Project site for UVA's Spring 2020 Public Interest Data Lab, a core project of the CommPAS Lab
View the Project on GitHub datafordemocracy/public-interest-data-2020
This is a 10-week hands-on lab; we’ll be learning how to use data and data science to answer important and public-minded questions with an emphasis on justice and equity. You don’t need to be an expert in statistics or R or coding coming in - though the project will provide plenty of opportunities to become more adept. Good data-oriented work comes from the collaboration of people with diverse talents and perspectives and developing expertise. We need individuals data interested in data wrangling and analysis, working in R, and visualizing data effectively; but we also need people with an understanding or interest in child welfare, who can ask probing questions and think creatively, who can help teams work better and keep projects moving, who can dig into a new substantive area and synthesize information, and who can communicate about results and write effectively. That doesn’t mean everyone won’t be responsible for learning about and contributing to each step of the project, but you may find that at some points you are more of a learner and at other points you are more of a leader.
Throughout the project you’ll be working in smaller teams (of about 3) to complete assigned tasks for the fuller project. We’ll need to quickly develop a better understanding of the child welfare system and processes. We’ll need to understand the data we have, consider its provenance, and evaluate it’s limitations. There will be no ``right answers,’’ and we don’t have pre-determined outcomes. So it will be very representative of real-life data science work - embrace the ambiguity!
Grades will depend on your learning and contribution to the project, which will be assessed by
Three data exploration assignments (10%): In weeks 1-3, individuals will start to learn about the data while practicing using R. Individuals should take a stab at completing these on their own, but can ask a colleague or instructors for help or troubleshooting if they get stuck, ideally through slack so that the learning can be shared. If you’re in the position of helping a colleague, use it as a chance to deepen your own understanding by explaining ideas (don’t just share your work).
Class contributions and participation (25%): Since we are working as a group to complete a real project for a real partner/client, attending class prepared to be fully engaged will help you get the most out of the experience and give the most to our joint effort. Come prepared, do the reading, ask questions, chime in with ideas, help develop solutions, complete work you’ve been assigned.
Team assignments (30%): From week 4 on, we’ll begin working in groups to complete parts of the overall project, including reviewing different parts of the literature, exploring different parts of the data set, analyzing different subsets of questions, writing different elements of the overall report. Once teams have been formed, I expect groups to meet each week between class to work together. Group members will have a chance to evaluate the contributions of one another as well. Each group will report on their progress each week, sometimes to the class, sometimes through documents submitted on slack. Based on that progress, we’ll determine the next steps that should be completed for the following class. Though I have something of an outline in my head of where I imagine we’ll be from week to week, applied research and data analysis often does not conform to planned timelines. In part, that’s what makes it exciting, but I know that can be uncomfortable for some, so I want to be upfront that the project will require some flexibility on all our parts.
Weekly updates (10%): Beginning in week 5, individuals will submit a weekly update to me on slack briefly describing your individual contribution in the past week, what you’re anticipating completing in the coming week, and alerting me to any problems you or your team are facing.
Final report (25%): Each group will be responsible for part of the final report, including some piece of the pre-analysis sections (overview, literature, and description of the data), as well as the communication of the processes and results around the research questions on which you took the lead.
We will be using R to do our analytical work. Some of us will have prior experience with R and with statistics, some of us will not – and that’s entirely fine.
For Data Analysis: You may find you need to learn about models or methods you don’t already know (it’s not yet clear what models we might need until we really get started). But we won’t let you go outside the bounds of appropriate statistical practice or leave you on your own to figure it out!
Some of our time together will be spent discussing the readings and projects, how we will practice responsible use of the data and the narrative we are creating, and providing feedback to our colleagues on their work. During these times, release yourself from your laptop and other electronic devices.
Some of our time together will be spent actively learning to use R for data wrangling, analysis, and visualization and working on the project in teams. During these times, of course, you’ll need your laptop.