In June, it was just rows and rows of data: property assessments, school records, smart meter readings, contract bids, and census results. By the end of August, this raw material was transformed into tools for detecting international corruption and political earmarks, predicting buildings that pose lead hazards and students at risk of dropping out, and suggesting strategies to fight maternal mortality, unemployment, and homelessness.
For the second year of the Eric & Wendy Schmidt Data Science for Social Good Summer Fellowship, 48 fellows from around the world worked on 14 projects that helped non-profits and government agencies do more with their data to improve the world. Beyond the accomplishments of the 12-week program, the fellowship hopes to seed a new, international community of data-savvy minds driven to make an impact that goes deeper than click rates.
“The primary goal of this fellowship was to train these students to not only solve real problems but problems with social impact,” said Rayid Ghani, director of the fellowship and researcher at the Computation Institute and the Harris School of Public Policy. “Our hope is that these fellows will not only go out and continue this work, but they’ll also start forming more local communities which will evolve into a larger set of people who care about these issues, who care about using their data science skills to make a social impact.”
This year, fellows from the United States, Mexico, Colombia, Australia, and Nigeria worked with project partners including the World Bank Group, Chicago Public Schools, the Office of the President of Mexico, Enroll America, and the City of Memphis to build new data-driven tools and generate novel, actionable insights for the organizations.
On August 19th, the fruits of the fellowship’s frantic summer were presented at the 1871 startup hub in Chicago’s Merchandise Mart. In a succession of 3-minute talks, the students presented predictive models, interactive dashboards, and policy recommendations to enable partners to use their data to better achieve their mission.
“I was amazed by what they were able to do with our data,” said Bill Thorland, national program evaluator at Nurse-Family Partnership, a project partner in 2013 and 2014. “It’s given us an opportunity to bring fresh eyes in, to look at our problems, to give us some new ideas. It’s been extremely valuable, I can’t put a price on what this is worth to us. I think we’ve turned the corner in the way which we’ll be using data in the future.”
Summer Camp, with More Statistics
When the fellows gathered at fellowship’s downtown Loop space in early June, they represented a range of disciplines from computer science, statistics, and machine learning to economics, public policy, psychology, and architecture/design. The early days were spent finding common technical ground — Python, Scikit-learn, R, iPython notebook, GitHub, SQL, Redshift, and MapReduce. To name a few.
But learning new technical skills was only part of the fellows’ education. Speakers from non-profits such as the YMCA, After-School Matters, Chicago Community Trust, and LISC informed them about problems they face, the data their organizations collect, and the challenges they face in realizing its fullest potential.
Data scientists working at Github and the City of Chicago talked about how they use data to improve services and conduct innovative research. Sociologists from Yale and Ohio State University talked about the use of new computational data analysis techniques to answer social science questions. Panels on workplace diversity and entrepreneurship gave fellows valuable insight into professional careers within and beyond the tech industry.
The diverse knowledge helped the 12 teams of the fellowship — each made up of four fellows and one experienced mentor — explore their data and scope their project to achieve a useful result on a brisk schedule. Before jumping into the data, there was a lot of time spent on understanding the problems. One team traveled to Memphis to work with their Mayors Innovation Delivery Team on strategies to rehabilitate distressed properties. Another visited the World Bank Group headquarters in Washington D.C., participating in a “Data Dive” on finding signs of corruption in data about the organization’s contracts around the world.
By diving deep into complex problems of education, public health, conservation, and social services, fellows learned what they could provide that would best suit the needs of their partners.
“You could have the best analytical model in the world, but if your partner can’t understand what it’s doing or why they should want to use it, it doesn’t really do that much good,” said Andrew Reece, a graduate student studying psychology at Harvard University. “Ultimately, it’s the partner that we’re trying to help, so they need to really feel and understand what it is we’re doing, in a way that makes sense to them.”
An Antidote Before The Poison
Since bans on leaded paint and gasoline went into effect, the lead levels of Chicago children have dropped precipitously. But in some areas of the city, children still present elevated blood lead levels when tested by pediatricians, reflecting exposure with serious effects upon brain development and other health outcomes.
Finding and mitigating lead hazards in city homes is a leading priority of the Chicago Department of Public Health (CDPH), yet currently they can only investigate residences after an elevated blood lead test from a child who lives there. One DSSG team — Reece, Alex Loewi, Joe Brew, Subhabrata Majumdar, and mentor Eric Rozier — worked with CDPH to change that process from reactive to proactive, directing home inspections before lead poisoning occurs.
Using data from blood tests, housing records, census demographics, and other sources, the team constructed a model that classified every building in Chicago — over 1 million structures — based on their risk of containing hazardous sources of lead. Combined with birth record data on where pregnant mothers and young children live, the model provides a list of high priority houses for CDPH inspectors to check and, if necessary, initiate lead mitigation efforts.
“The data we’re getting from DSSG is helping us rethink the way we do our operations,” said Bechara Choucair, Commissioner of the Chicago Department of Public Health. “When you have limited resources, you have to be smarter in how you use those resources, and that’s exactly what we’re doing.”