Where to Buy was a collaboration between Crain's Chicago Business and DataMade to help Crain's readers find the right Chicagoland neighborhood based on their needs. I worked to prepare the data for the app and design the UI interactions.
We scored places based on five variables of interest:
- High diversity
- Good schools
- Low crime
- Solid real estate price growth
- Short commutes
Users can rearrange the variables to reflect their own personal priorities and get an updated neighborhood recommendation. They can also select from one of a few preset profiles put together by Crain's reporters.
Data pipeline
The data preparation pipeline for Where to Buy is extensive. With Forest's guidance, I put together an ETL pipeline that includes:
- Scraping PDF reports of real estate prices and transforming them into structured data
- Inferring stats on racial diversity and commute times using US Census data and the Google directions API
- Linking crime stats from the Illinois State Police's Uniform Crime Reports to jursidictional boundaries
- Scraping short descriptions of towns and neighborhoods from Wikipedia
- Scoring towns and neighborhoods across the five variables of interest using Principle Component Analysis
While parts of the pipeline remain closed source in order to respect the privacy
policies of certain data sources, take a look at the data
subdirectory on the GitHub
page for
a taste.