Creating Neighborhood Networks with Walkscore

I want to find somewhere super livable; aside from “livable” being subjective, it is often hard to find data even if you have specific criteria. The data at Walkscore is really interesting but I see two big problems with it.

  1. The data for an entire city favtors in part of the city I likely do not care about.
  2. The data for one neighborhood is likely too small for the “area” I’d visit most days.

What I really want is a dataset which looks at “networks” of neighborhoods. Specifically, I want the aggregate data for a neighborhood and all other neighborhoods which are within a specific distance. To me, a network is an area which I’d “call home” and like visit frequently. In my case, I consider anything within 3 miles of my neighborhood within my “network” since I can get there on foot, bike, or transit pretty easily.

Instead of just complaining into the wind, I decided to explore the idea of creating neighborhood networks with Walkscore. To do this, I wrote a Python script which uses the Walkscore data to create aggregate datasets for neighborhood networks. You can find the script on my GitHib repository. The input for this script is a simple CSV file with city and states you want to lookup; the script has comment documentation to explain. To calculate distance I use a simple mathematical formula; you could also use Google if you have API keys.

For anyone curious, I looked up a few thousand US cities and calculated all networks within a 3 mile distance. The dataset is available here in CSV format.

There are some gotchas with the script, without question. Here are some of the biggest ones:

  • If no data is found on Walkscore a zero is used (bad-ish.)
  • You’re beholden to how Walkscore defined neighborhoods
  • My method of calculating distance is very approximate

Next up – visualizing and playing with this data!