Last week I’d teased about an independent data project I was working on.
Reminder: The prompt we were given was: “Something you miss from home”. I immediately thought of street food. Except, there’s not really a database out there (that I know of) of street food in India. It’s a largely unorganized industry made up of extremely small-scale businesses.
Here’s a link to last week’s post: Conceptualizing a data project … Part one of two!
I looked into Biryani restaurants in India, focussing in on six cities — Delhi, Lucknow, Hyderabad, Mumbai, Kolkata and Bengaluru — to keep the scope of the project relatively manageable.
I first thought of scraping Google Maps to create my database but ran into troubles with just the first query. Here’s a blurb from Google Maps API documentation:
What this means is for each loop through the code (see below), Google only gave me 60 results. Each loop in this sense meant a unique combination of latitude and longitude values.
My options were to either divide each city I was interested in into quadrants and take the mid-points of those quadrants and scrape 240 results per city. That didn’t seem right. For one — I had no way of knowing what these 60 results were based off of — proximity to the input latitude and longitude value, user ratings or something else entirely.
So. I pivoted to scraping data from Zomato. For the project, I scraped user ratings and price ranges and created a near-perfect dataset. From my findings, Kolkata has the cheapest Biryani selling portions for one for ₹112 on average. And on the other side, Delhi-NCR has some of the swankiest options, with restaurants selling servings for one for ₹500. If you’d like to look through my dataset, I posted a searchable version on my website! Check it out here!
I understand this dataset and approach both have caveats and this is in no way a comprehensive database of Biryani in India, but hey, it’s a start!
Here’s some other cool stuff from around the internet this week:
Sunlight as Infrastructure
Solar infrastructure has not yet figured in policy discussions because it is profoundly unprofitable. In other words, windows reverse the incentives driving every other form of investment in energy. Electric wires occupy public space while moving a private good.Maple syrup meltdown: in a changing climate, what’s to become of Canada’s sweetest commodity?
With the earth warming, the age-old cycle that makes maple syrup may stop being so reliable. And that threat could be coming sooner rather than later.
Thank you for making it thus far!
Here’s the little bit of ✨personal news✨ I teased above. I’m going to NICAR next month in Atlanta. It’s one of the biggest conferences in the data journalism world. AND. I’ll be giving a lightning talk too! It’s a pretty big deal and I’m STOKED!
That’s all for now. Cheers and see you next week! 🥂
Amazing stuff, loved the pivot to scraping when you hit the Google API wall—can be so frustrating!
Hi, just one question - did you use Python for this? (Can sound like a noob question) - Thinking of taking up a programming language to learn, I think such use cases are only possible because of Python. Did you use the same?