Putting together BBC, BuzzFeed, NPR, The Guardian and Reuters styleguides!
Hi friends!
I just got done with my first semester of grad school and I’m super stoked to share my first data project.
Here’s the story behind how I got on the idea. Last year, The New York Times announced it would capitalize ‘Black’. They wrote in detail, including historical context, how they came to this conclusion. The change inspired me to imagine a one-stop-shop to compare styleguides from different news organizations and then for over a year the idea just sat on a shelf in my brain.
Halfway through this semester, we began thinking about our final projects to showcase our newly acquired programming skills and well … eureka!
For the project, I scraped the styleguides from BBC, BuzzFeed, NPR, Reuters and The Guardian. Here’s the tech stack I used:
Python — BeautifulSoup and Pandas
Regex
PDFMiner
HTML
Excel
If you’d like to check out my code: You can find it here
Here’s what it looks like:
The final final code where you input a word
I tried the word ‘told' for this example and here’s the result
I want to work on making the final output look visually more appealing and perhaps have a website where people can just hop on, input a word and see how different organizations define it.
If you’d like to play around with my code, please read this instructions document. If you have any questions, suggestions, thoughts, please reach out!
We’re almost at the end of 2021 and with all my heart, I hope the next year is brighter, healthier and happier for everyone. Cheers! 🥂