A collection of my writing and analysis as a Data Scientist at the Environmental Policy Innovation Center evaluating current data availability and investigating how historic funds are being utilized to expand access to clean drinking water.
Hasbro’s stock has fallen sharply in recent months after mainstream news outlets accused the company of killing their “golden goose” by overproducing and overcharging for Magic: the Gathering products. Fans of Magic: the Gathering, however, had expressed these same frustrations for months before journalists and news pundits caught on. Is it possible that by analyzing the secondary market for Magic products, investors could have predicted this downturn sooner?
As an MDI Scholar, working in partnership with Justice Innovation Lab (JIL), I helped investigate racial disparities in drug arrests. In prior research on a medium-sized jurisdiction, JIL found that Black men are approximately six times more likely to be arrested for drug-related offenses than White men. One argument for the disparity is that drug offenses are committed in proportion to the number of arrests. For this to be true, Black men would have to use drugs disproportionately more than White men. But what does the data say?
Using data available from three U.S. cities, this project deploys machine learning to predict the likelihood of a dog entering an animal shelter being euthanized and to identify which characteristics most effectively determine a dog’s outcome. This information would potentially aid shelters in allocating resources and reducing the total number of euthanasias.
One of the key findings suggest that, perhaps counter-intuitively, the worse a shelter is doing in limiting euthanasias, the easier it is to identify ‘high-risk’ animals. This realization is an encouraging sign — the animal shelters needing the most help are those best suited for the model to provide meaningful results.
This project uses data available as early as possible in a student’s collegiate career to gain an accurate understanding of their likelihood of not retaining their state-funded scholarship. In addition to information available through a student’s application to the College, the project also utilizes custom metrics based on the student’s course schedule and academic performance throughout their first semester to improve the model’s results.
By the middle of the fall semester, the models created therein are able to accurately predict a student’s scholarship retention at the start of the following fall semester with approximately 75% accuracy.
In this collection of blog posts, I use R and Tableau to dive into and analyze the effectiveness and efficiency of the New Orleans Saints run offense and offensive line performance over the course of the Brees-Payton era.
How well can a team’s seasonal performance data predict their playoff performance? Using team aggregate data, this project compares the predictive capabilities of four different classification models and explores how the results inform our understanding of a team’s playoff chances.
Can the categorical descriptions and mechanics of popular board games illuminate meaningful trends or interesting connections between particular titles and their features? This project explore complexity, popularity, and game's theme and mechanical features to explore tabletop board game trends.
In the culmination of our Data Science undergraduate program, my team contributed to the documentation of Sci-Kit Learn and created instructive examples of the machine learning algorithms we worked on throughout the semester. As our deliverables, we created a digital poster and video presentation to share with our classmates during a remote presentation event.