Henry Johnson

About Me

Henry Johnson

Henry is a data scientist with a background in economics, mathematics, and quality assurance. Most of his professional career has been writing scripts in SQL and python to test software in FinTech and Ecomm. In academia, he's focused on econometrics and data science, with an emphasis in international trade. In his free time, he allocates his time among playing music (piano and guitar), mountain biking, and tending to his 100+ houseplants.

LinkedIn
GitHub
CV

Projects

Phoenix Trail Usage Dashboard

This Tableau dashboard presents data from the City of Phoenix's Parks and Recreation Department on trail usage. The data is collected from infrared trail counters that are placed on trails throughout the city. The observations are daily since January 1, 2019. The dashboard also incorporates weather temperature data from the National Centers for Environmental Information (pulled via synopticdata.com) for the City of Phoenix to show the inverse relationship between temperature and trail usage.

Tableau Dashboard
Hiking Trails Counter Data
City of Phoenix Temperature Data


The Effect of the 2018 Tariffs on European Wine

This research estimates a vector autoregression model for average wine prices across U.S. cities to assess the impact of tariff changes on the U.K., France, Germany, and Spain after they were enacted in October 2019. It uses impulse response functions to gauge how a one-unit impulse in the per-liter duty rate may effect the average wine price in the U.S. and the quantity of wine from various exporters to the U.S. It finds that a one-unit impulse in the duty rate levied against the bloc of countries impacted by the tariff results in a fall in the quantity of wine imported from those countries and that wine from the bloc of countries is substituted with wine from the top three exporters not included in the bloc.

Paper (PDF); (Link to Publication)
Jupyter Notebook
Data Collection GitHub Repo (new)
Original Analysis GitHub Repo


Classifying Movie Genres

Using a dataset of movie scripts scraped from IMSDB and movie genres detailed in MovieLens, I linked the movie titles and release years from the sources using a fuzzy match and built a multi-label classifier (supervised) to predict the genres of a movie from on its script. I used a variety of natural language processing techniques to extract features from the scripts, and then compared different models' performances (a KNN and OVR Naïve Bayes, OVR Linear with Stochastic Gradient Descent, and OVR Logistic Regression with SVD-Transformed Texts) in the classification problem. I found that the OVR with Naïve Bayes tended to perform the best.

Jupyter Notebook
GitHub Repo


Predicting the Change in Social Security Filers

This study forecasts the directional change in social security filers by utilizing financial data from Yahoo Finance and the Federal Reserve Bank of St. Louis. The goal of this research is to predict the change in social security solely by financial data. Plotting was done in python and the analysis was performed in STATA. The research won second place in the graduate research poster competition at Boise State.

Paper (PDF)
Poster (PDF)