Abstract
I built a fully automatic tool [link] to predict market behavior and price indices of popular snack foods. It works by using a simple linear regression model to predict the future value of snacks such as potato chips. I regressed prices against the date and found that the date tends to have a statistically-significant impact upon price. This reflects well-known notions of inflation as predicted by many empirical models, which state that (during short runs) a market tends to be in an inflationary state rather than a deflationary state and all goods' prices will therefore rise over time, often at rates which are somewhat consistent.
Inspiration and goals
As of June 2025, a 12.5 ounce bag of chips costs $5.44. That's roughly how much you would pay for an entire McDonalds' Big Mac [link] during the 2010s. It's no secret that snack foods and cheap lunches have been rapidly inflating their prices. However, Consumer Price Index (CPI) data is often hidden and misunderstood. As an economics student, I know where to go to find out about these things. However, my friends and family might have a harder time using these models to discern useful information such as how much they'll pay in groceries next quarter.
To deal with this evident inaccessibility of information, I decided to create a tool that'd track the inflation of snacks and predict their price in the future. That way, anyone could click on a link, go to a web page, and trivially find out how much more everything will cost in the near future.
Theory
Empirical evidence shows that prices tend to rise at a rate called the inflation rate. I conjecture that, because consumers and firms reject large leaps in price, this rate can be approximated linearly. In particular, I think that we can infer the future inflated price using a linear model that correlates time and price.
I construct the null hypothesis that my conjecture is incorrect (that is, the slope is around 0 and inflation can not be estimated using time), and the alternate hypothesis that there is a statistically significant correlation between time and price.
Implementation
To implement this project, I first had to choose where to procure my data. I chose the FRED API because it delivers decades-long CPI sub-indices in a clean JSON format, all free of charge. Most other datasets are paid and cover only small time durations. This is not very good for producing a robust, high-quality regression model. Furthermore, while a custom web scraper would work for free, it would be brittle and lack historic data. With FRED, a free API key and a single HTTPS call give me up-to-date monthly prices for any give CPI item category all the way back to 1958 [link] .
To handle the data processing, I created a lightweight Python 3.12 script. It starts by using a FRED API key to download JSON files containing all the historic price data. It then sorts the data into time and price columns, being careful to eliminate rows with malformed data. Dates are converted into UNIX timestamps [link] to allow me to treat time as an integer. Finally, I use scipy to regress price against time.
Of course, just producing a linear model isn't enough. I must also interpret and apply it. A simple hypothesis testing routine examines the R and P values to discern the nature and degree of correlation. If the correlation is high, it annotates the writeup. A series of future timestamps are generated by a helper function. These timestamps are inserted into the model to generate predictions about future prices. These predictions are also added to the writeup.
Takeaways
I tested the null hypothesis that there was no correlation between time and price against the alternate hypothesis that a correlation existed. I concluded that there is a statistically significant correlation. This makes a lot of sense, because that's how inflation generally is assumed to work.
I also learned that APIs are often quite easy to use. I generally gather data by writing scripts to scrape the web. However, this is time-consuming and brittle. It's far easier to just use an API when one is made available.