All images unless explicitly stated are generated using DALL·E 3 from here: https://www.bing.com/images/create
United States Stock Market Prediction
Stock Prediction has been the Holy Grail of Time Series Data because of the potential lucrative applications.
However, a huge number of dependent variables exist in stock market prediction for even one single stock.
The market price of correlated stocks, holidays, overall stock market valuations, seasonal differences, international finance news, company
financial news, company hype - to list it all would be an extremely long list.
This makes predicting stock market prices for a single company extremely challenging and difficult. Not to mention unreliable.
So what do we do?
I thought of taking the most difficult to predict stock -Nvidia (because of its high volatility and non-linear behavior over the last few months).
I tried the following methods:
- ARIMA
- Exponential smoothing
- LSTM
- Neural Networks
- Prophet
- XGBoost
- Moving Averages
- Random Forests
- Support Vector Regression
- Logistic Regression
Out of all these, I ensembled the two most promising ones.
I made several predictions of stock prices till the end of 2024.
Further than six months is a risk because of the huge number of variables that need to be accounted for while making predictions.
But here I have a script that can make reasonably accurate predictions about the stock market out of all of the above.
I’m sharing the code below.
Insert your favorite stock ticker to get a state-of-the-art prediction for till the end of 2024.
Seriously!
Editor’s note: This article should be relied upon for informational purposes only. Stocks can be speculative, complex, and involve high risks. This can mean high prices volatility and potential loss of your initial investment. You should consider your financial situation, investment purposes, and consult with a financial advisor before making any investment decisions. HackerNoon and its distribution partners disclaim any liability for losses or damages resulting from the use of this information and does not endorse or guarantee the accuracy, reliability, or completeness of the information within. #DYOR
The Methods I Choose and Why I Choose Them
The two methods I choose after trying all of the above were Prophet and moving averages.
There’s not much to discuss about moving averages - you simply predict the next price according to an average of the last chosen time period.
It works great when you have existing data but obviously, not for prediction. But combined with Prophet, it can yield powerful results. What is Prophet, you ask? See below!
Prophet - Meta’s All in One Swiss Knife for Time Series Prediction
We summarize the Research Paper published about Prophet below.
Prophet is an open-source time series forecasting tool developed by Facebook's Core Data Science team. According to the sources, the key points about Prophet are:
Additive Model with Non-Linear Trends: Prophet is based on an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects.
Robust to Outliers and Missing Data: Prophet is robust to missing data and shifts in the trend, and typically handles outliers well.
Fully Automatic Forecasting: Prophet can provide reasonable forecasts on messy data with no manual effort. It is designed to be robust to outliers, missing data, and dramatic changes in the time series.
Tunable Forecasts: The Prophet procedure includes many possibilities for users to tweak and adjust forecasts by using human-interpretable parameters to incorporate domain knowledge.
Accurate and Fast: Prophet is used extensively across Facebook for producing reliable forecasts, outperforming other approaches in most cases. The models are fit in Stan, allowing forecasts to be generated quickly
More details available in the References.
Advantages of Prophet
Can handle non-linear trends and complex patterns in the data using an additive model with piecewise linear trends.
Robust to outliers and missing data, making it suitable for messy real-world time series.
Fully automatic forecasting with reasonable results even without much manual effort.
Tunable forecasts that allow incorporating domain knowledge through human-interpretable parameters.
Available in both R and Python, sharing the same underlying Stan code for fitting models.
Relatively computationally efficient compared to other time series methods.
Provides an interpretable decomposition of the forecast into trend, seasonality, holiday and extra regressors components.
Disadvantages of Prophet
Subpar predictive performance compared to classical time series models in some cases.
Only appropriate for univariate time series, not designed for forecasting multiple correlated time series jointly.
Can only handle covariates representing holidays, not other types of regressors.
Trend component tends to explain the majority of the prediction (around 90% in one case study), making it essential to get the trend right.
Confidence intervals can be quite large, especially for long-term forecasts.
Requires some tuning of parameters like changepoint_prior_scale to get the best results.
The Source Code for the Stock Estimator That You Can Run on Your Local System
You will need an Alpha Vantage API key. The instructions to get that are available here:
https://www.alphavantage.co/support/#api-key
https://www.alphavantage.co/support/?embedable=true#api-key
This is the requirements.txt file:
yfinance==0.2.18
pandas==1.5.3
numpy==1.23.5
matplotlib==3.7.1
prophet==1.1.2
statsmodels==0.13.5
scikit-learn==1.2.2
tensorflow==2.12.0
xgboost==1.7.5
This is the source code and in this section, you can replace with the desired ticker symbol:
#Change this symbol variable to get predictions of any stock ticker you may need
symbol = "NVDA"
end_date = "2024-04-30"
#You may want to experiment with the forecast date. But normally
#greater the time length, the less accurate the prediction.
forecast_end = "2025-01-31"
Replace the Alpha Vantage API Key with your own in this code block below:
https://www.alphavantage.co/support/#api-key
# Replace with your Alpha Vantage API key
ALPHA_VANTAGE_API_KEY = "XXXXXXXXXXXXXXXXXXXX"
import requests
import pandas as pd
import numpy as np
from datetime import datetime
from prophet import Prophet
import matplotlib.pyplot as plt
from sklearn.metrics import classification_report
# Replace with your Alpha Vantage API key
ALPHA_VANTAGE_API_KEY = "XXXXXXXXXXXXXXXXXXXX"
def get_stock_data(symbol, end_date):
url = f"https://www.alphavantage.co/query?function=TIME_SERIES_DAILY&symbol={symbol}&outputsize=full&apikey={ALPHA_VANTAGE_API_KEY}"
response = requests.get(url)
data = response.json()
if "Time Series (Daily)" not in data:
raise ValueError("Error fetching data from Alpha Vantage API")
df = pd.DataFrame(data["Time Series (Daily)"]).T
df.index = pd.to_datetime(df.index)
df = df.sort_index()
df = df.astype(float)
# Filter data up to end_date
df = df[df.index <= end_date]
return df[["4. close"]].rename(columns={"4. close": "y"})
def prepare_data_for_prophet(df):
df_prophet = df.reset_index()
df_prophet.columns = ['ds', 'y']
return df_prophet
def moving_average_model(data, window_size):
return data.rolling(window=window_size).mean()
def train_and_predict(df, periods):
# Prophet model
model_prophet = Prophet(
yearly_seasonality=False,
weekly_seasonality=False,
daily_seasonality=False,
changepoint_prior_scale=0.05
)
model_prophet.fit(df)
future_prophet = model_prophet.make_future_dataframe(periods=periods)
forecast_prophet = model_prophet.predict(future_prophet)
# Moving average model
ma_short = moving_average_model(df['y'], window_size=50)
ma_long = moving_average_model(df['y'], window_size=200)
# Combine forecasts
forecast = forecast_prophet.copy()
forecast['yhat_ma'] = np.concatenate([ma_long.values, np.full(periods, ma_long.iloc[-1])])
forecast['yhat_ensemble'] = (forecast['yhat'] + forecast['yhat_ma']) / 2
return model_prophet, forecast
def plot_predictions(historical_data, forecast, stock_name):
plt.figure(figsize=(15, 8))
# Plot historical data
plt.plot(historical_data.index, historical_data['y'], label='Historical', color='blue')
# Plot Prophet forecast
forecast_start = forecast[forecast['ds'] >= '2020-01-01']
plt.plot(forecast_start['ds'], forecast_start['yhat'], label='Prophet Forecast', color='red')
# Plot Moving Average forecast
plt.plot(forecast_start['ds'], forecast_start['yhat_ma'], label='MA Forecast', color='green')
# Plot Ensemble forecast
plt.plot(forecast_start['ds'], forecast_start['yhat_ensemble'], label='Ensemble Forecast', color='purple')
plt.fill_between(forecast_start['ds'], forecast_start['yhat_lower'], forecast_start['yhat_upper'], color='red',
alpha=0.2)
plt.title(f'{stock_name} Stock Price: Historical and Forecast (2020-2025)')
plt.xlabel('Date')
plt.ylabel('Close Price')
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()
def calculate_direction(series):
return (series.diff() > 0).astype(int)
def evaluate_model(actual, predicted):
actual_direction = calculate_direction(actual)
predicted_direction = calculate_direction(predicted)
return classification_report(actual_direction[1:], predicted_direction[1:])
def main():
# Change this symbol to get predictions of any stock ticker you may need
symbol = "NVDA"
end_date = "2024-04-30"
# You may want to experiment with the forecast date. But normally, the #greater the time length, the less accurate the prediction.
forecast_end = "2025-01-31"
try:
# Fetch historical data
stock_data = get_stock_data(symbol, end_date)
df_prophet = prepare_data_for_prophet(stock_data)
# Train model and make predictions
periods = (datetime.strptime(forecast_end, "%Y-%m-%d") - datetime.strptime(end_date, "%Y-%m-%d")).days
model, forecast = train_and_predict(df_prophet, periods)
# Print data info
print(stock_data.head())
print(stock_data.tail())
print(f"\nTotal data points: {len(stock_data)}")
print(f"Date range: {stock_data.index.min()} to {stock_data.index.max()}")
# Plot predictions from 2020 to 2025
plot_predictions(stock_data['2015':], forecast, "NVDA")
# Print predictions for 2025
predictions_2025 = forecast[forecast['ds'].dt.year == 2025]
print(f"\nPredicted prices for 2025:")
print(predictions_2025.groupby(predictions_2025['ds'].dt.to_period('Y')).agg(
{'yhat_ensemble': ['mean', 'min', 'max']}))
# Evaluate model
actual = stock_data['y']
predicted = forecast.set_index('ds')['yhat_ensemble'].loc[actual.index]
print("\nClassification Report:")
print(evaluate_model(actual, predicted))
except Exception as e:
print(f"Error: {e}")
if __name__ == "__main__":
main()
Program Console Output (NVDA):
Program Output (NVDA)
…. head() and tail() output eliminated for brevity
Total data points: 6163
Date range: 1999-11-01 00:00:00 to 2024-04-30 00:00:00
Predicted prices for 2025:
yhat_ensemble
mean min max
ds
2025 534.933425 534.059932 535.806918
Classification Report:
precision recall f1-score support
0 0.51 0.46 0.48 2977
1 0.54 0.59 0.56 3185
accuracy 0.53 6162
macro avg 0.52 0.52 0.52 6162
weighted avg 0.52 0.53 0.52 6162
The Graphical Plot
This is what you get if you change the symbol to AAPL (with your own Alpha Vantage API key, of course)
Program Console Output (AAPL):
Total data points: 6163
Date range: 1999-11-01 00:00:00 to 2024-04-30 00:00:00
Predicted prices for 2025:
yhat_ensemble
mean min max
ds
2025 159.003104 158.695001 159.311208
Classification Report:
precision recall f1-score support
0 0.52 0.39 0.45 2965
1 0.54 0.66 0.59 3197
accuracy 0.53 6162
macro avg 0.53 0.53 0.52 6162
weighted avg 0.53 0.53 0.52 6162
The Graphical Plot
Have fun playing with the script.
Install the requirements.txt file with the following command and you are good to go, along with a free Alpha Vantage API Key of course.
pip install -r requirements.txt
Future Directions
These are really exciting times. Facebook (Meta) uses this method throughout its ecosystem, and it works well, especially where there is seasonality.
Now Meta has come up with a new predictor for time-series called NeuralProphet which is a huge improvement on the Prophet algorithm.
The program has been released but it is still in beta. It is expected to release in a functional condition soon.
Expect another blog when that happens!
The world is changing fast.
This was not a super-complex prediction system – it just used a super powerful prediction tool.
You can insert any stock ticker symbol you want at the point marked in the source code and you will get a prediction of the price for Jan 2025.
MSFT, anyone?
References:
- (Forecasting at scale [PeerJ Preprints])
- Prophet | Forecasting at scale. (facebook.github.io)
- Time Series Analysis using Facebook Prophet - GeeksforGeeks
- Hyndman, R. J., & Athanasopoulos, G. (2021). Forecasting: Principles and Practice. OTexts. ISBN-10: 0987507133, ISBN-13: 978-0987507136. Available online (fully) at Forecasting: Principles and Practice (3rd ed) (otexts.com)
- When to use Facebook Prophet - Crunching the Data
- Is Facebook Prophet suited for doing good predictions in a real-world project? - Artefact
- NeuralProphet: A Time-Series Modeling Library based on Neural-Networks | by Essi Alizadeh | Towards Data Science
- NeuralProphet
- https://ai.meta.com/blog/neuralprophet-the-neural-evolution-of-facebooks-prophet