AI‑powered models for predicting U.S. stock prices (2025 update)

Artificial intelligence (AI) has become integral to quantitative investing in the United States. Deep‑learning architectures can process huge volumes of market data and extract patterns that traditional statistical models miss. However, stock markets are complex systems influenced by economic indicators, investor psychology and unexpected events. A 2025 survey of multimodal stock‑forecasting frameworks notes that the U.S. market is driven by historical prices, trading volumes, economic indicators, global events and external data such as news and social media, which makes accurate forecasting inherently challengingmdpi.com. Even small errors in a forecast can lead to material losses, so researchers are looking beyond linear methods to models that capture hidden relationshipsmdpi.com.

Evolution from classic time‑series models to deep learning

Early financial forecasting used linear models such as ARIMA and GARCH. These methods are easy to interpret but assume stationarity and linear relationships, which limits their usefulness when data exhibit nonlinear dynamicsarxiv.org. A recent study comparing ARIMA with Long Short‑Term Memory (LSTM) networks on the S&P 500 index demonstrates why investors are adopting deep learning. Using 10 years of price data, ARIMA achieved a mean‑absolute error (MAE) of 462.1, a root mean squared error (RMSE) of 614 and an accuracy of ≈89.8 %, whereas an LSTM with optimised features reduced the MAE to 175.9, RMSE to 207.34 and increased accuracy to 96.41 %arxiv.org. LSTMs can maintain an internal memory and use forget, input and output gates to decide which information to retain or discardsimplilearn.com, which helps them model both short‑ and long‑term dependencies. Variants such as BiLSTM, GRU and Attention‑LSTM further improve performancearxiv.org.

LSTM‑based hybrids

While LSTMs capture temporal patterns, they overlook relationships between stocks. To address this, researchers combine LSTMs with Graph Neural Networks (GNNs), which model interactions among companies. A 2025 hybrid LSTM‑GNN model constructs a stock network using Pearson correlation and association analysis, allowing the GNN to learn how stocks influence each otherarxiv.org. When tested on historical U.S. market data under an expanding‑window validation scheme, the hybrid achieved a mean squared error (MSE) of 0.00144, a 10.6 % reduction compared with a standalone LSTM (MSE = 0.00161)arxiv.org. The hybrid outperformed linear regression, convolutional neural networks (CNNs) and dense neural networks, underscoring the value of capturing both temporal and relational featuresarxiv.org.

Transformer‑based models

Transformers, originally designed for natural‑language tasks, have become popular in finance because they use self‑attention to focus on relevant parts of the input sequence. A 2025 study evaluated five transformer architectures—encoder‑only, decoder‑only, vanilla encoder–decoder, vanilla without embeddings and a ProbSparse variant—on the S&P 500 index (daily closing prices from May 2015 – May 2024)arxiv.org. Sliding‑window inputs (5, 10 and 15 days) were used to predict 1, 5 and 10‑day returnsarxiv.org. The authors found that transformers generally outperformed LSTM, Temporal Convolutional Network (TCN), Support Vector Regression (SVR) and Random Forest models, and that a decoder‑only transformer delivered the best results across all horizonsarxiv.org. Conversely, the ProbSparse version performed worst, highlighting that architectural choices matterarxiv.org.

Large‑language‑model (LLM) approaches

Sentiment‑driven prediction

LLMs such as GPT‑4 and FinBERT can interpret news, analyst reports or social‑media posts to extract sentiment signals. A December 2024 study compared FinBERT, GPT‑4 and a baseline logistic regression model using Nigerian Stock Exchange news and all‑share index data. Surprisingly, the simple logistic regression achieved the highest accuracy (≈81.83 %) and ROC AUC (≈89.76 %), while FinBERT and GPT‑4 were resource‑intensive and offered only moderate improvementsarxiv.org. The authors emphasised that although LLMs provide sophisticated text analysis, they may not always outperform simpler models, especially when data are limited or well‑labelledarxiv.org.

FinGPT and dissemination‑aware sentiment

The AI4Finance community is developing specialised financial LLMs such as FinGPT. Researchers have noted that existing LLM‑based sentiment models often focus solely on the content of news articles. The FinGPT framework introduces dissemination‑aware and context‑enriched prompts. It clusters recent company news to measure how widely a story spreads and includes this context in the input prompt. Experimental results reported at an AAAI 2025 workshop show that dissemination‑aware tuning improved short‑term stock movement prediction accuracy by about 8 % compared with previous LLM‑based methodsarxiv.org.

FinBERT‑LSTM hybrids

Researchers are also blending language models with deep time‑series networks. FinBERT can convert financial news into sentiment scores; these scores are then fed into an LSTM to forecast price movements. A 2024 survey in Mathematics notes that incorporating FinBERT‑generated sentiment into LSTM models improves the prediction of short‑term price changesmdpi.com. Another study demonstrates that FinBERT‑LSTM models outperform pure LSTM or deep neural networks when predicting stock indicesmdpi.com.

LLM‑augmented Linear Transformer–CNN framework

An innovative 2024 framework combines an LLM, Linear Transformer, and CNN to forecast stock prices. Using only historical S&P 500 data (2022‑2023), the model first employs ChatGPT4o to generate textual technical analyses (moving averages, relative strength index and Bollinger bands) from numerical datamdpi.com. FinBERT transforms these text summaries into embeddings, which are merged with features extracted by a Linear Transformer (temporal patterns) and a CNN (visual patterns from candlestick charts). Experiments show that this multimodal approach significantly improves prediction accuracy and that integrating LLM‑generated insights helps the model capture temporal, spatial and contextual dependenciesmdpi.com. The study highlights three key contributions: (i) designing a hybrid Linear Transformer–CNN model enhanced by LLM features; (ii) using prompt engineering with ChatGPT4o to derive high‑quality technical indicators; and (iii) demonstrating improved performance on the 2022‑2023 S&P 500 datasetmdpi.com.

StockTime – an LLM tailored for time‑series data

Standard financial LLMs are typically used for textual analysis rather than direct price forecasting. The StockTime architecture (2024) addresses this gap by treating stock prices as a sequence of tokens and harnessing the autoregressive capabilities of LLMs. StockTime embeds patches of price data and derives textual information (correlations, trends and timestamps) from them. These textual and numerical embeddings are fused so that the LLM predicts future prices across flexible look‑back windowsarxiv.org. Compared with previous financial LLMs, StockTime reduces memory usage, lowers runtime costs and achieves more accurate predictions on time‑series benchmarksarxiv.org.

Dealing with noisy external information

External news and social media introduce significant noise. The LLM‑augmented framework emphasises that while online platforms offer valuable sentiment signals, they also contain unverified information and rumours that mask true market patternsmdpi.com. To avoid this, the authors rely exclusively on data‑driven technical indicators derived from price and volume datamdpi.com. This highlights a broader trend: many modern models either combine sentiment with robust preprocessing or focus solely on market‑generated data to reduce noise.

Comparison of recent AI models for U.S. stock forecasting

Model/Approach	Key idea / Features	Data & results
LSTM vs ARIMA	RNN variant with memory cells and gates; models non‑linear dependenciessimplilearn.com	S&P 500 prices; LSTM improved accuracy from ≈89.8 % (ARIMA) to ≈96.41 %, reducing RMSE from 614 to 207.34arxiv.org
LSTM‑GNN hybrid	Combines LSTM (temporal patterns) with graph neural networks (inter‑stock relationships)arxiv.org	Historical U.S. stock data; hybrid model achieved 10.6 % lower MSE than standalone LSTM and outperformed linear, CNN and dense networksarxiv.org
Transformer variants	Evaluates encoder‑only, decoder‑only, vanilla and ProbSparse transformers on S&P 500 (2015‑2024)arxiv.org	Transformers outperform LSTM, TCN, SVR and Random Forest; decoder‑only transformer performs best, while ProbSparse variant performs worstarxiv.org
FinGPT	Fine‑tuned LLM that includes dissemination breadth and context when analysing newsarxiv.org	Instruction‑tuned FinGPT improves short‑term stock movement prediction by ≈8 % over previous LLM methodsarxiv.org
FinBERT‑LSTM	Uses FinBERT to extract sentiment from news and feeds it into an LSTMmdpi.com	Demonstrated improved short‑term price prediction versus LSTM alonemdpi.com
LLM‑augmented Linear Transformer–CNN	Generates technical indicators via ChatGPT4o; uses FinBERT embeddings and combines Linear Transformer (temporal) + CNN (visual) featuresmdpi.com	On S&P 500 (2022‑2023), multimodal model captures temporal, spatial and contextual patterns and significantly improves accuracymdpi.com
StockTime	Treats price series as tokens; fuses time‑series patches with derived text; uses autoregressive LLM to forecast beyond fixed look‑backsarxiv.org	Reduces memory usage and runtime while outperforming other financial LLMs on time‑series predictionarxiv.org
FinBERT, GPT‑4 vs Logistic Regression	Compares two LLMs with a simple logistic regression model for sentiment analysis and predictionarxiv.org	Logistic regression achieved ≈81.83 % accuracy and higher ROC AUC (≈89.76 %); FinBERT and GPT‑4 were resource‑intensive and offered moderate performancearxiv.org

Practical considerations

Data quality and preprocessing – Deep models require large, clean datasets. Preprocessing steps such as standardisation, normalisation and handling missing values are essentialmdpi.com. Failure to apply these can lead to poor convergence and unreliable predictions.
Noise and interpretability – Incorporating external sentiment can improve accuracy, but noisy data and rumours can harm performancemdpi.com. Hybrid models that extract sentiment from trusted sources or rely solely on market data help mitigate this issue.
Computational resources – LLMs are computationally expensive. Studies show that logistic regression or simpler models can still outperform LLMs when data are limitedarxiv.org, so researchers must weigh accuracy gains against resource costs.
Generalisation and overfitting – Models trained on a specific period (e.g., 2022‑2023) may not generalise well to different market regimes. Expanding‑window validation and cross‑validation can help evaluate model robustnessarxiv.org.

Conclusion

AI has transformed U.S. stock forecasting. LSTM networks offer strong baselines by modelling sequential patterns and outperform linear models such as ARIMA. Hybrid architectures like LSTM‑GNN and LLM‑augmented Transformer–CNNs further enhance performance by capturing inter‑stock relationships and combining numerical, visual and textual information. Transformers are increasingly popular; decoder‑only versions tend to outperform other configurations. Large language models open new possibilities for sentiment‑driven forecasting, yet studies show that simpler models can sometimes deliver better results when data are scarce or well‑structured. The future of AI‑driven stock prediction lies in integrating diverse data sources while controlling noise and computational complexity. Researchers and practitioners should continue evaluating new architectures (such as StockTime) on broad U.S. market datasets, while maintaining cautious expectations—no model can fully eliminate market uncertainty.

#AIStockPrediction #DeepLearning #USStockMarket #LSTM #Transformer #GraphNeuralNetwork #FinGPT #FinBERT #StockTime #FinancialAI #MachineLearning #MarketForecasting #FinTech

AI Money Lab

2025년 7월 28일 월요일