
Background and Context
Research Motivation
The study investigates whether machine learning algorithms can better predict stock price movements compared to traditional human-coded sentiment dictionaries used in finance.
Data Sources
The research analyzes 85,530 earnings call transcripts from 3,229 firms between 2006-2020, 76,922 10-K filings from 1996-2018, and 87,198 Wall Street Journal articles from 2000-2021.
Methodology
The authors develop new sentiment dictionaries using a machine learning algorithm (MNIR) that learns from stock price reactions to identify positive and negative words and phrases.
Machine Learning vs Traditional Dictionaries: Predictive Power for Stock Returns
- Shows how well different methods predict stock price movements during earnings calls
- ML methods achieve more than twice the predictive power of traditional dictionaries
- Combining ML and traditional methods yields the best results
Coverage of Different Dictionaries Across Text Sources
- Compares how much text is captured by different dictionaries across document types
- ML dictionary achieves 3-4 times greater coverage than traditional LM dictionary
- Coverage advantage holds across all three types of financial documents
Economic Impact of Sentiment on Stock Returns
- Shows how much stock prices move in response to positive and negative sentiment
- ML methods identify stronger market reactions than traditional dictionaries
- Using word pairs (bigrams) captures the strongest market reactions
Dictionary Size Comparison
- Compares the number of words in traditional vs ML dictionaries
- ML achieves better results with far fewer words
- Shows efficiency of machine learning in identifying the most important sentiment words
External Validity: Performance Across Different Text Sources
- Shows how well the dictionaries work across different types of financial texts
- ML dictionary maintains its advantage across all document types
- Performance difference is largest for earnings calls where the ML was trained
Contribution and Implications
- The research demonstrates that machine learning can create more effective sentiment dictionaries for analyzing financial texts than traditional human-coded approaches
- The new ML dictionaries achieve better predictive power while using fewer words, making them more efficient tools for financial analysis
- The methods developed can be applied to other languages and contexts, opening new possibilities for automated financial text analysis
Data Sources
- Predictive Power Chart: Based on Table 2 regression results
- Coverage Chart: Based on Table 5 dictionary coverage statistics
- Economic Impact Chart: Based on Table 2 coefficient estimates
- Dictionary Size Chart: Based on Table 5 dictionary statistics
- External Validity Chart: Based on Tables 2, 3, and 4 R-squared values