Key Findings
Machine Learning Outperforms Traditional Dictionaries
ML dictionaries demonstrate significantly stronger predictive power for stock price movements compared to standard Loughran-McDonald (LM) dictionaries, with R² values more than double (4.6% vs 2.1%) in out-of-sample tests.
Enhanced Dictionary Coverage
ML dictionaries achieve greater coverage of financial text despite using fewer words - ML positive words cover 8.4% of earnings calls text with just 57 words, compared to 1.9% coverage from 329 LM positive words.
Robust External Validity
ML dictionaries constructed from earnings calls maintain strong predictive power when applied to 10-K filings and WSJ articles, demonstrating broader applicability across different types of financial text.
Comparative Performance in Earnings Calls Analysis
- ML dictionaries achieve more than double the explanatory power of LM dictionaries
- The overlap between LM & ML dictionaries shows the strongest performance
- ML bigrams demonstrate comparable performance to ML unigrams
Dictionary Coverage Comparison
- ML dictionaries achieve superior coverage with significantly fewer words
- ML positive dictionary achieves 4.4x better coverage than LM with only 17% of the words
- ML negative dictionary achieves 3.2x better coverage than LM with only 3% of the words
External Validity Across Different Text Types
- ML dictionaries maintain superior performance across different document types
- Performance difference is most pronounced in earnings calls analysis
- ML dictionaries show consistent advantage over LM dictionaries in all contexts
Contribution and Implications
- Demonstrates the potential of machine learning to improve financial text analysis beyond traditional human-coded dictionaries
- Provides new, more efficient dictionaries that achieve better coverage and predictive power with fewer words
- Offers a methodology for creating context-specific sentiment dictionaries that can be applied to different types of financial documents
Data Sources
- Performance comparison chart constructed using data from Table 2, showing R² values for different dictionary approaches
- Dictionary coverage comparison constructed using data from Table 5, comparing word counts and coverage percentages
- External validity chart constructed using data from Tables 2, 3, and 4, showing R² values across different document types