Please rotate your device to landscape mode to view the charts.

Background and Context

Research Motivation

The study investigates whether machine learning algorithms can better predict stock price movements compared to traditional human-coded sentiment dictionaries used in finance.

Data Sources

The research analyzes 85,530 earnings call transcripts from 3,229 firms between 2006-2020, 76,922 10-K filings from 1996-2018, and 87,198 Wall Street Journal articles from 2000-2021.

Methodology

The authors develop new sentiment dictionaries using a machine learning algorithm (MNIR) that learns from stock price reactions to identify positive and negative words and phrases.

Machine Learning vs Traditional Dictionaries: Predictive Power for Stock Returns

  • Shows how well different methods predict stock price movements during earnings calls
  • ML methods achieve more than twice the predictive power of traditional dictionaries
  • Combining ML and traditional methods yields the best results

Coverage of Different Dictionaries Across Text Sources

  • Compares how much text is captured by different dictionaries across document types
  • ML dictionary achieves 3-4 times greater coverage than traditional LM dictionary
  • Coverage advantage holds across all three types of financial documents

Economic Impact of Sentiment on Stock Returns

  • Shows how much stock prices move in response to positive and negative sentiment
  • ML methods identify stronger market reactions than traditional dictionaries
  • Using word pairs (bigrams) captures the strongest market reactions

Dictionary Size Comparison

  • Compares the number of words in traditional vs ML dictionaries
  • ML achieves better results with far fewer words
  • Shows efficiency of machine learning in identifying the most important sentiment words

External Validity: Performance Across Different Text Sources

  • Shows how well the dictionaries work across different types of financial texts
  • ML dictionary maintains its advantage across all document types
  • Performance difference is largest for earnings calls where the ML was trained

Contribution and Implications

  • The research demonstrates that machine learning can create more effective sentiment dictionaries for analyzing financial texts than traditional human-coded approaches
  • The new ML dictionaries achieve better predictive power while using fewer words, making them more efficient tools for financial analysis
  • The methods developed can be applied to other languages and contexts, opening new possibilities for automated financial text analysis

Data Sources

  • Predictive Power Chart: Based on Table 2 regression results
  • Coverage Chart: Based on Table 5 dictionary coverage statistics
  • Economic Impact Chart: Based on Table 2 coefficient estimates
  • Dictionary Size Chart: Based on Table 5 dictionary statistics
  • External Validity Chart: Based on Tables 2, 3, and 4 R-squared values