Key Findings
Patent Data Bias Issues
Significant biases exist in patent and citation counts when aggregated at firm level, particularly affecting newer patents and citations. These biases are systematically related to firm characteristics.
Technology & Regional Variation
Patent and citation biases vary substantially across technology classes and geographical regions, with computer/electronics patents and states like California showing the largest disparities.
Machine Learning Solutions
Machine learning approaches using firm-level information perform significantly better than traditional adjustment methods in addressing patent and citation biases.
Firm-Level Patent and Citation Bias Correlations
- Larger firms show greater patent bias (0.0548 coefficient) and citation bias (0.121 coefficient)
- Higher market-to-book ratios correlate with increased patent bias (0.0464) and citation bias (0.0920)
- R&D intensity shows positive correlation with both patent (0.0235) and citation (0.0417) biases
Machine Learning Model Performance Comparison
- Machine learning models achieve higher R² values (0.74-0.81) compared to traditional benchmarks (0.43-0.63)
- Linear SVR performs best with R² of 0.81 and lowest RMSE of 66.22
- All ML models outperform conventional adjustment methods
Regional Patent Bias Distribution
- California and Massachusetts show 2.5x increase in patenting between 1990-2000
- Delaware showed minimal increase in patenting activity
- Regional differences persist even after traditional adjustments
Contribution and Implications
- Demonstrates systematic biases in patent data that affect research inferences in corporate finance
- Provides an actionable checklist for researchers using patent data
- Introduces machine learning as a promising solution for addressing patent data biases
Data Sources
- Firm-level bias correlations based on Table 1 regression coefficients
- Machine learning performance metrics derived from Table 4 model comparisons
- Regional patent distribution based on Figure 3 patent application trends