Key Findings
Machine Learning Predicts Fund Performance
Nonlinear machine learning methods select mutual fund portfolios that earn significant out-of-sample annual alphas of 2.4% net of all costs
Past Performance & Fund Activeness Interaction
Past performance is a particularly strong predictor of future performance for more active funds, revealing important interactions between fund characteristics
Capital Misallocation Effects
Machine learning identifies managers whose skill is not sufficiently offset by diseconomies of scale, consistent with informational frictions
Out-of-Sample Portfolio Performance
- Machine learning methods (Gradient Boosting and Random Forest) significantly outperform traditional approaches
- Nonlinear methods achieve monthly alphas of 19.7-22.4 basis points
- Traditional passive strategies (Equally/Asset Weighted) show negative alphas
Top Fund Characteristics for Performance Prediction
- Value added and alpha t-stat are the strongest predictors of future performance
- Fund activeness measures (Market Beta t-stat, R²) are crucial determinants
- Combination of past performance and activeness measures provides best predictions
Fund Size vs. Manager Skill
- Top performing funds (D10) are smaller than expected given their skill level
- Evidence of capital misallocation in mutual fund market
- Informational frictions prevent optimal fund sizing
Contribution and Implications
- First study to show investors can earn significant positive alpha using machine learning on fund characteristics
- Demonstrates importance of nonlinear relationships and interactions in predicting fund performance
- Provides evidence that sophisticated prediction methods can help improve mutual fund selection
- Implications for pension plan administrators and financial advisors in fund selection
Data Sources
- Portfolio Performance Chart: Based on Table 3 - Out-of-sample alpha of fund portfolios using FF5+MOM model
- Characteristics Importance Chart: Based on Figure 2 - Characteristic importance analysis
- Capital Misallocation Chart: Based on Figure 10 - Mean log size across decile portfolios