Browse Infographics



TitleJournalDateAuthorAbstractLink
Informed Trading IntensityJournal of Finance20240401Bogousslavsky, Vincent; Fos, Vyacheslav; Muravyev, DmitriyWe train a machine learning method on a class of informed trades to develop a new measure of informed trading, informed trading intensity (ITI). ITI increases before earnings, mergers and acquisitions, and news announcements, and has implications for return reversal and asset pricing. ITI is effective because it captures nonlinearities and interactions between informed trading, volume, and volatility. This data-driven approach can shed light on the economics of informed trading, including impatient informed trading, commonality in informed trading, and models of informed trading. Overall, learning from informed trading data can generate an effective informed trading measure.View Infographic
The Virtue of Complexity in Return PredictionJournal of Finance20240201Kelly, Bryan; Malamud, Semyon; Zhou, KangyingMuch of the extant literature predicts market returns with "simple" models that use only a few parameters. Contrary to conventional wisdom, we theoretically prove that simple models severely understate return predictability compared to "complex" models in which the number of parameters exceeds the number of observations. We empirically document the virtue of complexity in U.S. equity market return prediction. Our findings establish the rationale for modeling expected returns through machine learning.View Infographic
(Re-)imag(in)ing Price TrendsJournal of Finance20231201Jiang, Jingwen; Kelly, Bryan; Xiu, DachengWe reconsider trend-based predictability by employing flexible learning methods to identify price patterns that are highly predictive of returns, as opposed to testing predefined patterns like momentum or reversal. Our predictor data are stock-level price charts, allowing us to extract the most predictive price patterns using machine learning image analysis techniques. These patterns differ significantly from commonly analyzed trend signals, yield more accurate return predictions, enable more profitable investment strategies, and demonstrate robustness across specifications. Remarkably, they exhibit context independence, as short-term patterns perform well on longer time scales, and patterns learned from U.S. stocks prove effective in international markets.View Infographic
Firm-Level Climate Change ExposureJournal of Finance20230601Sautner, Zacharias; Van Lent, Laurence; Vilkov, Grigory; Zhang, RuishenWe develop a method that identifies the attention paid by earnings call participants to firms' climate change exposures. The method adapts a machine learning keyword discovery algorithm and captures exposures related to opportunity, physical, and regulatory shocks associated with climate change. The measures are available for more than 10,000 firms from 34 countries between 2002 and 2020. We show that the measures are useful in predicting important real outcomes related to the net-zero transition, in particular, job creation in disruptive green technologies and green patenting, and that they contain information that is priced in options and equity markets.View Infographic
Biased AuctioneersJournal of Finance20230401Aubry, Mathieu; Kraussl, Roman; Manso, Gustavo; Spaenjers, ChristopheWe construct a neural network algorithm that generates price predictions for art at auction, relying on both visual and nonvisual object characteristics. We find that higher automated valuations relative to auction house presale estimates are associated with substantially higher price-to-estimate ratios and lower buy-in rates, pointing to estimates' informational inefficiency. The relative contribution of machine learning is higher for artists with less dispersed and lower average prices. Furthermore, we show that auctioneers' prediction errors are persistent both at the artist and at the auction house level, and hence directly predictable themselves using information on past errors.View Infographic
Do Municipal Bond Dealers Give Their Customers 'Fair and Reasonable' Pricing?Journal of Finance20230401Griffin, John M.; Hirschey, Nicholas; Kruger, SamuelMunicipal bonds exhibit considerable retail pricing variation, even for same-size trades of the same bond on the same day, and even from the same dealer. Markups vary widely across dealers. Trading strongly clusters on eighth price increments, and clustered trades exhibit higher markups. Yields are often lowered to just above salient numbers. Machine learning estimates exploiting the richness of the data show that dealers that use strategic pricing have systematically higher markups. Recent Municipal Securities Rulemaking Board rules have had only a limited impact on markups. While a subset of dealers focus on best execution, many dealers appear focused on opportunistic pricing.View Infographic
Anomalies and the Expected Market ReturnJournal of Finance20220201Dong, Xi; Li, Yan; Rapach, David E.; Zhou, GuofuWe provide the first systematic evidence on the link between long-short anomaly portfolio returns--a cornerstone of the cross-sectional literature--and the time-series predictability of the aggregate market excess return. Using 100 representative anomalies from the literature, we employ a variety of shrinkage techniques (including machine learning, forecast combination, and dimension reduction) to efficiently extract predictive signals in a high-dimensional setting. We find that long-short anomaly portfolio returns evince statistically and economically significant out-of-sample predictive ability for the market excess return. The predictive ability of anomaly portfolio returns appears to stem from asymmetric limits of arbitrage and overpricing correction persistence.View Infographic
Predictably Unequal? The Effects of Machine Learning on Credit MarketsJournal of Finance20220201Fuster, Andreas; Goldsmith-Pinkham, Paul; Ramadorai, Tarun; Walther, AnsgarInnovations in statistical technology in functions including credit-screening have raised concerns about distributional impacts across categories such as race. Theoretically, distributional effects of better statistical technology can come from greater flexibility to uncover structural relationships or from triangulation of otherwise excluded characteristics. Using data on U.S. mortgages, we predict default using traditional and machine learning models. We find that Black and Hispanic borrowers are disproportionately less likely to gain from the introduction of machine learning. In a simple equilibrium credit market model, machine learning increases disparity in rates between and within groups, with these changes attributable primarily to greater flexibility.View Infographic
Missing Values Handling for Machine Learning PortfoliosJournal of Financial Economics20240501Chen, Andrew Y.; McCoy, JackWe characterize the structure and origins of missingness for 159 cross-sectional return predictors and study missing value handling for portfolios constructed using machine learning. Simply imputing with cross-sectional means performs well compared to rigorous expectation-maximization methods. This stems from three facts about predictor data: (1) missingness occurs in large blocks organized by time, (2) cross-sectional correlations are small, and (3) missingness tends to occur in blocks organized by the underlying data source. As a result, observed data provide little information about missing data. Sophisticated imputations introduce estimation noise that can lead to underperformance if machine learning is not carefully applied.View Infographic
Charting by MachinesJournal of Financial Economics20240301Murray, Scott; Xia, Yusen; Xiao, HoupingWe test the efficient market hypothesis by using machine learning to forecast stock returns from historical performance. These forecasts strongly predict the cross-section of future stock returns. The predictive power holds in most subperiods and is strong among the largest 500 stocks. The forecasting function has important nonlinearities and interactions, is remarkably stable through time, and captures effects distinct from momentum, reversal, and extant technical signals. These findings question the efficient market hypothesis and indicate that technical analysis and charting have merit. We also demonstrate that machine learning models that perform well in optimization continue to perform well out-of-sample.View Infographic
Machine Learning and Fund Characteristics Help to Select Mutual Funds with Positive AlphaJournal of Financial Economics20231201DeMiguel, Victor; Gil-Bazo, Javier; Nogales, Francisco J.; Santos, Andre A. P.Machine-learning methods exploit fund characteristics to select tradable long-only portfolios of mutual funds that earn significant out-of-sample annual alphas of 2.4% net of all costs. The methods unveil interactions in the relation between fund characteristics and future performance. For instance, past performance is a particularly strong predictor of future performance for more active funds. Machine learning identifies managers whose skill is not sufficiently offset by diseconomies of scale, consistent with informational frictions preventing investors from identifying the outperforming funds. Our findings demonstrate that investors can benefit from active management, but only if they have access to sophisticated prediction methods.View Infographic
Machine-Learning the Skill of Mutual Fund ManagersJournal of Financial Economics20231001Kaniel, Ron; Lin, Zihan; Pelger, Markus; Van Nieuwerburgh, StijnWe show, using machine learning, that fund characteristics can consistently differentiate high from low-performing mutual funds, before and after fees. The outperformance persists for more than three years. Fund momentum and fund flow are the most important predictors of future risk-adjusted fund performance, while characteristics of the stocks that funds hold are not predictive. Returns of predictive long-short portfolios are higher following a period of high sentiment. Our estimation with neural networks enables us to uncover novel and substantial interaction effects between sentiment and both fund flow and fund momentum.View Infographic
The Colour of Finance WordsJournal of Financial Economics20230301Garcia, Diego; Hu, Xiaowen; Rohrer, MaximilianOur paper relies on stock price reactions to colour words, in order to provide new dictionaries of positive and negative words in a finance context. We extend the machine learning algorithm of Taddy (2013), adding a cross-validation layer to avoid over-fitting. In head-to-head comparisons, our dictionaries outperform the standard bag-of-words approach (Loughran and McDonald, 2011) when predicting stock price movements out-of-sample. By comparing their composition, word-by-word, our method refines and expands the sentiment dictionaries in the literature. The breadth of our dictionaries and their ability to disambiguate words using bigrams both help to colour finance discourse better.View Infographic
Machine Learning in the Chinese Stock MarketJournal of Financial Economics20220801Leippold, Markus; Wang, Qian; Zhou, WenyuWe add to the emerging literature on empirical asset pricing in the Chinese stock market by building and analyzing a comprehensive set of return prediction factors using various machine learning algorithms. Contrasting previous studies for the US market, liquidity emerges as the most important predictor, leading us to closely examine the impact of transaction costs. The retail investors' dominating presence positively affects short-term predictability, particularly for small stocks. Another feature that distinguishes the Chinese market from the US market is the high predictability of large stocks and state-owned enterprises over longer horizons. The out-of-sample performance remains economically significant after transaction costs.View Infographic
Price Revelation from Insider Trading: Evidence from Hacked Earnings NewsJournal of Financial Economics20220301Akey, Pat; Gregoire, Vincent; Martineau, CharlesFrom 2010 to 2015, a group of traders illegally accessed earnings information before their public release by hacking several news wire services. We use this scheme as a natural experiment to investigate how informed investors select among private signals and how efficiently financial markets incorporate private information contained in trades into prices. We construct a measure of qualitative information using machine learning and find that the hackers traded on both qualitative and quantitative signals. The hackers' trading caused 15% more of the earnings news to be incorporated in prices before their public release. Liquidity providers responded to the hackers' trades by widening spreads.View Infographic
The Partisanship of Financial RegulatorsReview of Financial Studies20231101Engelberg, Joseph; Henriksson, Matthew; Manela, Asaf; Williams, JaredWe analyze the partisanship of Commissioners at the SEC and Governors at the Federal Reserve Board. Using recent advances in machine learning, we identify partisan phrases in Congress, such as "red tape" and "climate change," and observe their usage among regulators. Although the Fed has remained relatively nonpartisan throughout our sample period (1930-2019), we find that partisanship among SEC Commissioners rose to an all-time high during the 2010-2019 period, driven by more-partisan Commissioners replacing less-partisan ones. Partisanship at the SEC appears in both the language of new SEC rules and the voting behavior of SEC Commissioners.View Infographic
Option Return Predictability with Machine Learning and Big DataReview of Financial Studies20230901Bali, Turan G.; Beckmeyer, Heiner; Morke, Mathis; Weigert, FlorianDrawing upon more than 12 million observations over the period from 1996 to 2020, we find that allowing for nonlinearities significantly increases the out-of-sample performance of option and stock characteristics in predicting future option returns. The nonlinear machine learning models generate statistically and economically sizable profits in the long-short portfolios of equity options even after accounting for transaction costs. Although option-based characteristics are the most important standalone predictors, stock-based measures offer substantial incremental predictive power when considered alongside option-based characteristics. Finally, we provide compelling evidence that option return predictability is driven by informational frictions and option mispricing. Authors have furnished an Internet Appendix, which is available on the Oxford University Press Web site next to the link to the final published paper online.View Infographic
Man versus Machine Learning: The Term Structure of Earnings Expectations and Conditional BiasesReview of Financial Studies20230601van Binsbergen, Jules H.; Han, Xiao; Lopez-Lira, AlejandroWe introduce a real-time measure of conditional biases to firms' earnings forecasts. The measure is defined as the difference between analysts' expectations and a statistically optimal unbiased machine-learning benchmark. Analysts' conditional expectations are, on average, biased upward, a bias that increases in the forecast horizon. These biases are associated with negative cross-sectional return predictability, and the short legs of many anomalies contain firms with excessively optimistic earnings forecasts. Further, managers of companies with the greatest upward-biased earnings forecasts are more likely to issue stocks. Commonly used linear earnings models do not work out-of-sample and are inferior to those analysts provide.View Infographic
Credit Building or Credit Crumbling? A Credit Builder Loan's Effects on Consumer Behavior and Market Efficiency in the United StatesReview of Financial Studies20230401Burke, Jeremy; Jamison, Julian; Karlan, Dean; Mihaly, Kata; Zinman, JonathanA randomized encouragement design yields null average effects of a credit builder loan (CBL) on consumer credit scores. But machine learning algorithms indicate the nulls are due to stark, offsetting treatment effects depending on baseline installment credit activity. Delinquency on preexisting loan obligations drives the negative effects, suggesting that adding a CBL overextends some consumers and generates negative externalities on other lenders. More favorably for the market, CBL take-up generates positive selection on score improvements. Simple changes to CBL practice, particularly to provider screening and credit bureau reporting, could ameliorate the negative effects for consumers and the market.View Infographic
The Party Structure of Mutual FundsReview of Financial Studies20220601Bubb, Ryan; Catan, Emiliano M.We investigate the structure of mutual funds' corporate governance preferences as revealed by how they vote their shares in portfolio companies. We apply unsupervised learning tools from the machine learning literature to analyze mutual funds' votes and find that a parsimonious two-dimensional model can explain the bulk of mutual fund voting. The dimensions capture competing visions of corporate governance and are related to the leading proxy advisors' recommendations. Cluster analysis shows that mutual funds are organized into three "parties"--the Traditional Governance Party, Shareholder Reform Party, and Shareholder Protest Party--that follow distinctive philosophies of corporate governance and shareholders' role.View Infographic
The Use and Misuse of Patent Data: Issues for Finance and BeyondReview of Financial Studies20220601Lerner, Josh; Seru, AmitPatents and citations are powerful tools increasingly used in financial economics (and management research more broadly) to understand innovation. Biases may result, however, from the interactions between the truncation of patents and citations and the changing composition of inventors. When aggregated at the firm level, these patent and citation biases can survive popular adjustment methods and are correlated with firm characteristics. These issues can lead to problematic inferences. We provide an actionable checklist to avoid biased inferences and also suggest machine learning as a potential new way to address these problems.View Infographic
Thousands of Alpha TestsReview of Financial Studies20210701Giglio, Stefano; Liao, Yuan; Xiu, DachengData snooping is a major concern in empirical asset pricing. We develop a new framework to rigorously perform multiple hypothesis testing in linear asset pricing models, while limiting the occurrence of false positive results typically associated with data snooping. By exploiting a variety of machine learning techniques, our multiple-testing procedure is robust to omitted factors and missing data. We also prove its asymptotic validity when the number of tests is large relative to the sample size, as in many finance applications. To improve the finite sample performance, we also provide a wild-bootstrap procedure for inference and prove its validity in this setting. Finally, we illustrate the empirical relevance in the context of hedge fund performance evaluation.View Infographic
Measuring Corporate Culture Using Machine LearningReview of Financial Studies20210701Li, Kai; Mai, Feng; Shen, Rui; Yan, XinyanWe create a culture dictionary using one of the latest machine learning techniques--the word embedding model--and 209,480 earnings call transcripts. We score the five corporate cultural values of innovation, integrity, quality, respect, and teamwork for 62,664 firm-year observations over the period 2001-2018. We show that an innovative culture is broader than the usual measures of corporate innovation--R&D expenses and the number of patents. Moreover, we show that corporate culture correlates with business outcomes, including operational efficiency, risk-taking, earnings management, executive compensation design, firm value, and deal making, and that the culture-performance link is more pronounced in bad times. Finally, we present suggestive evidence that corporate culture is shaped by major corporate events, such as mergers and acquisitions.View Infographic
Selecting Directors Using Machine LearningReview of Financial Studies20210701Erel, Isil; Stern, Lea H.; Tan, Chenhao; Weisbach, Michael S.Can algorithms assist firms in their decisions on nominating corporate directors? Directors predicted by algorithms to perform poorly indeed do perform poorly compared to a realistic pool of candidates in out-of-sample tests. Predictably bad directors are more likely to be male, accumulate more directorships, and have larger networks than the directors the algorithm would recommend in their place. Companies with weaker governance structures are more likely to nominate them. Our results suggest that machine learning holds promise for understanding the process by which governance structures are chosen and has potential to help real-world firms improve their governance.View Infographic
Microstructure in the Machine AgeReview of Financial Studies20210701Easley, David; Lopez de Prado, Marcos; O'Hara, Maureen; Zhang, ZhibaiUnderstanding modern market microstructure phenomena requires large amounts of data and advanced mathematical tools. We demonstrate how machine learning can be applied to microstructural research. We find that microstructure measures continue to provide insights into the price process in current complex markets. Some microstructure features with high explanatory power exhibit low predictive power, while others with less explanatory power have more predictive power. We find that some microstructure-based measures are useful for out-of-sample prediction of various market statistics, leading to questions about market efficiency. We also show how microstructure measures can have important cross-asset effects. Our results are derived using 87 liquid futures contracts across all asset classes.View Infographic
Bond Risk Premiums with Machine LearningReview of Financial Studies20210201Bianchi, Daniele; Buchner, Matthias; Tamoni, AndreaWe show that machine learning methods, in particular, extreme trees and neural networks (NNs), provide strong statistical evidence in favor of bond return predictability. NN forecasts based on macroeconomic and yield information translate into economic gains that are larger than those obtained using yields alone. Interestingly, the nature of unspanned factors changes along the yield curve: stock- and labor-market-related variables are more relevant for short-term maturities, whereas output and income variables matter more for longer maturities. Finally, NN forecasts correlate with proxies for time-varying risk aversion and uncertainty, lending support to models featuring both channels.View Infographic
Empirical Asset Pricing via Machine LearningReview of Financial Studies20200501Gu, Shihao; Kelly, Bryan; Xiu, DachengWe perform a comparative analysis of machine learning methods for the canonical problem of empirical asset pricing: measuring asset risk premiums. We demonstrate large economic gains to investors using machine learning forecasts, in some cases doubling the performance of leading regression-based strategies from the literature. We identify the best-performing methods (trees and neural networks) and trace their predictive gains to allowing nonlinear predictor interactions missed by other methods. All methods agree on the same set of dominant predictive signals, a set that includes variations on momentum, liquidity, and volatility.View Infographic
How Valuable Is FinTech Innovation?Review of Financial Studies20190501Chen, Mark A.; Wu, Qinxi; Yang, BaozhongWe provide large-scale evidence on the occurrence and value of FinTech innovation. Using data on patent filings from 2003 to 2017, we apply machine learning to identify and classify innovations by their underlying technologies. We find that most FinTech innovations yield substantial value to innovators, with blockchain being particularly valuable. For the overall financial sector, internet of things (IoT), robo-advising, and blockchain are the most valuable innovation types. Innovations affect financial industries more negatively when they involve disruptive technologies from nonfinancial startups, but market leaders that invest heavily in their own innovation can avoid much of the negative value effect.View Infographic
Big Loans to Small Businesses: Predicting Winners and Losers in an Entrepreneurial Lending ExperimentAmerican Economic Review20240901Bryan, Gharad; Karlan, Dean; Osman, AdamWe experimentally study the impact of relatively large enterprise loans in Egypt. Larger loans generate small average impacts, but machine learning using psychometric data reveals "top performers" (those with the highest predicted treatment effects) substantially increase profits, while profits drop for poor performers. The large differences imply that lender credit allocation decisions matter for aggregate income, yet we find existing practice leads to substantial misallocation. We argue that some entrepreneurs are overoptimistic and squander the opportunities presented by larger loans by taking on too much risk, and show the promise of allocations based on entrepreneurial type relative to firm characteristics.View Infographic
From Mad Men to Maths Men: Concentration and Buyer Power in Online AdvertisingAmerican Economic Review20211001Decarolis, Francesco; Rovigatti, GabrieleThis paper analyzes the impact of intermediary concentration on the allocation of revenue in online platforms. We study sponsored search documenting how advertisers increasingly bid through a handful of specialized intermediaries. This enhances automated bidding and data pooling, but lessens competition whenever the intermediary represents competing advertisers. Using data on nearly 40 million Google keyword auctions, we first apply machine learning algorithms to cluster keywords into thematic groups serving as relevant markets. Using an instrumental variable strategy, we estimate a decline in the platform's revenue of approximately 11 percent due to the average rise in concentration associated with intermediary merger and acquisition activity.View Infographic
Predicting and Understanding Initial PlayAmerican Economic Review20191201Fudenberg, Drew; Liang, AnnieWe use machine learning to uncover regularities in the initial play of matrix games. We first train a prediction algorithm on data from past experiments. Examining the games where our algorithm predicts correctly, but existing economic models don't, leads us to add a parameter to the best performing model that improves predictive accuracy. We then observe play in a collection of new "algorithmically generated" games, and learn that we can obtain even better predictions with a hybrid model that uses a decision tree to decide game-by-game which of two economic models to use for prediction.View Infographic
Does Machine Learning Automate Moral Hazard and Error?American Economic Review20170501Mullainathan, Sendhil; Obermeyer, ZiadMachine learning tools are beginning to be deployed en masse in health care. While the statistical underpinnings of these techniques have been questioned with regard to causality and stability, we highlight a different concern here, relating to measurement issues. A characteristic feature of health data, unlike other applications of machine learning, is that neither y nor x is measured perfectly. Far from a minor nuance, this can undermine the power of machine learning algorithms to drive change in the health care system--and indeed, can cause them to reproduce and even magnify existing errors in human judgment.View Infographic
Wearable Technologies and Health Behaviors: New Data and New Methods to Understand Population HealthAmerican Economic Review20170501Handel, Benjamin; Kolstad, JonathanWe study a randomized control trial in a large employer population of access to "wearable" technologies and the associated planning and monitoring tools on improved health behaviors (sleep and exercise). Both ITT and IV estimates based on actual plan enrollment for the treatment group suggest statistically significant but economically small changes in behavior after three months. We then implement machine learning-based models to assess treatment effect heterogeneity. We find little evidence for heterogeneous treatment effects base on observables. We also present detailed data on sleep patterns underscoring the value of this new data source to researchers.View Infographic
Double/Debiased/Neyman Machine Learning of Treatment EffectsAmerican Economic Review20170501Chernozhukov, Victor; Chetverikov, Denis; Demirer, Mert; Duflo, Esther; Hansen, Christian; Newey, WhitneyChernozhukov et al. (2016) provide a generic double/de-biased machine learning (ML) approach for obtaining valid inferential statements about focal parameters, using Neyman-orthogonal scores and cross-fitting, in settings where nuisance parameters are estimated using ML methods. In this note, we illustrate the application of this method in the context of estimating average treatment effects and average treatment effects on the treated using observational data.View Infographic
Productivity and Selection of Human Capital with Machine LearningAmerican Economic Review20160501Chalfin, Aaron; Danieli, Oren; Hillis, Andrew; Jelveh, Zubin; Luca, Michael; Ludwig, JensEconomists have become increasingly interested in studying the nature of production functions in social policy applications, with the goal of improving productivity. Traditionally models have assumed workers are homogenous inputs. However, in practice, substantial variability in productivity means the marginal productivity of labor depends substantially on which new workers are hired--which requires not an estimate of a causal effect, but rather a prediction. We demonstrate that there can be large social welfare gains from using machine learning tools to predict worker productivity, using data from two important applications - police hiring and teacher tenure decisions.View Infographic
Machine Learning as a Tool for Hypothesis GenerationQuarterly Journal of Economics20240501Ludwig, Jens; Mullainathan, SendhilWhile hypothesis testing is a highly formalized activity, hypothesis generation remains largely informal. We propose a systematic procedure to generate novel hypotheses about human behavior, which uses the capacity of machine learning algorithms to notice patterns people might not. We illustrate the procedure with a concrete application: judge decisions about whom to jail. We begin with a striking fact: the defendant's face alone matters greatly for the judge's jailing decision. In fact, an algorithm given only the pixels in the defendant's mug shot accounts for up to half of the predictable variation. We develop a procedure that allows human subjects to interact with this black-box algorithm to produce hypotheses about what in the face influences judge decisions. The procedure generates hypotheses that are both interpretable and novel: they are not explained by demographics (e.g., race) or existing psychology research, nor are they already known (even if tacitly) to people or experts. Though these results are specific, our procedure is general. It provides a way to produce novel, interpretable hypotheses from any high-dimensional data set (e.g., cell phones, satellites, online behavior, news headlines, corporate filings, and high-frequency time series). A central tenet of our article is that hypothesis generation is a valuable activity, and we hope this encourages future work in this largely "prescientific" stage of science.View Infographic
Diagnosing Physician Error: A Machine Learning Approach to Low-Value Health CareQuarterly Journal of Economics20220501Mullainathan, Sendhil; Obermeyer, ZiadWe use machine learning as a tool to study decision making, focusing specifically on how physicians diagnose heart attack. An algorithmic model of a patient's probability of heart attack allows us to identify cases where physicians' testing decisions deviate from predicted risk. We then use actual health outcomes to evaluate whether those deviations represent mistakes or physicians' superior knowledge. This approach reveals two inefficiencies. Physicians overtest: predictably low-risk patients are tested, but do not benefit. At the same time, physicians undertest: predictably high-risk patients are left untested, and then go on to suffer adverse health events including death. A natural experiment using shift-to-shift testing variation confirms these findings. Simultaneous over- and undertesting cannot easily be explained by incentives alone, and instead point to systematic errors in judgment. We provide suggestive evidence on the psychology underlying these errors. First, physicians use too simple a model of risk. Second, they overweight factors that are salient or representative of heart attack, such as chest pain. We argue health care models must incorporate physician error, and illustrate how policies focused solely on incentive problems can produce large inefficiencies.View Infographic
Human Decisions and Machine PredictionsQuarterly Journal of Economics20180201Kleinberg, Jon; Lakkaraju, Himabindu; Leskovec, Jure; Ludwig, Jens; Mullainathan, SendhilCan machine learning improve human decision making? Bail decisions provide a good test case. Millions of times each year, judges make jail-or-release decisions that hinge on a prediction of what a defendant would do if released. The concreteness of the prediction task combined with the volume of data available makes this a promising machine-learning application. Yet comparing the algorithm to judges proves complicated. First, the available data are generated by prior judge decisions. We only observe crime outcomes for released defendants, not for those judges detained. This makes it hard to evaluate counterfactual decision rules based on algorithmic predictions. Second, judges may have a broader set of preferences than the variable the algorithm predicts; for instance, judges may care specifically about violent crimes or about racial inequities. We deal with these problems using different econometric strategies, such as quasi-random assignment of cases to judges. Even accounting for these concerns, our results suggest potentially large welfare gains: one policy simulation shows crime reductions up to 24.7% with no change in jailing rates, or jailing rate reductions up to 41.9% with no increase in crime rates. Moreover, all categories of crime, including violent crimes, show reductions; these gains can be achieved while simultaneously reducing racial disparities. These results suggest that while machine learning can be valuable, realizing this value requires integrating these tools into an economic framework: being clear about the link between predictions and decisions; specifying the scope of payoff functions; and constructing unbiased decision counterfactuals.View Infographic
Personalized Pricing and Consumer WelfareJournal of Political Economy20230101Dube, Jean-Pierre; Misra, SanjogWe study the welfare implications of personalized pricing implemented with machine learning. We use data from a randomized controlled pricing field experiment to construct personalized prices and validate these in the field. We find that unexercised market power increases profit by 55%. Personalization improves expected profits by an additional 19% and by 86% relative to the nonoptimized price. While total consumer surplus declines under personalized pricing, over 60% of consumers benefit from personalization. Under some inequity-averse welfare functions, consumer welfare may even increase. Simulations reveal a nonmonotonic relationship between the granularity of data and consumer surplus under personalization.View Infographic
CEO Behavior and Firm PerformanceJournal of Political Economy20200401Bandiera, Oriana; Prat, Andrea; Hansen, Stephen; Sadun, RaffaellaWe develop a new method to measure CEO behavior in large samples via a survey that collects high-frequency, high-dimensional diary data and a machine learning algorithm that estimates behavioral types. Applying this method to 1,114 CEOs in six countries reveals two types: "leaders," who do multifunction, high-level meetings, and "managers," who do individual meetings with core functions. Firms that hire leaders perform better, and it takes three years for a new CEO to make a difference. Structural estimates indicate that productivity differentials are due to mismatches rather than to leaders being better for all firms.View Infographic
Estimation Based on Nearest Neighbor Matching: From Density Ratio to Average Treatment EffectEconometrica20231101Lin, Zhexiao; Ding, Peng; Han, FangNearest neighbor (NN) matching is widely used in observational studies for causal effects. Abadie and Imbens (2006) provided the first large-sample analysis of NN matching. Their theory focuses on the case with the number of NNs, M fixed. We reveal something new out of their study and show that once allowing M to diverge with the sample size an intrinsic statistic in their analysis constitutes a consistent estimator of the density ratio with regard to covariates across the treated and control groups. Consequently, with a diverging M, the NN matching with Abadie and Imbens' (2011) bias correction yields a doubly robust estimator of the average treatment effect and is semiparametrically efficient if the density functions are sufficiently smooth and the outcome model is consistently estimated. It can thus be viewed as a precursor of the double machine learning estimators.View Infographic
Terrorism Financing, Recruitment, and AttacksEconometrica20220701Limodio, NicolaThis paper investigates the effect of terrorism financing and recruitment on attacks. I exploit a Sharia-compliant institution in Pakistan, which induces unintended and quasi-experimental variation in the funding of terrorist groups through their religious affiliation. The results indicate that higher terrorism financing, in a given location and period, generate more attacks in the same location and period. Financing exhibits a complementarity in producing attacks with terrorist recruitment, measured through data from Jihadist-friendly online for a and machine learning. A higher supply of terror is responsible for the increase in attacks and is identified by studying groups with different affiliations operating in multiple cities. These findings are consistent with terrorist organizations facing financial frictions to their internal capital market.View Infographic
Locally Robust Semiparametric EstimationEconometrica20220701Chernozhukov, Victor; Escanciano, Juan Carlos; Ichimura, Hidehiko; Newey, Whitney K.; Robins, James M.Many economic and causal parameters depend on nonparametric or high dimensional first steps. We give a general construction of locally robust/orthogonal moment functions for GMM, where first steps have no effect, locally, on average moment functions. Using these orthogonal moments reduces model selection and regularization bias, as is important in many applications, especially for machine learning first steps. Also, associated standard errors are robust to misspecification when there is the same number of moment functions as parameters of interest. We use these orthogonal moments and cross-fitting to construct debiased machine learning estimators of functions of high dimensional conditional quantiles and of dynamic discrete choice parameters with high dimensional state variables. We show that additional first steps needed for the orthogonal moment functions have no effect, globally, on average orthogonal moment functions. We give a general approach to estimating those additional first steps. We characterize double robustness and give a variety of new doubly robust moment functions. We give general and simple regularity conditions for asymptotic theory.View Infographic
Automatic Debiased Machine Learning of Causal and Structural EffectsEconometrica20220501Chernozhukov, Victor; Newey, Whitney K.; Singh, RahulMany causal and structural effects depend on regressions. Examples include policy effects, average derivatives, regression decompositions, average treatment effects, causal mediation, and parameters of economic structural models. The regressions may be high-dimensional, making machine learning useful. Plugging machine learners into identifying equations can lead to poor inference due to bias from regularization and/or model selection. This paper gives automatic debiasing for linear and nonlinear functions of regressions. The debiasing is automatic in using Lasso and the function of interest without the full form of the bias correction. The debiasing can be applied to any regression learner, including neural nets, random forests, Lasso, boosting, and other high-dimensional methods. In addition to providing the bias correction, we give standard errors that are robust to misspecification, convergence rates for the bias correction, and primitive conditions for asymptotic inference for estimators of a variety of estimators of structural and causal effects. The automatic debiased machine learning is used to estimate the average treatment effect on the treated for the NSW job training data and to estimate demand elasticities from Nielsen scanner data while allowing preferences to be correlated with prices and income.View Infographic
Bootstrap-Based Inference for Cube Root AsymptoticsEconometrica20200901Cattaneo, Matias D.; Jansson, Michael; Nagasawa, KenichiThis paper proposes a valid bootstrap-based distributional approximation for M-estimators exhibiting a Chernoff (1964)-type limiting distribution. For estimators of this kind, the standard nonparametric bootstrap is inconsistent. The method proposed herein is based on the nonparametric bootstrap, but restores consistency by altering the shape of the criterion function defining the estimator whose distribution we seek to approximate. This modification leads to a generic and easy-to-implement resampling method for inference that is conceptually distinct from other available distributional approximations. We illustrate the applicability of our results with four examples in econometrics and machine learning.View Infographic
Measuring Group Differences in High-Dimensional Choices: Method and Application to Congressional SpeechEconometrica20190701Gentzkow, Matthew; Shapiro, Jesse M.; Taddy, MattWe study the problem of measuring group differences in choices when the dimensionality of the choice set is large. We show that standard approaches suffer from a severe finite-sample bias, and we propose an estimator that applies recent advances in machine learning to address this bias. We apply this method to measure trends in the partisanship of congressional speech from 1873 to 2016, defining partisanship to be the ease with which an observer could infer a congressperson's party from a single utterance. Our estimates imply that partisanship is far greater in recent years than in the past, and that it increased sharply in the early 1990s after remaining low and relatively constant over the preceding century.View Infographic
Program Evaluation and Causal Inference with High-Dimensional DataEconometrica20170101Belloni, A.; Chernozhukov, V.; Fernandez-Val, I.; Hansen, C.In this paper, we provide efficient estimators and honest confidence bands for a variety of treatment effects including local average (LATE) and local quantile treatment effects (LQTE) in data-rich environments. We can handle control variables, receipt of treatment, treatment effects, and outcomes. Our framework covers the special case of exogenous receipt of treatment, either conditional on controls or unconditionally as in randomized control trials. In the latter case, our approach produces efficient estimators and honest bands for (functional) average treatment effects (ATE) and quantile treatment effects (QTE). To make informative inference possible, we assume that key reduced-form predictive relationships are approximately sparse. This assumption allows the use of regularization and selection methods to estimate those relations, and we provide methods for post-regularization and post-selection inference that are uniformly valid (honest) across a wide range of models. We show that a key ingredient enabling honest inference is the use of orthogonal or doubly robust moment conditions in estimating certain reduced-form functional parameters. We illustrate the use of the proposed methods with an application to estimating the effect of 401(k) eligibility and participation on accumulated assets. The results on program evaluation are obtained as a consequence of more general results on honest inference in a general moment-condition framework, which arises from structural equation models in econometrics. Here, too, the crucial ingredient is the use of orthogonal moment conditions, which can be constructed from the initial moment conditions. We provide results on honest inference for (function-valued) parameters within this general framework where any high-quality, machine learning methods (e.g., boosted trees, deep neural networks, random forest, and their aggregated and hybrid versions) can be used to learn the nonparametric/high-dimensional components of the model. These include a number of supporting auxiliary results that are of major independent interest: namely, we (1) prove uniform validity of a multiplier bootstrap, (2) offer a uniformly valid functional delta method, and (3) provide results for sparsity-based estimation of regression functions for function-valued outcomes.View Infographic
Artificial Intelligence, Education, and EntrepreneurshipJournal of Finance20240201Gofman, Michael; Jin, ZhaoWe document an unprecedented brain drain of Artificial Intelligence (AI) professors from universities from 2004 to 2018. We find that students from the affected universities establish fewer AI startups and raise less funding. The brain-drain effect is significant for tenured professors, professors from top universities, and deep-learning professors. Additional evidence suggests that unobserved city- and university-level shocks are unlikely to drive our results. We consider several economic channels for the findings. The most consistent explanation is that professors' departures reduce startup founders' AI knowledge, which we find is an important factor for successful startup formation and fundraising.View Infographic
Artificial Intelligence, Firm Growth, and Product InnovationJournal of Financial Economics20240101Babina, Tania; Fedyk, Anastassia; He, Alex; Hodson, JamesWe study the use and economic impact of AI technologies. We propose a new measure of firm-level AI investments using employee resumes. Our measure reveals a stark increase in AI investments across sectors. AI-investing firms experience higher growth in sales, employment, and market valuations. This growth comes primarily through increased product innovation. Our results are robust to instrumenting AI investments using firms' exposure to universities' supply of AI graduates. AI-powered growth concentrates among larger firms and is associated with higher industry concentration. Our results highlight that new technologies like AI can contribute to growth and superstar firms through product innovation.View Infographic
Artificial Intelligence, Algorithmic Pricing, and CollusionAmerican Economic Review20201001Calvano, Emilio; Calzolari, Giacomo; Denicolo, Vincenzo; Pastorello, SergioIncreasingly, algorithms are supplanting human decision-makers in pricing goods and services. To analyze the possible consequences, we study experimentally the behavior of algorithms powered by Artificial Intelligence (Q-learning) in a workhorse oligopoly model of repeated price competition. We find that the algorithms consistently learn to charge supracompetitive prices, without communicating with one another. The high prices are sustained by collusive strategies with a finite phase of punishment followed by a gradual return to cooperation. This finding is robust to asymmetries in cost or demand, changes in the number of players, and various forms of uncertainty.View Infographic
AI-tocracyQuarterly Journal of Economics20230801Beraja, Martin; Kao, Andrew; Yang, David Y.; Yuchtman, NoamRecent scholarship has suggested that artificial intelligence (AI) technology and autocratic regimes may be mutually reinforcing. We test for a mutually reinforcing relationship in the context of facial-recognition AI in China. To do so, we gather comprehensive data on AI firms and government procurement contracts, as well as on social unrest across China since the early 2010s. We first show that autocrats benefit from AI: local unrest leads to greater government procurement of facial-recognition AI as a new technology of political control, and increased AI procurement indeed suppresses subsequent unrest. We show that AI innovation benefits from autocrats' suppression of unrest: the contracted AI firms innovate more both for the government and commercial markets and are more likely to export their products; noncontracted AI firms do not experience detectable negative spillovers. Taken together, these results suggest the possibility of sustained AI innovation under the Chinese regime: AI innovation entrenches the regime, and the regime's investment in AI for political control stimulates further frontier innovation.View Infographic
Data-Intensive Innovation and the State: Evidence from AI Firms in ChinaReview of Economic Studies20230701Beraja, Martin; Yang, David Y.; Yuchtman, NoamDeveloping artificial intelligence (AI) technology requires data. In many domains, government data far exceed in magnitude and scope data collected by the private sector, and AI firms often gain access to such data when providing services to the state. We argue that such access can stimulate commercial AI innovation in part because data and trained algorithms are shareable across government and commercial uses. We gather comprehensive information on firms and public security procurement contracts in China's facial recognition AI industry. We quantify the data accessible through contracts by measuring public security agencies' capacity to collect surveillance video. Using a triple-differences strategy, we find that data-rich contracts, compared to data-scarce ones, lead recipient firms to develop significantly and substantially more commercial AI software. Our analysis suggests a contribution of government data to the rise of China's facial recognition AI firms, and that states' data collection and provision policies could shape AI innovation.View Infographic
Platform Design When Sellers Use Pricing AlgorithmsEconometrica20230901Johnson, Justin P.; Rhodes, Andrew; Wildenbeest, MatthijsWe investigate the ability of a platform to design its marketplace to promote competition, improve consumer surplus, and increase its own payoff. We consider demand-steering rules that reward firms that cut prices with additional exposure to consumers. We examine the impact of these rules both in theory and by using simulations with artificial intelligence pricing algorithms (specifically Q-learning algorithms, which are commonly used in computer science). Our theoretical results indicate that these policies (which require little information to implement) can have strongly beneficial effects, even when sellers are infinitely patient and seek to collude. Similarly, our simulations suggest that platform design can benefit consumers and the platform, but that achieving these gains may require policies that condition on past behavior and treat sellers in a nonneutral fashion. These more sophisticated policies disrupt the ability of algorithms to rotate demand and split industry profits, leading to low prices.View Infographic
Goal Setting and Saving in the FinTech EraJournal of Finance20240601Gargano, Antonio; Rossi, Alberto G.We study the effectiveness of saving goals in increasing individuals' savings using data from a Fintech app. Using a difference-in-differences identification strategy that randomly assigns users into a group of beta testers who can set goals and a group of users who cannot, we find that setting goals increases individuals' savings rate. The increased savings within the app do not reduce savings outside the app. Moreover, goal setting helps those individuals previously identified as having the lowest propensity to save. Matching App user survey responses to their behavior highlights the relative merits of monitoring and concreteness channels in explaining our findings.View Infographic
Lender Automation and Racial Disparities in Credit AccessJournal of Finance20240401Howell, Sabrina T.; Kuchler, Theresa; Snitkof, David; Stroebel, Johannes; Wong, JunProcess automation reduces racial disparities in credit access by enabling smaller loans, broadening banks' geographic reach, and removing human biases from decision making. We document these findings in the context of the Paycheck Protection Program (PPP), where private lenders faced no credit risk but decided which firms to serve. Black-owned firms obtained PPP loans primarily from automated fintech lenders, especially in areas with high racial animus. After traditional banks automated their loan processing procedures, their PPP lending to Black-owned firms increased. Our findings cannot be fully explained by racial differences in loan application behaviors, preexisting banking relationships, firm performance, or fraud rates.View Infographic
Did FinTech Lenders Facilitate PPP Fraud?Journal of Finance20230601Griffin, John M.; Kruger, Samuel; Mahajan, PrateekIn the $793 billion Paycheck Protection Program, we examine metrics related to potential misreporting including nonregistered businesses, multiple businesses at residential addresses, abnormally high implied compensation per employee, and large inconsistencies with jobs reported in another government program. These measures consistently concentrate in certain FinTech lenders and are cross-verified by seven additional measures. FinTech market share increased significantly over time, and suspicious lending by FinTechs in 2021 is four times the level at the start of the program. Suspicious loans are being overwhelmingly forgiven at rates similar to other loans.View Infographic
Attention-Induced Trading and Returns: Evidence from Robinhood UsersJournal of Finance20221201Barber, Brad M.; Huang, Xing; Odean, Terrance; Schwarz, ChristopherWe study the influence of financial innovation by fintech brokerages on individual investors' trading and stock prices. Using data from Robinhood, we find that Robinhood investors engage in more attention-induced trading than other retail investors. For example, Robinhood outages disproportionately reduce trading in high-attention stocks. While this evidence is consistent with Robinhood attracting relatively inexperienced investors, we show that it is also driven in part by the app's unique features. Consistent with models of attention-induced trading, intense buying by Robinhood users forecasts negative returns. Average 20-day abnormal returns are -4.7% for the top stocks purchased each day.View Infographic
Regulatory Arbitrage or Random Errors? Implications of Race Prediction Algorithms in Fair Lending AnalysisJournal of Financial Economics20240701Greenwald, Daniel L.; Howell, Sabrina T.; Li, Cangyuan; Yimfor, EmmanuelWhen race is not directly observed, regulators and analysts commonly predict it using algorithms based on last name and address. In small business lending--where regulators assess fair lending law compliance using the Bayesian Improved Surname Geocoding (BISG) algorithm--we document large prediction errors among Black Americans. The errors bias measured racial disparities in loan approval rates downward by 43%, with greater bias for traditional vs. fintech lenders. Regulation using self-identified race would increase lending to Black borrowers, but also shift lending toward affluent areas because errors correlate with socioeconomics. Overall, using race proxies in policymaking and research presents challenges.View Infographic
Open Banking: Credit Market Competition When Borrowers Own the DataJournal of Financial Economics20230201He, Zhiguo; Huang, Jing; Zhou, JidongOpen banking facilitates data sharing consented to by customers who generate the data, with the regulatory goal of promoting competition between traditional banks and challenger fintech entrants. We study lending market competition when sharing banks' customer transaction data enables better borrower screening for fintechs. Open banking promotes competition if it helps level the playing field for all lenders in screening borrowers; however, if it over-empowers fintechs, it can also hinder competition and leave all borrowers worse off. Due to the credit quality inference from borrowers' sign-up decisions, this remains true even if borrowers have the control of whether to share their banking data. We also study extensions with fintech affinities and data sharing on borrower preferences.View Infographic
Measuring the Welfare Cost of Asymmetric Information in Consumer Credit MarketsJournal of Financial Economics20221201DeFusco, Anthony A.; Tang, Huan; Yannelis, ConstantineInformation asymmetries are known in theory to lead to inefficiently low credit provision, yet empirical estimates of the resulting welfare losses are scarce. This paper leverages a randomized experiment conducted by a large fintech lender to estimate welfare losses arising from asymmetric information in the market for online consumer credit. Building on methods from the insurance literature, we show how exogenous variation in interest rates can be used to estimate borrower demand and lender cost curves and recover implied welfare losses. While asymmetric information generates large equilibrium price distortions, we find only small overall welfare losses, particularly for high-credit-score borrowers.View Infographic
Can FinTech Reduce Disparities in Access to Finance? Evidence from the Paycheck Protection ProgramJournal of Financial Economics20221001Erel, Isil; Liebersohn, JackNew technology promises to expand the supply of financial services to small businesses poorly served by banks. Does it succeed? We study the response of FinTech to financial services demand created by the introduction of the Paycheck Protection Program. FinTech is disproportionately used in ZIP codes with fewer bank branches, lower incomes, and more minority households, and in industries with fewer banking relationships. It is also greater in counties where the economic effects of the COVID-19 pandemic were more severe. Substitution between FinTech and banks is economically small, implying that FinTech mostly expands, rather than redistributes, the supply of financial services.View Infographic
Consumer-Lending Discrimination in the FinTech EraJournal of Financial Economics20220101Bartlett, Robert; Morse, Adair; Stanton, Richard; Wallace, NancyU.S. fair-lending law prohibits lenders from making credit determinations that disparately affect minority borrowers if those determinations are based on characteristics unrelated to creditworthiness. Using an identification under this rule, we show risk-equivalent Latinx/Black borrowers pay significantly higher interest rates on GSE-securitized and FHA-insured loans, particularly in high-minority-share neighborhoods. We estimate these rate differences cost minority borrowers over $450 million yearly. FinTech lenders' rate disparities were similar to those of non-Fintech lenders for GSE mortgages, but lower for FHA mortgages issued in 2009-2015 and for FHA refi mortgages issued in 2018-2019.View Infographic
Fintech, Regulatory Arbitrage, and the Rise of Shadow BanksJournal of Financial Economics20181201Buchak, Greg; Matvos, Gregor; Piskorski, Tomasz; Seru, AmitShadow bank market share in residential mortgage origination nearly doubled from 2007 to 2015, with particularly dramatic growth among online "fintech" lenders. We study how two forces, regulatory differences and technological advantages, contributed to this growth. Difference in difference tests exploiting geographical heterogeneity induced by four specific increases in regulatory burden-capital requirements, mortgage servicing rights, mortgage-related lawsuits, and the movement of supervision to Office of Comptroller and Currency following closure of the Office of Thrift Supervision--all reveal that traditional banks contracted in markets where they faced more regulatory constraints; shadow banks partially filled these gaps. Relative to other shadow banks, fintech lenders serve more creditworthy borrowers and are more active in the refinancing market. Fintech lenders charge a premium of 14-16 basis points and appear to provide convenience rather than cost savings to borrowers. They seem to use different information to set interest rates relative to other lenders. A quantitative model of mortgage lending suggests that regulation accounts for roughly 60% of shadow bank growth, while technology accounts for roughly 30%.View Infographic
When FinTech Competes for Payment FlowsReview of Financial Studies20221101Parlour, Christine A.; Rajan, Uday; Zhu, HaoxiangWe study the impact of FinTech competition in payment services when a monopolist bank uses payment data to learn about consumers' credit quality. Competition from FinTech payment providers disrupts this information spillover. The bank's price for payment services and its loan offers are affected. FinTech competition promotes financial inclusion, may hurt consumers with a strong bank preference, and has an ambiguous effect on the loan market. Both FinTech data sales and consumer data portability increase bank lending, but the effects on consumer welfare are ambiguous. Under mild conditions, consumer welfare is higher under data sales than with data portability.View Infographic
Small Bank Lending in the Era of Fintech and Shadow Banks: A Sideshow?Review of Financial Studies20221101Begley, Taylor A.; Srinivasan, KandarpAmid the emerging dominance of nonbanks, small banks use key financing advantages to persist in the mortgage market. We provide evidence of the heterogeneous impact of two shocks to the supply of mortgage credit: postcrisis regulatory burden and GSE financing cost changes. Small banks exploit regulation disproportionately affecting the largest four banks (Big4) and their ability to lend on balance sheet to strongly substitute for the retreating Big4. The erasure of guarantee fee (g-fee) discounts for large lenders facilitates small bank growth in GSE lending. Small banks also grow balance sheet loans in areas more exposed to g-fee hikes.View Infographic
The Rise of Finance Companies and FinTech Lenders in Small Business LendingReview of Financial Studies20221101Gopal, Manasa; Schnabl, PhilippWe document that finance companies and FinTech lenders increased lending to small businesses after the 2008 financial crisis. We show that most of the increase substituted for a reduction in bank lending. In counties in which banks had a larger market share before the crisis, finance companies and FinTech lenders increased their lending more. We find no effect of reduced bank lending on employment, wages, and new business creation by 2016. Our results suggest that finance companies and FinTech lenders are major suppliers of credit to small businesses and played an important role in the recovery from the 2008 financial crisis.View Infographic
Regressive Mortgage Credit Redistribution in the Post-crisis EraReview of Financial Studies20220101D'Acunto, Francesco; Rossi, Alberto G.We document four secular trends about U.S. mortgage origination by traditional and FinTech lenders after the 2008-2009 financial crisis. First, since 2011, the overall number, size, and approval rate of small and medium-sized loans have been decreasing over time, relative to large loans. Second, the largest lenders redistribute their lending the most. Third, this loan-size redistribution of credit increases in the size of the lender. Fourth, the effects are stronger for mortgages further away from the conforming loan limit(s) in both directions. We argue that the supply of credit drives these secular trends, and we assess several potential economic mechanisms.View Infographic
Fintech Borrowers: Lax Screening or Cream-Skimming?Review of Financial Studies20211001Di Maggio, Marco; Yao, VincentWe study the personal credit market using unique individual-level data covering fintech and traditional lenders. We show that fintech lenders acquire market share by lending first to higher-risk borrowers and then to safer borrowers, and rely mainly on hard information to make credit decisions. Fintech borrowers are significantly more likely to default than neighbor individuals with the same characteristics borrowing from traditional financial institutions. Furthermore, they tend to experience a short-lived reduction in the cost of credit, because their indebtedness increases more than non-fintech borrowers after loan origination. However, fintech lenders' pricing strategies are likely to take this into account.View Infographic
On the Rise of FinTechs: Credit Scoring Using Digital FootprintsReview of Financial Studies20200701Berg, Tobias; Burg, Valentin; Gombovic, Ana; Puri, ManjuWe analyze the information content of a digital footprint--that is, information that users leave online simply by accessing or registering on a Web site--for predicting consumer default. We show that even simple, easily accessible variables from a digital footprint match the information content of credit bureau scores. A digital footprint complements rather than substitutes for credit bureau information and affects access to credit and reduces default rates. We discuss the implications for financial intermediaries' business models, access to credit for the unbanked, and the behavior of consumers, firms, and regulators in the digital sphere.View Infographic
How Valuable Is FinTech Innovation?Review of Financial Studies20190501Chen, Mark A.; Wu, Qinxi; Yang, BaozhongWe provide large-scale evidence on the occurrence and value of FinTech innovation. Using data on patent filings from 2003 to 2017, we apply machine learning to identify and classify innovations by their underlying technologies. We find that most FinTech innovations yield substantial value to innovators, with blockchain being particularly valuable. For the overall financial sector, internet of things (IoT), robo-advising, and blockchain are the most valuable innovation types. Innovations affect financial industries more negatively when they involve disruptive technologies from nonfinancial startups, but market leaders that invest heavily in their own innovation can avoid much of the negative value effect.View Infographic
To FinTech and BeyondReview of Financial Studies20190501Goldstein, Itay; Jiang, Wei; Karolyi, G. AndrewFinTech is about the introduction of new technologies into the financial sector, and it is now revolutionizing the financial industry. In 2017, when the academic finance community was not actively researching FinTech, the editorial team of the Review of Financial Studies launched a competition to develop research proposals focused on this topic. This special issue is the result. In this introductory article, we describe the recent FinTech phenomenon and the novel editorial protocol employed for this special issue following the Registered Reports format. We discuss what we learned from the submitted proposals about the field of FinTech and which ones we selected to be completed and ultimately come out in this special issue. We also provide several observations to help guide future research in the emerging area of FinTech.View Infographic
The Role of Technology in Mortgage LendingReview of Financial Studies20190501Fuster, Andreas; Plosser, Matthew; Schnabl, Philipp; Vickery, JamesTechnology-based ("FinTech") lenders increased their market share of U.S. mortgage lending from 2% to 8% from 2010 to 2016. Using loan-level data on mortgage applications and originations, we show that FinTech lenders process mortgage applications 20% faster than other lenders, controlling for observable characteristics. Faster processing does not come at the cost of higher defaults. FinTech lenders adjust supply more elastically than do other lenders in response to exogenous mortgage demand shocks. In areas with more FinTech lending, borrowers refinance more, especially when it is in their interest. We find no evidence that FinTech lenders target borrowers with low access to finance.View Infographic
Belief Distortions and Macroeconomic FluctuationsAmerican Economic Review20220701Bianchi, Francesco; Ludvigson, Sydney C.; Ma, SaiThis paper combines a data-rich environment with a machine learning algorithm to provide new estimates of time-varying systematic expectational errors (belief distortions) embedded in survey responses. We find sizable distortions even for professional forecasters, with all respondent-types overweighting the implicit judgmental component of their forecasts relative to what can be learned from publicly available information. Forecasts of inflation and GDP growth oscillate between optimism and pessimism by large margins, with belief distortions evolving dynamically in response to cyclical shocks. The results suggest that artificial intelligence algorithms can be productively deployed to correct errors in human judgment and improve predictive accuracy.View Infographic
A Picture Is Worth a Thousand Words: Measuring Investor Sentiment by Combining Machine Learning and Photos from NewsJournal of Financial Economics20220401Obaid, Khaled; Pukthuanthong, KuntaraBy applying machine learning to the accurate and cost-effective classification of photos based on sentiment, we introduce a daily market-level investor sentiment index (Photo Pessimism) obtained from a large sample of news photos. Consistent with behavioral models, Photo Pessimism predicts market return reversals and trading volume. The relation is strongest among stocks with high limits to arbitrage and during periods of elevated fear. We examine whether Photo Pessimism and pessimism embedded in news text act as complements or substitutes for each other in predicting stock returns and find evidence that the two are substitutes.View Infographic
The Mortality and Medical Costs of Air Pollution: Evidence from Changes in Wind DirectionAmerican Economic Review20191201Deryugina, Tatyana; Heutel, Garth; Miller, Nolan H.; Molitor, David; Reif, JulianWe estimate the causal effects of acute fine particulate matter exposure on mortality, health care use, and medical costs among the US elderly using Medicare data. We instrument for air pollution using changes in local wind direction and develop a new approach that uses machine learning to estimate the life-years lost due to pollution exposure. Finally, we characterize treatment effect heterogeneity using both life expectancy and generic machine learning inference. Both approaches find that mortality effects are concentrated in about 25 percent of the elderly population.View Infographic