Machine Learning Missing European Household Wealth


Household portfolios differ substantially within and across European economies. While public debates tend to focus on differences in net values, measures relating to compositional differences attract growing interest in economic research. The reason is that household portfolio aspects such as liquidity constraints have been found to play a major role in determining the aggregate economic response to monetary and fiscal policy interventions. Yet, in most European countries, information on household assets and liabilities are collected in surveys and accounting for item non-response is a serious challenge. In this paper, I exploit cross-country variation in the availability of administrative information on household wealth to conduct an alternative imputation for missing wealth items in the Household Finance and Consumption (HFCS) countries. I do so by combining concepts from Item Response Theory with tools from Machine Learning. Specifically, I train an imputation algorithm on a dataset constructed from country surveys where the use of administrative information leads to low numbers of missing items. I then apply this algorithm to impute missing items in those HFCS countries which have high rates of missing wealth items. Complementing the existing imputation, my approach allows to share the benefits of administrative data with those countries which have to rely on surveys.