“Taxes are what we pay for civilized society.”
– Supreme Court Justice Oliver Wendell Holmes
Collecting revenue is a core function of government that funds nearly all public programs, from health care to criminal justice and from economic regulation to military defense. A central part of the collection process are taxpayer audits, which aim to detect under-payment of taxes and to encourage honest income reporting. In the United States, the Internal Revenue Service (IRS) audits less than 0.5% of 200M annual returns, so a central challenge for the IRS has been how to accurately identify the risk of non-compliant tax returns.
In recent years, the IRS’s random audit process has come under increasing scrutiny. The tax gap – the difference between total taxes owed and taxes paid – is estimated to be nearing $500 billion. This shortfall starves the government of needed resources, contributes to growing wealth inequality, and reduces the willingness of honest taxpayers to comply with the law. Given the size of this gap, it is important that audits seeking to detect and deter evasion be directed where the benefits are expected to be greatest. The main approach IRS first pioneered in the 1970s, which it has periodically updated, to select which taxpayers to audit is a form of linear discriminant analysis, and many gains may exist to leveraging modern machine learning. Some analysts have suggested that audits excessively focus on target lower-income taxpayers, and a more complete analysis and approach to these distributive concerns is important. At the same time, the IRS has faced shrinking enforcement resources and a dwindling capacity to audit taxpayers for evasion or fraud. The impact of this contraction in funding has been dramatic. From 2010 to 2017, the IRS experienced a 42% drop in the audit rate.
In partnership with the IRS, RegLab is working to use AI to help modernize the system for tax collection. This research is focused on developing new active learning methods to overhaul the risk prediction system with advances in machine learning and data science, designing explainable and trustworthy AI methods that put these risk estimates into practice, and evaluating the effectiveness of the system. Our work seeks to develop a fair, efficient, and intelligent AI system for identifying tax evasion and addressing the human-centered challenges of integrating AI in a complex bureaucracy with ~10,000 diverse revenue agents, 150M taxpayers, and 1M audits annually.