Nikola Schmidt
August 01,2022
Extracting information from documents may not be the splashiest application of automation, but it has massive potential to reduce errors and costs.
Processing invoices is a fundamental and critical component of business operations. But it’s a slog. Each supplier has its own quirks, each invoice its own nomenclature—one company’s “payment term 15 days” is another’s “payment due in two weeks.” Even if invoices come from the same supplier every month, procurement agents change, formats vary, and typos creep in. And of course, invoices are just the tip of the documentation iceberg. Every day, at every company, at every level of management and operations, employees need to extract details from contracts, leases, tax forms, surveys, and other documents.
The good news? Artificial intelligence (AI) offers ways to perform these complex, integrated tasks far more efficiently. These solutions are seamless and scalable, simple to operate, and easy to manage. Using a variety of innovative AI techniques, organizations can process documents faster and simplify operational procedures; fewer errors mean fewer corrections and retractions. Recent research by PwC on automating analytics found that even the most rudimentary AI-based extraction techniques can save businesses 30–40% of the hours typically spent on such processes.
We all know about the paradigm-changing use of AI for Netflix recommendations, chatbots that impersonate customer service agents online, the dynamic pricing of hotel rooms, and the creation of routes for delivery companies. These efforts are the value creation engines of countless large, successful companies. What we’re talking about here is a decidedly less splashy and, at face value, more pedestrian use of AI—it’s aimed at reducing costs and optimizing operations rather than transforming or creating industries. But this boring AI is actually quite exciting, because it confronts issues that all companies wrestle with, and because the gains in productivity (and hence margins and valuations) are real.
Yet despite its huge potential, PwC’s AI Predictions 2021 found that only 28% of executives have prioritized using AI and machine learning for information extraction, significantly less than for other uses, such as chatbots and solutions for workplace safety. Some leaders are likely overwhelmed by the time and resources required to develop, scale, and integrate these advanced technologies. Some will be hesitant to trust AI or will feel skeptical about its utility. Others may simply be overlooking the value of automated information extraction because it is a back-office function. Regardless of the reason, they are missing an opportunity to streamline processes and improve their return on investment.
28% of executives have prioritized using AI and machine learning for information extraction.
The paperwork problem
Any company that audits a client’s books spends an enormous number of hours every year gathering evidence and verifying transactions to confirm that the balances and transactions associated with the client’s financial statements are correct; this is known as a “test of details.” For nearly three decades, workers have used spreadsheets (first Lotus 1-2-3, then Microsoft Excel) as the primary tool to complete the test of details.
Today, the evidence for these audits usually appears in PDF form—invoices, account statements, receipts—and it can run into the thousands of pages. Information residing on these PDFs must be manually entered into the spreadsheet. For a midsized company that processes 100,000 pages of documents annually at three minutes per page, it would take approximately 5,000 person-hours to complete the task; at US$50 per hour, that’s $250,000.
Now, what if the same company could deploy augmented intelligence? This is the term for applications based on adaptive systems powered by machine learning in which the algorithms learn from human experience, but humans make the ultimate decision. The AI tool can “read” text on each of the invoices and use relational data search to quickly identify supporting documentation that the organization had previously tagged as being important—a powerful shortcut when trying to manage millions of invoice exceptions. Even though paper invoices can be unique to each supplier, AI techniques can identify important fields in the different invoices, such as unit cost and quantity, and calculate ledger balances automatically. By implementing an AI solution and assuming the 40% estimate above, the example midsized company could save 2,000 hours for every 100,000 pages processed.
40% fewer hours are needed to process routine paperwork when even the most rudimentary AI-based extraction techniques are implemented.
Another issue at many companies is the need to interpret and respond to tax notices or letters issued from government revenue agencies to both individuals and corporations. In the US, the federal government has more than 100 of these types of tax notices, and individual states have thousands more: account changes, payment requests, tax return discrepancies. In every case, someone needs to read the letter or notice, interpret it, verify its accuracy and applicability, catalog it, and, finally, respond. It is a challenging process, prone to error. Besides involving typical data-entry errors, these types of documents can get lost in the shuffle, literally, resulting in missed notices, late responses, and thousands of additional work hours to rectify the situation.
As part of an internal, enterprise-wide initiative to encourage an employee-led automation effort, PwC used augmented intelligence to read and respond to tax notices. The tool read many different types of forms, and extracted and understood terms and phrases that required particular actions, such as due dates, notice codes, amounts owed, failure-to-file penalties, and so forth. The tool then used natural language generation techniques to automatically create responses—bypassing the need to manually create them. In combination with other information extraction tools, as well as solutions for compliance, scenario planning, and international tax situations, PwC reduced the time normally required to execute these various tasks by more than 5 million hours, a savings of 16%.
When a human spot-checks such tax notices, it is a form of augmented intelligence. But when the responses are sent automatically, it is autonomous intelligence—an AI system that both is adaptive and makes decisions itself, without human involvement. (Both options work; they are applied in different areas, depending on risk tolerance.) Companies that implement advanced pattern-matching techniques could automatically identify trends that may cause them to receive specific notices—such as adding the same erroneous information in the same section of a tax form—and thereby avoid such notices in the future, saving more time and resources.
Digital data, such as details gathered from a survey, also runs up against the problems of manual analysis. When a firm conducts an employee survey, for example, somebody has to tally and analyze the results. But even if the survey is done online (as opposed to via paper-and-pencil responses, which opens up significant risk of data entry errors), someone has to compile, analyze, and summarize the data. This task, frequently delegated to junior analysts with a basic level of statistical knowledge and experience, is also a minefield for inaccuracies. Relationships between variables could be spurious and yet be proclaimed significant and transformative—leading to flawed conclusions that fuel faulty and unreliable strategies. A classic example: ice cream sales are often positively correlated with crime. Of course, ice cream sales do not cause crime (or vice versa)—it’s simply that both rise during hot summer weather.
But comparing groups, putting respondents into categories, and checking for significant differences can all be automated. Moreover, for “free response” answers, to questions such as “Do you have any additional ideas for how we can improve employee benefits?” natural language processing techniques can identify important or prominent topics in respondents’ answers, summarize the main points, and generate automated reports, reducing the need to manually read through hundreds or thousands of responses to scores of questions. For the question above about benefits, it could identify responses focused on healthcare, flexibility, life insurance, and so on. Nevertheless, these AI systems’ static recommendations—a type of assisted intelligence—still require human judgment and decision-making.
Tools of the trade
AI-driven information extraction can tackle many of the inefficiencies and problems endemic in the above scenarios. However, unlike robotics used in manufacturing that do spot welding or spray painting, AI-enabled information extraction is not a rote, routinized activity. It requires a slew of complex data science techniques involving multiple dynamic components that must adapt to ever-changing conditions. Integrating cutting-edge technologies such as optical character recognition (OCR), supervised machine learning, and automated analytics that incorporate natural language processing into a seamless process will require time and technical expertise.
Consider OCR, which is the ability to read printed characters on a page—even handwritten characters—regardless of font, size, orientation, and brightness. At present, we encounter this technology frequently with the automated deposit of checks using our phone, when the OCR reads not only the routing number and account number but also the check amount and date. OCR is an older technology but is still essential as the first step in the process that gathers the relevant data from the documents in question.
For many uses, turning that data into action requires sophisticated machine learning algorithms that can recognize and classify patterns. Machine learning algorithms can be calibrated on existing data to tune their parameters, and then unleashed onto novel data. They can be calibrated to recognize patterns that are sophisticated yet subtle indicators of monetary fraud, such as misspelled information on a loan application or excessive numbers of transfers or cash deposits. They can also unearth similar meanings in different legal contracts, for example, in exclusion, limitation, and indemnity clauses, which are all related to exemptions.
Additionally, machine learning algorithms can tackle a data set and categorize a set of entities into different groups. Automated customer segmentation is one well-known example of this, but categorizing tax notices, letters, or contract clauses is also possible and can save enormous amounts of time that would otherwise be spent reading these documents.
Advances in natural language processing in the last few years have been impressive. Although it is not necessary to use the most advanced algorithms, such as the natural language generation application GPT-3, AI-enabled information extraction can nevertheless take advantage of some of these advances by identifying the true “meaning” of a document, through identification of contextual words, parts of speech, and so on. The AI itself does not understand what it is saying (although it might appear that way), but algorithms are able to generate summaries of documents; identify topics; judge the sentiment (positive or negative) of prose; identify key terms, provisions, or clauses within documents; and identify clusters of documents requiring similar actions.
Combining these AI techniques, it is possible to read and summarize long documents from third parties, competitors, or internal sources quickly and easily, and generate rapid and appropriate responses. In one application, in which we were searching for 35 different conceptual terms (for example, “governing law” or “termination date”) in various document types, such as loans and derivatives, we initially trained the AI system using only five documents and received an F1 score of 0.28. An F1 score is a measure of accuracy that mathematically combines false positives, false negatives, and true positives into a single score; a perfect F1 score would be 1, whereas a worthless F1 score would be 0. Further training over time with 565 more documents raised that F1 score to 0.83—not perfect, but quite good.
Each new document provides more context—a wider array of examples on which the algorithm can train its parameters and increase its accuracy. It should be noted, however, that accuracy can’t be measured by an F1 score alone (for example, a model with an F1 score of 0.60 might produce exactly what the business needs). The F1 score should be a guide, but in the end, it is human judgment and expertise that will validate the model and its level of accuracy.
The human element
AI tools tend to be highly accurate, but when they do make errors, they can be nonsensical and downright bizarre. Maintaining human oversight during the implementation of these AI techniques is crucial to ensuring quality, both for model training and for the final correction of the output in downstream processes. Successful implementation thus requires more than procuring the tools. Companies will also need to take the following actions:
Create a new platform (or reconfigure an existing one) that combines data management, automation tools, and AI applications, but also keeps people in the loop. This platform could be a central enterprise-level portal, wherein data could be stored and exchanged, applications uploaded and downloaded, and collaboration and joint development encouraged through a communication interface. This platform should be accessible to everyone in the organization and should be receptive to employee-led innovations and applications as well as those from professional developers. Of course, such democratization of these powerful technologies ought to proceed responsibly; leaders must stay vigilant about the potential risks and cognizant of the need for proper training and corporate governance.
Develop an enterprise-wide training program focused on digital and analytic understanding and awareness. Everyone will need to be upskilled, from the CEO to the newest entry-level hire, across all functions. Companies should consider training many of these employees not only in the use of these time-saving information extraction tools, but also in the fundamentals of the AI technologies behind them. With a better understanding of the capabilities, risks, limitations, and assumptions of the AI, employees will better understand how to use the tools responsibly and effectively. Every organization should ensure that its employees are conversant with current technologies, and this transformation will take hold only if the entire workforce is brought along.
Pay special attention to the impact on middle managers for whom a substantial portion of daily tasks will essentially be eliminated. That is a reality of automation—it creates efficiencies by taking over some tasks that are currently done by humans. The important message to communicate to managers is that, in so doing, AI will free them to focus on harder-to-solve problems, and to work on issues that demand human judgment or creativity—to do more managing and fewer mind-numbing repetitive tasks.
Enthusiastically offer incentives for those at the tactical level to use these tools and the new platform, beyond simply citing facts regarding the potential ROI. These incentives are dependent on the corporate culture, but could include KPIs for performance reviews, real-time bonuses, entry into a lottery for a large prize, and so forth. Incentivizing initial use of these tools will likely accelerate their acceptance. People will be won over when they start to see how the tools enhance their productivity.
Promote culture change by designating top-down champions who consistently and frequently communicate the benefits of AI implementation. The message that using these tools is on-strategy, is viewed favorably, and is good not only for the organization’s customers but also for the organization’s health and growth will accelerate adoption and make the technical and cultural changes stick.
As an application of AI, information extraction may appear mundane, but a closer look reveals that the opposite is true. With automated or augmented solutions, businesses have the potential to energize processes that have traditionally been time-consuming and error-prone, identify opportunities to add speed and efficiency, and unlock new insights that contribute to long-term growth. Boring has never seemed so exciting.