In Part 1 we outlined, broadly, an approach to reducing false positives. Let's delve a bit deeper.
Some of our client's customers have multiple product specific accounts, and the client provides customers with the ability to combine (link) those, saving money through consolidation but without losing the varied functionality provided by the individual products. This is popular within the industry, as the saving is not trivial, but the linking can easily fail due to the associated level of system/process complexity. There has been a fair level of regulatory (and media) attention to such failures over the past few years, with hefty infringement penalties and costs.
The Internal Audit function is relatively small, with fewer than ten FTE, but progressive, punching significantly above its weight, and respected by stakeholders. The team decided to take a data driven approach to reviewing a certain category of products, opting to cover the full population of related accounts for ~16 months.
We used an open source tool to analyse customer master, account master, account linkage, transactional and CRM data. The CRM data was primarily free text, so we used a set of natural language processing techniques, providing a level of structure, and then blended the processed data with the other (structured) data sets. To put the exercise into perspective, we were dealing with more than 750 million source records.
With the data in a format that could be used easily, we performed several analyses. One of those was related to account linkage - identifying expected/potential account links and comparing those to the actual links that had been setup. The number of raw exceptions generated from this was significant - too many to investigate manually, and we expected that many of them would be false positives.
In this next post, we explore a traditional approach to dealing with this, outline why that wouldn’t work in this case, and then explain how we resolved it.