Why AI is revolutionising compliance screening

Reading like a human – in a zillionth of the time
For financial sector companies seeking a better and more efficient way to meet compliance requirements, next-generation technology is an enticing prospect. But how does it work? And why is it worth spending time and resources on? This white paper, which includes an interview with Sabrina Boer, Research Engineer at Vartion, makes a case for AI-based compliance screening with Vartion’s unique tool Pascal.

The past decade has seen a push for greater transparency and accountability in the financial sector. Know Your Customer (KYC) and Client Due Diligence (CDD) requirements have become stricter and more complex than ever. Checking clients against PEP and sanction lists, corporate registers and other such sources is pretty straightforward, but in today’s complex and hyperconnected world, it is not enough. These days, adverse media screening of clients is advised and in some cases even mandatory. Financial institutions are increasingly applying such screening, not only for the sake of compliance, but also to meet their own ethical standards and to head off risks that could damage their brand.

Searching on keywords
But where do you even begin to screen media coverage on a client? Depending on their profile, a simple internet search on name can produce anything from 500 to several millions of hits. How do you sift out the articles that matter: unfavourable news on your client? In an ordinary search engine, you can add a keyword to your search, like “murder”. Beside the obvious limitation to English language articles, this is problematic in two ways:

False positives A search engine ignores context. The name and the keyword can both appear in an article without being connected. (“The jury in the murder case included XYZ”). Or the word can have a different meaning (“XYZ and his friends were screaming blue murder about the COVID-19 measures”).
False negatives For the same reason, the search will miss relevant instances where a different form of the keyword is used (“XYZ’s murderous attack on his wife”) or a synonym (“XYZ killed her in cold blood”) or where the keyword is missing altogether (“XYZ took her life”).

The keyword method leaves you with masses of hits that may or may not be relevant. Wading through that is a very time-consuming process. And when you’re done, there may still be many relevant articles you have missed. You don’t know what you don’t know. But you do have a decision to make.

Intelligent search technology
But what if you had an intelligent search engine? One that could actually “read” articles in multiple languages like a human, taking the context into account? But in a zillionth of the time? Advances in artificial intelligence (AI) technology now make this possible. Scientists are developing powerful search engines with machine learning and natural language processing (NLP) capabilities. It’s a matter of applying this technology to compliance screening.

Vartion has done this. The scientists of this data analytics company have built a state-of-the-art, AI powered search engine in house and teamed up with seasoned financial professionals to create a unique compliance support tool called Pascal. Pascal performs split-second searches in a proprietary data lake, scraped daily from global media sources. By finding and classifying the articles that matter, Pascal provides greater assurance to the users who ultimately decide which clients pass the screening and which don’t. Artificial intelligence complementing human intelligence. And Pascal’s speed makes compliance screening far more efficient, significantly limiting costs.

The merits of AI, straight from the scientist’s mouth
Although some users are happy just to know a tool works, many professionals may be keen to know more about the nuts and bolts. Sabrina Boer, Research Engineer at Vartion, is one of the developers who helped design Pascal, and is always working on ways to achieve even better search results. In an interview, she sheds some light on the science at the heart of Pascal.

Sabrina, what is your research about?
Media is an enormous source within Pascal, with currently over 300 million articles in our data lake. To ensure that our users only see the most relevant information regarding their clients, we created several intelligent features to push relevant articles to the forefront. One such feature is adverse media classification, which can identify 23 adverse events within media articles. These adverse events include the designated categories of offences as specified by The Financial Action Task Force (FATF), an intergovernmental organisation which focuses on combating money laundering and terrorist financing. The specification of adverse events will give our users quick insights into the crimes their client may be involved in. This will help users to more efficiently navigate the large number of media articles, before coming to a decision about their client.

How does adverse event classification work?
Creating an adverse event classifier involves several processes. First, we created extensive definitions of each of the adverse events and the differences between them. Then a team of annotators, including a domain expert, annotated a large number of media articles by hand, identifying one, or none, of the 23 adverse event categories. The hand-annotated data was then used to train a transformer-based classification model. Finally, the performance of the model was validated using two evaluation methods.

Artificial intelligence complements human intelligence. And Pascal’s speed makes compliance screening far more efficient, significantly limiting costs.

Could you describe the groundwork you did?
For the manual annotation of the datasets, the guidelines had to clearly define each offence category so annotators would know when a media article should be labelled with a particular adverse event. Moreover, keywords were selected which would assist in recognising certain events. The definitions and accompanying keywords for the 23 adverse events were collected by a Vartion employee and read, reviewed, and approved by a domain expert.

The data for annotation was compiled in two manners: through a basic text classifier and through keyword selection. The basic text classifier was trained to differentiate between articles that contained adverse-related keywords and non-adverse-related keywords. This provided a dataset with a high likelihood of covering adverse events. However, as some adverse event categories were underrepresented in this dataset, we also used a keyword-based approach to collect articles which contained keywords related to a specific adverse event. At the end of the annotation process each adverse event had at least 150 example articles (figure 2).

Figure 2 – Number of articles per adverse event present in the dataset used for training the adverse classifier.

Why did you choose a transformer-based model?
There are many different machine learning models, and transformer-based models are relatively new, but for many tasks within the natural language processing field they are already the method of choice. Transformer-based models have the benefit of not needing to process sequential data in order. Consequently, they can increase parallelisation during training and that reduces the training time. Simply stated, this means that transformer-based models can handle more data than other models.

Another advantage is that there are already many pre-trained models, such as BERT. These pre-trained models are already capable of understanding domain-specific language and only need to be fine-tuned for a specific task, such as text classification. Most transformer-based models are only able to handle short texts, such as twitter messages, but some of them can process longer texts, making them a perfect choice for performing text classification on media articles.

How did you train your model?
During our training of the transformer-based model, two datasets were provided by splitting the hand-annotated dataset into a train and test set (ratio 9:1). The train set was used to train the model. The test set was used to evaluate the model and make changes to it based on the results.

How did the model perform?
Two independent datasets were created to further evaluate the model. One dataset consisted of representative media articles. It was created by pre-selecting media articles based on keywords which represented each of the adverse events. It contained 70 hand-annotated articles per adverse event. The other dataset contained more complex media articles. For example, articles with a positive sentiment and an adverse event label, or vice versa. This dataset contained 30 hand-annotated articles per adverse event.

The model was evaluated on both the representative and complex dataset based on two performance measures. First, we measured how well the model was able to classify media articles which contained an adverse event as an adverse article, regardless of the specific adverse event. The model scored 83% on the representative dataset and 77% on the complex one. Its main struggle comes from articles which mention adverse-related words in a non-adverse sense, such as anti-money laundering or anti-terrorism. Second, we examined how good the model was at identifying the exact adverse event that was present in an article. Here, the scores were 72% and 65%, respectively. The model has difficulty separating adverse events which have a high amount of overlap. For example, animal smuggling can be considered both smuggling and environmental crime.

As a small test, the adverse event classifier was used to predict adverse events in 10,000 random English media articles. This demonstrated that two-thirds of the English media articles do not contain an adverse event. Assuming our users are only interested in adverse events related to their clients, this could save them up to 66% of their time on media analysis.

What are you working on now?
We are now rolling out adverse media classification across other languages. A translation model is used to translate non-English texts to English, after which the English adverse media classifier can be applied. The performance of the model on these texts is evaluated by native speakers from Vartion and third parties. Currently, adverse event classification is available for English, Dutch, German, Spanish and Russian.

Conclusion
Credible science shows that while compliance decision making is and remains up to humans, using AI for adverse media screening is a quantum leap in terms of accuracy and efficiency. Vartion’s compliance support platform Pascal effectively harnesses these benefits for the financial industry – or other companies that are interested in knowing who they are doing business with. Moreover, as this field of science further matures, even more benefits are emerging that will quickly find their way into Vartion’s cutting-edge product.

Would you like to know more about Pascal and the potential advantages for your organisation? Contact [email protected].