Suspicious Account Detection - How data analytics plays a key role in a secure and stable economy


With the rise of increasing awareness on fraud detection, and the coverage of multiple money laundering cases across the news, financial institutions in Pakistan are making it their foremost priority to keep a check on fake accounts and suspicious transactions. With international regulations around bribery and anti-money laundering well in place, Pakistani firms are now also stepping up and taking strict measures to have controls in place which would limit such transactions from taking place. Globally, AI is at the forefront of reducing money laundering and combating the financing of terrorism, and LFD is all set to implement these best practices.

Challenge

As numerous fake accounts and dubious transactions of billions of rupees were recently unearthed, the State Bank of Pakistan issued a complete guideline for all commercial banks to measure suspicious transactions, which are then referred to the FIA (Source: Guidelines on Compliance of Government of Pakistan's Notifications issued under United Nations Security Council (UNSC) Resolutions). The central bank also directed all commercial banks to monitor all bank accounts and report immediately in case of a suspicious transaction.

Current analysis mechanisms are primarily manual, time-consuming, and inefficient, with great room for negligence and personal discretion coming into play. Compliance officers and financial experts have noted that most of these systems apply overly broad rules which don't reflect the real money-laundering risks. High-risk entities can go undetected and escape scrutiny, while legitimate accounts and transactions may be flagged for investigation, resulting in inefficiency and unproductive outcomes.

One of the largest banks in Pakistan, with a huge customer base, provided a database of transactional activities and customer details for approximately 15 million customers to LFD. This data also included billions of transactions made over 11 years (2008 – 2018), which we were to clean, sort, analyses and develop an algorithm for detecting fake accounts, one that could be integrated with the bank's own system seamlessly, without compromising on the quality or confidentiality of the data.

Solution

LFD initiated this Herculean task pertaining to Big Data Analytics and, within six months, it provided a Risk Scoring Algorithm, 213 Anomalous Activity Cases, a Graph Network Analysis Application, and a clearer view of the high-net-worth clients with mix and match ability against provided conditions by the client. This ability is based on the demand of client to analyse the data based on 13 different conditions e.g., the number of accounts opened through the same CNIC, the number of accounts opened through the same contact number, account opened and closed within 18 months, etc.

The data shared with us was Open text Data for approximately 15 million customers, consisting of contact information i.e., addresses, emails, and phone numbers. This dataset lacked consistency and appropriate categorization for further processing. As a first step, LFD cleaned this huge data categorized the addresses into provinces, cities, districts, and towns. The second, more challenging step was low profile area marking, where LFD went down into details of each individual account, customer-wise and area-wise, and aggregated the total amount of transactions for each customer per district and town. Areas were then marked using a composite statistical measure of these metrics.

Result

Within six months LFD provided the client with a Risk Scoring Algorithm, 213 Anomalous Activity Cases, a Graph Network Analysis Application, and a clearer view of the high-net-worth clients with mix and match ability against provided conditions by the client. This ability is based on the demand of client to analyse the data based on 13 different conditions e.g., the number of accounts opened through the same CNIC, the number of accounts opened through the same contact number, account opened and closed within 18 months, etc.

Some of the outstanding features of our AI-driven solution for suspicious account detection were as follows:

1. Security

Owing to the extremely sensitive nature of the data and the task, the application is completely deployed at the client's site by LFD. Once the data is uploaded on the server, the application does not require internet access. In the case of our client, once the data was shared, no internet access was allowed and even USB ports were also blocked. As mentioned in our core values, trust and confidentiality with our clients is central to our DNA, and LFD provides maximum support and assistance to ensure data security to its client.

2. Creativity

We had a mammoth task of sifting through data for about 15 million customers comprising 3 billion transactional records and 70 million descriptive data points, with the dataset lacking consistency and appropriate categorization. We created multi-layered and interlinked categories based on geography (from provinces down to districts and towns), transactions and values, and customer contact information. Through our network analysis feature, the client could view all these linkages in a graphical format, and specially the movement of funds in two distinct formats: (i) multi-tier analysis of each customer (ii) clusters among customer base, allowing the client to take a bird's eye as well as microcosmic view of the data, ensuring we do not oversimplify the data or take to narrow a view.

3. Scale and Speed

The customer can extract and plot data from huge data sets within seconds. To illustrate, our client can view data in a graphical format for 900,000 customers including transactional activity, movement of funds and demographic information within just 5 seconds. Moreover, the algorithm for cleaning and classification of information can be expanded into further domains and used for various categories and levels of customer data.

4. Openness Integrability

Suspicious Account Detection is a stand-alone application which can be deployed to the client's server base. This application can be deployed live, but due to the confidential nature of the data, it is usually restricted to the server base of the client with extreme data security.