Case Study

Synthetic data for advanced analytics and testing with a leading international bank

About the client

Our customer is a multinational banking and financial services corporation. Its primary focus areas are retail banking, commercial banking, investment banking, wholesale banking, private banking, asset management, and insurance services. It has more than 50 million clients in more than 30 countries. The bank is listed in the top 100 in the list of the World’s 1000 Largest Banks.

The situation

Navigating the data landscape within the banking sector causes challenges. The bank faced difficulties in accessing and utilizing data due to fragmented storage across disparate databases and compliance regulations. Furthermore, the anonymization of data, intended to protect privacy, often resulted in machine learning models underperforming due to loss of valuable contextual information.

The bank’s commitment to stringent data privacy measures further complicates seamless data sharing and collaboration. These obstacles interfere with leveraging data-driven insights for decision-making, innovation, efficient fraud detection strategies, and the realization of its ambition toward open banking.

The solution

Deploying Syntho’s AI-synthetic data generation platform within the bank offers a transformative approach to address complex data challenges. By generating privacy-compliant realistic datasets, synthetic data empowers accurate machine learning model training, elevating fraud detection and risk assessment capabilities. This approach not only speeds up development cycles and enhances model performance, but also allows secure data collaboration among institutions.

The benefits

Upsampling minority groups

Synthetic data offers a powerful strategy to strategically upsampling minority groups within datasets, thereby fostering more balanced and representative input for machine learning models. This approach is used for example in the context of fraud detection and ant money laundry, where often the availability of data could be scarce and limited.


By using synthetic data, banks can adhere to strict data privacy regulations while still achieving accurate results and innovative advancements. By ensuring that sensitive customer information remains protected, this bank is now able to realize data-driven innovation on a privacy-preservative manner. Innovative models are used for example in the field of predicting defaults, marketing optimization, and KYC.

KYC: combating fraud, anti-money laundering and anti-terrorist financing

Synthetic data sharing emerges as a strategic advantage in the fight against financial crime within the banking sector, enabling collaborative efforts without compromising sensitive real-world data. Secure data sharing facilitation and analysis among financial institutions, regulatory bodies, and law enforcement. Also upsampling the often scarce amount of financial crime data (e.g. limited fraud data) allowed the bank to experiment with AI-based upsample technologies in comparison to traditional techniques such as interpolation and SMOTE.

Keeping data value and quality

Legacy anonymization destroys the data and requires domain knowledge. Synthetic data not only mimics the real data, but also keeps the original format, even for complex data structures like time-series data (in transactions) and location data.

Organization: Leading International Dutch Bank

Location: The Netherlands

Industry: Finance

Size: 60,000+ employees

Use case: Analytics

Target data: Financial transaction data

Website: on request

syntho guide cover

Save your synthetic data guide now!