Syntho’s quality assurance report

Assess generated synthetic data on accuracy, privacy, and speed

Introduction quality assurance report

What is a quality assurance report?

Syntho’s quality assurance report assesses generated synthetic data and demonstrates the accuracy, privacy, and speed of the synthetic data compared to the original data.

Why do we provide a quality assurance report for every generated synthetic data set?

At Syntho, we understand the importance of reliable and accurate synthetic data. That’s why we provide a comprehensive quality assurance report for every synthetic data run. Our quality report includes various metrics such as distributions, correlations, multivariate distributions, privacy metrics, and more. This way, you can easily assess that the synthetic data we provide is of the highest quality and can be used with the same level of accuracy and reliability as your original data.

What do we assess in our quality assurance report?

Accuracy
Privacy
Speed

Synthetic data accuracy metrics

Capturing a glimpse: this section illustrates highlights from our synthetic data quality report. Our assessments examine the synthetic data in comparison with the real data across various dimensions.

Distributions

Synthetic Data Distributions in comparison to real data

Distributions illustrate the frequency of variables within given categories or values and are accurately captured by the Syntho Engine.

Correlations

Synthetic Data Correlations in comparison to real data

Correlations show the relationship between variables, illustrating the degree to which variables are related. The Syntho Engine accurately captures these relationships.

Multivariates

Synthetic Data Multivariate Distributions in comparison to real data

Multivariate distributions and multivariate correlations take us beyond singular dimensions, providing a comprehensive view of how multiple variables are related. The Syntho Engine captures these relations.

Request Report

Do you have any questions?

Talk to one of our experts

Synthetic data privacy metrics

Why are synthetic data privacy metrics relevant?

Synthetic data generation is complex and pitfalls do exist and have to be controlled for. With AI algorithms, overfitting is a risk and this is also the case for synthetic data generation with AI. Hence, one should control for the risk of overfitting when generating synthetic data. The risk of overfitting is controlled for in the Syntho Engine. On top of that, the Syntho Quality Assurance (QA) report allows organizations to demonstrate the synthetic data did not overfit on the original data. We also assess on more privacy related aspect, which are often used by internal auditors.

Test on exact matches

Test on “Exact matches” with the Identical Match Ratio (IMR)

Demonstration that the ratio of the synthetic data records that match a real record from the original data is not significantly greater than the ratio that can be expected when analyzing the train data.

Test on Similar matches

Test on “Similar matches” with the Distance to Closest Record (DCR)

Demonstration that the normalized distance for synthetic data records to their nearest actual record within the original data is not significantly closer than the distance that can be expected when analyzing the train data.

Test on Outliers

Test on “Outliers” with the Nearest Neighbour Distance Ratio (NNDR)

Demonstration that the distance ratio between the nearest and second-nearest synthetic record to their closest record within the original data is not significantly closer than the ratio that is to be expected for the train data.

Request a quality assurance report

This is only a snapshot that summarizes the essence of our synthetic data quality exploration and quality assurance report. It offers a nuanced understanding of distributions, correlations, and multivariate distributions as part of synthetic data as captured by the advanced capabilities of the Syntho Engine. More details on our quality assurance report are available on request.

Request report

What is synthetic data?

Quality assurance report

External evaluation by SAS

Time series synthetic data

Upsampling

PII Scanner

Synthetic Mock Data

Consistent mapping

De-identification and synthetization

Rule-based Synthetic Data

Subsetting

Deployment and integration

Connectors

Extended features

Supported data

User documentation

Schedule a demo

Test data

Analytics

Data sharing

Product demo's

Data monetization

AI modeling

Healthcare

Finance

Public Organizations

User documentation

Whitepapers and Guides

Blog

Webinars

Case Studies

Pricing

About us

Careers