Data sharing

Diversity 1

Freely share your data in synthetic form to enable a data-driven organization without privacy concerns.

Can I share my client data freely according to GDPR?

Since the introduction of GDPR, companies are obligated to define why using personal data is required and if so, specifically obtain permission. Typically, data sharing is not the initial purpose of personal data collection. Moreover, the ‘’data minimalization’’ principle states that companies are required to minimize the use of personal data and consequently only use it when strictly necessary. Hence, sharing original client data is not allowed, while this it typically a precondition for collaborations, a data-driven organization and boosting innovation.

Is it only GDPR?

No. First, humans cause 90% of all data breaches. Hence, limit the amount of people with access to original client data is the most simple step in reducing the likelihood of data breaches. Second, based on research from the Dutch Privacy Authorities*, 94% of the Dutch population has privacy concerns, while only 6% has no privacy concerns. This directly impacts companies in how they should treat client data. For example, the share value of Facebook dropped with $119bn in one day after the Cambridge Analytica data scandal.

The classic risk assessment

The result: sharing original sensitive data is often strictly limited. Subsequently, when data sharing with 3rd parties is desirable as illustrated in figure 1, one typically runs into a slow and tedious process. It may require a risk assessment, certificates of good conduct or it is simply not allowed. Moreover, sharing sensitive date within the organisation could be a cumbersome process because of the ‘’data minimalization’’ principle (GDPR). Consequently, access to the right data can be a time-consuming process for data analysts or data scientists that immediately would like to start with the creation of a proof of concepts instead of data-collection.

Figure 1: classic data sharing

Our solution: data sharing in synthetic form

Synthetic data by Syntho reproduces the same statistical characteristics of your original dataset, while warranting that no records from the original dataset are present and specific individuals cannot be traced back. When applied on premise, the desired dataset can be synthesised and shared in synthetic form resulting in 4 benefits, as illustrated in figure 2:

  1. Synthetic data approaches the statistical properties of the original data, so interactions and patterns are preserved. Consequently synthetic data is realistic and representative.
  2. Synthetic data does not contain records from the original dataset. Hence, synthetic data rules out privacy risk.
  3. Original sensitive data does not leave the building, so the likelihood of data breaches is minimized.
  4. Time-consuming data sharing processes can be avoided.

The result: one is able to share representative data with no privacy risk.

Figure 2: the Syntho Engine on premise for data sharing purposes