Case Study

Synthetic data for the National Statistical Office, Statistics Netherlands (CBS)

About the client

As the national statistical office, Statistics Netherlands (CBS) provides reliable statistical information and data to produce insight into social issues, thus supporting public debate, policy development, and decision-making while contributing to prosperity, well-being, and democracy.

CBS was established in 1899 in response to the need for independent and reliable information that advances the understanding of social issues. This is still the main role of CBS. Through time, CBS has grown into an innovative knowledge institution, with continuous adoption of new technologies and developments in order to safeguard the quality of its data and its independent position

The situation

CBS holds a substantial amount of data for which privacy has to be fully guaranteed. From an organizational and operations perspective, there is a need for improved data-exchange methods in response to increasingly stringent privacy regulations and the obstacles they present in terms of data exchange.

CBS provides relevant, independent data on a wide range of societal issues. This requires a high degree of flexibility from CBS, something the staff works hard to achieve on a daily basis. Whether the issue is climate change, sustainability, the housing challenge, or poverty, CBS responds to the need for transparent and accessible information. The availability of data and the role of privacy is key, as CBS serves as a role model in the way it utilizes data.

The solution

Synthetic data could play a key role in this regard. It is important to note that privacy regulations, such as the GDPR, also need to be observed in these applications. They provide guidelines on the purposes for which sensitive data can and cannot be used. CBS sees added value in using synthetic data to facilitate this. From an organizational and operations perspective, there is a need for improved data-exchange methods in response to increasingly stringent privacy regulations and the obstacles they present in terms of data exchange. CBS sees added value in using synthetic data to accelerate and simplify this.

CBS sees opportunities for synthetic data for certain use cases and continues to explore further possibilities. In concrete terms, CBS will start using synthetic data for use cases that carry the least risk. These will be internal CBS cases in which synthetic data are generated for testing and development purposes. In addition, CBS will release a synthetic dataset for educational purposes,  which will be subject to a high degree of privacy. For other potential synthetic data services, CBS will need to gain yet more experience while involving relevant parties in the process.

The benefits

Accelerate data exchange with the scientific community

The demand for data and the amount of data available continue to grow, but data exchange with the scientific community still does not take place to a sufficient extent.

Position itself as a data partner and data hub

CBS seeks to use and share data securely. Synthetic data are increasingly being seen as an alternative to exchanging privacy-sensitive data. CBS regularly receives inquiries about synthetic data and is happy to address them. As a knowledge institute, CBS positions itself as a data partner and data hub. Synthetic data can be used to strengthen both specific collaborations and the role that CBS plays in society.

Synthetic data as test data

CBS sees value in using synthetic data internally for testing and evaluation purposes as an alternative to using real personal data from production.

Synthetic data for educational purposes

In addition, CBS will release a synthetic dataset for educational purposes which will be subject to a high degree of privacy. This aims to improve the quality of education by facilitating this with relevant and representative data.

centraal bureau voor de statistiek logo

Organization: Centraal Bureau voor de Statistiek (CBS)

Location: The Netherlands

Industry: Public sector

Size: 2000+ employees

Use case: Analytics, Test Data

Target data: Data related to the Dutch population

Website: https://www.cbs.nl/en-gb

group of people smiling

Data is synthetic, but our team is real!

Contact Syntho and one of our experts will get in touch with you at the speed of light to explore the value of synthetic data!