Rule-based Synthetic Data

Generate synthetic data to mimic real-world or targeted scenarios using predefined rules and constraints

rule-based synthetic data graph

Why do organizations use rule-based generated synthetic data?

Data creation icon

Generate Data from scratch

In cases where data is either limited or where you do not have data at all, the need for representative data becomes crucial when developing new functionalities. Rule-based synthetic data enables the generation of data from scratch, providing essential test data for testers and developers.

Enrich data

Rule based synthetic data could enrich data by generating extended rows and/or columns. It can be used to produce extra rows to create larger datasets easy and efficiently. Additionally, Rule based synthetic data can be used to extend data and generate additional new columns potentially dependent on existing columns.

Flexibility and customization

The rule-based approach provides flexibility and customization to adapt to diverse data formats and structures, enabling the full tailoring of synthetic data according to specific needs. One can design rules to simulate various scenarios, making it a flexible method for generating data.

Data cleansing

Rule-based synthetic data facilitates data cleansing by generating data adhering to predefined rules, correcting inconsistencies, filling missing values, and removing errors, ensuring the integrity and quality of the dataset is preserved. This allows users to have data with even higher quality.

Privacy and Confidentiality

Rule-based synthetic data generation is particularly useful in scenarios where real personal data cannot be used due to privacy concerns or legal restrictions. By creating synthetic data as an alternative, organizations can test and develop without compromising sensitive information.

Check our User Documentation here

Why rule-based synthetic data is more advanced

Examples of synthetic data you can generate with Calculated Column functions:

Data Cleaning and Transformation

Effortlessly clean and reformat data, such as trimming whitespace, changing text casing, or converting date formats.

Statistical Calculations

Perform statistical calculations like averages, variances, or standard deviations to derive insights from numerical data sets.

Logical Operations

Apply logical tests to data to create flags, indicators, or to filter and categorize data based on specific criteria.

Mathematical Operations

Execute a variety of mathematical operations, enabling complex calculations like financial modeling or engineering calculations.

Text and Date Manipulation

Extract or transform portions of text and date fields, which is particularly useful in data preparation for reporting or further analysis.

Data simulation

Generate data following a certain distribution, minimum, maximum, data format and many more.

How to generate rule-based synthetic data

Our platform supports for Rule Based Synthetic Data generation via our Calculated Column function. Calculated Column functions can be used to perform a wide range of operations on data and other columns, from simple arithmetic to complex logical and statistical computations. Whether you are rounding numbers, extracting portions of dates, calculating averages, or transforming text, these functions provide the versatility to create exactly the data you need.

Create data based on specific business logic

Users can generate tailored data by applying business logic using tools like mockers and calculated columns.

Calculated Column Syntho Engine

Preserve data relationships across tables

Users can maintain consistent mapped values across tables, ensuring that data relationships are preserved and reliable.

Expand datasets for better testing and analysis

Users can expand datasets while maintaining statistical consistency, enhancing the value of data for testing and analytics purposes.

Upsampling-syntho

Other features from Syntho

Explore other features that we provide

Frequently asked questions

Rule-based generated synthetic data refers to the process of creating artificial or simulated synthetic data that follows predefined (business) rules and constraints. This approach involves defining specific guidelines, conditions, and relationships to generate synthetic data. 

Build better and faster with synthetic data

Unlock data access, accelerate development, and enhance data privacy. Book a session with our experts now.