How worksuite uses synthetic data in their screening process

Introduction to Worksuite

Worksuite is an exclusive network of top-tier Data Science and AI  freelancers (500+). We bring experts and companies together on our platform by guiding freelancers before and during projects. We call this Data Science & AI as a Service.

The added value of synthetic data in the screening process

The freelancers on the Worksuite platform go through a screening process. This process is designed around a profile screen, a video call, and a data science challenge. The challenges are built for areas such as NLP, Image Recognition, Time Series Forecasting, Classification, and Regression. For these last two, an applicant receives a train- and test dataset where the test dataset is not labeled. The applicant then implements their solution and returns the predicted labels from the accompanying test dataset. It is imperative that the dataset is either proprietary or can not be found online. Because in either situation the chance of fraud would be significant.

Therefore, Worksuite worked together with Syntho to anonymize classical Machine Learning (structured) datasets to build fraud-free classification and regression challenges. By using the Syntho Engine to anonymize datasets we can leverage the interesting properties of Machine Learning research datasets, without opening up the possibility for fraud.