The Tonic Project

My second summer internship at RigUp (now Workrise) was a 10-week stint on the Engineering Enablement team. My project this time was Data Improvement using Tonic (I was the only intern assigned to the project, however, I received a major amount of help from my mentor, Bert, on the Enablement team).

The problem: Current development environments did not have the necessary data to be able to thoroughly test developer's changes. Tonic was able to solve the problem not only because it is able to provide data, but also because it generates data in a way that mathematically guarantees security and utility. And so, the goal of my work was to incorporate this tool into Workrise's infrastructure and eliminate any blockers that would keep developers from utilizing the new output data.

We achieved this goal by first deploying the Tonic application via Kubernetes (GKE) with Helm. The Tonic workspace and subsequent resources were defined using Terraform which made it incredibly easy to connect existing Postgres and MongoDB databases after we created Terraform modules for each of them as well as a "Tonic" Terraform provider. All of this resulted in the database-to-Tonic connection process only taking a few lines of Terraform HCL code through invocation of the respective modules.

After the database was connected, as described above, we were able to choose generators within the Tonic UI for columns that contained PII, PI, or PD. It was essential to work with the database owners in this step to help identify which columns we wanted to mask. After the masking process was complete, a generation job could be started to perform the generations and write the data to the specified Tonic output database. This is where the concern of how to consume the data could be addressed.

The method we chose to distribute the Tonic output databases was by containerizing each of them into Docker images that resided in GCP Artifact Registry. Thus by providing a docker-compose file, developers were able to spin up production-like data on their local, thus achieving our goal. By the time my 10 weeks were up, several production databases had already been connected, and self-service documentation existed for developers to connect their own databases to Tonic, in a few simple steps.

Back to Home