Case Study

Here's the Case-Study of some of our works we'd like you to see...

CASE STUDY

GenRocket

GenRocket’s data generation platform required a secure, efficient mechanism to mask sensitive data fields across billions of records to ensure compliance and protect Personally Identifiable Information (PII). While the initial implementation was functionally sound, it suffered from severe performance limitations, taking approximately 70 hours to complete an end-to-end masking cycle.

This delay created a major bottleneck for production workflows where the timely delivery of masked datasets was critical for downstream operations.

The Challenge

The performance bottleneck stemmed from multiple technical inefficiencies within the existing data masking process:

Inefficient query design and sequential operations led to high execution time.
Excessive use of update statements caused locking, transaction log bloat, and slow processing.
Large-scale updates frequently pushed SQL Server’s transaction log to capacity, leading to potential failures and rollbacks.
These challenges collectively made the masking process unreliable and unsuitable for enterprise-scale, production-grade use.

The Solution

A targeted optimization initiative was launched to address the core performance issues rather than applying surface-level tuning. The team conducted an in-depth analysis of query execution plans, system resource utilization, and transaction log behaviour to identify critical pain points.

Key strategies implemented included:

Insert-over-Update Strategy: Replaced expensive update operations with optimized insert processes to minimize locking and logging overhead.

Parallel Processing with Threading: Divided large datasets into manageable chunks and processed them concurrently, dramatically improving throughput.

Transaction Log Management: Implemented proactive measures to ensure logs never reached capacity, preventing failures and unnecessary rollbacks.

Query and Execution Plan Optimization: Rewrote queries, improved joins, and leveraged SQL Server indexing for optimal performance.

Set-Based Query Design: Eliminated loops and cursors in favour of set-based operations to enhance scalability and efficiency.

Through a combination of strategic query redesign, parallelization, and transaction log management, the team successfully reduced processing time from 70 hours to just 7 hours; a staggering 90% performance improvement.

GenRocket now benefits from a high-performance, secure, and compliant data masking process, enabling faster delivery of protected datasets that align with enterprise speed and compliance requirements.

Technology Stack

SQL Server

Java

SFTP Server (via JDBC)