May 15, 2025

Scaling UX Discovery With Sigma: Mining Patterns In Massive Datasets

May 15, 2025

I’m Matt Cowger, and I’m the Table Tamer in Chief (a.k.a. Writeback Engineering Manager) at Sigma. I take care of our writeback and input tables—an awesome feature that lets users edit cloud data warehouses through a familiar, spreadsheet-like interface. My mission? Make sure this feature is as powerful as it is easy to use.

This capability sits at the center of our data apps initiative and played a major role in our latest launch. It lets users go beyond static analysis and interact with their data in real time, unlocking a level of flexibility and control that most traditional BI tools just can’t offer.

Writeback lets users go beyond static analysis and interact with their data in real time, unlocking a level of flexibility and control that most traditional BI tools just can’t offer.

And as with any feature this dynamic, real-world usage often raises good questions—the kind that surfaces subtle UX issues, expose hidden assumptions, and push us to dig into how people actually use the product. One of those questions recently came up around simultaneous edits—and it sent us deep into the telemetry to find out what was really going on.

A UX challenge hiding in plain sight

How often do users edit the same input table simultaneously—or nearly so, say within 60 seconds of one another?

If two users edit the same cell at the same time, we need to make sure they’re aware of the overlap. Otherwise, they might mistakenly believe data is being lost, when in fact, one edit is simply overwriting another.

At Sigma, we pride ourselves on making data-driven decisions. So when this question came up, I knew it was an opportunity to lean into our data infrastructure and answer it with precision, rather than relying on assumptions or guesswork. As a new member of the team, it was also my first chance to really dive into product telemetry at scale It wasn’t enough to simply look at usage patterns across tables in a general sense; we needed to zoom in on specific events (user edits) and cross-reference them to see if they overlapped.

If we determined that this situation happened frequently enough, we’d need to consider engineering changes to improve the experience. For example, we could surface a notification to users when another person is editing the same cell, helping them avoid the risk of unknowingly overwriting each other’s changes.

Mining 14 billion events for a single insight

At Sigma, we have a robust and sophisticated telemetry infrastructure. Our product event data lives in a Snowflake database, with over 14 billion rows tracking everything from user actions to performance metrics. The good news is that this massive dataset is well-structured and rich with detail—ideal for analyzing user behavior. Side Note: We take privacy seriously. All personally identifiable information (PII) is appropriately protected and masked to ensure full compliance with privacy regulations.

That said, querying a dataset of this size is no small task. You need the right tools and infrastructure to sift through billions of rows efficiently—which is exactly where Sigma, powered by Snowflake’s query engine, comes in.

The hands-on workflow behind the answer

I turned to Sigma’s interface, purpose-built for this kind of exploration. I could dive into the data and analyze events happening within 60 seconds of one another—without ever writing complex SQL by hand. Here’s how I did it:

Step 1: Filtering for relevant events

First, I filtered for the specific user events tied to input tables—cell edits, deletions, additions. From there, I zeroed in on interactions by different users occurring within a 60-second window. Sigma’s filtering tools made it easy to sort by key fields like Start Time and User ID, cutting through the noise to focus on what mattered.

Sigma’s spreadsheet-like interface helped me hide irrelevant columns, hone in on the right variables, and visualize the data clearly. I was essentially working with a time series of user events—perfect for spotting patterns in overlapping edits.

Step 2: Creating custom columns

To pinpoint events within 60 seconds of each other, I built custom columns in Sigma using functions familiar to any SQL user.

Lag Function: Pulled info from previous rows—essentially the last event.
Lead Function: Looked ahead to the next row in the dataset.
DateDiff: Let me calculate time differences between events.

Using Start Time as the stable sort key, I could easily compare adjacent user actions and identify overlaps. This was the heart of the analysis: finding out if multiple users were editing the same input table within our defined 60-second window.

Step 3: Running the query

Once I had the custom columns set up and the dataset properly filtered, I used Sigma’s ability to automatically generate SQL queries. The platform created a query that ended up being over 600 lines long—built to scan more than 14 billion rows of product event data in Snowflake.

The beauty of this process was that Sigma handled the heavy lifting. I didn’t have to manually write long or complex SQL. Instead, I used Sigma’s interface to structure the logic, which let me focus on the analysis itself. I could review the data, fine-tune my formulas, and iterate on the setup until the results gave us exactly what we needed.

Once the query was generated, it was automatically executed against our Snowflake database. And thanks to Snowflake’s powerful architecture, it completed in just 2 to 3 minutes—despite the massive scale of the dataset.

This kind of performance is exactly why the Sigma + Snowflake combination is so effective: massive scale, fast results, and no need to get buried in boilerplate SQL.

Step 4: Making this easier to use and share

Once I had the insights, I made the workbook more reusable. I pulled out key filters—like the evaluation window and 60-second threshold—into page controls. This allowed others to explore the data without digging into formulas.

It only took a few right-clicks and drag-and-drops. (Forgive my UI design skills, there’s a reason I focus on backend work and not on our exceptional front-end team.)

The insight that led to real product change

After crunching the numbers, the answer was clear: users were editing the same input tables within 60 seconds more often than we expected.

This validated the need for a UX improvement. Real-time awareness was missing—and users were accidentally overwriting each other’s changes. Based on the data, we moved forward with engineering changes to surface live editing notifications and improve collaboration.

Turning gut instinct into data-driven decisions

This process—from defining the question to analyzing billions of rows—was a perfect example of how Sigma turns instinct into insight.

At Sigma, this is how we operate: we use data to drive smarter decisions. And this experience reinforced what we already believe—when you have the right tools, data becomes action.

What could have been a guessing game became a precise, evidence-based decision. Sigma’s spreadsheet-like interface, combined with the power of Snowflake, helped me answer a nuanced product question in under an hour. And more importantly, it led to a better user experience for our customers.

At Sigma, this is how we operate: we use data to drive smarter decisions. And this experience reinforced what we already believe—when you have the right tools, data becomes action.

2025 Gartner® Magic Quadrant™

Engineering