Sigma Compute Cost Efficiency and Caching Overview: Snowflake and BigQuery
Sigma deploys unique caching and query functionality to ensure you are making the most out of your Snowflake and BigQuery investments, without undertaking unnecessary spend.
Sigma provides multiple tiers of caching and evaluation to reduce data warehouse load and enhance user experience. Using these mechanisms, Sigma helps ensure that customers typically gain far more value from fast, easy access to data than the increased cost. We also typically see that for a given organization, economies of scale are realized, meaning as more Sigma users onboard, the associated cost per user decreases. The mechanisms that Sigma uses to do this are the Sigma Results ID Cache, Snowflake/BigQuery Results Cache, Sigma Browser Cache, Sigma Alpha Query, Sigma Materialization, and Sigma Query Timeouts:
Sigma Workbook Cache Duration/Sigma Results ID Cache
A key benefit of Sigma is that data is never stored or cached in our platform, and the default behavior is that we are always running live on top of your data warehouse. But what if your data is not often changing?
You can set a cache duration so that your workbooks are not always running new queries/ compute. In the example below, a cache duration of 10 minutes means Sigma will run fresh queries if it has been longer than 10 minutes since the queries for that workbook have been run to work with fresh data.
For workbook interactions within the 10 minute interval, Sigma maintains a mapping of the structure of each query run and the Snowflake/BigQuery query ID. If a Sigma-generated SQL query has been previously run within the 10 minute cache duration, Sigma can request the previously run result from Snowflake/BigQuery using the query ID. Such queries are not seen in Snowflake/BigQuery’s query history and do not use any compute.
Sigma Browser Cache
Sigma is fast even when working with datasets in the billions of records because all compute usage is happening within the data warehouse itself and a full dataset never has to be migrated into Sigma’s systems. In fact, when working on a large dataset in a table element in a Sigma workbook, data is requested in a paginated manner, and only 10,000 rows are dispatched to your browser for viewing., Aas you scroll down in the table, an additional 10,000 rows is requested for viewing as you get to the end of that page of data. Sigma maintains a cache of these recent results that have been dispatched to the web browser in the browser itself. This cannot help on the initial load since the cache is empty. However, as changes are made in the workbook, it is automatically leveraged. If Sigma determines that there is enough data in the browser cache to support new visualizations and queries, no network request is issued, and Sigma Alpha Query, which is detailed below, can be leveraged.
Sigma Alpha Query
Sigma’s Alpha Query engine leverages the Sigma Browser Cache to perform new calculations using the browser itself, instead of sending new queries for the database to compute. The Sigma Alpha Query Engine is able to intelligently determine when it can use data in the browser cache to perform new calculations vs. when new queries should be issued to the database. An example would be a user creating a new calculation such as a percentage change ([column 2] - [column 1])/[column 1].
Sigma also employs unique write-back functionality to turn complex, expensive, and slow datasets into efficient and fast ones. A complex dataset (one that may involve many joins, multiple aggregations, or complex calculations) can be materialized as a single table back to Snowflake/BigQuery, with data being refreshed on a schedule set and automated in Sigma. This will lead the same query to be much less expensive and much more performant compared to when all operations are being done on the fly.
Sigma Query Timeout
Sigma enforces query timeouts to ensure that resource-intensive queries are canceled when a certain time threshold is met. This ensures that runaway queries do not unnecessarily consume resources and credits. The default timeout limit is two minutes and is configurable for any data warehouse connection.
Snowflake/BigQuery Results Cache
Sigma will always generate the same exact SQL for a given configuration of any element in a Sigma workbook. If a workbook element has had a certain configuration within the last 24 hours, and the data has not changed in the source tables, (amongst some other considerations), Snowflake/BigQuery will determine that the results can be fetched from their respective Results Caches, and compute is not used, resulting in no cost and better performance.
Where can I learn more about Sigma features and use cases?
Our online documentation is a great way to get high-level information on product features along with as much fine detail as you want.
Sigma QuickStarts provide “step-by-step” guides to using Sigma, exploring specific features and use-cases.