ENGINEERING

Your BI Tool May Be Giving You the Wrong Answers

Jonathan Avrach

Sr. Director, Solution Architecture and Evangelism, Sigma

2^53. What a boring way to begin a blog. But read on because that number may be sabotaging the answers you’re getting from your BI tool and you don’t even know it.

Modern BI tools like Sigma use a web browser for everything. This means all client side logic is written in JavaScript. However, JavaScript wasn’t built to handle the large numbers you might see in enterprise applications. In this post we take a look at the problem with JavaScript and large numbers, what the impact is for business intelligence tools, and why this is not an issue for Sigma.

Problem

When JavaScript was first introduced in 1995, it wasn’t aimed at enterprise applications. It was geared toward dynamic web page content for applications like email, stock quotes, and news. The built-in JavaScript math library handled 52 bit integers. This meant the largest number it could accurately represent was (2^53) – 1, about 9 quadrillion, which was more than enough for the applications of the time. CPU and memory capabilities also made this a reasonable limit.

2^53 

= 9,007,199,254,740,991
~ 9 quadrillion

The largest number JavaScript can handle natively

This raises two interesting questions:

 Why would a number ever be larger than 2^53?

What happens when the number is larger?

The answer to the first question is that it likely wasn’t in the early days. Fast forward to today and we find many enterprise applications regularly encounter large integers. One example is ID values that need to be globally unique, are auto-generated, and are most efficiently stored as integer data types.

The second question poses a problem of correctness. When an integer is greater than 253-1, JavaScript’s built in number primitive has no way of representing it. It’s limited to 52 bits of accuracy so large numbers get rounded. Think of it like a poorly sampled mp3 file — it sounds similar to the original music but it’s not quite the same. The resulting number in JavaScript can’t be trusted.

Impact

If not handled correctly, rounding large numbers in a BI tool impacts rendering, filtering, aggregation, and row-wise updates. In short, you get inaccurate results. Let’s explore each one of these to see how.

Rendering means showing the number on the screen in the browser. For example, if a query returns the value “4” but shows “2” on the screen, that’s clearly wrong. This is what’s happening but for much larger numbers. If you observed this happening in your BI tool, you’d quickly lose trust in all future results.

Filtering can also break for similar reasons. If you see a value on screen that’s wrong and you filter on that value, it will either return zero rows or the wrong rows.

Aggregation, e.g. summing up multiple values, can be wrong if client-side computation is happening and large integers aren’t handled. In Sigma, compute is pushed down to the cloud data warehouse which has many benefits. Pushdown achieves performance at scale and ensures the latest results since the data is live. It also avoids the large integer problem. Unlike JavaScript, databases are designed to crunch numbers so precision is sacrosanct. To achieve this, they use exact numeric data types. This means all of your computing is performed correctly on large integers.

Sigma pushes compute down to the CDW

Finally, if your BI tool supports editing cell level data like Sigma’s newly introduced Warehouse Data Editing (WDE), but doesn’t handle large integers correctly, cell updates will fail. Updates rely on a primary key and these are often large integer IDs.

Solution

To handle large numbers, JavaScript introduced an object known as BigInt. Let’s look at an example of a large integer to see how BigInt fixes the precision problem:

152879992176018639719

BigInt Representation

152879992176018639800

JavaScript Number Primirive

Here, we see a large ID value. In the first line ending in 719, the value is correct. This is using BigInt. In the second line, the same number is stored as a native JavaScript number. Since it’s limited to 52 bits of accuracy, it has no way of representing such a large value. Left with no other option, it rounds the number. The result is nonsense and can’t be trusted.

The solution sounds simple on the surface — change your code to use BigInt instead of primitive numbers. However, code changes can impact other areas. You also need to ensure dependent libraries use it. There may be dependencies elsewhere in the stack beyond the JavaScript layer. Sigma is built to natively use JavaScript’s BigInt library so this is not an issue.

Conclusion

Correctness is paramount. You should always scrutinize the results in your BI product. The next time you see a large integer (such as an ID), do a spot check against the source data. Rest assured, if you are using Sigma this is handled correctly!

Ready to visualize your data for actionable insights?