Generating Insights from Semi-Structured JSON Analysis
Content Marketing Manager, Sigma
As our modern society has become more internet-connected, the JSON data interchange format has become THE format for sharing data. Websites, apps, mobile devices, wearables, and IoT devices all share semi-structured JSON. Nearly every business is generating JSON data or has access to vast public JSON repositories. JSON analysis has the potential to deliver valuable insights, including consumer behavior trends, buying patterns, and inventory demand by location.
JSON is clearly a deep pool of opportunity for companies that seek to be data-driven. But up to 73% of all enterprise data goes unused, including JSON data. Why? And what can be done to change that stat? In this post, we examine the challenges of JSON data analysis and how modern cloud data warehouses and analytics tools are making the task simpler. Companies of any size can now take advantage of all that JSON data has to offer. Let’s dive in.
Challenges of JSON data analysis
There are good reasons why companies aren’t using JSON data as they could be. JSON has three significant built-in challenges that make it difficult to work with.
First, JSON repositories are remarkably vast. Because nearly all connected devices use the JSON format, data is being generated rapidly. For example, every single in-app event generates a data point, from tracked activity by people using workout apps to GPS pings triggered by delivery truck drivers using navigation apps. The sheer volume and variety of data can be overwhelming — you may be running a query against 2.5 billion rows of data, and doing it in real-time. Even after this data is formatted and modeled, it’s tricky to analyze without the right tool.
The second, related issue with JSON analysis is that JSON data can’t be stored, managed, and analyzed as quickly as structured data formats. JSON is a schema-less, text-based representation of structured data that is based on key-value pairs and ordered lists. The files can contain an arbitrary depth of nesting, making it even more difficult. With readable tags and an implicit organizational structure, it’s not the Wild West that fully unstructured data is, but it’s hardly as easy to digest as structured data with its rows and columns and rules.
The thing about JSON as a semi-structured format is that you can load the data into a NoSQL platform like Hadoop or MongoDB, delaying data modeling and schema design until later. This allows you to get the data into the repository very quickly. But then the data must be parsed into an understandable schema, and even for data scientists, this process is time-consuming. Until quite recently, unless a company could afford to employ large data teams to keep up with their domain experts’ ad hoc report requests, the technical nature of JSON data creates bottlenecks in the path to insights. Thanks to modern tools like Snowflake and Sigma, this challenge is now able to be overcome.
Cloud data warehouses open up new possibilities for JSON analysis
Technological developments in two areas have changed what’s possible with JSON analysis. Let’s examine these new capabilities.
JSON and the modern cloud data warehouse
The most significant development in the cloud data warehouse is its ability to now store semi-structured and unstructured data side-by-side. New technologies have allowed ELT (extract, load, transform) processes to replace the slower and more limited ETL sequence. Rather than needing to be structured and summarized, data now only must go through a simple cleaning before it’s added into the data warehouse. Additionally, data vault modeling provides even more flexibility, bypassing the judgment of what’s valuable and what isn’t while integrating data from various systems and tracing the origin of all data at the same time.
And even newer capabilities provide more power. An example is Snowflake’s patented technology that allows you to load JSON data directly into a relational table, skipping the schema-on-read process. Snowflake’s software monitors schema changes automatically, so there’s no need to rely on ETL or parsing algorithms. In short, the modern cloud data warehouse speeds up data processing dramatically, overcoming the problem of scale.
JSON and the newest cloud analytics tools
Cloud-native analytics tools like Sigma are built to take advantage of all the capabilities of the modern data warehouse. They connect to JSON repositories and allow you to work with the data directly, using the built-in tools that the cloud data warehouse provider offers. While different tools have different functionality, let’s look at what you can do using Sigma’s unique technology for JSON analysis.
See how Sigma makes it possible for anyone to parse, join, and analyze JSON within modern data warehouses. Read our free eBook.
How Sigma makes JSON analysis easy
Sigma is an ideal JSON query tool for several reasons. First, it allows users to easily extract JSON’s semi-structured data fields and create relevant dataset views for exploration. The extracted dataset view can then be analyzed in the Sigma Spreadsheet.
The interface is one of Sigma’s most significant innovations. Sigma can parse JSON data directly in the warehouse. You can then query and join it to other structured data using Sigma. The Sigma Spreadsheet helps non-technical users query and manipulate data without writing SQL. The software automatically turns user actions into SQL “under the hood” so users don’t need to have any knowledge of code. The visual approach makes data analysis simple.
Thanks to the capabilities of Sigma’s interface, business teams can explore vetted data and build ad-hoc visualizations, dashboards, and reports in minutes without technical assistance. They can go as deep as they like, asking follow-up questions and exploring the causes of trends and related issues without the need to rely on the data team.
The path to JSON insights is now clear
Modern cloud data warehouses and analytics tools have essentially eliminated the roadblocks that companies have faced in the past when seeking insights from JSON data. Even non-technical users can move quickly to work with semi-structured and unstructured data, making the mission to become data-driven more attainable.
Want to learn more about how you can easily manage and analyze JSON using Snowflake and Sigma? Check out our eBook, Cracking the JSON Code.