3 Data Extract Strategies That Help — Not Hurt! — Your Business
Director of Content, Sigma
Data extracts: the arch nemesis of security and compliance-concerned teams at companies everywhere. And it’s really no wonder, since nearly half (47%) of data breaches stem from negligence often caused by lost or stolen spreadsheet data.
What’s more, 88% of spreadsheets contain errors, whether from copy/paste mistakes or incorrect formulas and calculations. This doesn’t include inaccuracies caused by stale data, siloed views, and other common problems.
But while the latest wave of analytics and business intelligence tools provides companies with the governed data access and familiar user interfaces needed to minimize these risks, data extracts as CSVs and PDFs are not totally avoidable. They’re often the only format certain tools will accept, or the most convenient way to share information with others.
So how can data and BI experts better enable teams whose workflows depend on spreadsheets and PDFs while still protecting the larger organization? Here are 3 ways to approach data extracts to help your company’s data processes — not hurt them.
of spreadsheets contain errors.
Maintain analysis structure
Data and BI teams put a lot of thought and strategy into how they structure and format analyses in their analytics and BI tools. Unfortunately, this level of detail is often lost in the data extract process, leaving teams with a rudimentary series of rows and columns.
This makes it nearly impossible for anyone on the receiving end of these extracts to figure out how the analyst got to a particular answer. And digging into the data to ask additional questions? Forget about it. Lack of clarity and visibility around the context and calculations behind extracts also leaves them susceptible to misinterpretation and misuse.
To avoid these all-too-common issues, be sure all extracted analyses maintain their intended structure, format, and labels. This will make it easier for anyone who sees the spreadsheet to determine how particular conclusions were reached and how the data should be used. It will extend the value of your work by enabling others to jump into the data, ask follow-up questions, and easily move to the next stage of analysis.
How Sigma Helps
Sigma carefully preserves column formatting across all data exports. It also allows users to export pivot tables they build using the Sigma Spreadsheet as Google Sheets, CSVs, or Excel documents. This is particularly valuable as it makes it faster and easier to analyze large data sets — which are difficult to structure and format using traditional spreadsheets — as extracts.
Teach teams about “TEL” and “ELT”
Some applications still require teams to manually import data via Excel or CSV format. Even more common is having to export data from these applications as CSV files. Business teams often find themselves struggling to work within these various extracts in an attempt to import the best possible data into their applications, as well as generate meaningful insights post extract.
“ETL,” or Extract, Transform, Load, is arguably one of the most important terms in the data world. It refers to the process data must go through to be retrieved and integrated across various sources and deposited in a company’s data warehouse for analysis.
Repurpose this acronym for teams working with extract-dependent applications, and teach them to Transform, Extract, Load — or “TEL.” This will get them into the habit of doing any necessary analysis — or “transformation” — in your BI tool, rather than extracting the data first and then trying to manipulate it to fit their needs.
Here they can do things that aren’t scalable or possible once the data has been extracted, like access pre-joined data sources, leverage predefined calculations, easily extract JSON data, and much more. Once this is finished, all they need to do is extract the data in the correct format, and load it into the next application. This improves the accuracy and completeness of the data being uploaded, and also gives data teams visibility and control around what’s being extracted.
Similarly, give non-technical teams a rundown of “ELT.” This increasingly popular modified version of the ETL pipeline loads extracted data into a cloud data warehouse (CDW) like Snowflake, where transformation begins and then continues using a CDW-native BI solution.
Use this concept to explain why teams should seek assistance uploading CSV exports directly to your BI platform and analyze them there. Rather than attempting to glean insights from high-level, siloed application extracts, business teams can join them with other relevant data sources for a more holistic and contextual view.
How Sigma Helps
Some 85% of people use a spreadsheet in their work. Business teams manage data in Excel because they know how to use it — and most BI tools require coding expertise to analyze data. The Sigma Spreadsheet empowers business users to visually analyze data using the power of SQL without writing any code.
Analyses are done using familiar, Excel-like formulas, so teams feel just as comfortable doing analyses in Sigma as they do in a spreadsheet! What’s more, Sigma enables teams to analyze massive data sets in real time by giving them direct, governed access to the CDW and all its power, scale, and speed.
Schedule exports for snapshotting
One of the biggest issues with data extracts is that they quickly become outdated. In fact, 21% of people admit to working with month-old business data. What’s more, it’s estimated that 60% of corporate data has lost some or even all of its business, legal, or regulatory value.
Today’s on-demand economy moves at the speed of real time. Making decisions based on stale insights can have major consequences for marketing campaigns, sales pipelines, customer relationships, and even the entire business.
It’s critical for data and BI teams to clearly convey that report extracts represent a single moment in time. Every employee should know not to use extracts for on-going analysis, and that the data these reports represent is only a fraction of what’s available.
One of the most effective ways to get these points across and avoid people using outdated reports is to schedule frequently requested extracts on a regular basis. Most BI tools can be programmed to automatically refresh the data underlying these reports and send out a fresh version via email, Slack, or other channel.
Even better? Schedule these reports to deliver to a shared location, like a particular Google Drive Folder. This allows teams to create a snapshot history of reports in a central location for future reference and easy sharing. The resulting report history can then be leveraged to track and reflect on data changes and business decisions over time.
of people admit to working with month-old business data.
How Sigma Helps
Sigma offers scheduled export functionality for visualizations, dashboards, and worksheets as PDFs, CSVs, and/or Google Sheets. Users can choose to have their exports delivered via Slack, email, or Google Drive. In addition to sending recurring reports to a shared Google Drive folder, Sigma also offers the option to rewrite over a single Google Sheet each time the schedule runs. This is particularly effective for ensuring teams are always working with the most recent and relevant extracts.
Rethink your data extract strategy
Keeping data inside your data warehouse and off employees’ PCs as much as possible is critical to maintaining company-wide security and compliance. But sometimes data extracts are unavoidable.
Data and BI experts must develop strategies and work with lines of business teams to minimize the risk these scenarios can pose for the organization. Leveraging the three tactics outlined in this post is a great place to start — and Sigma can help.