Breaking barriers: How BigQuery data insights boosts the data exploration journey

8 months ago 27
News Banner

Looking for an Interim or Fractional CTO to support your business?

Read more

Most data analysis starts with exploration — finding the right dataset, understanding the data’s structure, identifying key patterns, and identifying the most valuable insights you want to extract. This step can be cumbersome and time-consuming, especially if you are working with a new dataset or if you are new to the team. 

To address this problem, we announced the preview of new data insights capability in BigQuery at Next ‘24 that surfaces relevant, executable queries for tables that you can run with just one click. These features are available as part of Gemini in BigQuery and leverage the metadata and profiling information of tables from Dataplex.

In this blog post, we explore how Alex, a data analyst working for a large enterprise, can use the new BigQuery data insights features to accelerate his analytics workflows. Like many data professionals, he often encounters the "cold-start" problem when exploring new datasets. With little or no prior knowledge about the data they're working with, it can be difficult to identify patterns, much less uncover valuable insights. We also dive deeper into the concept of grounding generated queries and the roles of different personas involved in this journey.

Addressing the cold-start problem with data insights

Data insights harnesses Google's Gemini models to generate insightful queries about hidden patterns within a table, utilizing the table's metadata. By analyzing data types, statistical summaries, and other metadata attributes, it helps data analysts like Alex overcome the cold-start problem and unlock a world of data exploration possibilities.

image1

“The Insights feature felt like it understood the table, it filtered out not so useful columns like Created_at time, Transaction Id, while highlighting important columns like Amount, Intent Type, Bank Name, App Version, Platform.” - Product Manager, Financial Services Industry

Grounding generated queries for data relevance and accuracy

One of the key features of BigQuery data insights is its ability to ground generated queries. This means that the queries are based on the actual data distribution and patterns within the dataset, ensuring their relevance and accuracy. The grounding process involves:

  1. Analyzing profile scan data: Data insights examines the published profile scan data of the dataset, which includes information such as data types, statistical summaries, and other metadata attributes.

  2. Generating queries based on data distribution: Using the profile scan data, data insights creates queries that are tailored to the specific data distribution and patterns within the dataset.

  3. Validating queries: The generated queries are validated to ensure their relevance and accuracy.

Two key personas: admin and data consumer

Two primary personas who can benefit from using BigQuery data insights:

Admins - responsible for generating insights using the data insights feature. Admins typically include data steward, data governors, or other technical users who have the necessary permissions and access to the underlying data. 

Data consumers -  can view and execute the generated queries without needing direct access to the underlying data. Data consumers may include business analysts, data scientists, or other non-technical users who rely on the insights generated by BigQuery data insights to make informed decisions.In our story, Alex is a data consumer.

Getting started with BigQuery data insights

To use data Bigquery data insights, follow these steps:

  1. Access data insights: With your data in BigQuery, navigate to the BigQuery Studio in the Google Cloud console. Here, you'll find an overview of your tablesand their associated metadata.

  2. Generate queries: Select a table and click on the "Generate insights" button. Data insights analyzes the metadata and generates a list of insightful queries tailored to your dataset.

  3. Explore and refine queries: Review the generated queries and refine them as needed. 

  4. Run queries: Execute the queries against your table and analyze the results to uncover valuable insights.

Alex's path to greater data insights

Initially, Alex struggled to get up to speed when working with a new dataset. However, after discovering BigQuery data insights, he was able to streamline his data exploration process. Here's what data insights brought to Alex's work:

  1. Efficient data exploration: By automatically generating insightful queries based on metadata, data insights enabled Alex to explore new tables more efficiently and independently.

  2. Time and resource savings: With data insights handling low-to-moderate complexity data analysis tasks, Alex was able to focus on more challenging projects and save valuable time and resources.

  3. Collaboration and democratization: Data insights made data analysis more accessible to non-technical users in Alex's organization, fostering collaboration and promoting a unified approach to data interpretation.

  4. Real-time insights: By automatically deriving insights from continuously flowing business data, data insights helped Alex and his team respond to changing business conditions in real-time.

“It’s fantastic that the Insights generation feature in BigQuery not only provides new insights but also simplifies the process of running derived queries. The tool surprises me with fresh perspectives, going beyond what I initially considered. Its user-friendly nature makes it accessible to everyone, enabling efficient query execution.” - Data Analyst, renewable energy industry

Unlock insights from your data, fast

BigQuery data insights is a powerful tool to help you unlock valuable insights from your data. By leveraging the metadata of tables, it streamlines the data exploration process and enables data professionals to focus on more challenging tasks. The grounding of generated queries ensures the relevance and accuracy of the insights, while the two primary personas – admin and data consumer – facilitate collaboration and democratization of data analysis. 

Check out the documentation to learn more about data insights and reimagine the way you explore and analyze data.

Read Entire Article