Google Introduces Zero-ETL Approach for Analytics on Bigtable Data Using BigQuery

Recently, Google announced the general availability of Bigtable Federated Query with BigQuery allowing customers to query data residing in Bigtable more quickly through BigQuery. Additionally, querying occurs without moving or copying data across all Google Cloud regions with increased federated query concurrency limits, closing the long-standing gap between operational data and analytics, according to the society.

BigQuery is Google Cloud’s serverless multi-cloud data warehouse that simplifies analysis by bringing together data from a variety of sources – and Cloud Bigtable is Google Cloud’s fully managed NoSQL database for transactional and urgent analytics. The latter is suitable for several use cases such as real-time fraud detection, recommendations, personalization, and time series.

Previously, customers had to use ETL tools like Dataflow or self-developed Python tools to copy data from Bigtable into BigQuery; however, now they can query the data directly with BigQuery SQL. BigQuery federated queries can access data stored in Bigtable.

Advertising

To query Bigtable data, users can create an external table for a Cloud Bigtable data source by providing the Cloud Bigtable URI, which can be obtained through the Cloud Bigtable console. The URI contains the following:

  • project_id is the project containing the Cloud Bigtable instance
  • instance_id is the Cloud Bigtable instance ID
  • (Optional) app_profile is the application profile ID to use
  • table_name is the name of the table to query

BigQuery my4p59r.max

Source: https://cloud.google.com/blog/products/data-analytics/bigtable-bigquery-federation-brings-hot–cold-data-closer

Once the external table is created, users can query Bigtable like any other table in BigQuery. Additionally, users can also take advantage of BigQuery features such as JDBC/ODBC drivers and connectors for popular business intelligence and data visualization tools such as Data Studio, Looker, and Tableau, in addition to AutoML tables for trained machine learning models and the BigQuery Spark connector to load data into their model development environments.

Big data enthusiast Christian Laurer explains in a medium post the benefit of the new approach with Bigtable’s federated queries:

By using the new approach, you can overcome some shortcomings of the traditional ETL approach. As:

• More data freshness (up-to-date information for your business, no old data of hours or even days)

• Don’t pay twice for storing the same data (clients normally store terabytes or even more in Bigtable)

• Less ETL pipeline monitoring and maintenance

Finally, more details on Bigtable’s federated queries with BigQuery can be found on the documentation page. Additionally, querying data in Cloud Bigtable is available in all supported Cloud Bigtables. the areas.

Leave a Comment