Skip to the content
LeaplogicLeaplogic
  • Home
  • About Us
  • Contact
SIGN IN
  • Home
  • About Us
  • Contact

  • Getting Started
    • Before You Begin
    • Creating an Account
    • Logging into LeapLogic
    • Reset Password
    • Quick Tour of the Web Interface
    • LeapLogic in 15 minutes
      • Prerequisites
      • Step 1. Log into LeapLogic
      • Step 2. Create Assessment and Get Insights
      • Step 3. Create Transformation Pipeline and See Results
      • Step 4. Edit or Optimize the Transformed Code
      • Step 5: Complete the Transformation Lifecycle
  • Introduction to LeapLogic
    • Overview
    • High Level Architecture
    • Supported Legacy and Cloud Platforms
    • Key Features
  • LeapLogic Deployment Architecture & Prerequisites
  • Workload Assessment
    • Overview
    • Value Proposition
    • Creating Assessment
      • Prerequisites
      • Step 1. Provide Primary Inputs
        • Automation Coverage
      • Step 2. Add the Additional Inputs
        • Table Stat Extraction Steps
          • Teradata
          • Oracle
          • Netezza
      • Step 3. Update the Source Configuration
      • Step 4. Configure the Recommendation Settings
    • Assessment Listing
    • Understanding Insights and Recommendations
      • Volumetric Info
      • EDW
        • Oracle
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Vertica
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Snowflake
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Azure Synapse
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • SQL Server
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Teradata
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Netezza
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Google Big Query
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Redshift
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • PostgreSQL
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Duck DB
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • ClickHouse
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Exasol
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • DB2
          • Highlights
          • Analysis
          • Optimization
          • Recommendations
          • Lineage
          • Downloadable Reports
      • ETL
        • Informatica
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • Ab Initio
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • DataStage
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • Talend
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • SSIS
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • Informatica BDM
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • Oracle Data Integrator
          • Highlights
          • Analysis
          • Downloadable Reports
        • Pentaho
          • Highlights
          • Analysis
          • Downloadable Reports
        • Azure Data Factory
          • ARM Template
          • Highlights
          • Analysis
          • Downloadable Reports
        • Matillion
          • Highlights
          • Analysis
          • Downloadable Reports
        • SnapLogic
          • Highlights
          • Analysis
          • Downloadable Reports
      • Orchestration
        • AutoSys
          • Highlights
          • Analysis
          • Downloadable Reports
        • Control-M
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • SQL Server
          • Highlights
          • Analysis
      • BI
        • OBIEE
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • Tableau
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • IBM Cognos
          • Highlights
          • Analysis
          • Downloadable Reports
        • MicroStrategy
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • Power BI
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • SSRS
          • Highlights
          • Analysis
          • Downloadable Reports
        • SAP BO
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • WebFOCUS
          • Highlights
          • Analysis
          • Downloadable Reports
      • Analytics
        • SAS
          • Highlight
          • Analysis
          • Lineage
          • Downloadable Reports
        • Alteryx
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
      • Shell
        • Highlights
        • Analysis
        • Lineage
        • Downloadable Reports
      • Integrated Assessment (EDW, ETL, Orchestration, BI)
        • Highlights
        • Analysis
        • Optimization
        • Lineage
        • Recommendations
    • Managing Assessment Reports
      • Downloading Report
      • Input Report Utility
      • View Configuration
    • Complexity Calculation Logic
    • Key Benefits
    • Ad hoc Query
  • Script/ Query Log/ Code Extraction Prerequisites
    • Cloud
      • Azure Data Factory
      • Snowflake
      • Azure Synapse
      • Google BigQuery
      • Redshift
      • Azure SQL Database Hyperscale
      • Aurora PostgreSQL
    • EDW
      • Oracle
      • Netezza
      • Teradata
      • Vertica
      • SQL Server
      • Db2
      • MySQL
      • PostgreSQL
    • ETL
      • DataStage
      • Informatica
      • SSIS
      • Talend
      • ODI
      • IICS
      • DBT
      • Pentaho
      • Matillion
      • SnapLogic
      • Ab Initio
      • SAP BODS
      • TAC
      • WebFOCUS
    • BI
      • IBM Cognos
      • OBIEE
      • Tableau
      • Metabase
      • MicroStrategy
      • PowerBI
      • LeapLogic Utility for SAP BO
      • SAP BO Universe and Web Intelligence
      • SSRS
    • Analytics
      • SAS
      • Alteryx
    • Orchestration
      • AutoSys
      • Control-M
      • SQL Server
    • Mainframe
  • Metadata Management
    • Overview
    • Introduction to Data Catalog
      • Managing Data Catalog
        • Building Data Catalog
        • Insights to Data Catalog
        • Managing the Repository and Data Source
      • Creating Repository (Repo)
      • Creating Data Source
    • Tag Management
    • Key benefits
  • Batch Processing using Pipeline
    • Introduction
    • Designing Pipeline
      • How to create a pipeline
        • Configuring Migration Stage
          • Schema Optimization
        • Configuring Transformation Stage
          • On-premises to Cloud
          • Cloud-to-Cloud
            • Assigning Roles
            • Intelligent Modernization
          • LeapFusion
        • Configuring Validation Stage
          • Data Validation
            • Table
            • File
            • File and Table
            • Cell-by-cell validation
          • Query Validation
            • Query Validation (When Data is Available)
            • Query Validation (When Data is Not Available)
          • Schema Validation
        • Configuring Execution Stage
        • Configuring ETL Conversion Stage
          • Ab Initio
          • Informatica
          • Informatica BDM
          • Matillion
          • DataStage
          • SSIS
          • IICS
          • Talend
          • Oracle Data Integrator
          • Pentaho
          • SnapLogic
        • Configuring Mainframe Conversion Stage
          • Cobol
          • JCL
        • Configuring Orchestration Stage
          • AutoSys
          • Control-M
        • Configuring BI Conversion Stage
          • OBIEE to Power BI
          • OBIEE to AWS QuickSight
          • Tableau to Amazon QuickSight
          • Tableau to Power BI
          • Tableau to Superset
          • Tableau to Looker
          • IBM Cognos to Power BI
        • Configuring Analytics Conversion Stage
          • SAS
          • Alteryx
        • Configuring Script Conversion Stage
    • Key Features
      • How to schedule a pipeline
      • Configuring Parameters
  • Pipeline Reports
    • Overview of Pipeline Report
    • Pipeline Listing
    • Reports and Insights
      • Migration
      • Transformation
        • On-premises to Cloud
        • Cloud-to-Cloud
      • Validation
        • Data
          • File
          • Table
          • File and Table
        • Query
          • Query Validation Report (When Data is Available)
          • Query Validation Report (When Data is not Available)
        • Schema
      • Execution
      • ETL
        • Ab Initio
        • Informatica
        • Informatica BDM
        • Matillion
        • DataStage
        • SSIS
        • IICS
        • Talend
        • Oracle Data Integrator
        • Pentaho
        • SnapLogic
      • Mainframe
        • Cobol
        • JCL
      • Orchestration
        • AutoSys
        • Control-M
      • BI
        • OBIEE to Power BI
        • OBIEE to Amazon QuickSight
        • Tableau to Amazon QuickSight
        • Tableau to Power BI
        • Tableau to Superset
        • Tableau to Looker
        • IBM Cognos to Power BI
      • Analytics
        • SAS
        • Alteryx
      • Shell Script
      • Common Model
    • Automation Level Indicator
      • ETL
        • Informatica
        • Matillion
        • DataStage
        • Informatica BDM
        • SnapLogic
        • IICS
        • Ab Initio
        • SSIS
        • Talend
        • Pentaho
      • Orchestration
        • AutoSys
        • Control-M
      • EDW
      • Analytics
        • SAS
        • Alteryx
      • BI
      • Shell Script
    • Error Specifications & Troubleshooting
  • SQL Transformation
    • Overview
    • Creating and Executing the Online Notebook
      • How to Create and Execute the Notebook
      • Supported Features
    • Configuring the Notebook
      • Transformation
      • Unit Level Validation
      • Script Level Validation
    • Notebook Listing
  • Operationalization
    • Overview
      • Basic
      • Advanced
      • Cron Expression
    • Parallel Run Pipeline Listing
  • Transformation Source
    • Introduction
    • Creating Transformation Source Type
  • Governance
    • Summary of Governance - Roles and Permissions
    • User Creation
      • Creating a new User Account
    • Adding Roles and permissions
      • How to add Roles and Permissions to a new user?
    • Adding Group Accounts
    • Manage Quota
    • Product Usage Metrics
  • License
    • Workload Assessment
    • EDW Conversion
    • ETL Conversion
    • BI Conversion
    • Orchestration Conversion
    • Data Migration
    • Data Validation
  • LeapLogic Desktop Version
    • Overview
    • Registration and Installation
    • Getting Started
    • Creating Assessment
      • ETL
      • DML
      • Procedure
      • Analytics
      • Hadoop
    • Reports and Insights
      • Downloadable Reports
      • Reports for Estimation
    • Logging and Troubleshooting
    • Sample Scripts
    • Desktop vs. Web Version
    • Getting Help
  • LeapLogic (Version 5.0) Deployment
    • System Requirements
    • Prerequisites
    • Deployment
      • Extracting Package
      • Placing License Key
      • Executing Deployment Script
      • Accessing LeapLogic
    • Uploading License
    • Appendix
    • Getting Help
  • Removed Features
    • Configuring File Validation Stage
    • Variable Extractor Stage
      • Variable Extractor Report
    • Configuring Meta Diff Stage
      • Meta Diff
    • Configuring Data Load Stage
      • Data Load
    • Configuring Multi Algo Stage
  • FAQs
  • Tutorial Videos
  • Notice
Home   »  Script/ Query Log/ Code Extraction Prerequisites   »  Cloud  »  Google BigQuery Artifact Extraction Prerequisites

Google BigQuery Artifact Extraction Prerequisites

This topic briefs about the prerequisites required for Google BigQuery artifact extraction.

In This Topic:

  • Introduction
  • Artifact Extraction
  • Getting Help

Download


Introduction

LeapLogic’s Assessment profiles existing inventory, identifies complexity, performs dependency analysis, and provides recommendations for migration to other data platforms.


Artifact Extraction

LeapLogic requires certain artifacts to perform workload assessment. As a prerequisite, you need to have an admin user service account to fetch the required data. You can copy them from your GIT instance or export from the BigQuery repository where all the artifacts such as DDL scripts, stored procedures, functions, query execution logs, DML scripts, and other database objects are stored. LeapLogic needs all these artifacts in the form of .sql files.

DDL and Routine Scripts

Use the Python script (attached below) that utilizes the GCP BigQuery Python API to extract the DDL scripts. To use the BigQuery Python API, you need to install specific Python packages. Use the following command to install the required packages:

pip install google-cloud-bigquery

pip install pandas

pip install db-dtypes

Below is the Python script that extracts tables, views, routines, etc. from BigQuery. The script is also attached as a text file.

from google.cloud import bigquery
import os
import pandas as pd
import csv
import warnings
warnings.filterwarnings(“ignore”)


# Function to get DDL for a given table
def get_table_ddl(dataset_id, output_folder):
    try:
        output_folder = os.path.join(output_folder, dataset_id)
        os.makedirs(output_folder, exist_ok=True)
        query = f”””
        SELECT ddl
        FROM `{project_id}.{dataset_id}.INFORMATION_SCHEMA.TABLES`
        WHERE table_type = ‘BASE TABLE’
        “””
        ddls = client.query(query).to_dataframe()
        if not ddls.empty:
            ddls.to_csv(os.path.join(output_folder, dataset_id+’_tables_ddl_export.sql’), index=False, mode=’a’, header=False, quoting=csv.QUOTE_NONE, escapechar=’ ‘)
    except Exception as e:
        raise


# Function to get DDL for a given view
def get_view_ddl(dataset_id, output_folder):
    try:
        output_folder = os.path.join(output_folder, dataset_id)
        os.makedirs(output_folder, exist_ok=True)
        query = f”””
        SELECT ddl
        FROM `{project_id}.{dataset_id}.INFORMATION_SCHEMA.TABLES`
        WHERE table_type = ‘VIEW’
        “””
        ddls = client.query(query).to_dataframe()
        if not ddls.empty:
            ddls.to_csv(os.path.join(output_folder, dataset_id+’_views_ddl_export.sql’), index=False, mode=’a’, header=False, quoting=csv.QUOTE_NONE, escapechar=’ ‘)
    except Exception as e:
        raise


def get_mv_ddl(dataset_id, output_folder):
    try:
        output_folder = os.path.join(output_folder, dataset_id)
        os.makedirs(output_folder, exist_ok=True)
        query = f”””
        SELECT ddl
        FROM `{project_id}.{dataset_id}.INFORMATION_SCHEMA.TABLES`
        WHERE table_type = ‘MATERIALIZED VIEW’
        “””
        ddls = client.query(query).to_dataframe()
        if not ddls.empty:
            ddls.to_csv(os.path.join(output_folder, dataset_id+’_materialized_views_ddl_export.sql’), index=False, mode=’a’, header=False, quoting=csv.QUOTE_NONE, escapechar=’ ‘)
    except Exception as e:
        raise


# Function to get DDL for a given procedure
def get_procedure_ddl(dataset_id, output_folder):
    try:
        output_folder = os.path.join(output_folder, dataset_id)
        os.makedirs(output_folder, exist_ok=True)
        query = f”””
        SELECT ddl
        FROM `{project_id}.{dataset_id}.INFORMATION_SCHEMA.ROUTINES`
        WHERE routine_type = ‘PROCEDURE’
        “””
        ddls = client.query(query).to_dataframe()
        if not ddls.empty:
            ddls.to_csv(os.path.join(output_folder, dataset_id+’_procedure_ddl_export.sql’), index=False, mode=’a’, header=False, quoting=csv.QUOTE_NONE, escapechar=’ ‘)
    except Exception as e:
        raise


# Function to get DDL for a given function
def get_function_ddl(dataset_id, output_folder):
    try:
        output_folder = os.path.join(output_folder, dataset_id)
        os.makedirs(output_folder, exist_ok=True)
        query = f”””
        SELECT ddl
        FROM `{project_id}.{dataset_id}.INFORMATION_SCHEMA.ROUTINES`
        WHERE routine_type = ‘FUNCTION’
        “””
        ddls = client.query(query).to_dataframe()
        if not ddls.empty:
            ddls.to_csv(os.path.join(output_folder, dataset_id+’_udf_ddl_export.sql’), index=False, mode=’a’, header=False, quoting=csv.QUOTE_NONE, escapechar=’ ‘)
    except Exception as e:
        raise


# Function to get DDL for a given function
def get_table_function_ddl(dataset_id, output_folder):
    try:
        output_folder = os.path.join(output_folder, dataset_id)
        os.makedirs(output_folder, exist_ok=True)
        query = f”””
        SELECT ddl
        FROM `{project_id}.{dataset_id}.INFORMATION_SCHEMA.ROUTINES`
        WHERE routine_type = ‘TABLE FUNCTION’
        “””
        ddls = client.query(query).to_dataframe()
        if not ddls.empty:
            ddls.to_csv(os.path.join(output_folder, dataset_id+’_table_function_ddl_export.sql’), index=False, mode=’a’, header=False, quoting=csv.QUOTE_NONE, escapechar=’ ‘)
    except Exception as e:
        raise


# Export DDLs
def export_ddl(datasets, output_folder):
    try:
        # Iterate over all datasets
        for dataset in datasets:
            dataset_id = dataset
            print(f”Processing dataset: {dataset_id}”)

            # Export table DDLs
            get_table_ddl(dataset_id, output_folder)

            # Export view DDLs
            get_view_ddl(dataset_id, output_folder)

            # Export mv DDLs
            get_mv_ddl(dataset_id, output_folder)

            # Export procedure DDLs
            get_procedure_ddl(dataset_id, output_folder)

            # Export udf DDLs
            get_function_ddl(dataset_id, output_folder)

            # Export table function DDLs
            get_table_function_ddl(dataset_id, output_folder)

    except Exception as e:
        raise


if __name__ == “__main__”:
    try:
        # Provide Google Cloud credentials: Service Account JSON file path
        os.environ[“GOOGLE_APPLICATION_CREDENTIALS”] = r“/path/to/service_account.json”

        # Specify your GCP project ID
        project_id = ‘your-gcp-project-id’

        # Directory to store DDL files
        output_dir = r“/path/to/bigquery_ddls”

        # Dataset names for DDL export. Keep empty list [] for all datasets
        datasets = [‘dataset1’, ‘dataset2’]

        # Initialize a BigQuery client
        client = bigquery.Client()

        # Get a list of all datasets in the project
        if not datasets:
            datasets = [dataset.dataset_id for dataset in client.list_datasets(project=project_id)]

        # export ddls
        export_ddl(datasets, output_dir)

        print(“Export completed.”)

    except Exception as e:
        print(‘Exception while executing the ddl export script:’ + str(e))

Replace the highlighted variable values with the actual values.

Python Script:

bigquery_ddl_exportDownload

Script output

The Python script will create a separate folder for each dataset and export DDLs, including tables, views, routines, etc.

Query Execution Logs

For extracting the query execution logs there are two options available. The first approach uses the EXPORT DATA command to directly export query execution log to a GCS bucket. The second requires you to run the query directly in the BigQuery editor and export the output as a CSV file.

Note:

We recommend using the EXPORT DATA command, especially when the file size is substantial.

Using EXPORT DATA Command

The EXPORT DATA command generates the required data file to a GCS bucket. Please see the prerequisites below.

To export the required query execution logs, use the following EXPORT DATA command. Please do remember to change the project-id, dataset and start date CAST(start_time AS date) between ‘2024-01-01’ and ‘2024-05-31’ and GCS path where the files are expected to be generated. All the required inputs are highlighted in the below script.

– – Create temp table to persist query log data

CREATE TABLE `project-id.dataset.sample_query_log`

OPTIONS(

  expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)

) AS

SELECT concat(‘IDWWM’, ‘~~’,

ifnull(user_email,’NA’), ‘~~’,

‘app_u’, ‘~~’,

‘client_u’, ‘~~’,

ifnull(start_time,’9999-01-01 00:00:00′), ‘~~’,

round(ifnull(AmpIO,0)), ‘~~’,

ifnull(total_bytes_processed,0), ‘~~’,

ifnull(creation_time, ‘9999-01-01 00:00:00’), ‘~~’,

ifnull(start_time,’9999-01-01 00:00:00′), ‘~~’,

ifnull(start_time,’9999-01-01 00:00:00′), ‘~~’,

0, ‘~~’,

CASE WHEN TBl1.Query_ID IS NULL THEN jb.job_id ELSE TBl1.Query_ID END, ‘~~’,

(max(ifnull(AmpIO,0)) OVER(

      ORDER BY NULL)), ‘~~’,

(max(ifnull(total_bytes_processed,0)) OVER (ORDER BY NULL )) , ‘~~’,

(SUM(ifnull(AmpIO,0)) OVER (ORDER BY NULL)), ‘~~’,

(SUM(ifnull(total_bytes_processed,0)) OVER (ORDER BY NULL )), ‘~~’,

IFNULL(total_slot_ms, 0) / 1000, ‘~~’,

CASE WHEN LEFT(TBl1.dataset_id,7) = ‘_script’ THEN ‘Not Present’ ELSE IFNULL(TBl1.dataset_id, ‘Not Present’) END, ‘~~’,

0, ‘~~’,

COALESCE (statement_type,”), ‘~~’,

”, ‘~~’,

”, ‘~~’,

COALESCE (query,”), ‘~~’,

COALESCE (job_type,’NA’), ‘~~’,

COALESCE (TBl1.tables_used, ”), ‘~~’,

ifnull(priority,”), ‘~~’,

COALESCE (dml_statistics.inserted_row_count,0 )+ COALESCE (dml_statistics.deleted_row_count,0) + COALESCE (dml_statistics.updated_row_count, 0)) as query_log

FROM  `region-us`.INFORMATION_SCHEMA.JOBS AS JB

LEFT JOIN (

SELECT DISTINCT JOB_ID, query_info.query_hashes.normalized_literals AS query_id, r.dataset_id, string_agg(DISTINCT r.table_id ORDER BY r.table_id) AS tables_used

FROM  `region-us`.INFORMATION_SCHEMA.JOBS

LEFT JOIN UNNEST (referenced_tables) AS r

GROUP BY 1, 2,3

) AS TBl1

ON (JB.job_id = TBl1.JOB_ID)

LEFT JOIN (SELECT job_id, round(sum(j.compute_ratio_avg) + sum(j.compute_ratio_max) + sum(j.compute_ms_avg) + sum(j.compute_ms_max)

+ sum(read_ratio_avg) + sum(read_ms_avg) + sum(read_ratio_max)

+ sum(read_ms_max) + sum(write_ratio_avg) + sum(write_ms_avg) + sum(write_ratio_max) + sum(write_ms_max)) AS AmpIO

FROM `region-us`.INFORMATION_SCHEMA.JOBS, unnest(job_stages) AS j

GROUP BY job_id) AS TB12

ON (JB.JOB_ID = TB12.JOB_ID)

WHERE error_result.message IS NULL AND state = ‘DONE’

AND CAST(start_time AS date) > ‘2024-01-01’ AND CAST(start_time AS date) < ‘2024-05-31’

;

– – Export Data from temp table

EXPORT DATA OPTIONS(

  uri=‘gs://path/to/gcs/*.gzip’,

  format=’CSV’,

  overwrite=true,

  header=false,

  compression=GZIP,

  field_delimiter=’~’) AS

select * from `project-id.dataset.sample_query_log` ;

Using BigQuery Query Editor

To export the required query execution logs, refer to the below script. Please change the project-id, dataset and start date CAST(start_time AS date) between ‘2024-01-01’ and ‘2024-05-31’ as required.

We recommend exporting the query logs for a three months’ timeframe. Next, execute the script given below using the BigQuery Query Editor UI and download the output after execution.

SELECT concat(‘IDWWM’, ‘~~’,

ifnull(user_email,’NA’), ‘~~’,

‘app_u’, ‘~~’,

‘client_u’, ‘~~’,

ifnull(start_time,’9999-01-01 00:00:00′), ‘~~’,

round(ifnull(AmpIO,0)), ‘~~’,

ifnull(total_bytes_processed,0), ‘~~’,

ifnull(creation_time, ‘9999-01-01 00:00:00’), ‘~~’,

ifnull(start_time,’9999-01-01 00:00:00′), ‘~~’,

ifnull(start_time,’9999-01-01 00:00:00′), ‘~~’,

0, ‘~~’,

CASE WHEN TBl1.Query_ID IS NULL THEN jb.job_id ELSE TBl1.Query_ID END, ‘~~’,

(max(ifnull(AmpIO,0)) OVER(

      ORDER BY NULL)), ‘~~’,

(max(ifnull(total_bytes_processed,0)) OVER (ORDER BY NULL )) , ‘~~’,

(SUM(ifnull(AmpIO,0)) OVER (ORDER BY NULL)), ‘~~’,

(SUM(ifnull(total_bytes_processed,0)) OVER (ORDER BY NULL )), ‘~~’,

IFNULL(total_slot_ms, 0) / 1000, ‘~~’,

CASE WHEN LEFT(TBl1.dataset_id,7) = ‘_script’ THEN ‘Not Present’ ELSE IFNULL(TBl1.dataset_id, ‘Not Present’) END, ‘~~’,

0, ‘~~’,

COALESCE (statement_type,”), ‘~~’,

”, ‘~~’,

”, ‘~~’,

COALESCE (query,”), ‘~~’,

COALESCE (job_type,’NA’), ‘~~’,

COALESCE (TBl1.tables_used, ”), ‘~~’,

ifnull(priority,”), ‘~~’,

COALESCE (dml_statistics.inserted_row_count,0 )+ COALESCE (dml_statistics.deleted_row_count,0) + COALESCE (dml_statistics.updated_row_count, 0)) as query_log

FROM  `region-us`.INFORMATION_SCHEMA.JOBS AS JB

LEFT JOIN (

SELECT DISTINCT JOB_ID, query_info.query_hashes.normalized_literals AS query_id, r.dataset_id, string_agg(DISTINCT r.table_id ORDER BY r.table_id) AS tables_used

FROM  `region-us`.INFORMATION_SCHEMA.JOBS

LEFT JOIN UNNEST (referenced_tables) AS r

GROUP BY 1, 2,3

) AS TBl1

ON (JB.job_id = TBl1.JOB_ID)

LEFT JOIN (SELECT job_id, round(sum(j.compute_ratio_avg) + sum(j.compute_ratio_max) + sum(j.compute_ms_avg) + sum(j.compute_ms_max)

+ sum(read_ratio_avg) + sum(read_ms_avg) + sum(read_ratio_max)

+ sum(read_ms_max) + sum(write_ratio_avg) + sum(write_ms_avg) + sum(write_ratio_max) + sum(write_ms_max)) AS AmpIO

FROM `region-us`.INFORMATION_SCHEMA.JOBS, unnest(job_stages) AS j

GROUP BY job_id) AS TB12

ON (JB.JOB_ID = TB12.JOB_ID)

WHERE error_result.message IS NULL AND state = ‘DONE’

AND CAST(start_time AS date) > ‘2024-01-01‘ AND CAST(start_time AS date) < ‘2024-05-31‘ ;

Other Scripts and Artifacts

Copy any other scripts such as DML scripts etc. from your GIT and share them with the LeapLogic team to produce more extensive insights.


Getting Help

Contact LeapLogic technical support at info@leaplogic.io


To learn more, contact our support team or write to: info@leaplogic.io

Copyright © 2025 Impetus Technologies Inc. All Rights Reserved

  • Terms of Use
  • Privacy Policy
  • License Agreement
To the top ↑ Up ↑