Skip to the content
LeaplogicLeaplogic
  • Home
  • About Us
  • Contact
SIGN IN
  • Home
  • About Us
  • Contact

  • Getting Started
    • Before You Begin
    • Creating an Account
    • Logging into LeapLogic
    • Reset Password
    • Quick Tour of the Web Interface
    • LeapLogic in 15 minutes
      • Prerequisites
      • Step 1. Log into LeapLogic
      • Step 2. Create Assessment and Get Insights
      • Step 3. Create Transformation Pipeline and See Results
      • Step 4. Edit or Optimize the Transformed Code
      • Step 5: Complete the Transformation Lifecycle
  • Introduction to LeapLogic
    • Overview
    • High Level Architecture
    • Supported Legacy and Cloud Platforms
    • Key Features
  • Workload Assessment
    • Overview
    • Value Proposition
    • Creating Assessment
      • Prerequisites
      • Step 1. Provide Primary Inputs
        • Automation Coverage
      • Step 2. Add the Additional Inputs
        • Table Stat Extraction Steps
          • Teradata
          • Oracle
          • Netezza
      • Step 3. Update the Source Configuration
      • Step 4. Configure the Recommendation Settings
    • Assessment Listing
    • Understanding Insights and Recommendations
      • Volumetric Info
      • EDW
        • Oracle
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Vertica
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Snowflake
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Azure Synapse
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • SQL Server
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Teradata
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Netezza
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Google Big Query
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Redshift
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • PostgreSQL
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Duck DB
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • ClickHouse
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Exasol
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • DB2
          • Highlights
          • Analysis
          • Optimization
          • Recommendations
          • Lineage
          • Downloadable Reports
      • ETL
        • Informatica
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • Ab Initio
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • DataStage
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • Talend
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • SSIS
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • Informatica BDM
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • Oracle Data Integrator
          • Highlights
          • Analysis
          • Downloadable Reports
        • Pentaho
          • Highlights
          • Analysis
          • Downloadable Reports
        • Azure Data Factory
          • ARM Template
          • Highlights
          • Analysis
          • Downloadable Reports
        • Matillion
          • Highlights
          • Analysis
          • Downloadable Reports
        • SnapLogic
          • Highlights
          • Analysis
          • Downloadable Reports
      • Orchestration
        • AutoSys
          • Highlights
          • Analysis
          • Downloadable Reports
        • Control-M
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • SQL Server
          • Highlights
          • Analysis
      • BI
        • OBIEE
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • Tableau
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • IBM Cognos
          • Highlights
          • Analysis
          • Downloadable Reports
        • MicroStrategy
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • Power BI
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • SSRS
          • Highlights
          • Analysis
          • Downloadable Reports
        • SAP BO
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • WebFOCUS
          • Highlights
          • Analysis
          • Downloadable Reports
      • Analytics
        • SAS
          • Highlight
          • Analysis
          • Lineage
          • Downloadable Reports
        • Alteryx
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
      • Integrated Assessment (EDW, ETL, Orchestration, BI)
        • Highlights
        • Analysis
        • Optimization
        • Lineage
        • Recommendations
    • Managing Assessment Reports
      • Downloading Report
      • Input Report Utility
      • View Configuration
    • Complexity Calculation Logic
    • Key Benefits
    • Ad hoc Query
  • Metadata Management
    • Overview
    • Introduction to Data Catalog
      • Managing Data Catalog
        • Building Data Catalog
        • Insights to Data Catalog
        • Managing the Repository and Data Source
      • Creating Repository (Repo)
      • Creating Data Source
    • Tag Management
    • Key benefits
  • Batch Processing using Pipeline
    • Introduction
    • Designing Pipeline
      • How to create a pipeline
        • Configuring Migration Stage
          • Schema Optimization
        • Configuring Transformation Stage
          • On-premises to Cloud
          • Cloud-to-Cloud
          • LeapLogic Express
        • Configuring Validation Stage
          • Data Validation
            • Table
            • File
            • File and Table
            • Cell-by-cell validation
          • Query Validation
            • Query Validation (When Data is Available)
            • Query Validation (When Data is Not Available)
          • Schema Validation
        • Configuring Execution Stage
        • Configuring ETL Conversion Stage
          • Ab Initio
          • Informatica
          • Informatica BDM
          • Matillion
          • DataStage
          • SSIS
          • IICS
          • Talend
          • Oracle Data Integrator
          • Pentaho
          • SnapLogic
        • Configuring Mainframe Conversion Stage
          • Cobol
          • JCL
        • Configuring Orchestration Stage
          • AutoSys
          • Control-M
        • Configuring BI Conversion Stage
          • OBIEE to Power BI
          • OBIEE to AWS QuickSight
          • Tableau to Amazon QuickSight
          • Tableau to Power BI
          • Tableau to Superset
          • Tableau to Looker
          • IBM Cognos to Power BI
        • Configuring Analytics Conversion Stage
          • SAS
          • Alteryx
        • Configuring Script Conversion Stage
    • Key Features
      • How to schedule a pipeline
      • Configuring Parameters
  • Pipeline Reports
    • Overview of Pipeline Report
    • Pipeline Listing
    • Reports and Insights
      • Migration
      • Transformation
        • On-premises to Cloud
        • Cloud-to-Cloud
        • LeapLogic Express
      • Validation
        • Data
          • File
          • Table
          • File and Table
        • Query
          • Query Validation Report (When Data is Available)
          • Query Validation Report (When Data is not Available)
        • Schema
      • Execution
      • ETL
        • Ab Initio
        • Informatica
        • Informatica BDM
        • Matillion
        • DataStage
        • SSIS
        • IICS
        • Talend
        • Oracle Data Integrator
        • Pentaho
        • SnapLogic
      • Mainframe
        • Cobol
        • JCL
      • Orchestration
        • AutoSys
        • Control-M
      • BI
        • OBIEE to Power BI
        • OBIEE to Amazon QuickSight
        • Tableau to Amazon QuickSight
        • Tableau to Power BI
        • Tableau to Superset
        • Tableau to Looker
        • IBM Cognos to Power BI
      • Analytics
        • SAS
        • Alteryx
      • Shell Script
      • Common Model
    • Automation Level Indicator
      • ETL
        • Informatica
        • Matillion
        • DataStage
        • Informatica BDM
        • SnapLogic
        • IICS
        • Ab Initio
        • SSIS
        • Talend
        • Pentaho
      • Orchestration
        • AutoSys
        • Control-M
      • EDW
      • Analytics
        • SAS
        • Alteryx
      • BI
      • Shell Script
    • Error Specifications & Troubleshooting
  • SQL Transformation
    • Overview
    • Creating and Executing the Online Notebook
      • How to Create and Execute the Notebook
      • Supported Features
    • Configuring the Notebook
      • Transformation
      • Unit Level Validation
      • Script Level Validation
    • Notebook Listing
  • Operationalization
    • Overview
      • Basic
      • Advanced
      • Cron Expression
    • Parallel Run Pipeline Listing
  • Transformation Source
    • Introduction
    • Creating Transformation Source Type
  • Governance
    • Summary of Governance - Roles and Permissions
    • User Creation
      • Creating a new User Account
    • Adding Roles and permissions
      • How to add Roles and Permissions to a new user?
    • Adding Group Accounts
    • Default Quota Limits
    • Product Usage Metrics
  • License
    • EDW
    • ETL
  • Script/ Query Log/ Code Extraction Prerequisites
    • Cloud
      • Azure Data Factory
      • Snowflake
      • Azure Synapse
      • Google BigQuery
      • Redshift
      • Azure SQL Database Hyperscale
      • Aurora PostgreSQL
    • EDW
      • Oracle
      • Netezza
      • Teradata
      • Vertica
      • SQL Server
      • Db2
      • MySQL
      • PostgreSQL
    • ETL
      • DataStage
      • Informatica
      • SSIS
      • Talend
      • ODI
      • IICS
      • DBT
      • Pentaho
      • Matillion
      • SnapLogic
      • Ab Initio
      • SAP BODS
      • TAC
      • WebFOCUS
    • BI
      • IBM Cognos
      • OBIEE
      • Tableau
      • Metabase
      • MicroStrategy
      • PowerBI
      • LeapLogic Utility for SAP BO
      • SAP BO Universe and Web Intelligence
      • SSRS
    • Analytics
      • SAS
      • Alteryx
    • Orchestration
      • AutoSys
      • Control-M
      • SQL Server
    • Mainframe
  • LeapLogic Desktop Version
    • Overview
    • Registration and Installation
    • Getting Started
    • Creating Assessment
      • ETL
      • DML
      • Procedure
      • Analytics
      • Hadoop
    • Reports and Insights
      • Downloadable Reports
      • Reports for Estimation
    • Logging and Troubleshooting
    • Sample Scripts
    • Desktop vs. Web Version
    • Getting Help
  • LeapLogic (Version 4.9) Deployment
    • System Requirements
    • Prerequisites
    • Deployment
      • Extracting Package
      • Placing License Key
      • Executing Deployment Script
      • Accessing LeapLogic
    • Uploading License
    • Appendix
    • Getting Help
  • Removed Features
    • Configuring File Validation Stage
    • Variable Extractor Stage
      • Variable Extractor Report
    • Configuring Meta Diff Stage
      • Meta Diff
    • Configuring Data Load Stage
      • Data Load
    • Configuring Multi Algo Stage
  • FAQs
  • Tutorial Videos
  • Notice
Home   »  Script/ Query Log/ Code Extraction Prerequisites   »  Cloud  »  Azure Synapse Query Execution Logs Extraction and Source Code Prerequisites

Azure Synapse Query Execution Logs Extraction and Source Code Prerequisites

This topic briefs about the Azure Synapse assessment query execution logs extraction and source code prerequisites.

In This Topic:

  • Introduction
  • Assessment Process
  • Source Code Assessment

Introduction

LeapLogic’s Assessment profiles existing inventory identifies workload complexity, interdependencies, and provides comprehensive actionable recommendations for migration to modern data platform.

Assessment checklist

Information Format Availability?
Database Object Count File Export in text format Yes/ No
Database Volume File Export in text format Yes/ No
High Data Volume Tables File Export in text format Yes/ No
Total Number of Users File Export in text format Yes/ No
Total Number of User Tables File Export in text format Yes/ No
Objects File Export in text format Yes/ No
Total I/O Usage by Database File Export in text format Yes/ No
Top CPU and I/O Consuming Queries File Export in text format Yes/ No
Top CPU and I/O Consuming Procedures File Export in text format Yes/ No
Data for Partitioning and Bucketing File Export in text format Yes/ No
Source Code Assessment File Export in text format Yes/ No

Follow the steps in section 2, 3 to collect the required information for assessment.


Assessment Process

All the environment specific variables are highlighted. The start and end date should be consistent for all the queries.

Databases

This query provides the total number of databases. Execute it in a master database and save the results in CSV format.

– -To be run under master database

select count(*) from sys.databases

Database Volume

This query provides the database volume in MBs. Execute it in a master database and save the results in CSV format.

– – To be run under master database

DECLARE @Database TABLE
(
database_nm VARCHAR(50),
log_write DECIMAL(8,2),
row_size_mb DECIMAL(8,2),
total_size_mb DECIMAL(8,2)
)
INSERT INTO @Database
SELECT
database_name = CAST(database_name AS VARCHAR(50))
,log_write_secs = max(avg_log_write_percent)
, row_size_mb = max(storage_in_megabytes)
,total_size_mb = max(allocated_storage_in_megabytes)

FROM sys.resource_stats rs inner join sys.databases db on rs.database_name = db.name
where sku=’DW’
GROUP BY database_name
select * FROM @Database

High Data Volume Tables

This query provides the result for tables with a high data volume. Execute it for all the databases that need to be migrated and save the results in CSV format.

– – – to run under all the DB of Synapse

WITH base

AS

(

SELECT

 GETDATE()                                                             AS  [execution_time]

, DB_NAME()                                                            AS  [database_name]

, s.name                                                               AS  [schema_name]

, t.name                                                               AS  [table_name]

, QUOTENAME(s.name)+’.’+QUOTENAME(t.name)                              AS  [two_part_name]

, nt.[name]                                                            AS  [node_table_name]

, ROW_NUMBER() OVER(PARTITION BY nt.[name] ORDER BY (SELECT NULL))     AS  [node_table_name_seq]

, tp.[distribution_policy_desc]                                        AS  [distribution_policy_name]

, c.[name]                                                             AS  [distribution_column]

, nt.[distribution_id]                                                 AS  [distribution_id]

, i.[type]                                                             AS  [index_type]

, i.[type_desc]                                                        AS  [index_type_desc]

, nt.[pdw_node_id]                                                     AS  [pdw_node_id]

, pn.[type]                                                            AS  [pdw_node_type]

, pn.[name]                                                            AS  [pdw_node_name]

, di.name                                                              AS  [dist_name]

, di.position                                                          AS  [dist_position]

, nps.[partition_number]                                               AS  [partition_nmbr]

, nps.[reserved_page_count]                                            AS  [reserved_space_page_count]

, nps.[reserved_page_count] – nps.[used_page_count]                    AS  [unused_space_page_count]

, nps.[in_row_data_page_count]

    + nps.[row_overflow_used_page_count]

    + nps.[lob_used_page_count]                                        AS  [data_space_page_count]

, nps.[reserved_page_count]

 – (nps.[reserved_page_count] – nps.[used_page_count])

 – ([in_row_data_page_count]

         + [row_overflow_used_page_count]+[lob_used_page_count])       AS  [index_space_page_count]

, nps.[row_count]                                                      AS  [row_count]

from

    sys.schemas s

INNER JOIN sys.tables t

    ON s.[schema_id] = t.[schema_id]

INNER JOIN sys.indexes i

    ON  t.[object_id] = i.[object_id]

    AND i.[index_id] <= 1

INNER JOIN sys.pdw_table_distribution_properties tp

    ON t.[object_id] = tp.[object_id]

INNER JOIN sys.pdw_table_mappings tm

    ON t.[object_id] = tm.[object_id]

INNER JOIN sys.pdw_nodes_tables nt

    ON tm.[physical_name] = nt.[name]

INNER JOIN sys.dm_pdw_nodes pn

    ON  nt.[pdw_node_id] = pn.[pdw_node_id]

INNER JOIN sys.pdw_distributions di

    ON  nt.[distribution_id] = di.[distribution_id]

INNER JOIN sys.dm_pdw_nodes_db_partition_stats nps

    ON nt.[object_id] = nps.[object_id]

    AND nt.[pdw_node_id] = nps.[pdw_node_id]

    AND nt.[distribution_id] = nps.[distribution_id]

    AND i.[index_id] = nps.[index_id]

LEFT OUTER JOIN (select * from sys.pdw_column_distribution_properties where distribution_ordinal = 1) cdp

    ON t.[object_id] = cdp.[object_id]

LEFT OUTER JOIN sys.columns c

    ON cdp.[object_id] = c.[object_id]

    AND cdp.[column_id] = c.[column_id]

WHERE pn.[type] = ‘COMPUTE’

)

, size

AS

(

SELECT

   [execution_time]

,  [database_name]

,  [schema_name]

,  [table_name]

,  [two_part_name]

,  [node_table_name]

,  [node_table_name_seq]

,  [distribution_policy_name]

,  [distribution_column]

,  [distribution_id]

,  [index_type]

,  [index_type_desc]

,  [pdw_node_id]

,  [pdw_node_type]

,  [pdw_node_name]

,  [dist_name]

,  [dist_position]

,  [partition_nmbr]

,  [reserved_space_page_count]

,  [unused_space_page_count]

,  [data_space_page_count]

,  [index_space_page_count]

,  [row_count]

,  ([reserved_space_page_count] * 8.0)                                 AS [reserved_space_KB]

,  ([reserved_space_page_count] * 8.0)/1000                            AS [reserved_space_MB]

,  ([reserved_space_page_count] * 8.0)/1000000                         AS [reserved_space_GB]

,  ([reserved_space_page_count] * 8.0)/1000000000                      AS [reserved_space_TB]

,  ([unused_space_page_count]   * 8.0)                                 AS [unused_space_KB]

,  ([unused_space_page_count]   * 8.0)/1000                            AS [unused_space_MB]

,  ([unused_space_page_count]   * 8.0)/1000000                         AS [unused_space_GB]

,  ([unused_space_page_count]   * 8.0)/1000000000                      AS [unused_space_TB]

,  ([data_space_page_count]     * 8.0)                                 AS [data_space_KB]

,  ([data_space_page_count]     * 8.0)/1000                            AS [data_space_MB]

,  ([data_space_page_count]     * 8.0)/1000000                         AS [data_space_GB]

,  ([data_space_page_count]     * 8.0)/1000000000                      AS [data_space_TB]

,  ([index_space_page_count]  * 8.0)                                   AS [index_space_KB]

,  ([index_space_page_count]  * 8.0)/1000                              AS [index_space_MB]

,  ([index_space_page_count]  * 8.0)/1000000                           AS [index_space_GB]

,  ([index_space_page_count]  * 8.0)/1000000000                        AS [index_space_TB]

FROM base

)

SELECT

     database_name

,    table_name

,    schema_name

,    SUM(row_count)                 as table_row_count

,    SUM(([reserved_space_page_count] * 8.0)/1000000)         as table_reserved_space_GB

,    SUM(([data_space_page_count]     * 8.0)/1000000 )             as table_data_space_GB

,    SUM(([unused_space_page_count]   * 8.0)/1000000 )           as table_unused_space_GB

FROM

    base

GROUP BY

     database_name

,    schema_name

,    table_name

,    distribution_policy_name

,      distribution_column

,    index_type_desc

having SUM(([data_space_page_count]     * 8.0)/1000000 )    > =10

ORDER BY

table_reserved_space_GB desc

This SQL will collect databases with volume equal or above 10GB. This step must be performed for all the databases.

Note:

The filter condition can be changed for collecting lower volumes.

Users

This query provides the total number of users. Execute it for all the databases that need to be migrated and save the results in CSV format.

– – -to be run under all the db of synapse

select 

database_name = CAST(DB_NAME(DB_ID()) AS VARCHAR(50)),

type ,count(*) [NUM]

from sys.database_principals

group by type

User Tables

This query provides the total number of user tables. Execute it in a synapse database and save the results in CSV format.

– -to be executed in all synapse db

select 

database_name = CAST(DB_NAME(DB_ID()) AS VARCHAR(50)),

Name,type_desc,create_date,modify_date

from sys.tables

Objects

This query provides all the objects in a database. Execute it for all the databases that need to be migrated and save the results in CSV format.

– – – To be executed in all synapse db

select 

database_name = CAST(DB_NAME(DB_ID()) AS VARCHAR(50)),

Name,type_desc,create_date,modify_date

from sys.objects  where is_ms_shipped=0

Server Configuration and Concurrent Number of Queries Allowed

This query provides server configuration along with concurrent number of queries allowed.

Execute it on master database and save the results in CSV format.

– -To be run under master Database

create table #Concurrent_queries

(

  service_objectives VARCHAR(50),

  Max_concurrent_queries DECIMAL(8,2)

);

INSERT INTO #Concurrent_queries values (‘DW100c’,4);

INSERT INTO #Concurrent_queries values (‘DW200c’,8);

INSERT INTO #Concurrent_queries values (‘DW300c’,12);

INSERT INTO #Concurrent_queries values (‘DW400c’,16);

INSERT INTO #Concurrent_queries values (‘DW500c’,20);

INSERT INTO #Concurrent_queries values (‘DW1000c’,32);

INSERT INTO #Concurrent_queries values (‘DW1500c’,48);

INSERT INTO #Concurrent_queries values (‘DW2000c’,48);

INSERT INTO #Concurrent_queries values (‘DW2500c’,64);

INSERT INTO #Concurrent_queries values (‘DW3000c’,64);

INSERT INTO #Concurrent_queries values (‘DW5000c’,128);

INSERT INTO #Concurrent_queries values (‘DW6000c’,128);

INSERT INTO #Concurrent_queries values (‘DW7500c’,128);

INSERT INTO #Concurrent_queries values (‘DW10000c’,128);

INSERT INTO #Concurrent_queries values (‘DW15000c’,128);

INSERT INTO #Concurrent_queries values (‘DW30000c’,128);

SELECT  db.name [Database]

,        ds.edition [Edition]

,        ds.service_objective [Service_Objective]

, cq.Max_concurrent_queries [Concurrent_queries]

FROM    sys.database_service_objectives   AS ds

JOIN    sys.databases                     AS db ON ds.database_id = db.database_id

JOIN  #Concurrent_queries AS cq on upper(cq.service_objectives) = upper(ds.service_objective)

;

Average, Max CPU, IO and Storage Stats

This query provides average, max CPU/IP and storage statistics. Execute it on master database and save the results in CSV format.

– -To be run under master database

DECLARE @s datetime; 

DECLARE @e datetime; 

SET @s= DateAdd(d,-7,GetUTCDate()); 

SET @e= GETUTCDATE(); 

SELECT 

    database_name, 

    AVG(avg_cpu_percent) AS ‘Average CPU Utilization In Percent’,  

    MAX(avg_cpu_percent) AS ‘Maximum CPU Utilization In Percent’,  

    AVG(avg_data_io_percent) AS ‘Average Data IO In Percent’,  

    MAX(avg_data_io_percent) AS ‘Maximum Data IO In Percent’,  

    AVG(avg_log_write_percent) AS ‘Average Log Write I/O Throughput Utilization In Percent’,  

    MAX(avg_log_write_percent) AS ‘Maximum Log Write I/O Throughput Utilization In Percent’  ,

    max(storage_in_megabytes) AS ‘Maximum storage size in megabytes’,  

    MAX(allocated_storage_in_megabytes) AS ‘Allocated storage size in megabytes’  

 FROM sys.resource_stats rs inner join sys.databases db on rs.database_name = db.name

where sku = ‘DW’ GROUP BY database_name

Total I/O Usage by Database

This query provides the total I/O usage by databases. Execute it in master database and save the results in CSV format.

– – Total I/O Usage by Database

– -should be execute in master

SELECT Name AS Database_Name

,SUM(num_of_reads)AS Number_of_Reads

,SUM(num_of_writes)AS Number_of_Writes

,(SUM(num_of_reads) + SUM(num_of_writes)) as TotalIO

FROM sys.dm_io_virtual_file_stats(NULL,NULL) I

INNER JOIN sys.databases D ON I.database_id = d.database_id

GROUP BY name

ORDER BY  TotalIO desc

Top CPU and I/O Consuming Queries

This query provides top CPU and I/O consuming queries. Execute it a Synapse database and save the results in CSV format.

– -CPU and I/O Consuming queries using system tables having limit of 10K rows.
– -To be executed on Synapse DB
set nocount on;
DECLARE @startdt VARCHAR(10);
DECLARE @enddt VARCHAR(10);

— Provide start and end date
SET @startdt = ‘2017-03-05’
SET @enddt = ‘2023-11-23’

SELECT

concat(‘IDWWM’,’~~’,

CURRENT_TIMESTAMP,’~~’

               ,s.session_id,’~~’

               ,r.request_id,’~~’

               ,r.status,’~~’

    ,isnull(cast(r.resource_allocation_percentage as nvarchar(20)),”) ,’~~’

               ,s.login_name,’~~’

               ,r.submit_time,’~~’

               ,r.end_compile_time,’~~’

               ,r.total_elapsed_time,’~~’

               ,isnull(ISNULL(r.command2, r.command),”) ,’~~’

               ,isnull(sum(ner.cpu_time)/1000,”) ,’~~’

               ,isnull(sum(ner.reads),”) ,’~~’

               ,isnull(sum(ner.writes),”) ,’~~’

               ,isnull(max(ner.granted_query_memory),”) ,’~~’

               ,isnull(round(log(sum(g.query_cost)),2),”))

FROM sys.dm_pdw_exec_requests r

JOIN sys.dm_pdw_exec_sessions s

               on r.session_id = s.session_id

LEFT JOIN sys.dm_pdw_request_steps rs

               ON r.request_id = rs.request_id

LEFT JOIN sys.dm_pdw_sql_requests sr

               ON rs.request_id = sr.request_id

               AND rs.step_index = sr.step_index

Left  JOIN sys.dm_pdw_nodes_exec_requests ner

               ON sr.spid = ner.session_id

               AND sr.pdw_node_id = ner.pdw_node_id

LEFT JOIN sys.dm_pdw_nodes_exec_query_memory_grants g

               ON ner.session_id = g.session_id

               AND ner.pdw_node_id = g.pdw_node_id

WHERE                cast(r.submit_time as date) BETWEEN CAST(@startdt+’ 00:00:00:00′ as DATETIME)  AND CAST(@enddt+’ 00:00:00:00′ AS DATETIME)

GROUP BY r.request_id

               ,r.request_id

               ,r.status

               ,r.command

               ,r.command2

               ,r.resource_allocation_percentage

               ,s.session_id

               ,s.login_name

               ,r.submit_time

               ,r.end_compile_time

               ,r.total_elapsed_time

               ,r.[label]

               ,r.classifier_name

               ,r.group_name                 ,r.result_cache_hit;

– – CPU and I/O Consuming queries using Data store Tables

set nocount on;

DECLARE @startdt VARCHAR(10)

DECLARE @enddt VARCHAR(10)

– – Provide start and end date

SET @startdt = ‘2013-03-05’

SET @enddt = ‘2023-11-23’

SELECT  concat(‘IDWWM’,’~~’,

CURRENT_TIMESTAMP,’~~’,

q.query_id,’~~’,

qt.query_text_id,’~~’

,isnull(rs.avg_physical_io_reads,”),’~~’

,isnull(rs.avg_cpu_time,”),’~~’

,p.plan_id,’~~’ 

,isnull(rs.runtime_stats_id,”),’~~’

,qt.query_sql_text , ‘~~’

,rsi.start_time, ‘~~’

,rsi.end_time, ‘~~’

,rs.avg_rowcount, ‘~~’

,rs.count_executions  )

FROM sys.query_store_query_text AS qt  

JOIN sys.query_store_query AS q  

    ON qt.query_text_id = q.query_text_id  

JOIN sys.query_store_plan AS p  

    ON q.query_id = p.query_id  

JOIN sys.query_store_runtime_stats AS rs  

    ON p.plan_id = rs.plan_id  

JOIN sys.query_store_runtime_stats_interval AS rsi  

    ON rsi.runtime_stats_interval_id = rs.runtime_stats_interval_id 

WHERE rsi.start_time >= CAST(@startdt+’ 00:00:00:00′ as DATETIME)  and rsi.end_time <= CAST(@enddt+’ 00:00:00:00′ AS DATETIME)

ORDER BY rs.avg_physical_io_reads DESC;

Change the value of start date and end date in the parameter at the beginning of the script. Save the output of above query in CSV format if you are executing it from an editor such as SSMS. If the log size is too large, then use the alternative method (see below) to generate the output.

Change start and end date as required. Execute it using the below command from Windows command line and share the output file.
This method can be used to execute any SQL command through the command prompt.

sqlcmd -S server -U username -P password -d database -I -i ./SQL_Server_query_log_1.sql -y0 -o output_file_1.log

sqlcmd -S server -U username -P password -d database -I -i ./SQL_Server_query_log_2.sql -y0 -o output_file_2.log

Top CPU and I/O Consuming Procedures

This query provides top CPU and I/O consuming procedures.

DECLARE @startdt VARCHAR(10)

DECLARE @enddt VARCHAR(10)

SET @startdt = ‘2019-03-05’

SET @enddt = ‘2023-04-10’

select CONCAT(‘IDWWM’, ‘~~’,

               sess.login_name,’~~’,

    coalesce(sess.client_id,”),’~~’,

     sess.app_name,’~~’,

    convert(VARCHAR,d.last_execution_time,120),’~~’,

    (d.total_elapsed_time/d.execution_count)/1000000.0,’~~’,

    (total_logical_reads+total_physical_reads+total_logical_writes)/d.execution_count,’~~’,

   0,’~~’,

    convert(VARCHAR,(d.last_execution_time + d.total_elapsed_time/86400000000),120),’~~’,

    convert(VARCHAR,d.last_execution_time,120),’~~’,

    0,’~~’,

    row_number() over (ORDER BY total_worker_time/d.execution_count desc),’~~’,

    (total_worker_time/execution_count)/1000000.0,’~~’,

    (total_logical_reads+total_physical_reads+total_logical_writes),’~~’,

    tb4.ampcputime,’~~’,

    tb4.TotalIOCount,’~~’,

    d.total_elapsed_time/1000000.0,’~~’,

    OBJECT_SCHEMA_NAME(d.object_id,d.database_id), ‘~~’,

    db.name,’~~’,

    REPLACE(CAST(text as NVARCHAR(MAX)), CHAR(10), ‘ ‘),’~~’,

    execution_count

               )

FROM sys.dm_pdw_nodes_exec_procedure_stats AS d 

LEFT OUTER JOIN sys.dm_pdw_nodes_exec_sql_text st on (d.sql_handle = st.sql_handle)

INNER JOIN

(

SELECT DISTINCT

a.last_execution_time,

SUM(total_worker_time)/1000000.0 as AMPCPUTIME,

(SUM(total_logical_reads)+sum(total_physical_reads)+sum(total_logical_writes)) AS TOTALIOCOUNT

FROM

sys.dm_pdw_nodes_exec_procedure_stats AS a

WHERE

cast(a.last_execution_time as date) BETWEEN CAST(@startdt+’ 00:00:00:00′ as DATETIME)  AND CAST(@enddt+’ 00:00:00:00′ AS DATETIME)

GROUP BY a.last_execution_time

)TB4 ON TB4.last_execution_time = d.last_execution_time

inner join sys.databases db

on ( db.database_id = d.database_id)

LEFT JOIN sys.dm_pdw_exec_sessions AS Sess

ON (Sess.session_id) = ‘SID’+cast(st.session_id as nvarchar(60));

Data for Partitioning and Bucketing

This query extracts table size and column details to be utilized for partition and bucket recommendation. Execute it and save the results in CSV format.

Note:

This query provides the details for the database that we are in. You need to execute it for all the databases. Also, you need to generate the statistics before executing this query.

WITH base

AS

(

SELECT

 GETDATE()                                                             AS  [execution_time]

, DB_NAME()                                                            AS  [database_name]

, s.name                                                               AS  [schema_name]

, t.name                                                               AS  [table_name]

, QUOTENAME(s.name)+’.’+QUOTENAME(t.name)                              AS  [two_part_name]

, nt.[name]                                                            AS  [node_table_name]

, ROW_NUMBER() OVER(PARTITION BY nt.[name] ORDER BY (SELECT NULL))     AS  [node_table_name_seq]

, tp.[distribution_policy_desc]                                        AS  [distribution_policy_name]

, c.[name]                                                             AS  [distribution_column]

, nt.[distribution_id]                                                 AS  [distribution_id]

, i.[type]                                                             AS  [index_type]

, i.[type_desc]                                                        AS  [index_type_desc]

, nt.[pdw_node_id]                                                     AS  [pdw_node_id]

, pn.[type]                                                            AS  [pdw_node_type]

, pn.[name]                                                            AS  [pdw_node_name]

, di.name                                                              AS  [dist_name]

, di.position                                                          AS  [dist_position]

, nps.[partition_number]                                               AS  [partition_nmbr]

, nps.[reserved_page_count]                                            AS  [reserved_space_page_count]

, nps.[reserved_page_count] – nps.[used_page_count]                    AS  [unused_space_page_count]

, nps.[in_row_data_page_count]

    + nps.[row_overflow_used_page_count]

    + nps.[lob_used_page_count]                                        AS  [data_space_page_count]

, nps.[reserved_page_count]

 – (nps.[reserved_page_count] – nps.[used_page_count])

 – ([in_row_data_page_count]

         + [row_overflow_used_page_count]+[lob_used_page_count])       AS  [index_space_page_count]

, nps.[row_count]                                                      AS  [row_count]

from

    sys.schemas s

INNER JOIN sys.tables t

    ON s.[schema_id] = t.[schema_id]

INNER JOIN sys.indexes i

    ON  t.[object_id] = i.[object_id]

    AND i.[index_id] <= 1

INNER JOIN sys.pdw_table_distribution_properties tp

    ON t.[object_id] = tp.[object_id]

INNER JOIN sys.pdw_table_mappings tm

    ON t.[object_id] = tm.[object_id]

INNER JOIN sys.pdw_nodes_tables nt

    ON tm.[physical_name] = nt.[name]

INNER JOIN sys.dm_pdw_nodes pn

    ON  nt.[pdw_node_id] = pn.[pdw_node_id]

INNER JOIN sys.pdw_distributions di

    ON  nt.[distribution_id] = di.[distribution_id]

INNER JOIN sys.dm_pdw_nodes_db_partition_stats nps

    ON nt.[object_id] = nps.[object_id]

    AND nt.[pdw_node_id] = nps.[pdw_node_id]

    AND nt.[distribution_id] = nps.[distribution_id]

LEFT OUTER JOIN (select * from sys.pdw_column_distribution_properties ) cdp

    ON t.[object_id] = cdp.[object_id]

LEFT OUTER JOIN sys.columns c

    ON cdp.[object_id] = c.[object_id]

    AND cdp.[column_id] = c.[column_id]

WHERE pn.[type] = ‘COMPUTE’

)

, size

AS

(

SELECT

   [execution_time]

,  [database_name]

,  [schema_name]

,  [table_name]

,  [two_part_name]

,  [node_table_name]

,  [node_table_name_seq]

,  [distribution_policy_name]

,  [distribution_column]

,  [distribution_id]

,  [index_type]

,  [index_type_desc]

,  [pdw_node_id]

,  [pdw_node_type]

,  [pdw_node_name]

,  [dist_name]

,  [dist_position]

,  [partition_nmbr]

,  [reserved_space_page_count]

,  [unused_space_page_count]

,  [data_space_page_count]

,  [index_space_page_count]

,  [row_count]

,  ([reserved_space_page_count] * 8.0)                                 AS [reserved_space_KB]

,  ([reserved_space_page_count] * 8.0)/1000                            AS [reserved_space_MB]

,  ([reserved_space_page_count] * 8.0)/1000000                         AS [reserved_space_GB]

,  ([reserved_space_page_count] * 8.0)/1000000000                      AS [reserved_space_TB]

,  ([unused_space_page_count]   * 8.0)                                 AS [unused_space_KB]

,  ([unused_space_page_count]   * 8.0)/1000                            AS [unused_space_MB]

,  ([unused_space_page_count]   * 8.0)/1000000                         AS [unused_space_GB]

,  ([unused_space_page_count]   * 8.0)/1000000000                      AS [unused_space_TB]

,  ([data_space_page_count]     * 8.0)                                 AS [data_space_KB]

,  ([data_space_page_count]     * 8.0)/1000                            AS [data_space_MB]

,  ([data_space_page_count]     * 8.0)/1000000                         AS [data_space_GB]

,  ([data_space_page_count]     * 8.0)/1000000000                      AS [data_space_TB]

,  ([index_space_page_count]  * 8.0)                                   AS [index_space_KB]

,  ([index_space_page_count]  * 8.0)/1000                              AS [index_space_MB]

,  ([index_space_page_count]  * 8.0)/1000000                           AS [index_space_GB]

,  ([index_space_page_count]  * 8.0)/1000000000                        AS [index_space_TB]

FROM base

)

SELECT  execution_time,database_name,schema_name,table_name,distribution_column,sum(row_count),sum(data_space_gb)

from size group by  execution_time,database_name,schema_name,table_name,distribution_column


Source Code Assessment

Provide the below active or in-scope Synapse source code artifacts as applicable in the migration scope.

Code Artifact Criticality Remarks
Orchestration Scripts (Control-M / AutoSys / Cron etc.) Must To identify interdependencies across scheduler scripts / jobs, queries, and dependent workloads
Procedures / Functions Must To identify workload complexity, query patterns, query count etc. for effort estimation and technical debt
Views Must To identify view complexity, patterns and effort estimations
Shell Scripts Must To identify count, dependencies, SQL queries and PL/SQL, logic (example: email, notification etc.) and effort estimations
DDL Must To identify column usage, and provide recommendation on column level lineage, and query optimization on the target system
DML / SQL Files Must To identify count, dependencies, SQL queries and effort estimations

Note:

Limit: Assuming the orchestration script is a trigger point for every single use case execution in the existing setup. If the customer is not comfortable sharing all the workloads, then share those workloads which are referred or executed through the orchestration scripts. However, in such scenarios the scope and effort estimates will be based on the given workloads.


To learn more, contact our support team or write to: info@leaplogic.io

Copyright © 2025 Impetus Technologies Inc. All Rights Reserved

  • Terms of Use
  • Privacy Policy
  • License Agreement
To the top ↑ Up ↑