Skip to the content
LeaplogicLeaplogic
  • Home
  • About Us
  • Contact
SIGN IN
  • Home
  • About Us
  • Contact

  • Getting Started
    • Before You Begin
    • Creating an Account
    • Logging into LeapLogic
    • Reset Password
    • Quick Tour of the Web Interface
    • LeapLogic in 15 minutes
      • Prerequisites
      • Step 1. Log into LeapLogic
      • Step 2. Create Assessment and Get Insights
      • Step 3. Create Transformation Pipeline and See Results
      • Step 4. Edit or Optimize the Transformed Code
      • Step 5: Complete the Transformation Lifecycle
  • Introduction to LeapLogic
    • Overview
    • High Level Architecture
    • Supported Legacy and Cloud Platforms
    • Key Features
  • LeapLogic Deployment Architecture & Prerequisites
  • Workload Assessment
    • Overview
    • Value Proposition
    • Creating Assessment
      • Prerequisites
      • Step 1. Provide Primary Inputs
        • Automation Coverage
      • Step 2. Add the Additional Inputs
        • Table Stat Extraction Steps
          • Teradata
          • Oracle
          • Netezza
      • Step 3. Update the Source Configuration
      • Step 4. Configure the Recommendation Settings
    • Assessment Listing
    • Understanding Insights and Recommendations
      • Volumetric Info
      • EDW
        • Oracle
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Vertica
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Snowflake
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Azure Synapse
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • SQL Server
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Teradata
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Netezza
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Google Big Query
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Redshift
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • PostgreSQL
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Duck DB
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • ClickHouse
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Exasol
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • DB2
          • Highlights
          • Analysis
          • Optimization
          • Recommendations
          • Lineage
          • Downloadable Reports
      • ETL
        • Informatica
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • Ab Initio
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • DataStage
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • Talend
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • SSIS
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • Informatica BDM
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • Oracle Data Integrator
          • Highlights
          • Analysis
          • Downloadable Reports
        • Pentaho
          • Highlights
          • Analysis
          • Downloadable Reports
        • Azure Data Factory
          • ARM Template
          • Highlights
          • Analysis
          • Downloadable Reports
        • Matillion
          • Highlights
          • Analysis
          • Downloadable Reports
        • SnapLogic
          • Highlights
          • Analysis
          • Downloadable Reports
      • Orchestration
        • AutoSys
          • Highlights
          • Analysis
          • Downloadable Reports
        • Control-M
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • SQL Server
          • Highlights
          • Analysis
      • BI
        • OBIEE
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • Tableau
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • IBM Cognos
          • Highlights
          • Analysis
          • Downloadable Reports
        • MicroStrategy
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • Power BI
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • SSRS
          • Highlights
          • Analysis
          • Downloadable Reports
        • SAP BO
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • WebFOCUS
          • Highlights
          • Analysis
          • Downloadable Reports
      • Analytics
        • SAS
          • Highlight
          • Analysis
          • Lineage
          • Downloadable Reports
        • Alteryx
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
      • Shell
        • Highlights
        • Analysis
        • Lineage
        • Downloadable Reports
      • Integrated Assessment (EDW, ETL, Orchestration, BI)
        • Highlights
        • Analysis
        • Optimization
        • Lineage
        • Recommendations
    • Managing Assessment Reports
      • Downloading Report
      • Input Report Utility
      • View Configuration
    • Complexity Calculation Logic
    • Key Benefits
    • Ad hoc Query
  • Script/ Query Log/ Code Extraction Prerequisites
    • Cloud
      • Azure Data Factory
      • Snowflake
      • Azure Synapse
      • Google BigQuery
      • Redshift
      • Azure SQL Database Hyperscale
      • Aurora PostgreSQL
    • EDW
      • Oracle
      • Netezza
      • Teradata
      • Vertica
      • SQL Server
      • Db2
      • MySQL
      • PostgreSQL
    • ETL
      • DataStage
      • Informatica
      • SSIS
      • Talend
      • ODI
      • IICS
      • DBT
      • Pentaho
      • Matillion
      • SnapLogic
      • Ab Initio
      • SAP BODS
      • TAC
      • WebFOCUS
    • BI
      • IBM Cognos
      • OBIEE
      • Tableau
      • Metabase
      • MicroStrategy
      • PowerBI
      • LeapLogic Utility for SAP BO
      • SAP BO Universe and Web Intelligence
      • SSRS
    • Analytics
      • SAS
      • Alteryx
    • Orchestration
      • AutoSys
      • Control-M
      • SQL Server
    • Mainframe
  • Metadata Management
    • Overview
    • Introduction to Data Catalog
      • Managing Data Catalog
        • Building Data Catalog
        • Insights to Data Catalog
        • Managing the Repository and Data Source
      • Creating Repository (Repo)
      • Creating Data Source
    • Tag Management
    • Key benefits
  • Batch Processing using Pipeline
    • Introduction
    • Designing Pipeline
      • How to create a pipeline
        • Configuring Migration Stage
          • Schema Optimization
        • Configuring Transformation Stage
          • On-premises to Cloud
          • Cloud-to-Cloud
            • Assigning Roles
            • Intelligent Modernization
          • LeapFusion
        • Configuring Validation Stage
          • Data Validation
            • Table
            • File
            • File and Table
            • Cell-by-cell validation
          • Query Validation
            • Query Validation (When Data is Available)
            • Query Validation (When Data is Not Available)
          • Schema Validation
        • Configuring Execution Stage
        • Configuring ETL Conversion Stage
          • Ab Initio
          • Informatica
          • Informatica BDM
          • Matillion
          • DataStage
          • SSIS
          • IICS
          • Talend
          • Oracle Data Integrator
          • Pentaho
          • SnapLogic
        • Configuring Mainframe Conversion Stage
          • Cobol
          • JCL
        • Configuring Orchestration Stage
          • AutoSys
          • Control-M
        • Configuring BI Conversion Stage
          • OBIEE to Power BI
          • OBIEE to AWS QuickSight
          • Tableau to Amazon QuickSight
          • Tableau to Power BI
          • Tableau to Superset
          • Tableau to Looker
          • IBM Cognos to Power BI
        • Configuring Analytics Conversion Stage
          • SAS
          • Alteryx
        • Configuring Script Conversion Stage
    • Key Features
      • How to schedule a pipeline
      • Configuring Parameters
  • Pipeline Reports
    • Overview of Pipeline Report
    • Pipeline Listing
    • Reports and Insights
      • Migration
      • Transformation
        • On-premises to Cloud
        • Cloud-to-Cloud
      • Validation
        • Data
          • File
          • Table
          • File and Table
        • Query
          • Query Validation Report (When Data is Available)
          • Query Validation Report (When Data is not Available)
        • Schema
      • Execution
      • ETL
        • Ab Initio
        • Informatica
        • Informatica BDM
        • Matillion
        • DataStage
        • SSIS
        • IICS
        • Talend
        • Oracle Data Integrator
        • Pentaho
        • SnapLogic
      • Mainframe
        • Cobol
        • JCL
      • Orchestration
        • AutoSys
        • Control-M
      • BI
        • OBIEE to Power BI
        • OBIEE to Amazon QuickSight
        • Tableau to Amazon QuickSight
        • Tableau to Power BI
        • Tableau to Superset
        • Tableau to Looker
        • IBM Cognos to Power BI
      • Analytics
        • SAS
        • Alteryx
      • Shell Script
      • Common Model
    • Automation Level Indicator
      • ETL
        • Informatica
        • Matillion
        • DataStage
        • Informatica BDM
        • SnapLogic
        • IICS
        • Ab Initio
        • SSIS
        • Talend
        • Pentaho
      • Orchestration
        • AutoSys
        • Control-M
      • EDW
      • Analytics
        • SAS
        • Alteryx
      • BI
      • Shell Script
    • Error Specifications & Troubleshooting
  • SQL Transformation
    • Overview
    • Creating and Executing the Online Notebook
      • How to Create and Execute the Notebook
      • Supported Features
    • Configuring the Notebook
      • Transformation
      • Unit Level Validation
      • Script Level Validation
    • Notebook Listing
  • Operationalization
    • Overview
      • Basic
      • Advanced
      • Cron Expression
    • Parallel Run Pipeline Listing
  • Transformation Source
    • Introduction
    • Creating Transformation Source Type
  • Governance
    • Summary of Governance - Roles and Permissions
    • User Creation
      • Creating a new User Account
    • Adding Roles and permissions
      • How to add Roles and Permissions to a new user?
    • Adding Group Accounts
    • Manage Quota
    • Product Usage Metrics
  • License
    • Workload Assessment
    • EDW Conversion
    • ETL Conversion
    • BI Conversion
    • Orchestration Conversion
    • Data Migration
    • Data Validation
  • LeapLogic Desktop Version
    • Overview
    • Registration and Installation
    • Getting Started
    • Creating Assessment
      • ETL
      • DML
      • Procedure
      • Analytics
      • Hadoop
    • Reports and Insights
      • Downloadable Reports
      • Reports for Estimation
    • Logging and Troubleshooting
    • Sample Scripts
    • Desktop vs. Web Version
    • Getting Help
  • LeapLogic (Version 5.0) Deployment
    • System Requirements
    • Prerequisites
    • Deployment
      • Extracting Package
      • Placing License Key
      • Executing Deployment Script
      • Accessing LeapLogic
    • Uploading License
    • Appendix
    • Getting Help
  • Removed Features
    • Configuring File Validation Stage
    • Variable Extractor Stage
      • Variable Extractor Report
    • Configuring Meta Diff Stage
      • Meta Diff
    • Configuring Data Load Stage
      • Data Load
    • Configuring Multi Algo Stage
  • FAQs
  • Tutorial Videos
  • Notice
Home   »  Script/ Query Log/ Code Extraction Prerequisites   »  Cloud  »  Redshift Artifact Extraction Prerequisites

Redshift Artifact Extraction Prerequisites

This topic briefs about the prerequisites required for Redshift artifact extraction.

In This Topic:

  • Introduction
  • Artifact Extraction
  • Getting Help

Download


Introduction

LeapLogic’s Assessment profiles existing inventory, identifies complexity, performs dependency analysis, and provides recommendations for migration to modern data platform.


Artifact Extraction

LeapLogic requires certain artifacts to perform an assessment. As a prerequisite, you need to have a super user privilege to fetch the required data. You can copy them from your GIT instance or the Redshift repository where all the artifacts such as DDL scripts, stored procedures, functions, query execution logs, DML scripts, and other database objects are stored. LeapLogic needs all these artifacts in the form of .sql files.

DDL Scripts

Table DDLs

There are two options available for extracting table DDL scripts. The first option is to use the Python script. The second option is to use the shell script with psql.

Note:

To use the Python script, you need the redshift-connector module. For the Shell script, you need a psql client.

Table DDLs using Python

Install the “redshift_connector” module using Python pip command.

pip install redshift_connector

Use Python pip3 command if you have multiple Python versions installed.

pip3 install redshift_connector

Use the attached Python script and put it on your local machine or server. Run it from there to get the table DDLs.

#Connect to the cluster

import redshift_connector

conn = redshift_connector.connect(

     host=’redshift_host‘,

     database=’database‘,

     port=5439,

     user=’usename‘,

     password=’password‘

  )

output_file = r’/path/table_ddl_output.sql‘

# Create a Cursor object

cursor = conn.cursor()

# Query a table using the Cursor

res = cursor.execute(“””        SELECT DISTINCT SCHEMAname ||’.’||tablename FROM pg_tables

   WHERE schemaname NOT IN (‘pg_catalog’,’information_schema’) limit 10″””).fetchall()

for value in res:

    print(value[0])

    res2 = cursor.execute(“””show table “”” + value[0] + “”” “””).fetchone()

    # Write the result to the file

    with open(output_file, ‘a’) as f:

        f.write(‘- – Table: ‘ + value[0] + ‘\n’)

f.write(str(res2[0]) + ‘\n\n’)

Modify the details highlighted. This script exports all the table DDLs. Please run this script for every database in the Redshift cluster.

Note:

Please remember to remove the “limit 10” clause from the above script before executing it.

Also attached Python script in the form of text file.

table_ddlDownload

Table DDLs using Shell Script

Use the attached Shell script and put it on your local machine or server. Run it from there to get the table DDLs.

Note:

Run the script on a machine that has the psql client on it.

#!/bin/bash

# Database connection parameters

export PGHOST=”redshift-host“

export PGDATABASE=”database“

export PGPORT=5439

export PGUSER=”username“

export PGPASSWORD=”password“

# Output file

OUTPUT_FILE=”/path/output/table_ddl_output.sql“

# Query to get view names

QUERY=”SELECT DISTINCT SCHEMAname ||’.’||tablename FROM pg_tables  WHERE schemaname NOT IN (‘pg_catalog’,’information_schema’) limit 10;”

# Execute the query and save the result in a variable

TABLES=$(psql -AXqtc “$QUERY”)

# Loop over the views

for TABLE in $TABLES

do

    # Query to get the view definition

    echo “Exporting Table DDL: $TABLE…”

    QUERY=”SHOW TABLE $TABLE;”

    # Execute the query and save the result in a variable

    TABLE_DEFINITION=$(psql -AXqtc “$QUERY”)

    # Write the view name and its definition to the output file

    echo “- – Table: $TABLE” >> $OUTPUT_FILE

    echo “$TABLE_DEFINITION” >> $OUTPUT_FILE

    echo “” >> $OUTPUT_FILE

done

Modify the details highlighted. This script exports all the table DDLs. Please run this script for every database in the Redshift cluster.

Also attached shell script in the form of text file.

table_ddlDownload

View DDLs

There are two options available for extracting the View DDL scripts. The first option is to use the Python script. The second option is to use the shell script with psql.

Note:

To use the Python script, you need redshift-connector module. For the shell script, you need a psql client.

View DDLs using Python

Use the attached Python script and put it on your local machine or server. Run it from there to get the table DDLs.

#Connect to the cluster

import redshift_connector

conn = redshift_connector.connect(

     host=’redshift_host‘,

     database=’database‘,

     port=5439,

     user=’usename‘,

     password=’password‘

  )

output_file = r’/path/view_ddl_output.sql‘

# Create a Cursor object

cursor = conn.cursor()

# Query a table using the Cursor

res = cursor.execute(“””  SELECT DISTINCT SCHEMAname ||’.’||viewname FROM pg_views

 WHERE schemaname NOT IN (‘pg_catalog’,’information_schema’) limit 10″””).fetchall()

for value in res:

    print(value[0])

    res2 = cursor.execute(“””show view “”” + value[0] + “”” “””).fetchone()

    # Write the result to the file

    with open(output_file, ‘a’) as f:

        f.write(‘– Table: ‘ + value[0] + ‘\n’)

f.write(str(res2[0]) + ‘\n\n’)

Modify the details highlighted. This script exports all the view DDLs. Please run this script for every database in the Redshift cluster.

Also attached Python script in the form of text file.

view_ddlDownload

View DDLs using Shell Script

Take the shell script that’s attached and put it on your local machine or server. Run it from there to get the table DDLs.

Note:

Run the script on a machine that has the psql client on it.

#!/bin/bash

# Database connection parameters

export PGHOST=”redshift-host”

export PGDATABASE=”database”

export PGPORT=5439

export PGUSER=”username”

export PGPASSWORD=”password”

# Output file

OUTPUT_FILE=”/path/output/view_ddl_output.sql”

# Query to get view names

QUERY=”SELECT DISTINCT schemaname || ‘.’ || viewname FROM pg_views WHERE schemaname NOT IN (‘pg_catalog’,’information_schema’) LIMIT 10;”

# Execute the query and save the result in a variable

VIEWS=$(psql -AXqtc “$QUERY”)

# Loop over the views

for VIEW in $VIEWS

do

    # Query to get the view definition

    echo “Exporting view DDL: $VIEW…”

    QUERY=”SHOW VIEW $VIEW;”

    # Execute the query and save the result in a variable

    VIEW_DEFINITION=$(psql -AXqtc “$QUERY”)

    # Write the view name and its definition to the output file

    echo “– View: $VIEW” >> $OUTPUT_FILE

    echo “$VIEW_DEFINITION” >> $OUTPUT_FILE

    echo “” >> $OUTPUT_FILE

done

Modify the details highlighted. This script exports all the View DDLs. Please run this script for every database in the Redshift cluster.

Also attached Shell script in the form of text file.

view_ddlDownload

Query Execution Logs

Similarly, for extracting the query execution logs, there are two options available. The first option leverages the UNLOAD command while the second option uses the AWS Redshift Query Editor.

Note:

We recommend using the UNLOAD command, especially when the file size is substantial.

Using UNLOAD Command

The UNLOAD command generates the required data file on an S3 bucket. Please see the prerequisites below.

  • ‘s3://bucket_name/path’ – The S3 path where files are expected to be generated.
  • iam_role ‘arn:aws:iam::<aws acct num>:role/<redshift role>’– IAM role of the Redshift cluster.

To export the required query execution logs, use the following UNLOAD command as a reference. Change the start date trunc(sqlog.starttime) between ‘2024-03-29’ and ‘2024-04-01’ as required. All the required inputs are highlighted in the below script.

unload

(

$$

SELECT

      ‘IDWWM’ || ‘~~’ ||

coalesce(sui.usename,”) || ‘~~’ ||

0 || ‘~~’ ||

‘client_u’ || ‘~~’ ||

NULLIF(sqlog.starttime,’9999-12-31′) ::varchar|| ‘~~’ ||

coalesce(sqm.query_cpu_time,0) ::varchar || ‘~~’ ||

0 || ‘~~’ ||

coalesce(x.byt,0) ::varchar || ‘~~’ ||

coalesce(datediff(millisecond, sc.endtime,sc.starttime),0) ::varchar|| ‘~~’ ||

NULLIF(sqlog.starttime,’9999-12-31′) ::varchar|| ‘~~’ ||

NULLIF(sqlog.starttime,’9999-12-31′) ::varchar|| ‘~~’ ||

coalesce(sqlog.pid,999) ::varchar|| ‘~~’ ||

coalesce(sqlog.query,999) ::varchar|| ‘~~’ ||

‘NA’ ::VARCHAR|| ‘~~’ ||

‘NA’ ::VARCHAR|| ‘~~’ ||

coalesce(max(sqm.query_cpu_time) OVER(Partition by sqm.query ORDER BY NULL ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW),0)::varchar || ‘~~’ ||

coalesce(x.max_bytes,0)::varchar || ‘~~’ ||

coalesce(sqm.query_execution_time,0) ::varchar || ‘~~’ ||

‘Not Present’ || ‘~~’ ||

ss.sequence::varchar || ‘~~’ ||

coalesce(ss.type,”) || ‘~~’ ||

” || ‘~~’ ||

coalesce(sqlog.label,”) || ‘~~’ || 

ss.text ::VARCHAR(MAX)

from

SVL_STATEMENTTEXT ss

LEFT JOIN

SVL_QLOG sqlog

on ss.xid = sqlog.xid and

ss.pid = sqlog.pid and

ss.userid  = sqlog.userid

LEFT JOIN

SVL_QUERY_METRICS sqm

on sqm.query = sqlog.query

LEFT JOIN

svl_compile sc

ON sqm.query = sc.query  and

 sqlog.pid = sc.pid and

sqlog.xid = sc.xid

Left JOIN

SVL_USER_INFO sui

ON sui.usesysid = sqm.userid

LEFT JOIN

( select query,sum(bytes) as byt,coalesce(max(bytes),0) as max_bytes  from SVL_QUERY_SUMMARY

WHERE userid > 1

group by query) X

on sqlog.query = x.query   

WHERE

ss.userid > 1

and sqlog.userid > 1

and cast( ss.starttime AS date) between ‘2023-12-01‘ and ‘2024-04-30‘ $$)

to ‘s3://bucket_name/path’

iam_role ‘arn:aws:iam::<aws acct num>:role/<redshift role>‘

DELIMITER ‘|’

GZIP

ALLOWOVERWRITE;

Note:

The UNLOAD command runs in parallel which essentially means it generates multiple files in the S3 bucket.

Using AWS Redshift Query Editor

To export the required query execution logs, refer to the below script. Please remember to change the start date trunc(sqlog.starttime) between ‘2024-03-29‘ and ‘2024-04-01‘; as required.

The total recommended timeframe is of three months for the end-to-end query logs. Next, execute the given script below using the AWS Redshift Query Editor UI and download the output after execution.

SELECT

      ‘IDWWM’ || ‘~~’ ||

coalesce(sui.usename,”) || ‘~~’ ||

0 || ‘~~’ ||

‘client_u’ || ‘~~’ ||

NULLIF(sqlog.starttime,’9999-12-31′) ::varchar|| ‘~~’ ||

coalesce(sqm.query_cpu_time,0) ::varchar || ‘~~’ ||

0 || ‘~~’ ||

coalesce(x.byt,0) ::varchar || ‘~~’ ||

coalesce(datediff(millisecond, sc.endtime,sc.starttime),0) ::varchar|| ‘~~’ ||

NULLIF(sqlog.starttime,’9999-12-31′) ::varchar|| ‘~~’ ||

NULLIF(sqlog.starttime,’9999-12-31′) ::varchar|| ‘~~’ ||

coalesce(sqlog.pid,999) ::varchar|| ‘~~’ ||

coalesce(sqlog.query,999) ::varchar|| ‘~~’ ||

‘NA’ ::VARCHAR|| ‘~~’ ||

‘NA’ ::VARCHAR|| ‘~~’ ||

coalesce(max(sqm.query_cpu_time) OVER(Partition by sqm.query ORDER BY NULL ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW),0)::varchar || ‘~~’ ||

coalesce(x.max_bytes,0)::varchar || ‘~~’ ||

coalesce(sqm.query_execution_time,0) ::varchar || ‘~~’ ||

‘Not Present’ || ‘~~’ ||

ss.sequence::varchar || ‘~~’ ||

coalesce(ss.type,”) || ‘~~’ ||

” || ‘~~’ ||

coalesce(sqlog.label,”) || ‘~~’ || 

ss.text ::VARCHAR(MAX)

from

SVL_STATEMENTTEXT ss

LEFT JOIN

SVL_QLOG sqlog

on ss.xid = sqlog.xid and

ss.pid = sqlog.pid and

ss.userid  = sqlog.userid

LEFT JOIN

SVL_QUERY_METRICS sqm

on sqm.query = sqlog.query

LEFT JOIN

svl_compile sc

ON sqm.query = sc.query  and

 sqlog.pid = sc.pid and

sqlog.xid = sc.xid

Left JOIN

SVL_USER_INFO sui

ON sui.usesysid = sqm.userid

LEFT JOIN

( select query,sum(bytes) as byt,coalesce(max(bytes),0) as max_bytes  from SVL_QUERY_SUMMARY

WHERE userid > 1

group by query) X

on sqlog.query = x.query   

WHERE

ss.userid > 1

and sqlog.userid > 1

and cast( ss.starttime AS date) between ‘2023-12-01‘ and ‘2024-04-30‘;

Other Database Objects

For better assessment results of your environment and workloads, we recommend exporting additional database objects as well. Please refer to the below script to export the data as separate delimited files.

– – – -Database objects:  we need to run the Below query for every database using Super user credentials

SELECT n.nspname AS schema_name

 , CASE WHEN c.relkind = ‘v’ THEN ‘view’ when c.relkind = ‘i’ THEN ‘index’ ELSE ‘table’ END 

   AS table_type

, count(c.relname)

 FROM pg_class As c

 LEFT JOIN pg_namespace n ON n.oid = c.relnamespace

 LEFT JOIN pg_tablespace t ON t.oid = c.reltablespace

 LEFT JOIN pg_description As d

      ON (d.objoid = c.oid AND d.objsubid = 0)

 WHERE 

 n.nspname  not in (‘information_schema’, ‘pg_catalog’)

group BY n.nspname, CASE WHEN c.relkind = ‘v’ THEN ‘view’ when c.relkind = ‘i’ THEN ‘index’ ELSE ‘table’ END

UNION

SELECT

    n.nspname,

    ‘Stored_procedure’ as table_type

    ,count(p.prosrc)

FROM

    pg_catalog.pg_namespace n

JOIN pg_catalog.pg_proc p ON

    pronamespace = n.oid

join pg_catalog.pg_user b on

    b.usesysid = p.proowner

where

    nspname not in (‘information_schema’,

    ‘pg_catalog’)

group by     n.nspname, table_type

– – -Databases:

 select ‘Databases’,count(*) from pg_database

 UNION                                                      

select ‘Schemas’,count(*) from pg_namespace where nspname  not in (‘information_schema’,

    ‘pg_catalog’)

– – – -High Data Volume Tables

WITH tbl_ids AS

(

  SELECT DISTINCT oid

  FROM pg_class c

  WHERE –relowner > 1

     relkind = ‘r’

),

pcon AS

(

  SELECT conrelid,

         CASE

           WHEN SUM(

             CASE

               WHEN contype = ‘p’ THEN 1

               ELSE 0

             END

           ) > 0 THEN ‘Y’

           ELSE ‘N’

         END pk,

         CASE

           WHEN SUM(

             CASE

               WHEN contype = ‘f’ THEN 1

               ELSE 0

             END

           ) > 0 THEN ‘Y’

           ELSE ‘N’

         END fk,

             conname

  FROM pg_constraint

  WHERE conrelid > 0

  AND   conrelid IN (SELECT oid FROM tbl_ids)

  GROUP BY conrelid,conname

)

SELECT

database

,SCHEMA as schemaname

,”table” AS tablename

,tbl_rows as num_rows

,size AS size_mb

,pcon.pk

,pcon.conname

FROM

svv_table_info ti

LEFT JOIN pcon ON pcon.conrelid = ti.table_id

WHERE ti.SCHEMA !~ ‘^information_schema|catalog_history|pg_’ and size_mb = 10000

– – -Data for Partitioning / Bucketing

SELECT

database

,SCHEMA as schemaname

,”table” AS table_name

,size AS size_mb

,tbl_rows as num_rows

,pg.attname column_name

,” as num_unique_values

FROM

svv_table_info ti

inner JOIN pg_attribute pg ON pg.attrelid = ti.table_id

WHERE ti.SCHEMA !~ ‘^information_schema|catalog_history|pg_’

– – -Count of stored procedures and Functions

select database_name,schema_name,function_type,count(*) from SVV_REDSHIFT_FUNCTIONS  WHERE schema_name !~ ‘^information_schema|catalog_history|pg_’

group by database_name,schema_name,function_type

– – -List of stored procedure

    select

database_name

,schema_name

,function_type

,function_name

from

SVV_REDSHIFT_FUNCTIONS

where

    schema_name not in (‘information_schema’,

    ‘pg_catalog’)

– – -Count of external Tables/Views in Redshift

select

redshift_database_name

,schemaname

,tabletype

,count(tablename)

from

SVV_EXTERNAL_TABLES

group by

redshift_database_name

,schemaname

,tabletype

– – -List of External Tables

select

redshift_database_name

,schemaname

,tabletype

, tablename

from

SVV_EXTERNAL_TABLES

– – –Total I/O Usage by Days

SELECT trunc(start_time) as RUNDATE,sum(local_read_IO+remote_read_IO)as  TOTALIOREADS

FROM SYS_QUERY_DETAIL

where trunc(start_time)  between ‘2024-03-28’ and ‘2024-04-03’

group by trunc(start_time)

Note: ‘2024-03-28’ and ‘2024-03-31’ Please change the Date range To the period of  High usage and of minimum 30 Days.

– – – –Database Volume:

v_space_used_per_tblDownload

We need to create the above view in any schema and then execute the Below query to pull the details.

SELECT

dbase_name

,schemaname

,SUM(megabytes) as total_mb

FROM

  public.v_space_used_per_tbl

GROUP BY

dbase_name

,schemaname

– -Distinct application name

select distinct application_name 
from pg_catalog.stl_connection_log where (recordtime between ‘2024-05-01’ and ‘2024-05-30’ )
and application_name is not null

Note: ‘2024-05-01’ and ‘2024-05-30’ Please change the Date range To the period of  High usage and of minimum 30 Days.

– -Distinct client ID

select distinct client_id from pg_catalog.stl_network_throttle  where  (log_time  between ‘2024-05-01’ and ‘2024-05-30’ ) 

Note: ‘2024-05-01’ and ‘2024-05-30’Please change the Date range To the period of  High usage and of minimum 30 Days.

– – The date range is subject to change – as needed for query log assessment.— (15day/1month/6months whatever is possible for extraction)

Stored Procedure and Function DDL Extraction

Stored Procedure DDLs

There are two options available for extracting Redshift stored procedure DDL scripts. The first option is to use the Python script. The second option is to use the Shell script with psql.

Note:

To use the Python script, you need redshift-connector module. For the shell script, you need a psql client.

Stored Procedure DDLs using Python

Take the Python script that’s attached and put it on your local machine or server. Run it from there to get the table DDLs.

#Connect to the cluster

import redshift_connector

conn = redshift_connector.connect(

     host=’redshift_host’,

     database=’database’,

     port=5439,

     user=’usename’,

     password=’password’

  )

output_file = r’/path/sp_ddl_output.sql’

# Create a Cursor object

cursor = conn.cursor()

# Query a table using the Cursor

res = cursor.execute(“””     select

database_name ||’.’||

schema_name ||’.’||

function_name ||'(‘||

argument_type ||’)’

from

SVV_REDSHIFT_FUNCTIONS

where

    schema_name not in (‘information_schema’,

    ‘pg_catalog’)

AND function_type = ‘STORED PROCEDURE'”””).fetchall()

for value in res:

    print(value[0])

    res2 = cursor.execute(“””show procedure “”” + value[0] + “”” “””).fetchone()

    # Write the result to the file

    with open(output_file, ‘a’) as f:

        f.write(‘– Procedure: ‘ + value[0] + ‘\n’)   

f.write(str(res2[0]) + ‘\n\n’)

Modify the details highlighted. This script exports all the stored procedure DDLs. Please run this script for every database in the Redshift cluster.

Also attached Python script in the form of text file.

proc_ddl_exportDownload

Stored Procedure DDLs using Shell Script

Take the shell script that’s attached and put it on your local machine or server. From there, you can run it to get the table DDLs.

Note:

Run the script on a machine that has the psql client on it.

#!/bin/bash

# Database connection parameters

export PGHOST=”redshift-host“

export PGDATABASE=”database“

export PGPORT=5439

export PGUSER=”username“

export PGPASSWORD=”password“

# Output file

OUTPUT_FILE=”/path/output/sp_ddl_output.sql“

# Query to get view names

QUERY=”select database_name ||’.’|| schema_name ||’.’|| function_name ||'(‘|| replace(replace(replace(replace(replace(argument_type, ‘ ‘, ”), ‘charactervarying’,’varchar’),’timestampwithouttimezone’,’timestamp’ ),’binaryvarying’, ‘varbyte’ ),’timestampwithtimezone’,’timestamptz’ ) ||’)’ from SVV_REDSHIFT_FUNCTIONS where schema_name not in (‘information_schema’,    ‘pg_catalog’) AND function_type = ‘STORED PROCEDURE'”

# Execute the query and save the result in a variable

SPS=$(psql -AXqtc “$QUERY”)

# Loop over the views

for SP in $SPS

do

    # Query to get the view definition

    echo “Exporting Procedure DDL: $SP…”

    QUERY=”SHOW PROCEDURE “$SP”;”

    # Execute the query and save the result in a variable

    SP_DEFINITION=$(psql -AXqtc “$QUERY”)

    # Write the view name and its definition to the output file

    echo “– Stored Procedure: $SP” >> $OUTPUT_FILE

    echo “$SP_DEFINITION” >> $OUTPUT_FILE

    echo “” >> $OUTPUT_FILE

done

Modify the details highlighted. This script exports all the stored procedure DDLs. Please run this script for every database in the Redshift cluster.

Also attached the Shell script in the form of text file.

sp_ddlDownload

User Defined Function DDLs

There are two options available for extracting user-defined function DDL scripts. The first option is to use the AWS Redshift console query editor. The second option is to use any Redshift client to execute and export the results of the query.

Please follow the below steps to export the DDL scripts using the query editor from the AWS Query Editor.

  1. Click Amazon Redshift from the AWS Console.
  2. Select the Redshift cluster from the Cluster Overview tab.
  3. Click Query Data highlighted in Orange on the right side.
  4. Next, click Query in query Editor.
  1. The Query Editor opens where the below attached procedure can be compiled along with the other steps as mentioned below.
  2. Execute the SELECT statement. When the query execution is complete, click and then click CSV.

    1. To export the required DDLs, refer to the below script.

    Note:

    Please execute the below query using super user for any schema.

    generate_udf_ddlsDownload

    Other Scripts and Artifacts

    Copy any other scripts such as DML scripts etc. from your GIT and share them with the LeapLogic team to produce more extensive insights.


    Getting Help

    Contact LeapLogic technical support at info@leaplogic.io


    To learn more, contact our support team or write to: info@leaplogic.io

    Copyright © 2025 Impetus Technologies Inc. All Rights Reserved

    • Terms of Use
    • Privacy Policy
    • License Agreement
    To the top ↑ Up ↑