Skip to the content
LeaplogicLeaplogic
  • Home
  • About Us
  • Contact
SIGN IN
  • Home
  • About Us
  • Contact

  • Getting Started
    • Before You Begin
    • Creating an Account
    • Logging into LeapLogic
    • Reset Password
    • Quick Tour of the Web Interface
    • LeapLogic in 15 minutes
      • Prerequisites
      • Step 1. Log into LeapLogic
      • Step 2. Create Assessment and Get Insights
      • Step 3. Create Transformation Pipeline and See Results
      • Step 4. Edit or Optimize the Transformed Code
      • Step 5: Complete the Transformation Lifecycle
  • Introduction to LeapLogic
    • Overview
    • High Level Architecture
    • Supported Legacy and Cloud Platforms
    • Key Features
  • Workload Assessment
    • Overview
    • Value Proposition
    • Creating Assessment
      • Prerequisites
      • Step 1. Provide Primary Inputs
        • Automation Coverage
      • Step 2. Add the Additional Inputs
        • Table Stat Extraction Steps
          • Teradata
          • Oracle
          • Netezza
      • Step 3. Update the Source Configuration
      • Step 4. Configure the Recommendation Settings
    • Assessment Listing
    • Understanding Insights and Recommendations
      • Volumetric Info
      • EDW
        • Oracle
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Vertica
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Snowflake
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Azure Synapse
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • SQL Server
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Teradata
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Netezza
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Google Big Query
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Redshift
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • PostgreSQL
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Duck DB
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • ClickHouse
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Exasol
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • DB2
          • Highlights
          • Analysis
          • Optimization
          • Recommendations
          • Lineage
          • Downloadable Reports
      • ETL
        • Informatica
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • Ab Initio
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • DataStage
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • Talend
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • SSIS
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • Informatica BDM
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • Oracle Data Integrator
          • Highlights
          • Analysis
          • Downloadable Reports
        • Pentaho
          • Highlights
          • Analysis
          • Downloadable Reports
        • Azure Data Factory
          • ARM Template
          • Highlights
          • Analysis
          • Downloadable Reports
        • Matillion
          • Highlights
          • Analysis
          • Downloadable Reports
        • SnapLogic
          • Highlights
          • Analysis
          • Downloadable Reports
      • Orchestration
        • AutoSys
          • Highlights
          • Analysis
          • Downloadable Reports
        • Control-M
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • SQL Server
          • Highlights
          • Analysis
      • BI
        • OBIEE
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • Tableau
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • IBM Cognos
          • Highlights
          • Analysis
          • Downloadable Reports
        • MicroStrategy
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • Power BI
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • SSRS
          • Highlights
          • Analysis
          • Downloadable Reports
        • SAP BO
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • WebFOCUS
          • Highlights
          • Analysis
          • Downloadable Reports
      • Analytics
        • SAS
          • Highlight
          • Analysis
          • Lineage
          • Downloadable Reports
        • Alteryx
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
      • Integrated Assessment (EDW, ETL, Orchestration, BI)
        • Highlights
        • Analysis
        • Optimization
        • Lineage
        • Recommendations
    • Managing Assessment Reports
      • Downloading Report
      • Input Report Utility
      • View Configuration
    • Complexity Calculation Logic
    • Key Benefits
    • Ad hoc Query
  • Metadata Management
    • Overview
    • Introduction to Data Catalog
      • Managing Data Catalog
        • Building Data Catalog
        • Insights to Data Catalog
        • Managing the Repository and Data Source
      • Creating Repository (Repo)
      • Creating Data Source
    • Tag Management
    • Key benefits
  • Batch Processing using Pipeline
    • Introduction
    • Designing Pipeline
      • How to create a pipeline
        • Configuring Migration Stage
          • Schema Optimization
        • Configuring Transformation Stage
          • On-premises to Cloud
          • Cloud-to-Cloud
          • LeapLogic Express
        • Configuring Validation Stage
          • Data Validation
            • Table
            • File
            • File and Table
            • Cell-by-cell validation
          • Query Validation
            • Query Validation (When Data is Available)
            • Query Validation (When Data is Not Available)
          • Schema Validation
        • Configuring Execution Stage
        • Configuring ETL Conversion Stage
          • Ab Initio
          • Informatica
          • Informatica BDM
          • Matillion
          • DataStage
          • SSIS
          • IICS
          • Talend
          • Oracle Data Integrator
          • Pentaho
          • SnapLogic
        • Configuring Mainframe Conversion Stage
          • Cobol
          • JCL
        • Configuring Orchestration Stage
          • AutoSys
          • Control-M
        • Configuring BI Conversion Stage
          • OBIEE to Power BI
          • OBIEE to AWS QuickSight
          • Tableau to Amazon QuickSight
          • Tableau to Power BI
          • Tableau to Superset
          • Tableau to Looker
          • IBM Cognos to Power BI
        • Configuring Analytics Conversion Stage
          • SAS
          • Alteryx
        • Configuring Script Conversion Stage
    • Key Features
      • How to schedule a pipeline
      • Configuring Parameters
  • Pipeline Reports
    • Overview of Pipeline Report
    • Pipeline Listing
    • Reports and Insights
      • Migration
      • Transformation
        • On-premises to Cloud
        • Cloud-to-Cloud
        • LeapLogic Express
      • Validation
        • Data
          • File
          • Table
          • File and Table
        • Query
          • Query Validation Report (When Data is Available)
          • Query Validation Report (When Data is not Available)
        • Schema
      • Execution
      • ETL
        • Ab Initio
        • Informatica
        • Informatica BDM
        • Matillion
        • DataStage
        • SSIS
        • IICS
        • Talend
        • Oracle Data Integrator
        • Pentaho
        • SnapLogic
      • Mainframe
        • Cobol
        • JCL
      • Orchestration
        • AutoSys
        • Control-M
      • BI
        • OBIEE to Power BI
        • OBIEE to Amazon QuickSight
        • Tableau to Amazon QuickSight
        • Tableau to Power BI
        • Tableau to Superset
        • Tableau to Looker
        • IBM Cognos to Power BI
      • Analytics
        • SAS
        • Alteryx
      • Shell Script
      • Common Model
    • Automation Level Indicator
      • ETL
        • Informatica
        • Matillion
        • DataStage
        • Informatica BDM
        • SnapLogic
        • IICS
        • Ab Initio
        • SSIS
        • Talend
        • Pentaho
      • Orchestration
        • AutoSys
        • Control-M
      • EDW
      • Analytics
        • SAS
        • Alteryx
      • BI
      • Shell Script
    • Error Specifications & Troubleshooting
  • SQL Transformation
    • Overview
    • Creating and Executing the Online Notebook
      • How to Create and Execute the Notebook
      • Supported Features
    • Configuring the Notebook
      • Transformation
      • Unit Level Validation
      • Script Level Validation
    • Notebook Listing
  • Operationalization
    • Overview
      • Basic
      • Advanced
      • Cron Expression
    • Parallel Run Pipeline Listing
  • Transformation Source
    • Introduction
    • Creating Transformation Source Type
  • Governance
    • Summary of Governance - Roles and Permissions
    • User Creation
      • Creating a new User Account
    • Adding Roles and permissions
      • How to add Roles and Permissions to a new user?
    • Adding Group Accounts
    • Default Quota Limits
    • Product Usage Metrics
  • License
    • EDW
    • ETL
  • LeapLogic Desktop Version
    • Overview
    • Registration and Installation
    • Getting Started
    • Creating Assessment
      • ETL
      • DML
      • Procedure
      • Analytics
      • Hadoop
    • Reports and Insights
      • Downloadable Reports
      • Reports for Estimation
    • Logging and Troubleshooting
    • Sample Scripts
    • Desktop vs. Web Version
    • Getting Help
  • LeapLogic (Version 4.8) Deployment
    • System Requirements
    • Prerequisites
    • Deployment
      • Extracting Package
      • Placing License Key
      • Executing Deployment Script
      • Accessing LeapLogic
    • Uploading License
    • Appendix
    • Getting Help
  • Removed Features
    • Configuring File Validation Stage
    • Variable Extractor Stage
      • Variable Extractor Report
    • Configuring Meta Diff Stage
      • Meta Diff
    • Configuring Data Load Stage
      • Data Load
    • Configuring Multi Algo Stage
  • FAQs
  • Tutorial Videos
  • Notice
Home   »  Workload Assessment   »  Understanding Insights and Recommendations   »  ETL  »  DataStage Assessment Report

DataStage Assessment Report

This topic contains information about the DataStage assessment report. The assessment assesses workloads and produces in-depth insights that help to plan the migration. DataStage assessment uses XML or DSX files as input.

In This Topic:

  • Highlights
    • Summary
    • Complexity
  • Analysis
    • Source Analysis
    • Entities
    • Jobs
    • Artifacts
  • Lineage
  • Downloadable Reports
    • Insights and Recommendations
    • Source Inventory Analysis
    • Lineage Report

Highlights

The highlights section gives you a high-level overview of your assessment summary of the analytics performed on the selected workloads. It includes a graphical depiction of the complexity of files as well as a summary of the files used.

Summary

This section highlights the input DataStage files that were analyzed throughout the various jobs. Here, you can see the number of files, entities, jobs, components, as well as analyzed percentage of the workloads.

Complexity

This section provides a summarized graphical representation of the complexity of the DataStage files that help in making different decisions including migration planning and estimating budget.


Analysis

This topic provides a detailed examination of source files, jobs, and artifacts.

Source Analysis

This section provides a comprehensive report on the source files. It includes information about the total number of files, jobs, components, complexity, percentage of analyzed files, and queries associated with each file.

  • Files: Displays name of the file.
  • Components: Displays the number of components in each file. DataStage accommodates multiple components like Jobs.
  • Analyzed (%): Provides percentage of analyzed queries in the file.
  • Jobs: Displays the number of jobs. Jobs contain instructions to perform a set of activities. Click the dropdown arrow next to the job count to view the associated job types along with their counts. The job types include:
    • Parallel: Jobs run in parallel.
    • Sequential: Jobs run in sequence.
    • Others: Jobs that are not parallel or sequential.
  • Queries: Provides the total number of queries in the file.
  • Routines: Displays the number of routines in each file. It is a custom script or pre-defined collection of functions to create, update or view jobs.

Entities

This section displays a detailed analysis of the entities. It includes information about the job types, jobs, and the associated source files.

  • Table Name: Name of the table.
  • Type: Displays the type of job such as parallel, sequential, and others.
    • Parallel: Jobs run in parallel.
    • Sequential: Jobs run in sequence.
    • Others: Jobs that are not parallel or sequential.
  • Job Name: Displays the associated jobs of each table.
  • File Name: Displays the associated source file of each table.

Jobs

This section provides a complete analysis of the jobs. Jobs are the orchestration scripts (such as AutoSys or Control-M) that run in a certain order to perform a set of activities. It provides information regarding associated files that can run in parallel or sequential to perform the task.

Parallel Jobs

This section provides details of jobs that run in parallel.

  • Jobs Name: Displays name of the parallel job.
  • Associated File: Displays the files associated with the jobs.

Sequential Jobs

This section provides details of jobs that run in sequence.

  • Jobs Name: Displays name of the sequence job.
  • Associated File: Displays the files associated with the jobs.
  • Dependent Jobs: Provides the jobs that are associated with the jobs in the Jobs Name field.

Others

This section provides details of jobs other than parallel and sequential jobs.

  • Jobs Name: Displays the name of the job.
  • Associated File: Displays the files associated with the jobs.
  • Dependent Jobs: Provides the jobs that are associated with the jobs in the Jobs Name field.

Artifacts

This page gives details about artifacts-collections of related server data. It provides a list of missing artifacts, artifacts that appear additionally, or could not be parsed completely due to some error.

Missing Artifacts

This section provides the details of all the missing artifacts. Additionally, it categorizes the missing artifacts into files and entities that are missing.

  • Artifact Name: Displays name of the artifact.
  • Type: Provides the type of the artifacts such as file, table, etc.
  • Linkage: Provides the linked or associated file name.

Additional Artifacts

This section provides the details of all the artifacts that appear additionally. It also categorizes the additional artifacts into files and entities that appeared additionally.

  • Artifact Name: Displays name of the artifact.
  • Type: Provides the type of the artifacts such as file, table, etc.
  • Linkage: Provides the linked or associated file name.

Unparsed Artifacts

This section provides the details of all the artifacts that could not be parsed completely due to some error.

  • File Name: Displays name of the artifact.
  • Message: Displays the message.

Lineage

End-to-end data and process lineage identify the complete dependency structure through interactive and drill-down options to the last level.

Typically, even within one line of business, multiple data sources, entry points, ETL tools, and orchestration mechanisms exist. Decoding this complex data web and translating it into a simple visual flow can be extremely challenging during large-scale modernization programs. The visual lineage graph adds tremendous value and helps define the roadmap to the modern data architecture. It deep dives into all the existing flows, like Autosys jobs, applications, ETL scripts, BTEQ/Shell (KSH) scripts, procedures, input and output tables, and provides integrated insights. These insights help data teams make strategic decisions with greater accuracy and completeness. Enterprises can proactively leverage integrated analysis to mitigate the risks associated with migration and avoid business disruption.

Now, let’s see how you can efficiently manage lineage.

To view the required lineage:

  1. Select either the Process or Data tab.
  2. Enter the keywords in the Search Keywords field.
  1. Click the Search icon to generate the lineage.

Process lineage illustrates the dependencies between two or more processes such as files, jobs, etc., whereas data lineage depicts data flow between two or more data-holding components such as entities, flat files, etc. 

In addition, the filter search icon allows you to include or exclude particular nodes to obtain the required dependency structure. You can also choose the direction of the lineage. By default, the Dependency Direction is Left to Right Hierarchy. You can also choose Right to Left Hierarchy or Bidirectional dependency directions as required. Moreover, you can also increase the Hierarchy Levels to nth level.

Lineage facilitates you visualize how your selected nodes are connected and depend on each other. The nodes and their connecting edges (relationships) help you to understand the overall structure and dependencies.

Nodes Edges
Tables Call
File Read
Job Execute
Autosys Box Write
Script OTHER

Manage Lineage

This feature enables you to view and manage your lineage. You can add, modify, or delete nodes and their relationships to generate an accurate representation of the required dependency structure. There are two ways to update the lineage: either using Complete Lineage report or Lineage Template.

Using Complete Lineage report

Follow the below steps to modify the lineage:

  1. Click the Manage Graph icon.
  1. Click Download Complete Lineage to update, add, or delete the nodes and their relationships in the current lineage.
  1. Once the complete lineage report is downloaded, you can make necessary updates such as updating, deleting or adding the nodes and its relationships.
  2. After making the required changes, upload the updated lineage report in Upload to Modify Lineage.
  3. Click Apply to incorporate the updates into the dependency structure.
  4. Generate the required process or data lineage.

Using Lineage Template

Follow the below steps to add new nodes and their relationships to the current lineage report:

  1. Click the Manage Graph icon.
  1. Click Download Lineage Template.
    1. Once the lineage template is downloaded, you can add new nodes and relationships in the template.
    2. After making the required changes, upload the template in Upload to Modify Lineage.
    3. Click Apply to incorporate the updates into the complete dependency structure.
    4. Generate the required process or data lineage.

    Important

    To effectively manage lineage, you must adhere to the following rules:

    • Do not modify the column headers or their order in the Complete Lineage report or the Lineage template. The following are the column headers: source_name, source_type, target_name, target_type, relation_type, and database_id.
    • When deleting a row, retain the value in the database_id column and clear all the other column values.
    • When inserting a row, add all the column values except in the database_id column.
    • When updating a row, ensure that all the columns have relevant data.

    You can also apply:

    FeatureIconUse
    FilterUsed to filter the lineage.
    ReloadAssists in reloading graphs.
    SaveUsed to save the lineage.
    DownloadUsed to download the file.
    ExpandUsed to enlarge the screen.


    Downloadable reports

    Downloadable reports allow you to export detailed assessment reports of your source data which enables you to gain in-depth insights with ease. To access these assessment reports, click Reports.

    Types of Reports

    In the Reports section, you can see various types of reports such as Insights and Recommendations, Source Inventory Analysis, and Lineage reports. Each report type offers detailed information allowing you to explore your assessment results.

    Insights and Recommendations

    This report provides an in-depth insight into the source input files. It contains the final output including the details of queries, complexity, jobs, and so on.

    Here, you can see the datastage folder, and DataStage Assessment.xlsx, and Lineage Dependency Report.xlsx reports.

    DataStage Assessment.xlsx: This report provides insights about the source inventory. It helps you plan the next frontier of a modern data platform methodically. It includes a report summary, aggregated inventory, file summary, job summary, and more.

    This report contains the following information:

    • Report Summary: Provides information about all the generated artifacts.
    • Volumetric Info: Presents a summary of the aggregated inventory after analyzing the source files. For instance, it provides volumetric info about the total number of files, jobs, shared containers, stages, queries, and likewise. It also provides job and query level complexity.
    • File Summary: Lists all the input files along with the total number of jobs, components, complexity, and analyzed percentage. It also provides statistical information about the transformation components such as sequencer, Lookup, aggregator, and so on. These transformation components are transformed or converted to the target platform.
    • Job Summary: Lists all the jobs associated with the input file. It also provides information about the complexity, analyzed percentage, the total number of components, and a lot more.
    • Routines: Lists all the routines (Reusable custom script or pre-defined set of functions to perform specific operations or functionality in the DataStage job) present in the input files along with the file paths where the routine code resides.
    • UnSupported Component: Provides information about the unsupported component types along with their frequency.
    • Complexity Summary: Provides details about the complexity of jobs.
    • Transformation Patterns: Provides details about transformations with similar patterns. It also includes information about transformation types, patterns, and occurrences.
    • Job Patterns: Provides details about jobs with similar patterns. It also includes information about job types, patterns, and occurrences.
    • Missing Job: Lists all the missing jobs along with their parent job.

    Lineage Dependency Report.xlsx: This report contains information about views and job level lineages. It includes information about used and impacted tables, views, files, direct dependencies, dependency hierarchy and more.

    This report contains the following information:

    • view_report: Provides information about the views.
    • script_report: Provides information about job level lineage.

    Browse through the datastage folder to get datastage_column_lineage.csv, datastage_job_details_param.csv, job Analysis, Query.csv (CSV folder), job-level column lineage (datastage_column_lineage folder), and stage-level column lineage (datastage_column_stage_lineage folder) reports.

    datastage_column_lineage.csv: This report provides information about column lineage. It includes information about source and target nodes (tables or files) along with associated columns, types, stages, and more.

    datastage_job_details_param.csv: This report provides information about jobs including parent jobs, child jobs, parameters, values, and more.

    Job Analysis.xlsx: This report provides information about jobs including the stages, stage types, container types, impacted and used tables along with a list of job parameters.

    This report contains the following information:

    • Job Analysis: Provides detailed information about jobs including the stage name, stage types, container types, impacted, used tables and more.
    • Job Parameters: Lists all the job parameters.

    Routines.csv: This report lists all the routines (Reusable custom script or pre-defined set of functions to perform specific operations or functionality in the DataStage job) present in the input files along with the file paths where the routine code resides.

    Browse through the csv folder to view Query.csv report.

    Query.csv: This report provides information about queries including the used and impacted tables, analyzed status, complexity, and more. If the analyzed status is Analyzed, it indicates that the query is analyzed successfully. Conversely, a Not Analyzed status indicates that the query is not analyzed.

    Browse through the datastage_column_lineage folder to view job-level column lineage details. Each file includes information about a specific job such as source and target, along with their types, columns, stages, and more.

    Browse through the datastage_column_stage_lineage folder to view stage-level column lineage details. Each file contains information about a specific stage (a component of a DataStage job), including associated jobs, source and target nodes along with their types, columns, and more.

    Source Inventory Analysis

    It is an intermediate report which helps to debug failures or calculate the final report. It includes all the generated CSV reports including external executable files, external procedures files, job details, unparsed files, invalid queries, and more.

    assessment_unparsed Files.csv: This report lists all the unparsed DataStage files along with their type and the reason for parsing failure.

    datastage_column_lineage.csv: This report provides information about column lineage. It includes information about source and target nodes (tables or files) along with associated columns, types, stages, and more.

    datastage_job_details_param.csv: This report provides information about jobs including parent jobs, child jobs, parameters, values, and more.

    invalid_query.csv: Lists all the invalid queries.

    Routines: Lists all the routines (Reusable custom script or pre-defined set of functions to perform specific operations or functionality in the DataStage job) present in the input files along with the file paths where the routine code resides.

    Browse through the csv folder to access External Executable File.csv, External Procedures File.csv, and Job Details.csv reports.

    External Executable File.csv: Provides information about external executable files including jobs, executable file paths, commands, and more.

    External Procedures File.csv: Provides information about external procedures files. It includes information about jobs, procedures, and its availability.

    Job Details.csv: Provides information about jobs such as job types, child jobs, child job types, and more.

    Lineage Report

    This report provides complete dependency details for all nodes. It provides an end-to-end data and process lineage that helps to identify the complete dependency structure and the data flow.

    This report contains the following information:

    • Dependency (Process): Provides information about the process lineage.
    • Dependency (Data): Provides information about the data lineage.
    • Dependency (Data Model): Provides dependency details about the data models.
    • Nodes: Lists all the source and target nodes along with its type.
    • Volumetric Info (Summary): Provides volumetric information about the artifact types such as input tables, output tables, and DataStage jobs.

    To learn more, contact our support team or write to: info@leaplogic.io

    Copyright © 2025 Impetus Technologies Inc. All Rights Reserved

    • Terms of Use
    • Privacy Policy
    • License Agreement
    To the top ↑ Up ↑