Ab Initio Assessment Report

This topic contains information about the Ab Initio assessment report. The assessment assesses workloads and produces in-depth insights that help to plan the migration. The Ab Initio assessment accepts various file formats such as XFR, DML, KSH, PLAN, and PSET. Among these, PLAN file provides detailed workflow for executing tasks and PSET file contains tasks or calls to other scripts such as BTEQ, etc.

In This Topic:

Highlights
- Summary
- Complexity
Analysis
- Source Analysis
- Plan File Information
Lineage
Downloadable Reports

Highlights

The highlights section gives you a high-level overview of your assessment summary of the analytics performed on the selected workloads. It includes a graphical depiction of the complexity of files as well as the summary of the files used.

Summary

This section showcases an overview of input Ab Initio graphs and the associated workload inventory. It provides insights into the total number of files, components, Plan files, tasks and more.

Files: Displays the total number of Ab Initio source files.
Supported Components: Displays the number of unique components identified within the source files.
Plan files: Displays the total number of Plan files. PLAN file provides detailed workflow for executing tasks.
Total tasks: Displays the number of tasks within the Plan files.
Components: Displays the total number of components.
Analyzed: Displays the percentage of analyzed Ab Initio workloads.

Complexity

This section provides a summarized graphical representation of the complexity of the Ab Initio ETL graphs that helps in making different decisions, including budget estimation and the effort required for migration.

Analysis

This topic provides a detailed examination of source and Plan files.

Source Analysis

This section provides a comprehensive report of the source file including statistical information about file complexity, input/output sources, components, and more.

Files: Displays the name of the file.
Input Sources: Provides the number of input files identified in the attached artifacts.
Output Sources: Provides the number of target files for jobs.
Components: Displays the total number of components. Components are the building blocks to perform a task. These components are mainly categorized into the Dataset components and the Program components, where the dataset components store the data, and the program components process the data.
Auto Conversion (%): Displays the percentage of auto-converted files.
Raw Loc: Displays the number of lines of queries in the attached input.
Complexity: Displays the complexity of files.

Plan File Information

This section provides detailed information about plan files, including associated PSET files, MP files, plan tasks, graph tasks, and more.

File Name: Displays the name of the Plan file.
Conditional Task: Displays the number of conditional tasks in each Plan file. Conditional task refers to the execution of tasks based on specific conditions.
Graph Task: Displays the number of graph tasks in each Plan file. Graph task refers to the execution of graphs.
Plan Task: Displays the number of plan tasks in each Plan file. Plan task refers to the execution of another Plan task.
Program Task: Displays the number of program tasks in each Plan file. Program task refers to the execution of programs or scripts.
MP Files: Displays the number of MP files that are associated with the Plan file.
PSET Files: Displays the number of PSET files that are associated with the Plan file.

Lineage

End-to-end data and process lineage identify the complete dependency structure through interactive and drill-down options to the last level.

Typically, even within one line of business, multiple data sources, entry points, ETL tools, and orchestration mechanisms exist. Decoding this complex data web and translating it into a simple visual flow can be extremely challenging during large-scale modernization programs. The visual lineage graph adds tremendous value and helps define the roadmap to the modern data architecture. It deep dives into all the existing flows, like Autosys jobs, applications, ETL scripts, BTEQ/Shell (KSH) scripts, procedures, input and output tables, and provides integrated insights. These insights help data teams make strategic decisions with greater accuracy and completeness. Enterprises can proactively leverage integrated analysis to mitigate the risks associated with migration and avoid business disruption.

Now, let’s see how you can efficiently manage lineage.

To view the required lineage:

Select either the Process or Data tab.
Enter the keywords in the Search Keywords field.

Click the Search icon to generate the lineage.

Process lineage illustrates the dependencies between two or more processes such as files, jobs, etc., whereas data lineage depicts data flow between two or more data-holding components such as entities, flat files, etc.

In addition, the filter search icon allows you to include or exclude particular nodes to obtain the required dependency structure. You can also choose the direction of the lineage. By default, the Dependency Direction is Left to Right Hierarchy. You can also choose Right to Left Hierarchy or Bidirectional dependency directions as required. Moreover, you can also increase the Hierarchy Levels to nth level.

Lineage facilitates you visualize how your selected nodes are connected and depend on each other. The nodes and their connecting edges (relationships) help you to understand the overall structure and dependencies.

Nodes	Edges
Tables	Call
File	Read
Job	Execute
Autosys Box	Write
Script	OTHER

Manage Lineage

This feature enables you to view and manage your lineage. You can add, modify, or delete nodes and their relationships to generate an accurate representation of the required dependency structure. There are two ways to update the lineage: either using Complete Lineage report or Lineage Template.

Using Complete Lineage report

Follow the below steps to modify the lineage:

Click the Manage Graph icon.

Click Download Complete Lineage to update, add, or delete the nodes and their relationships in the current lineage.

Once the complete lineage report is downloaded, you can make necessary updates such as updating, deleting or adding the nodes and its relationships.
After making the required changes, upload the updated lineage report in Upload to Modify Lineage.
Click Apply to incorporate the updates into the dependency structure.
Generate the required process or data lineage.

Using Lineage Template

Follow the below steps to add new nodes and their relationships to the current lineage report:

Click the Manage Graph icon.

Click Download Lineage Template.

Once the lineage template is downloaded, you can add new nodes and relationships in the template.
After making the required changes, upload the template in Upload to Modify Lineage.
Click Apply to incorporate the updates into the complete dependency structure.
Generate the required process or data lineage.

Important:

To effectively manage lineage, you must adhere to the following rules:

Do not modify the column headers or their order in the Complete Lineage report or the Lineage template. The following are the column headers: source_name, source_type, target_name, target_type, relation_type, and database_id.
When deleting a row, retain the value in the database_id column and clear all the other column values.
When inserting a row, add all the column values except in the database_id column.
When updating a row, ensure that all the columns have relevant data.

You can also apply:

Feature	Icon	Use
Filter		Used to filter the lineage.
Reload		Assists in reloading graphs.
Save		Used to save the lineage.
Download		Used to download the file.
Expand		Used to enlarge the screen.

Downloadable Reports

Downloadable reports allow you to export detailed assessment reports of your source data which enables you to gain in-depth insights with ease. To access these assessment reports, click Reports.

Types of Reports

In the Reports section, you can see various types of reports such as Insights and Recommendations, Source Inventory Analysis, and Lineage reports. Each report type offers detailed information allowing you to explore your assessment results.

Insights and Recommendations

This report provides an in-depth insight into the source input files. It contains the final output including the details of queries, complexity, components, and so on.

Here, you can see the abinitio folder, Lineage Dependency Report.xlsx, and updated_complexity_summary.csv.

Lineage Dependency Report.xlsx: This report contains information about views and script level lineages. It includes information about used and impacted tables, views, files, direct dependencies, dependency hierarchy and more.

This report contains the following information:

view_report: Provides information about the views.
script_report: Provides information about script level lineage.

updated_complexity_summary.csv: This report provides information about the updated complexity of the Ab Initio ETL graphs. During the configuration of the Informatica Assessment, you can redefine the Complexity Distribution field to transfer the complexity of queries or files from a higher level to a lower level. This updated complexity is reflected in this report.

Detailed Ab Initio Assessments Reports

To access a detailed assessment report, open the abinitio folder.

abinitio_complexity_details.csv: This report contains information about the complexity of the Ab Initio ETL graphs that help in making different decisions, including budget estimation and the effort required for migration.

abinitio_file_details.csv: This report provides information about the source files including total number of components, complexity, conversion percentage of each file.

abinitio_query_details.csv: This report provides information about queries including the used and impacted tables, analyzed status, complexity, and more. If the analyzed status is TRUE, it indicates that the query is analyzed successfully. Conversely, a FALSE status indicates that the query is not analyzed.

abinitio_summary_details.csv: This report provides a summary of source input files including statistical information such as total number of files, analyzed percentage, components, and more.

AbinitioAssessmentDetailReport.csv: This report provides detailed insights about the source inventory. It includes information about components, conversion percentage, and more.

AbinitioComponentMetadataReport.csv: This report provides information about the Component Metadata including component types, queries, entities, and more.

AbinitioCustomComplexityReport.xlsx: This report provides information about the source inventory including DML files, XFR files, and complexity.

This report contains the following information:

Report Summary: Provides information about all the generated artifacts.
Volumetric Info: Presents a summary of the aggregated inventory after analyzing the source files. For instance, it provides total number of files, components, XFRs, and likewise. It also provides file-level complexity.
DML Summary: Provides information about internal DML files in the kab files.
XFR Summary: Provides information about XFR files that contain transformation logic.
Abinitio Complexity: Provides complexity details of the AbInitio files. It includes statistical and complexity details of components, complex DML files, complex XFR files, and so on.

Source Inventory Analysis

It is an intermediate report which helps to debug failures or calculate the final report. It includes all the generated CSV reports.

abinitio_external_files.csv: This report provides information about external files.

AbinitioComponentMetadataReport.csv: This report provides information about the Component Metadata including component types, queries, entities, and more.

AbinitioFileAvailabilityReport.csv: This report provides information about the availability of each file.

assessment_unparsed_etl_files.csv: This report lists all the unparsed Ab Initio files along with its type and the reason for parsing failure.

invalid_query.csv: This report lists all the invalid queries.

Lineage Report

This report provides complete dependency details for all nodes. It provides an end-to-end data and process lineage that helps to identify the complete dependency structure and the data flow.

This report contains the following information:

Dependency (Process): Provides information about the process lineage.
Dependency (Data): Provides information about the data lineage.
Nodes: Lists all the source and target nodes along with its type.
Volumetric Info (Summary): Provides volumetric information about the artifact types such as input tables, output tables, and schedulers.