Oracle Data Integrator Assessment Report

This topic contains information about the ODI assessment report. The assessment assesses workloads and produces in-depth insights that help to plan the migration. The input format for ODI assessment is XML file format.

In This Topic:

Highlights
Analysis
Lineage
Downloadable Reports

Highlights

The highlights section gives you a high-level overview of your assessment summary of the analytics performed on the selected workloads. It includes a graphical depiction of the complexity of packages, summary of the source input files, orchestrations, unpackaged files, and so on.

Summary

This section illustrates the input ODI scripts that are analyzed throughout the various components. Here, you can see the number of files, packages, unique entities, mappings, and more.

Files: Displays the number of source files.
Packages: Displays the number of packages existing in the source files.
Unique Entities: Displays the total number of unique entities (without duplicate entities) in the source files.
Mappings: Displays the total number of mappings. It describes the flow of data between the source and the target.
Scenario: These are codes generated to implement ODI objects such as mappings, packages, etc., in a production environment.
Procedures: Displays the total number of procedures. It is a set of sequential commands.
Variables: Displays the total number of variables.

Complexity (Across Packages)

This section provides a summarized graphical representation of the ODI packages classification based on complexity. This information is required to make various decisions, including migration planning, and estimating budget.

Queries

It displays a synopsis of analyzed queries with information about unique text, unique pattern, as well as parsed and unparsed queries.

Unique (Text): Duplicate queries are eliminated, and solitary queries are identified.
Unique (Pattern): Identify the distinct query structures but the query structure’s values may change.
Parsed Queries: Displays the number of parsed queries.
UnParsed Queries: Displays the number of unparsed queries.

Statement Types

This section displays a summary of analyzed queries based on the query distribution sorted by the different query statement types in the input source file. Moreover, it displays a visual representation of the complexity breakdown across various statement types.

Query Complexity

This section provides a summarized graphical representation of queries’ complexity that helps in making different decisions, including budget estimation.

Complexity Type	Complexity Range	Description
Low	0-4	Handled directly by the tool, it requires no manual effort.
Medium	4-10	Most of the queries can be converted directly by the tool, and some require minimum manual effort.
High	10+	Complexity is high and requires more manual effort to handle.

Orchestration

This section provides a summary of load plans along with the step types.

Unpackaged Files

Here, you can see a summary of step types that are defined outside the package, for example, mappings, procedures, and so on.

Analysis

This topic provides a detailed examination of packages, entities, jobs, orchestration, and unpackaged files.

Source Analysis

This section provides a comprehensive report of the source files with information about packages, mappings, procedures, variables, and more.

Package Name: Displays the package name in the file.
File Name: Name of the file.
Mapping: Displays the number of mappings in each package. It describes the flow of data between the source and the target.
Procedure: Displays the number of procedures in each package. It is a set of sequential commands.
Scenario: These are codes generated to implement ODI objects such as mappings, packages, etc., in a production environment.
Variables: Displays the count of variables in each package.
Complexity: Displays the complexity of each package.

Entities

This section displays a detailed analysis of the entities. It includes information about the schemas, frequency of tables used, and the associated source files.

Table Name: Name of the table.
Schema Name: Name of the schema.
Frequency: Displays the frequency of table used.
File Name: Displays the associated source file of each table.

Jobs

This section displays detailed information about the job such as its type, package, and more.

Job Name: Name of the job.
Job Type: Displays the type of job such as Mapping, Scenario, Procedure, Variable, or Others.
- Mapping: Mapping describes the flow of data between source and target.
- Procedure: Set of sequential commands.
- Scenario: Scenarios are codes generated to implement ODI objects such as mappings, packages, etc., in a production environment.
- Variable: Object that holds only one value.
- Others: Jobs that don’t belong to Mapping, Procedure, Scenario, or Variable.
Package Name: Displays the associated package.
File Name: Displays the associated source file.

Orchestration

This section displays detailed information about the load plan such as its step type, status, and more.

Load Plan Name: Name of the load plan.
Step Type: Displays the step types such as Mapping, Package, Scenario, Procedure, Variable, or Others.
Active Flage: Indicates whether the step type is active or not.
File Name: Displays the associated source file.

Unpackaged Files

This section displays detailed information about the artifacts defined outside the package that are segregated into mappings and procedures.

Artifact Name: Name of the artifact.
Step Type: Displays step types such as Mapping or Package.
Step Name: Displays the step name.
File Name: Displays the associated source file.
Complexity: Displays the complexity of each step count.

Lineage

End-to-end data and process lineage identify the complete dependency structure through interactive and drill-down options to the last level.

Typically, even within one line of business, multiple data sources, entry points, ETL tools, and orchestration mechanisms exist. Decoding this complex data web and translating it into a simple visual flow can be extremely challenging during large-scale modernization programs. The visual lineage graph adds tremendous value and helps defining the roadmap to the modern data architecture. It deep dives into all the existing flows, like Autosys jobs, applications, ETL scripts, BTEQ/Shell (KSH) scripts, procedures, input and output tables, and provides integrated insights. These insights help data teams to make strategic decisions with greater accuracy and completeness. Enterprises can proactively leverage integrated analysis to mitigate the risks associated with migration and avoid business disruptions.

LeapLogic’s lineage helps to trace the complete data flow in steps. The visual representation helps to track how the data hopped and transformed along the way, with details of parameter changes, etc. It provides backward and forward lineage showing the hops and reference points of data lineage with a complete audit trail of the data points of interest. They also help temporarily ignore peripheral, unwanted data points like system tables, bridge tables, etc. and establish useful correlations between data points to uncover hidden patterns and results. Users can deep dive into certain flows leveraging next-gen features like contextual search, advanced filters according to workload type, an interactive visual interface, graph depth, and more.

Now, let’s see how you can efficiently manage lineage.

To view the required lineage:

Select either the Process or Data tab.
Enter the keywords in the Search Keywords field.

Click the Search icon to generate the lineage.

Process lineage illustrates the dependencies between two or more processes such as packages, scenarios, procedures, etc., whereas data lineage depicts data flow between two or more data-holding components such as entities, flat files, etc.

In addition, the filter search icon allows you to include or exclude particular nodes to obtain the required dependency structure. You can also choose the direction of the lineage. By default, the dependency direction is Left to Right Hierarchy. You can also choose Right to Left Hierarchy or Bidirectional dependency directions as required. Moreover, you can also increase the Hierarchy Levels to the nth level.

Lineage facilitates you visualize how your selected nodes are connected and depend on each other. The nodes and their connecting edges (relationships) help you to understand the overall structure and dependencies.

Nodes	Edges
Tables	Call
File	Read
Procedure	Execute
View	Write
Bridge Table	OTHERS
Mapping	Contains
Package
Scenarios

Manage Lineage

This feature enables you to view and manage your lineage. You can add, modify, or delete nodes and their relationships to generate an accurate representation of the required dependency structure. There are two ways to update the lineage: either using Complete Lineage report or Lineage Template.

Using Complete Lineage report

Follow the below steps to modify the lineage:

Click the Manage Graph icon.

Click Download Complete Lineage to update, add, or delete the nodes and their relationships in the current lineage.

Once the complete lineage report is downloaded, you can make necessary updates such as updating, deleting or adding the nodes and its relationships.
After making the required changes, upload the updated lineage report in Upload to Modify Lineage.
Click Apply to incorporate the updates into the dependency structure.
Generate the required process or data lineage.

Using Lineage Template

Follow the below steps to add new nodes and their relationships to the current lineage report:

Click the Manage Graph icon.

Click Download Lineage Template.

Once the lineage template is downloaded, you can add new nodes and relationships in the template.
After making the required changes, upload the template in Upload to Modify Lineage.
Click Apply to incorporate the updates into the complete dependency structure.
Generate the required process or data lineage.

Important

To effectively manage lineage, you must adhere to the following rules:

Do not modify the column headers or their order in the Complete Lineage report or the Lineage template. The following are the column headers: source_name, source_type, target_name, target_type, relation_type, and database_id.
When deleting a row, retain the value in the database_id column and clear all the other column values.
When inserting a row, add all the column values except in the database_id column.
When updating a row, ensure that all the columns have relevant data.

You can also apply:

Feature	Icon	Use
Filter		Used to filter the lineage.
Reload		Assists in reloading graphs.
Save		Used to save the lineage.
Download		Used to download the file.
Expand		Used to enlarge the screen.

Downloadable Reports

Downloadable reports allow you to export detailed assessment reports of your source data which enables you to gain in-depth insights with ease. To access these assessment reports, click Reports.

Types of Reports

In the Reports section, you can see various types of reports such as Insights and Recommendations, Source Inventory Analysis, and Lineage reports. Each report type offers detailed information allowing you to explore your assessment results.

Insights and Recommendations

This report provides an in-depth insight into the source input files. It contains the final output including the details of queries, complexity, and so on.

Here, you can see the odi folder, and Lineage Dependency Report.xlsx.

Lineage Dependency Report.xlsx: This report contains information about views and script level lineages. It includes information about used and impacted tables, views, files, direct dependencies, dependency hierarchy and more.

This report contains the following information:

view_report: Provides information about the views.
script_report: Provides information about script level lineage.

Detailed ODI Assessments Reports

To access a detailed assessment report, open the ODI folder.

odi_complexity_summary.csv: This report contains information about the complexity of the odi ETL scripts that help in making different decisions, including budget estimation and the effort required for migration.

odi_scenario_highlights.csv: This report provides information about the source files including total number of mappings, scenarios, variables, and more.

odi_sql_highlights.csv: This report provides information about the number of queries segregated by statement types such as delete, merge, insert, and more.

Browse through the output folder to access the ODI Assessment.xlsx, and queries_detail.csv reports.

ODI Assessment.xlsx: This report provides insights about the source inventory. It helps you plan the next frontier of a modern data platform methodically. It includes a report summary, package summary, load plan summary, and a lot more.

This report contains the following information:

Report Summary: Provides information about all the generated artifacts.
Volumetric Info: Lists an aggregated inventory of every source file. For instance, it provides information about the total number of files, packages, mappings, and so on.
Package Summary: Provides information about the packages in the file. It includes statistical information about the components, mappings, procedures, and more along with the complexity of each package.
Load Plan Details: Provides a summary of load plan components including information about the scenarios, used, and impacted tables, and so on.
Mapping Summary: Lists all the mappings along with information about the number of join components, expression components, complexity, and so on.
Procedure Details: Provides information about procedures including used and unused tables, complexity, and more.

queries_detail.csv: This report provides information about queries including the used and impacted tables, analyzed status, complexity, and more. If the analyzed status is TRUE, it indicates that the query is analyzed successfully. Conversely, a FAIL status indicates that the query is not analyzed.

Source Inventory Analysis

It is an intermediate report which helps to debug failures or calculate the final report. It includes all the generated CSV reports such as Entities.csv, Load Plan Details.csv, package summary.csv, and more.

Entities.csv: This report provides information about entities along with associated logical schema, table aliases, and more.

invalid_query.csv: This report lists all the invalid queries.

Load Plan Details.csv: This report provides comprehensive information about load plans including scenarios, scenario steps, execution orders, used and unused tables, and more.

package summary.csv: This report provides comprehensive information about packages along with information about components, mappings, procedures, complexity, and more.

Lineage Report

This report provides complete dependency details for all nodes. It provides an end-to-end data and process lineage that helps to identify the complete dependency structure and the data flow.

This report contains the following information:

Dependency (Process): Provides information about the process lineage.
Dependency (Data): Provides information about the data lineage.
Nodes: Lists all the source and target nodes along with its type.
Volumetric Info (Summary): Provides volumetric information about the artifact types such as input tables, output tables, and schedulers.