Informatica Assessment Report

This topic contains information about the Informatica assessment report. The assessment assesses workloads and produces in-depth insights that help to plan the migration. The input format for Informatica assessment is XML, or PRM file format.

In This Topic:

Highlights
Analysis
Lineage
Downloadable Reports

Highlights

The highlights section gives you a high-level overview of your assessment summary of the analytics performed on the selected workloads. It includes a graphical depiction of the complexity of files as well as the summary of the files used.

Summary

This section illustrates the input informatica graphs that were analyzed throughout the various workflows and components. Here, you can see the number of files, transformations, mappings, workflows, analyzed percentage as well as auto-conversion percentage of the workloads.

Scheduler

Schedulers monitor and execute tasks or jobs at a specific time.

On Server Init: The jobs are triggered based on the availability of the server.
Continuous: Jobs that are continuously executed based on a scheduled time.
On Demand: Jobs that are triggered manually for execution.
Unknown: Jobs that do not belong to the above categories.

Complexity

This section provides a summarized graphical representation of the classification of Informatica files and workflows based on a detailed complexity assessment. This information is required to make different decisions, including migration planning, estimating budget etc.

Analysis

This topic provides a detailed examination of source files, entities, and artifacts.

Source Analysis

This section provides a comprehensive report of the source files with information about the total number of files, source databases, workflows, complexity, external files, and so on.

File Name: Displays name of the file.
Source Databases: Provides details about the source database.
Workflows: Displays the number of workflows. It is a set of instructions to perform various actions.
Mappings: Displays the number of mappings. It describes the flow of data between source and target.
Transformations: Displays the number of transformations. It is a set of instructions to create, modify, or transform data to the target.
Queries: Provides the total number of queries in the file.
Complexity: Displays the complexity of the files.
External Files: Displays the number of external files.

Entities

This section displays a detailed analysis of the entities comprising the available and missing tables.

Tables

This section provides details about the tables used in the Lookup transformation, source, and target. The Lookup transformation is used to calculate field values from the source, source qualifier, or target to retrieve required data.

Table Name: Displays name of the table.
Data Source: Provides details of data source.
Source Database: Provides the details of the source database.
Frequency of Use: Provides the frequency of table usage.

Missing Tables

This section provides a list of all the missing tables.

Table Name: Displays name of the table.
Frequency of Use: Provides the frequency of table usage.

Scheduler

This section provides information about the jobs including the types, status, start time, and more.

Scheduler Name: Displays the name of the scheduler.
Scheduler Type: Displays the type of the scheduler such as ONDEMAND, Continuous, etc.
Start Date & Time: Displays the job execution start date and time.
End Date & Time: Displays the job execution end date and time.
Frequency: Displays the frequency of the scheduled jobs such as daily, weekly, monthly, etc.
Interval: Displays the frequency interval of the scheduled jobs.
Filters: To filter the data based on specific requirements. For example: Run the job only for specific geography.
Associated Workflow: Displays information about the associated workflow.

Workflows

This section provides a summary of workflows with information about mappings, complexity, and associated files.

Workflows Name: Displays name of the workflow.
File Name: Displays the associated source file.
Mapping: Describes the flow of data between source and target.
Complexity: Displays the complexity of the workflows.

Artifacts

This page gives details about artifacts-collections of the related server data. It provides a list of missing artifacts, artifacts that appear additionally, or could not be parsed completely due to some error.

Missing Artifacts

This section provides the details of all the missing artifacts. Additionally, it categorizes the missing artifacts into files and entities that are missing.

Artifact Name: Displays name of the artifact.
Type: Provides the type of the artifacts such as file, table, etc.
Linkage: Provides the linked or associated file names.

Additional Artifacts

This section provides the details of all the artifacts that appear additionally. It also categorizes the additional artifacts into files and entities.

Artifact Name: Displays name of the artifact.
Type: Provides the type of the artifacts such as file, table, etc.
Linkage: Provides the linked or associated file names.

Unparsed Artifacts

This section provides the details of all the artifacts that could not be parsed completely due to some error.

Lineage

End-to-end data and process lineage identify the complete dependency structure through interactive and drill-down options to the last level.

Typically, even within one line of business, multiple data sources, entry points, ETL tools, and orchestration mechanisms exist. Decoding this complex data web and translating it into a simple visual flow can be extremely challenging during large-scale modernization programs. The visual lineage graph adds tremendous value and helps define the roadmap to the modern data architecture. It deep dives into all the existing flows, like Autosys jobs, applications, ETL scripts, BTEQ/Shell (KSH) scripts, procedures, input and output tables, and provides integrated insights. These insights help data teams make strategic decisions with greater accuracy and completeness. Enterprises can proactively leverage integrated analysis to mitigate the risks associated with migration and avoid business disruption.

Now, let’s see how you can efficiently manage lineage.

To view the required lineage:

Select either the Process or Data tab.
Enter the keywords in the Search Keywords field.

Click the Search icon to generate the lineage.

Process lineage illustrates the dependencies between two or more processes such as files, jobs, workflows, etc., whereas data lineage depicts data flow between two or more data-holding components such as entities, flat files, etc.

In addition, the filter search icon allows you to include or exclude particular nodes to obtain the required dependency structure. You can also choose the direction of the lineage. By default, the Dependency Direction is Left to Right Hierarchy. You can also choose Right to Left Hierarchy or Bidirectional dependency directions as required. Moreover, you can also increase the Hierarchy Levels to nth level.

Lineage facilitates you visualize how your selected nodes are connected and depend on each other. The nodes and their connecting edges (relationships) help you to understand the overall structure and dependencies.

Nodes	Edges
Tables	Call
File	Read
Job	Execute
Autosys Box	Write
Workflows	OTHERS
Flat File
DB Type

Manage Lineage

This feature enables you to view and manage your lineage. You can add, modify, or delete nodes and their relationships to generate an accurate representation of the required dependency structure. There are two ways to update the lineage: either using Complete Lineage report or Lineage Template.

Using Complete Lineage report

Follow the below steps to modify the lineage:

Click the Manage Graph icon.

Click Download Complete Lineage to update, add, or delete the nodes and their relationships in the current lineage.

Once the complete lineage report is downloaded, you can make necessary updates such as updating, deleting or adding the nodes and its relationships.
After making the required changes, upload the updated lineage report in Upload to Modify Lineage.
Click Apply to incorporate the updates into the dependency structure.
Generate the required process or data lineage.

Using Lineage Template

Follow the below steps to add new nodes and their relationships to the current lineage report:

Click the Manage Graph icon.

Click Download Lineage Template.

Once the lineage template is downloaded, you can add new nodes and relationships in the template.
After making the required changes, upload the template in Upload to Modify Lineage.
Click Apply to incorporate the updates into the complete dependency structure.
Generate the required process or data lineage.

Important

To effectively manage lineage, you must adhere to the following rules:

Do not modify the column headers or their order in the Complete Lineage report or the Lineage template. The following are the column headers: source_name, source_type, target_name, target_type, relation_type, and database_id.
When deleting a row, retain the value in the database_id column and clear all the other column values.
When inserting a row, add all the column values except in the database_id column.
When updating a row, ensure that all the columns have relevant data.

You can also apply:

Feature	Icon	Use
Filter		Used to filter the lineage.
Reload		Assists in reloading graphs.
Save		Used to save the lineage.
Download		Used to download the file.
Expand		Used to enlarge the screen.

Downloadable Reports

Downloadable reports allow you to export detailed assessment reports of your source data which enables you to gain in-depth insights with ease. To access these assessment reports, click Reports.

Types of Reports

In the Reports section, you can see various types of reports such as Insights and Recommendations, Source Inventory Analysis, and Lineage reports. Each report type offers detailed information allowing you to explore your assessment results.

Insights and Recommendations

This report provides an in-depth insight into the source input files. It contains the final output including the details of queries, complexity, workflows, mappings, and so on.

Here, you can see the informatica folder along with Informatica Assessment.xlsx and Lineage Dependency Report.xlsx.

Informatica Assessment.xlsx: This report provides insights about the source inventory. It helps you plan the next frontier of a modern data platform methodically. It includes a report summary, aggregated inventory, transformations, workflows, mappings, and more.

This report contains the following information:

Report Summary: Provides information about all the generated artifacts.
Volumetric Info: Lists an aggregated inventory of every source Informatica file. For instance, it provides information about the total number of files, transformations, workflows, mappings, and a lot more.
Effective Workflow Mapping: Eliminates any duplicate mappings coming across different workflows. Refer to Effective Mapping column for more details. Accordingly, it provides a suggestion for the complexity (Proposed Complexity column in Workflow Summary sheet) of the duplicate workflow. The matched conditions include:
- Unique: The workflow mappings are distinct from other workflow mappings.
- Partially Matched: When certain mappings are common in some other workflows as well.
- Matched: All the mappings of the two workflows are identical.

The following are the results of effective mappings based on matched conditions.

Is Matched Condition	Effective Mapping of 1st workflow	Effective Mapping of 2nd workflow
Matched	Displays all workflow mappings	None
Partially Matched	Displays all workflow mappings	Displays the remaining workflow mappings which are not present in the 1st workflow
Unique	Displays all workflow mappings	Displays all workflow mappings

Transformation Summary: Lists all the input source files along with the count of workflows, worklets, sessions, mappings, mapplets, and so forth available in the source files.
Workflow Summary: Lists all the workflows existing in the Informatica files. It also provides statistical information about the worklets, sessions, mappings, mapplets, expressions, transformation components such as sequences, filters, joiners, and so on.
Mapping Summary: Describes the flow of data between source and target. It lists all the mappings and the count of associated transformations such as expression input columns, expression output columns, aggregators, filters, joiners, and so on.
DB Summary: Provides information about the files available in the databases. It displays the count of workflows, mappings, source files, target files, and so on.
Workflow DB Summary: Provides workflow-level details. It includes the count of mappings, source files, target files, and so on.
Transformation Patterns: Provides details about transformations with similar patterns. It also includes information about transformation types, patterns, and occurrences.
Mapping Patterns: Provides details about mappings with similar patterns along with the number of occurrences.
Mapplet Patterns: Provides details about mapplets with similar patterns along with the number of occurrences.
Workflow Patterns: Provides details about workflows with similar patterns along with the number of occurrences.
Worklet Patterns: Provides details about worklets with similar patterns along with the number of occurrences.

Lineage Dependency Report.xlsx: This report contains information about views and workflow level lineages. It includes information about used and impacted tables, views, files, direct dependencies, dependency hierarchy and more.

This report contains the following information:

view_report: Provides information about the views.
script_report: Provides information about workflow level lineage.

Detailed Informatica Assessment Reports

To access a detailed assessment report, open the informatica folder. Here, you can see CSV, pattern, and query folders along with data lineage and missing artifacts reports.

Browse through the CSV folder to access the Updated Complexity.csv file.

Updated Complexity.csv: This report provides information about the updated complexity of Informatica scripts. During the configuration of the Informatica Assessment, you can redefine the Complexity Distribution field to transfer the complexity of ETL scripts from a higher level to a lower level. This updated complexity is reflected in this report.

Browse through the pattern folder to get information about similar patterns including transformations, mappings, mapplets, workflows, and worklets.

Mapping: In this folder you can see the CSV file which contains details about mappings with similar patterns.

Transformation: In the Transformation folder, you can see CSV files for various transformation types such as aggregators, sorters, and source qualifiers.

Each CSV file provides information about similar patterns within its respective transformation type.

Workflow: In this folder you can see the CSV file which contains details about workflows with similar patterns.

Browse through the query folder to access Entity.csv, Query.csv, and Statement Type Summary.csv reports.

Entity.csv: This report provides information about entities.

Query.csv: This report provides information about queries including its type, complexity, and more.

Statement Type Summary.csv: Provides information about queries segregated by statement types such as SELECT, UPDATE, INSERT, DELETE, etc.

informatica_dataLineage.csv: This report provides information about entity or file level lineage based on write operation.

Missing artifacts: This report provides information about missing artifacts.

Source Inventory Analysis

It is an intermediate report which helps to debug failures or calculate the final report. It includes all the generated CSV reports, query extraction output, procedure details, unparsed files, detailed analysis report, invalid queries, and more.

Browse through the CSV folder to access reports such as Entity.csv, External Executable File.csv, Macro.csv, Scheduler.csv, Source.csv, Target.csv, and more.

Entity.csv: This report provides information about entities. It includes information about table types, databases, connection types, and more.

External Executable File.csv: This report provides information about external executable files including workflows, executable file paths, commands, and more.

Macro.csv: This report provides information about macros in the source files.

Scheduler.csv: This report provides information about schedulers, its type, reusability, and workflow associated with the scheduler. It also provides other scheduling-related information like start date, end date, repeat type, repeat interval, occurrence, filters, etc.

Source.csv: This report provides information about source database type and folder name.

Target.csv: This report provides information about target database type and folder name.

assessment_unparsed_etl_files.csv: This report lists all the unparsed Informatica files along with its type and the reason for parsing failure.

external_procedure_detail.csv: This report provides information about the external procedures.

Informatica Detailed Analysis.xlsx: This report provides a detailed analysis report including workflows, mappings, connections, and more.

This report contains the following information:

Report Summary: Provides information about all the generated artifacts.
Workflows-L1: Set of instructions to perform various actions. It lists all the workflows along with information about the start and end nodes of the workflows, component type, mapping file, workflow command, and so on.
Workflows-Override: Workflows can modify some elements within the transformation during execution. It lists all the overridden or altered workflows. It also provides information about the override component type, override component name, mapping files, and a lot more.
Mappings: Describes the flow of data between source and target. Lists all the mapping files along with the component name, component type, and so on.
Workflow Connections: Lists all workflows along with connection details. It also includes information about the session, mapping files, component type, connection variables, connection values, and so on.
Worklets: Provides information about worklets including the associated workflows and their count.
Connections: Interface to connect nodes. It lists mapping files, session, source connection subtype, connection variables, and so on.

invalid_query.csv: This report lists all the invalid queries.

Lineage Report

This report provides complete dependency details for all nodes. It provides an end-to-end data and process lineage that helps to identify the complete dependency structure and the data flow.

This report contains the following information:

Dependency (Process): Provides information about the process lineage.
Dependency (Data): Provides information about the data lineage.
Nodes: Lists all the source and target nodes along with its type.
Volumetric Info (Summary): Provides volumetric information about the artifact types such as workflows, input tables, output tables, and schedulers.