Matillion Assessment Report
This topic contains information about the Matillion assessment report. The assessment assesses workloads and produces in-depth insights that help plan the migration. The input format for the Matillion assessment is JSON file format.
In This Topic:
Highlights
The highlights section gives you a high-level overview of your assessment summary of the analytics performed on the selected workloads. It includes information about jobs and query database types.
Summary
This section summarizes the input source scripts and the associated workload inventory. It includes information about files, jobs, components, and so on.
Jobs
This section provides an overview of jobs and components along with a detailed complexity assessment of transformation and orchestration jobs.
- Total Jobs: Displays the total number of jobs.
- Orchestration Jobs: Orchestration jobs deal with managing and executing other orchestration and transformation jobs.
- Transformation Jobs: Transformation jobs are used for data processing and transformation. It includes cleaning data, filtering data, transforming data, and more.
- Total Components: Displays the total number of components. Components are the building blocks of a workflow to perform a task or job.
- Orchestration Components: Orchestration components are building blocks that are used to perform orchestration jobs.
- Transformation Components: Transformation components are building blocks that are used to perform transformation jobs.
Query Database Types
This section provides an overview of different query database types within the entire inventory.
Analysis
This topic provides a detailed examination of source files and jobs.
Files
This section provides a comprehensive report of the source files along with information about the databases, orchestration jobs, transformation jobs, and so on.
- File Name: Displays the name of the file.
- Database: Displays the number of associated databases.
- Orchestration Jobs: Displays the number of orchestration jobs. Orchestration jobs deal with managing and executing other orchestration and transformation jobs.
- Transformation Jobs: Displays the number of transformation jobs. Transformation jobs are used for data processing and transformation. It includes cleaning data, filtering data, transforming data, and more.
- External Jobs: Displays the number of external jobs. It refers to jobs that are called from external files and are not present in the input files.
- Queries: Displays the number of queries.
Jobs
This section provides detailed information about orchestration and transformation jobs. It includes details about components, queries, connections, job complexity, and more.
Orchestration Jobs
This section provides information about orchestration jobs including file names, components, queries, job complexity, and more. Orchestration jobs deal with managing and executing jobs.
- Job Name: Displays the name of the job.
- File Name: Displays the associated file.
- Components: Displays the number of orchestration components. Orchestration components are the building blocks of orchestration jobs.
- Queries: Displays the number of queries.
- Orchestration Jobs: Displays the number of orchestration jobs that are associated with the orchestration.
- Transformations: Displays the number of transformation jobs that are associated with the orchestration.
- Jobs: Displays the number of jobs.
- Complexity: Displays the job complexity.
Transformations Jobs
This section provides information about transformation jobs including file names, components, and queries. Transformation jobs are used for data processing and transformation. It includes cleaning data, filtering data, transforming data, and more.
- Job Name: Displays the name of the job.
- File Name: Displays the associated file.
- Components: Displays the number of transformation components. Transformation components are the building blocks of transformation jobs.
- Queries: Displays the number of queries.
Lineage
End-to-end data and process lineage identify the complete dependency structure through interactive and drill-down options to the last level.
Typically, even within one line of business, multiple data sources, entry points, ETL tools, and orchestration mechanisms exist. Decoding this complex data web and translating it into a simple visual flow can be extremely challenging during large-scale modernization programs. The visual lineage graph adds tremendous value and helps define the roadmap to the modern data architecture. It deep dives into all the existing flows, like Autosys jobs, applications, ETL scripts, BTEQ/Shell (KSH) scripts, procedures, input and output tables, and provides integrated insights. These insights help data teams make strategic decisions with greater accuracy and completeness. Enterprises can proactively leverage integrated analysis to mitigate the risks associated with migration and avoid business disruption.
Now, let’s see how you can efficiently manage lineage.
To view the required lineage:
- Select either the Process or Data tab.
- Enter the keywords in the Search Keywords field.
- Click the Search icon to generate the lineage.
Process lineage illustrates the dependencies between two or more processes such as files, jobs, entities, etc., whereas data lineage depicts data flow between two or more data-holding components such as entities, flat files, etc.
In addition, the filter search icon allows you to include or exclude particular nodes to obtain the required dependency structure. You can also choose the direction of the lineage. By default, the Dependency Direction is Left to Right Hierarchy. You can also choose Right to Left Hierarchy direction as required. Moreover, you can also increase the Hierarchy Levels to nth level.
Lineage facilitates you visualize how your selected nodes are connected and depend on each other. The nodes and their connecting edges (relationships) help you to understand the overall structure and dependencies.
Nodes |
Edges |
Tables |
Call |
File |
Read |
Job |
Execute |
Bridge Table |
Write |
|
OTHERS |
Manage Lineage
This feature enables you to view and manage your lineage. You can add, modify, or delete nodes and their relationships to generate an accurate representation of the required dependency structure. There are two ways to update the lineage: either using Complete Lineage report or Lineage Template.
Using Complete Lineage report
Follow the below steps to modify the lineage:
- Click the Manage Graph icon.
- Click Download Complete Lineage to update, add, or delete the nodes and their relationships in the current lineage.
- Once the complete lineage report is downloaded, you can make necessary updates such as updating, deleting or adding the nodes and its relationships.
- After making the required changes, upload the updated lineage report in Upload to Modify Lineage.
- Click Apply to incorporate the updates into the dependency structure.
- Generate the required process or data lineage.
Using Lineage Template
Follow the below steps to add new nodes and their relationships to the current lineage report:
- Click the Manage Graph icon.
- Click Download Lineage Template.
- Once the lineage template is downloaded, you can add new nodes and relationships in the template.
- After making the required changes, upload the template in Upload to Modify Lineage.
- Click Apply to incorporate the updates into the complete dependency structure.
- Generate the required process or data lineage.
Downloadable Reports
Downloadable reports allow you to export detailed Matillion assessment reports of your source data which enables you to gain in-depth insights with ease. To access these assessment reports, click Reports.
Types of Reports
In the Reports section, you can see various of reports such as Insights and Recommendations, Source Inventory Analysis, and Lineage Report. Each report type offers detailed information allowing you to explore your assessment results.
Insights and Recommendations
This report provides an in-depth insight into the source input files. It contains the final output including the details of queries, complexity, jobs, components, and so on.
Job-Component-Name-Detail.csv: This report provides information about jobs and components along with their types.
Job-Flow.xlsx: This report provides information about jobs.
Matillion_Assessment.xlsx: This report provides information about file and job level assessment.
This report contains the following information:
- Report Summary: Provides information about all the generated artifacts.
- Volumetric Info: Presents a summary of the aggregated inventory after analyzing the source files. For instance, it provides volumetric information about the total number of files, orchestration jobs, transformation jobs, components, job complexity, and so on.
- File Level Assessment: Lists all the input files along with statistical information for orchestration jobs, transformation jobs, orchestration components, external jobs, and likewise.
- Job Level Assessment: Lists all the jobs along with information for orchestration jobs, transformation jobs, components, external jobs, job complexity, and so on.
MissingArtifacts.xlsx: This report provides information about missing jobs.
queries_detail.csv: This report provides information about queries including the used and impacted tables, analyzed status, complexity, and more. If the analyzed status is TRUE, it indicates that the query is analyzed successfully. Conversely, a FAILED status indicates that the query is not analyzed.
ScriptSummary_excel.xlsx: This report provides comprehensive information about scripts.
Source Inventory Analysis
It is an intermediate report which helps to debug failures or calculate the final report. It includes all the generated CSV reports such as InputOutput-Components.csv, Job-Component-Name-Detail.csv, missing_artifacts.cvs, and more.
InputOutput-Components.csv: This report provides information about input and output components.
Job-Component-Name-Detail.csv: This report provides information about jobs and components along with their types.
missing_artifacts.cvs: This report provides information about missing artifacts.
Lineage Report
This report provides complete dependency details for all nodes. It provides an end-to-end data and process lineage that helps to identify the complete dependency structure and the data flow.
This report contains the following information:
- Dependency (Process): Provides information about the process lineage.
- Dependency (Data): Provides information about the data lineage.
- Nodes: Lists all the source and target nodes along with its type.
- Volumetric Info (Summary): Provides volumetric information about the artifact types such as input tables, output tables, and schedulers.