SSIS Assessment Report
This topic contains information about the SSIS assessment report. The input format for SSIS assessment is DTSX, which is an XML-based file format. This file contains instructions for processing data flow from the source to the target.
In This Topic:
Highlights
The highlights section gives you a high-level overview of your assessment summary based on the analysis of the selected workloads. It includes a graphical depiction of the complexity of packages as well as summary of the packages, queries and entities used.
Summary
This section summarizes the input SSIS graphs, including number of packages, executable processes, stored procedures and so on. Each file in the SSIS assessment is called a package. This report comprises a total of eight packages, implying that eight input files are provided for the assessment.
- Total Packages: Total number of input files.
- Executable Processes: Displays the number of executable processes. It is used to run external files in SSIS packages such as shell, python, etc.
- Components: A component is data flow in SSIS packages. SSIS furnishes three types of components for data flow.
- Sources: Extract the data from source data stores.
- Transformations: Modify the data.
- Destinations: Load the processed data back into data stores or store it in memory.
- Connections: Provides the number of connections used to implement integration services to move data from one place to another.
- Stored Procedures: Pre-defined SQL codes that can be reused.
- Executables: Displays the number of executables. It is control flow in SSIS packages. SSIS furnishes three types of elements for control flow.
- Containers: Provide a sequence or structure to run tasks.
- Tasks: Functional work segments of control flow elements.
- Precedence constraints: Specify the order of operations as well as the relationships between containers and tasks.
- Sub-Packages: Displays the total number of packages within another package.
Complexity
This section provides a graphical representation of SSIS files based on the complexity. This information is required to make various decisions, including budget estimation.
Queries
It displays a synopsis of the total queries with information about executable SQL tasks, variables, and so on.
- Total Queries: Total number of queries in the input files.
- Execute SQL Tasks: Provides the number of queries in the Execute SQL tasks. It is one of the executable components in SSIS that is used to retrieve result sets.
- SQL Commands: Provides the number of SQL queries in the SQL Commands component. It is used to interact with the database.
- Commands Params: Provides the number of SQL queries in the Command Params component.
- Variable Tasks: Provides the number of SQL queries in the Variable Task component.
Entities
This section provides a synopsis of the analyzed entities including the total number of entities and missing tables in the input packages.
Analysis
This topic provides a detailed examination of packages, queries, and entities.
Source Information
This section provides a comprehensive report of the source files with information about the total number of packages, executables, transformation, connection, and so on.
- Package Name: Name of the package.
- Executables: Displays the number of executables in each package. It is control flow in SSIS packages. SSIS furnishes three types of elements for control flow.
- Containers: Provides a sequence or structure to run tasks.
- Tasks: Functional work segments of control flow elements.
- Precedence constraints: Specify the order of operations as well as the relationships between containers and tasks.
- Transformations: Displays the number of data flow components that are used to perform various actions such as sorting, merging, data cleansing and so on.
- Connections: Provides information about the connections such as OLE DB connection, FLAT file, etc that are used to connect the database.
- Stored Procedures: Pre-defined SQL codes that can be reused.
- Queries: Provides details of queries in the package.
- Complexity: Provides complexity of each package.
- Pipeline: A type of component to transform data from source to target.
- Executable Processes: Displays the number of executable processes. It is used to run external files in SSIS packages such as shell, python, etc.
- Sub-Packages: Displays the total number of packages within another package.
- Dependent Packages: Displays the number of packages that depend on other packages.
- Components: A component is data flow in SSIS packages. SSIS furnishes three types of components for data flow.
- Sources: Extract the data from source data stores.
- Transformations: Modify the data.
- Destinations: Load the processed data back into data stores or store it in memory.
- Containers: Provides a sequence or structure to run tasks.
Queries
This page displays a detailed analysis of the queries used in the input packages. Moreover, it displays the package name and the query type of each query.
Entities
This section displays a detailed analysis of entities along with the missing table details.
- Table Name: Displays name of the table.
- Package Name: Displays the name of the package.
- Data Volume: Displays the quantity of data.
- Materialized View: These are queries used to store data in physical tables where you can define to auto update the table when changes are made in the source tables or execute a command to update the tables.
- Frequency of Use: Displays the frequency of table used.
- Primary Key: To identify unique row in the table. It will not accept NULL values.
- Foreign Key: To link two tables.
- Unique Key: To identify records in a table.
- No. of Create: Displays number of CREATE queries executed on the table.
- No. of Delete: Displays number of DELETE queries executed on the table.
- No. of Insert: Displays number of INSERT queries executed on the table.
- No. of Read: Displays number of READ queries executed on the table.
- No. of Update: Displays number of UPDATE queries executed on the table.
- No. of Rows: Displays number of rows on the table.
- Primary Index: Indexes based on the primary key.
- Unique Primary Index: Allows only unique values in the column of the table.
- Index: Indexes are lookup tables that help to quickly retrieve data from the database.
- Queries Executed: Number of queries that are executed.
- Transactional: Displays the number of data flow components that are used to perform sorting, merging, data cleansing and so on.
- Source Database: Displays name of the source database.
Lineage
End-to-end data and process lineage identify the complete dependency structure through interactive and drill-down options to the last level.
Typically, even within one line of business, multiple data sources, entry points, ETL tools, and orchestration mechanisms exist. Decoding this complex data web and translating it into a simple visual flow can be extremely challenging during large-scale modernization programs. The visual lineage graph adds tremendous value and helps define the roadmap to the modern data architecture. It deep dives into all the existing flows, like Autosys jobs, applications, ETL scripts, BTEQ/Shell (KSH) scripts, procedures, input and output tables, and provides integrated insights. These insights help data teams make strategic decisions with greater accuracy and completeness. Enterprises can proactively leverage integrated analysis to mitigate the risks associated with migration and avoid business disruption.
Now, let’s see how you can efficiently manage lineage.
To view the required lineage:
- Select either the Process or Data tab.
- Enter the keywords in the Search Keywords field.
- Click the Search icon to generate the lineage.
Process lineage illustrates the dependencies between two or more processes such as files, jobs, executables, etc., whereas data lineage depicts data flow between two or more data-holding components such as entities, flat files, etc.
In addition, the filter search icon allows you to include or exclude particular nodes to obtain the required dependency structure. You can also choose the direction of the lineage. By default, the Dependency Direction is Left to Right Hierarchy. You can also choose Right to Left Hierarchy direction as required. Moreover, you can also increase the Hierarchy Levels to nth level.
Lineage facilitates you visualize how your selected nodes are connected and depend on each other. The nodes and their connecting edges (relationships) help you to understand the overall structure and dependencies.
Nodes |
Edges |
SSIS Executable |
Call |
Tables |
Read |
File |
Execute |
Job |
Write |
Autosys Box |
OTHER |
Manage Lineage
This feature enables you to view and manage your lineage. You can add, modify, or delete nodes and their relationships to generate an accurate representation of the required dependency structure. There are two ways to update the lineage: either using Complete Lineage report or Lineage Template.
Using Complete Lineage report
Follow the below steps to modify the lineage:
- Click the Manage Graph icon.
- Click Download Complete Lineage to update, add, or delete the nodes and their relationships in the current lineage.
- Once the complete lineage report is downloaded, you can make necessary updates such as updating, deleting or adding the nodes and its relationships.
- After making the required changes, upload the updated lineage report in Upload to Modify Lineage.
- Click Apply to incorporate the updates into the dependency structure.
- Generate the required process or data lineage.
Using Lineage Template
Follow the below steps to add new nodes and their relationships to the current lineage report:
- Click the Manage Graph icon.
- Click Download Lineage Template.
- Once the lineage template is downloaded, you can add new nodes and relationships in the template.
- After making the required changes, upload the template in Upload to Modify Lineage.
- Click Apply to incorporate the updates into the complete dependency structure.
- Generate the required process or data lineage.
You can also apply:
Feature | Icon | Use |
Filter | | Used to filter the lineage. |
Reload | | Assists in reloading graphs. |
Save | | Used to save the lineage. |
Download | | Used to download the file. |
Expand | | Used to enlarge the screen. |
Downloadable Reports
Downloadable reports allow you to export detailed assessment reports of your source data which enables you to gain in-depth insights with ease. To access these assessment reports, click Reports.
Types of Reports
In the Reports section, you can see various types of reports such as Insights and Recommendations, Source Inventory Analysis, and Lineage reports. Each report type offers detailed information allowing you to explore your assessment results.
Insights and Recommendations
This report provides an in-depth insight into the source input files. It contains the final output including the details of queries, complexity, packages, and so on.
Here, you can see the ssis folder and the Lineage Dependency Report.xlsx.
Lineage Dependency Report.xlsx: This report contains information about views and package level lineages. It includes information about used and impacted tables, views, files, direct dependencies, dependency hierarchy and more.
This report contains the following information:
- view_report: Provides information about the views.
- script_report: Provides information about package level lineage.
Detailed ssis Assessments Reports
To access a detailed assessment report, open the ssis folder.
SSIS Report.xlsx: This report provides insights about the source inventory. It helps you plan the next frontier of a modern data platform methodically. It includes information about files, sub packages, precedence constraints, and more.
This report contains the following information:
- Report Summary: Provides information about all the generated artifacts.
- Volumetric Info: Presents a summary of the aggregated inventory after analyzing the source files. For instance, it provides total number of packages, executables, transformations, sub-packages, and likewise. It also provides package-level complexity.
- File Summary: Lists all the SSIS input files. It contains the total count of executables, pipeline, components, connections, sub-packages, and so on for every SSIS package.
- Sub-packages: Packages within another package. It lists all the SSIS packages and the associated sub-packages.
- Precedence Constraints: Defines the relationship between two precedence constraints or tasks.
- Procedure: Lists all the procedures.
- Query: Lists all the SQL queries along with the information about executable, executable type, component type, and so on.
- Transformation: Provides information about transformations.
- Connection: Defines to implement integration services to move data from one place to another. It lists all the connections along with the connection manager reference id.
- Entity: Lists all the entities. It also includes information about the entity type, database name, statement type and so on.
- Parsed Query: Lists all the queries that parsed.
- Unparsed Query: Lists all the queries that are not parsed.
- Event Handler: Helps to manage and execute events that arise during the run time. This sheet provides information about the event handlers along with the executables and types.
- Script Task: Helps to add new functionality to expand the SSIS package capabilities using C++ or Visual Basic. This sheet provides information about the executables, language, project items, and so on.
- Execute Process: Executes batch files or command files. This sheet provides information about executables, execute process data, arguments, and so on.
- Executables: It is the control flow in SSIS packages. It lists all the executables along with their type.
- Connection Manager: Provides information about the connection manager. It includes the creation name and connection string.
- Package Parameter: Lists all the SSIS packages along with package parameter details.
Browse through the query folder to access the Queries.csv file.
Queries.csv: This report provides information about queries including the used and impacted tables, analyzed status, complexity, and more. If the analyzed status is Analyzed, it indicates that the query is analyzed successfully. Conversely, a Not Analyzed status indicates that the query is not analyzed.
Source Inventory Analysis
It is an intermediate report which helps to debug failures or calculate the final report. It includes all the generated CSV reports.
external_procedures_details.CSV: Provides information about external procedures files. It includes information about packages, procedures, and its availability.
invalid_query.csv: This report lists all the invalid queries.
Lineage Report
This report provides complete dependency details for all nodes. It provides an end-to-end data and process lineage that helps to identify the complete dependency structure and the data flow.
This report contains the following information:
- Dependency (Process): Provides information about the process lineage.
- Dependency (Data): Provides information about the data lineage.
- Nodes: Lists all the source and target nodes along with its type.
- Volumetric Info (Summary): Provides volumetric information about the artifact types such as input tables, output tables, and schedulers.