DataStage Conversion Report
This topic provides a comprehensive report on DataStage ETL conversion. The ETL conversion pipeline converts legacy DataStage workloads to modern cloud platforms.
This section provides an overview of the ETL conversion, which includes:
- Scripts Transformed: Displays the number of scripts and jobs that are successful or fail to convert.
- Status: Status of the ETL conversion stage.
- Automatic Conversion Rate: Displays the percentage of total units or components that are auto-converted.
- Auto-Validation (Query): Displays the percentage of queries that have undergone syntax validation successfully.
- User Feedback: Displays the automatic conversion percentage based on your feedback.
In This Topic:
Scripts
This topic shows a comprehensive report of the converted artifacts. It includes the name of the graph, number of jobs, and file-wise transformation percentage.
Browse through each script to get more insights into the nodes. In each node, you can review the transformed data and provide feedback. If you perceive that a successfully converted stage is inaccurate, then you can mark it as incorrect.
You can click Mark as Incorrect to indicate that the converted stage is incorrect. Additionally, you can provide detailed comments in the Add Comment field to explain the specific issues or discrepancies you have observed. After providing your constructive feedback, click to recalculate the conversion automation based on your feedback. This ensures that the converted stage aligns more closely with the intended requirements or desired outcomes.
After clicking (Recalculate), you can see the updated conversion automation (User Feedback) based on your feedback in the Summary section.
This section also allows you to take a variety of actions, such as:
Feature |
Icon |
Description |
Download All |
|
To download all the graphs.
You can download graphs in an alternative way by:
- Click on the preferred graph or node.
- Click and then select Download Graph.
|
Upload |
|
To upload a modified graph after making the required changes.
You can upload graphs in an alternative way by:
- Click on the preferred graph or node.
- Click and then select Change Graph.
|
Regenerate |
|
To update and repackage the uploaded or changed graph.
If the artifacts are updated successfully, the system generates a snackbar popup to notify the success. |
Sort |
|
To sort the nodes. You can sort the nodes based on:
- All Nodes
- Success Nodes
- Failed Nodes
|
Package
This topic provides a detailed report of the converted artifacts containing python files, java files, etc. LeapLogic provides target-compatible packaged code that is ready to be orchestrated and executed as production-ready jobs on target platforms such as AWS Glue, Databricks.
In the Package section, you can also see Transformation_Report.xlsx file which contains a comprehensive report of the transformation, and the associated deductible license quota. License quota is deducted for units when converting DataStage workloads to target equivalents, where unit refers to component or stage.
The license deduction logic is determined based on File similarity and Complexity.
File similarity – File similarity checks whether the source file is related or similar to an already executed file. A higher file similarity value indicates that you already executed a similar file. The license quota deduction depends on whether the file similarity value falls below the predefined threshold. Specifically, if the file similarity value is below the threshold, then the system will deduct the license quota. Conversely, if the file similarity value exceeds the threshold, it won’t incur any license quota deduction, indicating that the file has been processed before. For instance,
- If the file similarity value is 0% (below the threshold) and the threshold is 80%, then the system deducts the unit quota.
- If the file similarity value is 92% (above the threshold) and the threshold is 80%, then the system will not deduct unit quota.
Complexity – Another criterion used to determine the license quota deduction is based on the complexity of each unit. The unit complexity is categorized into TRIVIAL, SIMPLE, MEDIUM, COMPLEX, V. COMPLEX, and V.V COMPLEX. The complexity weightage for the deductible unit quota is outlined as follows:
- If the unit complexity is TRIVIAL or SIMPLE, the deductible unit quota is 1.
- If the unit complexity is MEDIUM, the deductible unit quota is 3.
- If the unit complexity is COMPLEX, V. COMPLEX, or V.V COMPLEX, the deductible unit quota is 5.
For instance, consider the example provided in the image below for the file TSTCASE2.xml. This file contains a total of 20 stages, all of which are successfully transformed. Among these successfully transformed units:
- 6 stages are categorized as simple complexity with a weightage of 1. Consequently, the deductible unit quota for these units is 6.
- 14 stages are categorized as medium complexity with a weightage of 3. Consequently, the deductible unit quota for these units is 42.
So, the total deductible unit quota for TSTCASE2.xml file is 48.
The License Summary sheet (refer the above image) in Transformation_Report.xlsx file showcases detailed information about the license deductible quota.
- File Name: Displays the name of the file.
- Job Name: Displays the name of the job.
- Job Type: Displays the type of the job such as parallel job or sequence job.
- Conversion Percentage: Displays the auto-conversion percentage of each job.
- File Similarity: Indicates whether the file is similar to the already executed file. If the value is true, then the file is similar to an already executed file else false. If the file is similar to the already executed file, then the Deducted Script Quota and Deducted Unit Quota will be zero and will not deduct any quota.
- Complexity: Displays the complexity of each script.
- Total Stages: Displays the number of total stages.
- Success Stages: Displays the number of successfully transformed stages.
- Deductible Units: Displays the unit quota that needs to be deducted based on the complexity of transformed stages.
- Deducted Unit Quota: Displays the actual unit quota deducted.
The Translation_Report sheet in Transformation_Report.xlsx file provides a comprehensive report of the conversion. It includes information about jobs, stages, and the conversion status of stages. If the value of conversion status is true, then the stage is converted successfully to the target equivalent else false.
- File Name: Displays the name of the file.
- Job Name: Displays the name of the job.
- Stage Name: Displays the name of the stage.
- Stage Type: Displays the type of each stage.
- Conversion Status: Indicates the conversion status of each stage. If the value is SUCCESS the stage is converted successfully, else FAIL.
The Transformation_License_Report sheet provides a comprehensive report of the conversion along with the unit quota deduction.
- File Name: Displays the name of the file.
- Job Name: Displays the name of the job.
- Stage Name: Displays the name of the stage.
- Stage Type: Displays the type of each stage.
- Conversion Status: Indicates the conversion status of each stage. If the value is SUCCESS the stage is converted successfully, else FAIL.
- Complexity: Displays the complexity of each stage.
- Deductible Units: Displays the unit quota that needs to be deducted based on the complexity of the transformed stages.
- Deducted Unit Quota: Displays the actual unit quota deducted.
When you try to convert units that exceed the available license quota, those within the quota will successfully convert. However, any additional units beyond the quota will fail to transform and display an error message indicating ‘License quota exhausted.’
While configuring the DataStage to Databricks Lakehouse conversion stage, if the selected output type is DBT, then you will get dbt_packages, models, TODO files, workflows, etc., in the transformed artifacts report. In the models folder, you can find SQL files containing transformed codes that can be run directly on the DBT platform. The required queries in the output files can be edited through an online notebook-based code editor and can be repackaged.