On-premises to Cloud Transformation Report
This topic provides a comprehensive summary of the transformed scripts as well as a summary of all the statement types in the transformation stage. The Transformation stage contains two sections-Transformation and Assessment Report.
In this Topic:
Transformation
This stage provides the details of the logic and scripts that are transformed or not transformed into the target compactable scripts.
- Total Scripts: Displays the number of scripts you have given as input in the transformation stage. It also gives a segregated view of how many scripts were successfully transformed and how many failed.
- Status: Provides status of the Transformation stage.
- Description: A brief description of the stage is displayed.
- Total Queries: Display the total number of queries present in the scripts along with the number of parsed and unparsed queries.
- Successfully Transformed Queries: Displays the number of queries that have been successfully transformed.
- Auto-Conversion: Displays the auto-conversion percentage.
- Auto-Validation: Displays the percentage of queries that have undergone syntax validation successfully.
- Transformation Time: The time taken for transformation.
- Validation Time: The time taken for validation.
The Transformation section contains three sub-sections. The sub-sections include:
Report
This topic shows a comprehensive report of the converted artifacts. The left panel lists all the input files along with the file-wise transformation percentage. The right panel shows information about the selected file. It includes information about the original and transformed queries, status and more.
When transforming legacy EDW workloads to Databricks Lakehouse with the Unity Catalog enabled, constraints such as primary key, foreign key, check, etc., present in the source data are transformed to Databricks-native equivalent. As depicted in the image below, original queries are converted to Databricks-equivalent code along with the constraints.
If the Utility Catalog is disabled, the default Hive metastore is utilized. In such cases, legacy EDW workloads are converted to Databricks-equivalent code without any constraints.
In addition, you can also take a variety of actions, including:
Feature |
Icon |
Description |
Query Recommendation |
|
To get instant recommendations for improving the performance of the queries based on best practices for query construction. To apply Query Recommendations, choose:
- Apply All: To apply query recommendations to all the queries.
- Custom Selection: Specify the queries to apply query recommendations. To do so,
- Click Custom Selection.
- Select the queries by checking the check boxes of respective queries.
- Click to get the recommendation for query optimization.
If the recommendations for query optimization are carried out successfully, the system generates a snackbar popup to notify the success.
|
Download |
|
Download the transformed artifacts (sh scripts, java files, etc.), validation report, and executable jar file as a zip. |
Regenerate |
|
Edit the transformed queries using the notebook-based online editor. Update and repackage your code after making the necessary changes to apply the updated query. The Regenerate feature helps to update and repackage your query in the Transformation stage.
If the artifacts are updated successfully, the system generates a snackbar popup to notify the success. |
Create Pipeline |
|
The transformed code can be reformed into a visual pipeline on the Stream Analytic platform where the transformed code can be orchestrated and executed effortlessly. |
Update |
|
To upload the updated query files to replace the manually corrected queries:
- Click .
- Upload the updated query file (.xlsx or .xls file) in the Update field to replace the auto-transformed queries with the manually corrected queries.
- Enable Includes Header toggle if the first row contains column names in the uploaded files.
|
Validate |
|
To validate the replaced queries. |
Compare |
|
Line-by-line comparison of the transformed queries and the source queries to identify the changes. |
Edit |
|
To manually edit failed queries using the Notebook-based online editor. Furthermore, you can repackage the updated code with the Regenerated option. To do so,
- Click .
- Make the necessary changes in the transformed queries.
- Click to update the queries.
|
Sort |
|
To sort the queries. You can sort the queries based on:
- All Queries
- Success Queries
- Failed Queries
- Manually Corrected
- Recommendation: Yes
- Recommendation: No
|
Filter |
|
To filter the queries based on the query type. |
View Notebook |
|
To manually edit failed queries using the Notebook-based online editor. Furthermore, you can repackage the updated code with the Regenerated option. To do so,
- Click .
- Make the necessary changes in the transformed queries.
- Click to update the queries.
|
Package
This section provides a detailed report of the converted artifacts containing python files, sql files, etc. along with an executable jar file.
In the Package section, you can also see Transformation_Report.xlsx file which contains a comprehensive report of the transformation, and the associated deductible license quota. License quota is deducted for units when converting EDW workloads such as Teradata, Oracle, etc., to target equivalents, where unit refers to query in the workload. The license deduction logic is determined based on File similarity and Complexity.
File similarity – File similarity checks whether the source file is related or similar to an already executed file. A higher file similarity value indicates that you already executed a similar file. The license quota deduction depends on whether the file similarity value falls below the predefined threshold. Specifically, if the file similarity value is below the threshold, then the system will deduct the license quota. Conversely, if the file similarity value exceeds the threshold, it won’t incur any license quota deduction, indicating that the file has been processed before. For instance,
- If the file similarity value is 70% (below the threshold) and the threshold is 80%, then the system deducts the unit quota.
- If the file similarity value is 92% (above the threshold) and the threshold is 80%, then the system will not deduct unit quota.
Complexity – One of the criteria to determine the license quota deduction is based on the complexity of each unit. The unit complexity is categorized into TRIVIAL, SIMPLE, MEDIUM, COMPLEX, V. COMPLEX, and V.V COMPLEX. The complexity weightage for the deductible unit quota is outlined as follows:
- If the unit complexity is TRIVIAL or SIMPLE, the deductible unit quota is 1.
- If the unit complexity is MEDIUM, the deductible unit quota is 3.
- If the unit complexity is COMPLEX, V. COMPLEX, or V.V COMPLEX, the deductible unit quota is 5.
For instance, consider the example provided in the image below for the file TD_TestBed_DML.sql. This file contains a total of 86 units, out of which 85 are successfully transformed. Among these successfully transformed units:
- 82 units are categorized as Simple complexity with a weightage of 1. Consequently, the deductible unit quota for these units is 82.
- 2 units are categorized as Medium complexity with a weightage of 3. Consequently, the deductible unit quota for these units is 6.
- 1 unit is categorized as having Complex complexity with a weightage of 5. Consequently, the deductible unit quota for these units is 5.
So, the total deductible unit quota for TD_TestBed_DML.sql file is 93.
The Summary sheet (refer above image) in Transformation_Report.xlsx file showcases detailed information about the license deductible quota.
- File Name: Displays the name of the file.
- Auto Conversion Percentage: Displays the auto-conversion percentage of scripts.
- File Similarity: Indicates whether the file is similar to the already executed file. If the value is Yes, then the file is similar to an already executed file else No. If the file is similar to the already executed file, then the Deducted Script Quota and Deducted Unit Quota will be zero and will not deduct any quota.
- Script Complexity: Displays the complexity of each file.
- Total Blocks: Displays the total number of blocks.
- Success Blocks: Displays the number of successfully transformed blocks.
- Total Units: Displays the total number of units.
- Success Units: Displays the number of successfully transformed units.
- Deductible Unit Quota: Displays the unit quota that needs to be deducted based on the complexity of each successfully transformed unit.
- Deducted Unit Quota: Displays the actual unit quota deducted from license.
The Query Summary sheet provides information about the queries along with the count of total, auto-converted, and manually corrected queries.
- Statement: Displays the statement types.
- Total Count: Displays the number of queries for each statement type.
- Auto-Converted Count: Displays the number of queries that are automatically converted for each statement type.
- Manually Converted Count: Displays the number of queries that require manual intervention for each statement type.
The Transformation Report sheet lists all the queries along with its type, auto generated query, status, complexity and more.
- File Name: Displays the name of the file.
- Procedure Name: Displays the name of the procedure.
- Procedure Parameters: Displays the number of parameters in each procedure.
- Original Query: Displays the original query.
- Query Type: Displays the query type.
- Auto Generated Query: Displays the auto-generated query.
- Status: Indicates the conversion status of each query. If the value is SUCCESS the query is converted successfully, else FAILED.
- Validation Status: Displays the status of query validation.
- Complexity: Displays the query complexity.
- Deductible Quota: Displays the quota that needs to be deducted based on the complexity of each successfully transformed query.
- Deducted Quota: Displays the actual unit quota deducted from the license.
- UDF Used: Displays the user-defined functions used in each query.
- Script Line: Displays the number of lines present in each query/ procedure.
- Script Column: Displays the number of columns that are used in the query.
- Error Message: Shows a brief message of auto-conversion or query validation failure.
- Error Details: Shows a detailed message of auto-conversion or query validation failure.
- Manually Corrected: Indicates whether the query needs manual intervention.
- Manual Query: Displays manually updated queries.
- Transformation Discrepancy: This indicates whether there is a difference between the source and target query constructs. If the value is Yes, it signifies that there is a discrepancy between the source and target query constructs. For instance, if the source query has one join condition, but the target or converted query has two which highlights a discrepancy. Conversely, if the value is No, it signifies that there is no difference in the query construct between the source and target queries.
- Changed Model Applied: Provides information on any applied changes to the model. It is a mapping between source and target tables as well as columns along with a definition of the mapping condition such as Inner Join, etc.
- Query Construct: Displays the number of source and target query constructs such as Joins, sub-queries, and conditions.
When you try to convert units that exceed the available license quota, those within the quota will successfully convert. However, any additional units beyond the quota will fail to transform and display an error message indicating “License quota insufficient.”
You can apply:
Dependency
It illustrates the complex interdependencies between various enterprise workloads through a process lineage graph. Whenever one database is dependent upon another, or when one table is dependent upon another, to demonstrate this relationship, lineage or dependency graphs are generated.
Assessment Report
This section displays a summary of all the statement types in the transformation stage. It presents details about the total statement type, various types of statements used in queries, the count of a particular statement type, and the auto-converted queries.
You can download the Transformation report based on the different statement types used in the queries. To download the report, click the .