SAS Assessment Report
This topic contains information about the SAS assessment report. SAS is one of the most popular analytics systems used in the industry. The assessment assesses workloads and produces in-depth insights that help to plan the migration. The input formats for the SAS assessment are SAS or EGP files.
In This Topic:
Highlights
The highlights section gives you a high-level overview of your assessment summary of the analytics performed on the selected workloads. It includes a graphical depiction of the complexity of files as well as the summary of the files used.
Summary
This section illustrates the SAS based analytical scripts that were analyzed across the various elements. Here you can see the number of files, the percentage of available procedural and statistical scripts, the percentage of code analyzed and more.
- Files: Displays the total number of input files.
- SQL + SAS Procedural: Displays the percentage of workloads that consist of SQL and SAS Procedural code.
- SAS Statistical (Base Proc): Displays the percentage of workloads that consist of SAS Statistical (Base Proc) code.
- Automation: Displays the anticipated automation conversion percentage.
- Duplicate Files: Displays the number of duplicate files present in the input files.
- Partly Duplicate Files: Displays the number of partially duplicate files present in the input files.
Complexity
This section provides a summarized graphical representation of the complexity of SAS scripts that helps in making different decisions, including budget estimation. The complexity of the SAS script is calculated based on various parameters like external access, conditional procedural statement, external resources, total SQL statement, and procedures.
Analysis
This topic provides a detailed examination of source files and artifacts.
Source Analysis
This section provides a comprehensive report of the SAS script including file complexity, queries executed, and so on. It also shows the number of macros, proc SQL, data steps, etc. for every SAS script.
- File Name: Name of the file.
- Macros: Displays the number of macros in the file.
- Proc SQL: Displays the number of Proc SQL in the file.
- Data Steps: Set of instructions to manage (create or modify) data sets.
- Complexity: Displays the complexity of SAS script.
- Conditional Procedural Constructs: Group of conditional statements including do loop statement count, if statement count, where statement count, do over statements count, array statement count, let statement count, put statement count, symdel statement count, and else_cond count.
- External Data Access: Group of statements including lib name statement count, and ods count.
- Base SAS Procs: Procedural statements in SAS including proc append count, proc calendar count, proc catalog count, proc chart count, etc.
- Advanced SAS Procs: Procedural statements in SAS including proc access count, proc aggregation count, proc allele count, proc anom count, proc appsrv count and more.
- Analyzed Percentage: Percentage of analyzed files.
Artifacts
This page gives details about artifacts-collections of related server data. It provides a list of files that could not be parsed completely due to some error.
- File Name: Displays the name of the unparsed file.
- Type: Displays the type of each file.
- Error on Line Number: Displays the line number where the error occured.
Lineage
End-to-end process lineage identify the complete dependency structure through interactive and drill-down options to the last level.
Typically, even within one line of business, multiple data sources, entry points, ETL tools, and orchestration mechanisms exist. Decoding this complex data web and translating it into a simple visual flow can be extremely challenging during large-scale modernization programs. The visual lineage graph adds tremendous value and helps define the roadmap to the modern data architecture. It deep dives into all the existing flows and provides integrated insights. These insights help data teams make strategic decisions with greater accuracy and completeness. Enterprises can proactively leverage integrated analysis to mitigate the risks associated with migration and avoid business disruption.
To view the Data Model lineage, enter the keywords in the search field and then Click the Search icon.
The Data model shows the end-to-end relationships and dependencies between elements. In addition, the filter search icon allows you to include or exclude particular nodes to obtain the required dependency structure. You can also choose the direction of the lineage. By default, the Dependency Direction is Left to Right Hierarchy. You can also choose Right to Left Hierarchy or Bidirectional direction as required. Moreover, you can also increase the Hierarchy Levels to nth level.
Lineage facilitates you visualize how your selected nodes are connected and depend on each other. The nodes and their connecting edges (relationships) help you to understand the overall structure and dependencies.
Nodes |
Edges |
Tables |
Call |
File |
Read |
Stage |
Execute |
Bridge Table |
Write |
|
OTHERS |
Manage Lineage
This feature enables you to view and manage your lineage. You can add, modify, or delete nodes and their relationships to generate an accurate representation of the required dependency structure. There are two ways to update the lineage: either using Complete Lineage report or Lineage Template.
Using Complete Lineage report
Follow the below steps to modify the lineage:
- Click the Manage Graph icon.
- Click Download Complete Lineage to update, add, or delete the nodes and their relationships in the current lineage.
- Once the complete lineage report is downloaded, you can make necessary updates such as updating, deleting or adding the nodes and its relationships.
- After making the required changes, upload the updated lineage report in Upload to Modify Lineage.
- Click Apply to incorporate the updates into the dependency structure.
- Generate the required data model lineage.
Using Lineage Template
Follow the below steps to add new nodes and their relationships to the current lineage report:
- Click the Manage Graph icon.
- Click Download Lineage Template.
- Once the lineage template is downloaded, you can add new nodes and relationships in the template.
- After making the required changes, upload the template in Upload to Modify Lineage.
- Click Apply to incorporate the updates into the complete dependency structure.
- Generate the required data model lineage.
You can also apply:
Feature | Icon | Use |
Filter |  | Used to filter the lineage. |
Reload |  | Assists in reloading graphs. |
Save |  | Used to save the lineage. |
Download |  | Used to download the file. |
Expand |  | Used to enlarge the screen. |
Downloadable Reports
Downloadable reports allow you to export detailed SAS assessment reports of your source data which enables you to gain in-depth insights with ease. To access these assessment reports, click Reports.
Types of Reports
In the Reports section, you can see various types of reports such as Insights and Recommendations and Source Inventory Analysis. Each report type offers detailed information allowing you to explore your assessment results.
Insights and Recommendations
This report provides an in-depth insight into the source input files. It contains the final output including information about analyzed SAS scripts, complexity measures, functions, variable references, file duplicacy, and more.
Here, you can see the sas folder and the SAS_Code_Assessment.xlsx report.
SAS_Code_Assessment.xlsx: This report provides insights about the source inventory. It helps you plan the next frontier of a modern data platform methodically. It includes information about complexity measures, functions, variable references, duplicacy, and more.
This report contains the following information:
This report contains the following information:
- Report Summary: Provides information about all the generated artifacts.
- Summary: Lists detailed inventory for every SAS file. For instance, it provides information about number of files with compilation errors, parsing errors, missing files, percentage of auto-conversion and manual conversion, common functions used and so on.
- Complexity Measures: Provides high-level complexity details of each SAS file. It includes information about formula used to derive the complexity, total lines of code, percentage of auto-conversion, number of lines of code which have valid code for conversion, number of macros defined, and a lot more.
- EGP Complexity Measures: Provides details about complexity measures for the EGP files. It includes percentage of auto conversion, number of lines of code, number of non-converted code lines, count of proc SQL and so on.
- Discovery: Lists key metrics related to auto-converted code. It includes code parsing status, the number of non-converted code lines, compilation status, percentage of auto-conversion, and likewise.
- Manual Conversion: Provides more granular auto-conversion metrics such as statements like format, date, length, increment, etc., that need manual intervention. It also provides the count of statements, procs, etc. that need manual intervention in the format “ToDo Automatic Comment “.
- Step 1: In the Discovery sheet, if Compile and Parsing status are failed consider it for manual conversion.
- Step 2: In the Missing Source Code sheet, if you find an entry there, do not proceed with the manual conversion. We need to first get the missing files from the customer and then do the end-to-end conversion.
- Step 3: In the Manual conversion sheet, if % auto-conversion is more than 20% then ask for code re-conversion and statement-wise count for manual conversion.
- Step 4: In the Manual Patterns sheet, refer to the details of code blocks that need manual conversion.
- Manual Patterns: Provides details of code blocks that need manual conversion. It includes information about the statement types which are not auto-converted, code for manual conversion, partially auto-converted code but requires minor changes, and describes actions required for manual conversion.
- Inputs and Outputs: Lists all the inputs existing in the SAS file along with its type such as READ, Lookup, New Dataset, etc. It also lists the outputs and its type such as a file, etc.
- Functions: Lists all the functions occurring in a SAS file along with their usage status.
- Variable Reference: Lists all the variables existing in a source SAS file along with its values appearing in the code. It also lists the types of entities such as macro calls, LET, etc.
- Common Functions: Lists all the common functions in the source code file along with their availability and known issues.
- Compilation Errors: Provides information about the compilation error in each source file.
- Parsing Errors: Provides information about unparsable code.
- Dependent Source Code: Lists dependent files for different source files.
- Complexity Calculation Logic: Provides complete high-level details required for complexity calculations. It includes information about the Group Statements – collection of statements and lists all the SAS base functions. It also provides the complexity range based on an established complexity calculation logic, the assessment engine calculates the complexity of SAS scripts and its components and shows it in a range format for each SAS statement type and function.
- Duplicacy: Provides the file resemblance index by comparing each file with all the other input files. It marks all those files duplicate which either meet or exceed the pre-defined similarity threshold. The files which the system is able to successfully compare are marked as Success in the Status column otherwise an error message is displayed.
Browse through sas folder to access SAS Analysis DB.xlsx report.
SAS Analysis DB.xlsx: This report provides information about the analyzed SAS scripts.
This report contains the following information:
- Report Summary: Provides information about all the generated artifacts.
- SAS Analysis DB: Provides information about the analyzed SAS scripts. It includes information about the complexity, the number of macros, proc SQL, data steps, etc., for every SAS script.
Source Inventory Analysis
It is an intermediate report which helps to debug failures or calculate the final report. It includes all the generated CSV reports such as entities_used.csv and querywise_entity_details.csv.
entities_used.csv: This report provides information about the tables associated with the analyzed or supported queries.
querywise_entity_details.csv: This report provides information about entities or tables associated with each query. It also includes information about database types, schemas, number of columns, and more.