|
Target Type |
Input |
|
Spark |
- To convert the sequence jobs to Airflow equivalent and generate the corresponding Python artifacts turn on the Convert Sequence Jobs to Airflow toggle. If you did not activate this toggle, the sequence jobs will convert to Spark equivalent jobs.
- In Output Type, the default output type for the transformation is set to Python.
- To perform syntax validation, turn on the Validation toggle.
- In Source Data Source, select the data source (DDL) which contains corresponding metadata to ensure accurate query conversion.
- In Target Data Source, select the target data source to perform syntax validation. To successfully perform the syntax validation of the transformed queries, it is advisable to ensure that the required input tables are created or already present on the target side and secondly all the user-defined functions (UDFs) are registered on the target data source.
|
|
Snowflake |
- To perform syntax validation, turn on the Validation toggle.
- In Source Data Source, select the data source (DDL) which contains corresponding metadata to ensure accurate query conversion.
- In Target Data Source, select the target data source to perform syntax validation. To successfully perform the syntax validation of the transformed queries, it is advisable to ensure that the required input tables are created or already present on the target side and secondly all the user-defined functions (UDFs) are registered on the target data source.
|
|
Delta Live Tables |
- Enable DLT Meta toggle to facilitate the creation of a bronze table within the Databricks Lakehouse. Rather than fetching data directly from the source such as flat files, this feature creates a bronze table (exact replica of the file) within Databricks and helps to refine data during data ingestion. With DLT Meta enabled, flat files are stored as tables within Databricks ensuring efficient data retrieval directly from these tables. This enhancement significantly boosts overall performance.
- In DBFS Base Path, provide the DBFS base location where the source flat files and DDL files are stored. This information is required to create the bronze table in Databricks.
- To perform syntax validation, turn on the Validation toggle.
- In Source Data Source, select the data source (DDL) which contains corresponding metadata to ensure accurate query conversion.
- In Target Data Source, select the target data source to perform syntax validation. To successfully perform the syntax validation of the transformed queries, it is advisable to ensure that the required input tables are created or already present on the target side and secondly all the user-defined functions (UDFs) are registered on the target data source.
|
|
Databricks Notebook |
- In Output Type, select Python or Juypter as the output type format for the generated artifacts.
- In Data Interaction Technique, select your data interaction method. Following are the options:
- Databricks-Native: Select Databricks-Native to fetch, process, and store data in Databricks Notebook.
- Databricks: Unity Catalog: Select Databricks: Unity Catalog to access data via Databricks Unity Catalog. In Databricks, the Unity Catalog serves as a metadata repository from which data is fetched, processed, and stored within the catalog.
- Databricks: External: Select this data interaction technique to fetch input data from an external source such as Oracle, Netezza, Teradata, etc., and process that data in Databricks, and then move the processed data or output to an external target. For instance, if the source input file contains data from any external source like Oracle, you need to select Oracle as the Source Database Connection to establish the database connection and load the input data. Then data is processed in Databricks, and finally the processed or output data gets stored at an external target (Oracle). However, if you select Oracle as the Source Database Connection but the source input file contains data from an external source other than Oracle, such as Teradata, then by default, it will run on Databricks.
- If the selected data interaction technique is Databricks: External, you need to specify the source database of your data. In the Source Database Connection, select the database you want to connect to. This establishes the database connection to load data from external sources like Oracle, Teradata, etc. If the database is selected, the converted code will have connection parameters (in the output artifacts) related to the database. If the database is not selected, you need to add the database connection details manually to the parameter file to execute the dataset; otherwise, by default, it executes on Databricks.
- To convert the DataStage sequence jobs to Databricks workflows equivalent and generate the corresponding JSON artifacts, turn on the Convert Sequence Jobs to Databricks Workflows toggle.
- In Default Database, select appropriate default database (Teradata, Netezza, Oracle, etc.) for queries where the database type is not defined in the uploaded artifacts. Selecting Not Sure will only convert queries whose database type is available.
- To perform syntax validation, turn on the Validation toggle.
- In Source Data Source, select the data source (DDL) which contains corresponding metadata to ensure accurate query conversion.
- In Target Data Source, select the target data source to perform syntax validation. To successfully perform the syntax validation of the transformed queries, it is advisable to ensure that the required input tables are created or already present on the target side and secondly all the user-defined functions (UDFs) are registered on the target data source.
|
|
Databricks Lakehouse |
- In Output Type, select Python or DBT as the output type format for the generated artifacts.
- In Data Interaction Technique, select your data interaction method. Following are the options:
- Databricks-Native: Select Databricks-Native to fetch, process, and store data in Databricks Lakehouse.
- Databricks: Unity Catalog: Select Databricks: Unity Catalog to access data via Databricks Unity Catalog. In Databricks, the Unity Catalog serves as a metadata repository from which data is fetched, processed, and stored within the catalog.
- Databricks: External: Select this data interaction technique to fetch input data from an external source such as Oracle, Netezza, Teradata, etc., and process that data in Databricks, and then move the processed data or output to an external target. For instance, if the source input file contains data from any external source like Oracle, you need to select Oracle as the Source Database Connection to establish the database connection and load the input data. Then data is processed in Databricks, and finally the processed or output data gets stored at an external target (Oracle). However, if you select Oracle as the Source Database Connection but the source input file contains data from an external source other than Oracle, such as Teradata, then by default, it will run on Databricks.
- If the selected data interaction technique is Databricks: External, you need to specify the source database of your data. In the Source Database Connection, select the database you want to connect to. This establishes the database connection to load data from external sources like Oracle, Teradata, etc. If the database is selected, the converted code will have connection parameters (in the output artifacts) related to the database. If the database is not selected, you need to add the database connection details manually to the parameter file to execute the dataset; otherwise, by default, it executes on Databricks.
- In Default Database, select appropriate default database (Teradata, Netezza, Oracle, etc.) for queries where the database type is not defined in the uploaded artifacts. Selecting Not Sure will only convert queries whose database type is available.
- To perform syntax validation, turn on Validation toggle.
- In Source Data Source, select the data source (DDL) which contains corresponding metadata to ensure accurate query conversion.
- In Target Data Source, select the target data source to perform syntax validation. To successfully perform the syntax validation of the transformed queries, it is advisable to ensure that the required input tables are created or already present on the target side and secondly all the user-defined functions (UDFs) are registered on the target data source.
|
|
AWS Glue Studio |
- In Target Database Details, provide database name, schema name, and prefix. If the prefix is provided, the table name displays in prefix_database_tablename format.
- In AWS Glue Catalog Database, provide the AWS Glue catalog database connection details to connect the database and schema.
- In S3 Bucket Base Path, specify the S3 storage repository path to store the files.
- In UDF File Location and UDF Jar Location, specify the UDF file and Jar location path respectively to define the new UDF location.
- In Target Connection Name, provide the connection name to add the predefined connection to Glue.
- To perform syntax validation, turn on Validation toggle.
- In Source Data Source, select the data source (DDL) which contains corresponding metadata to ensure accurate query conversion.
- In Target Data Source, select the target data source to perform syntax validation. To successfully perform the syntax validation of the transformed queries, it is advisable to ensure that the required input tables are created or already present on the target side and secondly all the user-defined functions (UDFs) are registered on the target data source.
|
|
AWS Glue Job |
- In Data Interaction Technique, Glue: Redshift is selected by default. This data interaction method consumes input data from Amazon Redshift, processes it in Glue, and stores the processed or output data back in Redshift.
- In Default Database, select appropriate default database (Teradata, Netezza, Oracle, etc.) for queries where the database type is not defined in the uploaded artifacts. Selecting Not Sure will only convert queries whose database type is available.
- In Source Data Source, select the data source (DDL) which contains the corresponding metadata to ensure accurate query conversion.
- In Target Data Source, select the target data source to perform syntax validation. To successfully perform the syntax validation of the transformed queries, it is advisable to ensure that the required input tables are created or already present on the target side and secondly all the user-defined functions (UDFs) are registered on the target data source.
|
|
Matillion ETL |
- In Output Type, the default output type for the transformation is set to JSON language.
- To perform syntax validation, turn on Validation toggle.
- In Source Data Source, select the data source (DDL) which contains corresponding metadata to ensure accurate query conversion.
- In Target Data Source, select the target data source to perform syntax validation. To successfully perform the syntax validation of the transformed queries, it is advisable to ensure that the required input tables are created or already present on the target side and secondly all the user-defined functions (UDFs) are registered on the target data source.
|