Configuring Informatica BDM

This topic provides steps to configure Informatica BDM conversion stage.

Select the ETL Type as Informatica BDM.
In Input Artifacts, upload the source data via:

Browse Files: To select the source files from the local system.
Select From Data Source: To select the source files from the data source. To do so, follow the steps below:

Click Select From Data Source.
Choose repository.
Select data source.
Select the entities.
Click to save the source data source.

Select the Target Type as Spark or AWS Glue Studio to which you need to transform the source scripts.
Click Data Configuration.

The Input column in the table below provides the input requirements based on the Target Type selection.

Target Type	Input
Spark	In Output Type , the default output type for the transformation is set to Python.
AWS Glue Studio	In Target Database Details, specify database name, schema name, and prefix. If the prefix is provided, the table name displays in prefix_database_tablename format. In AWS Glue Catalog Database, provide the AWS Glue catalog database connection details to connect the database and schema. In S3 Bucket Base Path, specify the S3 storage repository path to store the files. In UDF File Location, specify the UDF file location. In UDF Jar Location, specify the jar location path to define the new UDF location.
AWS Glue Job	In Data Interaction Technique, select your data interaction method. The following are the options: Glue-Redshift: Select Glue-Redshift to fetch input data from Amazon Redshift, process it in Glue, and store the processed or output data in Redshift. In this scenario, source data are converted to Redshift whereas temporary or intermediate tables are converted to Spark. Glue: Data Catalog: This method accesses data through the data catalog which serves as a metadata repository. Then the data is processed in Glue and the processed or output data gets stored in the data catalog. In Storage Format, select the storage format of your data such as Delta or Iceberg. Glue: External: Select this data interaction method to fetch input data from an external source such as Oracle, Netezza, Teradata, etc., and process that data in Glue, and then move the processed or output data to an external target. For instance, if the source input file contains data from any external source like Oracle, you need to select Oracle as the source database to establish the database connection and load the input data. Then data is processed in Glue, and finally the processed or output data gets stored at the external target (Oracle). However, if you select Oracle as the Source Database Connection but the source input file contains data from an external source other than Oracle, such as Teradata, then by default, it will run on Redshift. If the selected data interaction technique is Glue: External, you need to specify the source database of your data. In Source Database Connection, select the database you want to connect to. This establishes the database connection to load data from external sources like Oracle, Teradata, etc. If the database is selected, the converted code will have connection parameters (in the output artifacts) related to the database. If the database is not selected, you need to add the database connection details manually to the parameter file to execute the dataset; otherwise, by default, it executes on Redshift. Redshift ETL Orchestration via Glue: This method accesses, processes, and executes data in Amazon Redshift and uses Glue for orchestration jobs. In this scenario, both source data and intermediate tables are converted to Redshift. In Default Database, select default database for the queries for which the database type is not defined in the uploaded artifacts. Selecting Not Sure will convert only those queries whose database type is available.

Click Save to update the changes.
An alert pop-up message appears. This message prompts you to refer your respective assessment to determine the anticipated quota deduction required when converting your scripts to target. Then click Ok

Click to provide a preferred pipeline name.
Click to execute the pipeline. Clicking (Execute) navigates you to the listing page which shows your pipeline status as Running. It changes its state to Success when it is completed successfully.
Click pipeline card to see reports.

To view the Informatica BDM conversion report, visit Informatica BDM Conversion Report.