Skip to the content
LeaplogicLeaplogic
  • Home
  • About Us
  • Contact
SIGN IN
  • Home
  • About Us
  • Contact

  • Getting Started
    • Before You Begin
    • Creating an Account
    • Logging into LeapLogic
    • Reset Password
    • Quick Tour of the Web Interface
    • LeapLogic in 15 minutes
      • Prerequisites
      • Step 1. Log into LeapLogic
      • Step 2. Create Assessment and Get Insights
      • Step 3. Create Transformation Pipeline and See Results
      • Step 4. Edit or Optimize the Transformed Code
      • Step 5: Complete the Transformation Lifecycle
  • Introduction to LeapLogic
    • Overview
    • High Level Architecture
    • Supported Legacy and Cloud Platforms
    • Key Features
  • Workload Assessment
    • Overview
    • Value Proposition
    • Creating Assessment
      • Prerequisites
      • Step 1. Provide Primary Inputs
        • Automation Coverage
      • Step 2. Add the Additional Inputs
        • Table Stat Extraction Steps
          • Teradata
          • Oracle
          • Netezza
      • Step 3. Update the Source Configuration
      • Step 4. Configure the Recommendation Settings
    • Assessment Listing
    • Understanding Insights and Recommendations
      • Volumetric Info
      • EDW
        • Oracle
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Vertica
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Snowflake
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Azure Synapse
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • SQL Server
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Teradata
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Netezza
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Google Big Query
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Redshift
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • PostgreSQL
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Duck DB
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • ClickHouse
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • Exasol
          • Highlights
          • Analysis
          • Optimization
          • Lineage
          • Recommendations
          • Downloadable Reports
        • DB2
          • Highlights
          • Analysis
          • Optimization
          • Recommendations
          • Lineage
          • Downloadable Reports
      • ETL
        • Informatica
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • Ab Initio
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • DataStage
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • Talend
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • SSIS
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • Informatica BDM
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • Oracle Data Integrator
          • Highlights
          • Analysis
          • Downloadable Reports
        • Pentaho
          • Highlights
          • Analysis
          • Downloadable Reports
        • Azure Data Factory
          • ARM Template
          • Highlights
          • Analysis
          • Downloadable Reports
        • Matillion
          • Highlights
          • Analysis
          • Downloadable Reports
        • SnapLogic
          • Highlights
          • Analysis
          • Downloadable Reports
      • Orchestration
        • AutoSys
          • Highlights
          • Analysis
          • Downloadable Reports
        • Control-M
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • SQL Server
          • Highlights
          • Analysis
      • BI
        • OBIEE
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • Tableau
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • IBM Cognos
          • Highlights
          • Analysis
          • Downloadable Reports
        • MicroStrategy
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • Power BI
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • SSRS
          • Highlights
          • Analysis
          • Downloadable Reports
        • SAP BO
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
        • WebFOCUS
          • Highlights
          • Analysis
          • Downloadable Reports
      • Analytics
        • SAS
          • Highlight
          • Analysis
          • Lineage
          • Downloadable Reports
        • Alteryx
          • Highlights
          • Analysis
          • Lineage
          • Downloadable Reports
      • Integrated Assessment (EDW, ETL, Orchestration, BI)
        • Highlights
        • Analysis
        • Optimization
        • Lineage
        • Recommendations
    • Managing Assessment Reports
      • Downloading Report
      • Input Report Utility
      • View Configuration
    • Complexity Calculation Logic
    • Key Benefits
    • Ad hoc Query
  • Metadata Management
    • Overview
    • Introduction to Data Catalog
      • Managing Data Catalog
        • Building Data Catalog
        • Insights to Data Catalog
        • Managing the Repository and Data Source
      • Creating Repository (Repo)
      • Creating Data Source
    • Tag Management
    • Key benefits
  • Batch Processing using Pipeline
    • Introduction
    • Designing Pipeline
      • How to create a pipeline
        • Configuring Migration Stage
          • Schema Optimization
        • Configuring Transformation Stage
          • On-premises to Cloud
          • Cloud-to-Cloud
          • LeapLogic Express
        • Configuring Validation Stage
          • Data Validation
            • Table
            • File
            • File and Table
            • Cell-by-cell validation
          • Query Validation
            • Query Validation (When Data is Available)
            • Query Validation (When Data is Not Available)
          • Schema Validation
        • Configuring Execution Stage
        • Configuring ETL Conversion Stage
          • Ab Initio
          • Informatica
          • Informatica BDM
          • Matillion
          • DataStage
          • SSIS
          • IICS
          • Talend
          • Oracle Data Integrator
          • Pentaho
          • SnapLogic
        • Configuring Mainframe Conversion Stage
          • Cobol
          • JCL
        • Configuring Orchestration Stage
          • AutoSys
          • Control-M
        • Configuring BI Conversion Stage
          • OBIEE to Power BI
          • OBIEE to AWS QuickSight
          • Tableau to Amazon QuickSight
          • Tableau to Power BI
          • Tableau to Superset
          • Tableau to Looker
          • IBM Cognos to Power BI
        • Configuring Analytics Conversion Stage
          • SAS
          • Alteryx
        • Configuring Script Conversion Stage
    • Key Features
      • How to schedule a pipeline
      • Configuring Parameters
  • Pipeline Reports
    • Overview of Pipeline Report
    • Pipeline Listing
    • Reports and Insights
      • Migration
      • Transformation
        • On-premises to Cloud
        • Cloud-to-Cloud
        • LeapLogic Express
      • Validation
        • Data
          • File
          • Table
          • File and Table
        • Query
          • Query Validation Report (When Data is Available)
          • Query Validation Report (When Data is not Available)
        • Schema
      • Execution
      • ETL
        • Ab Initio
        • Informatica
        • Informatica BDM
        • Matillion
        • DataStage
        • SSIS
        • IICS
        • Talend
        • Oracle Data Integrator
        • Pentaho
        • SnapLogic
      • Mainframe
        • Cobol
        • JCL
      • Orchestration
        • AutoSys
        • Control-M
      • BI
        • OBIEE to Power BI
        • OBIEE to Amazon QuickSight
        • Tableau to Amazon QuickSight
        • Tableau to Power BI
        • Tableau to Superset
        • Tableau to Looker
        • IBM Cognos to Power BI
      • Analytics
        • SAS
        • Alteryx
      • Shell Script
      • Common Model
    • Automation Level Indicator
      • ETL
        • Informatica
        • Matillion
        • DataStage
        • Informatica BDM
        • SnapLogic
        • IICS
        • Ab Initio
        • SSIS
        • Talend
        • Pentaho
      • Orchestration
        • AutoSys
        • Control-M
      • EDW
      • Analytics
        • SAS
        • Alteryx
      • BI
      • Shell Script
    • Error Specifications & Troubleshooting
  • SQL Transformation
    • Overview
    • Creating and Executing the Online Notebook
      • How to Create and Execute the Notebook
      • Supported Features
    • Configuring the Notebook
      • Transformation
      • Unit Level Validation
      • Script Level Validation
    • Notebook Listing
  • Operationalization
    • Overview
      • Basic
      • Advanced
      • Cron Expression
    • Parallel Run Pipeline Listing
  • Transformation Source
    • Introduction
    • Creating Transformation Source Type
  • Governance
    • Summary of Governance - Roles and Permissions
    • User Creation
      • Creating a new User Account
    • Adding Roles and permissions
      • How to add Roles and Permissions to a new user?
    • Adding Group Accounts
    • Default Quota Limits
    • Product Usage Metrics
  • License
    • EDW
    • ETL
  • LeapLogic Desktop Version
    • Overview
    • Registration and Installation
    • Getting Started
    • Creating Assessment
      • ETL
      • DML
      • Procedure
      • Analytics
      • Hadoop
    • Reports and Insights
      • Downloadable Reports
      • Reports for Estimation
    • Logging and Troubleshooting
    • Sample Scripts
    • Desktop vs. Web Version
    • Getting Help
  • LeapLogic (Version 4.8) Deployment
    • System Requirements
    • Prerequisites
    • Deployment
      • Extracting Package
      • Placing License Key
      • Executing Deployment Script
      • Accessing LeapLogic
    • Uploading License
    • Appendix
    • Getting Help
  • Removed Features
    • Configuring File Validation Stage
    • Variable Extractor Stage
      • Variable Extractor Report
    • Configuring Meta Diff Stage
      • Meta Diff
    • Configuring Data Load Stage
      • Data Load
    • Configuring Multi Algo Stage
  • FAQs
  • Tutorial Videos
  • Notice
Home   »  Metadata Management   »  Introduction to Data Catalog  »  Creating Data Source

Creating Data Source

The data source contains information about the database or schema. To create metadata, you can extract any data set or access data from a variety of data sources. Before you start extracting the data or entities, make sure that you have already set up a location or a repository for the source.

To add the data source, follow the below steps:

  1. In Repository, select the repository name to which you need to associate the data source.
  2. In Data Source Name, enter your preferred data source name and then provide the input requirements based on the category type selection in the repository.

The Input column in the table below provides the input requirements based on the category and type you have chosen to create the repository.

Catagory Type Input
Big data Databricks Lakehouse
  • Database Name: Provide the Database Name from which you want to fetch the entities.
  • Workspace Name: Domain name/ creator’s name used to login to Databricks.
  • Cluster ID: Provide the cluster-ID.
Additional information
  • Access Token: Provide Access Tokens to authenticate databricks. In Databricks, the authentication flows rely on tokens instead of passwords.
  • Transport Mode: Provide Transport mode details. You can provide either the jdbc url such as  jdbc:spark://<address>;<transportMode>;<httpPath>;<Authentication Mechanism>;<UID>; <Password> or the transport mode. Here the transport Mode is http.
  • SSL: Secure Sockets Layer (SSL) is used to establish secure communication between client and server or to provide security over the Internet.
  • HTTP Path: Provide an HTTP path to identify and access the resource in the host.
  • Auth Mechanism: Provide Auth Mechanism to ensure security.
  • UID: Provide a unique identification number to ensure security.
Databricks
Google Cloud BigQuery
  • Database Name: Provide the GCP database name to retrieve the metadata.
  • Bucket Name: Provide the Google cloud bucket name which can be temporarily used in data migration.
Hive
  • Database Name: Provide the Database Name from which you want to fetch the entities.
  • Warehouse Directory: Provide the Warehouse Directory which acts as a database directory that contains Hive tables.
  • Keytab Path: Keytab (also known as Key table) contains service keys that are mainly used to allow server applications to accept authentications from clients.
  • Client Principal: Provide Client Principal. The clients are the users who initiate communication for a service request. The identity of the user is identified by the client principal for authentication. 
  • Service Principal: Provide an identity for service that runs on hosts. For example, hive/xxxxxx-xxxxxx.xxxxxx.co.in@IMPETUS.CO.IN.
  • Metastore Principal: Provide the ability to remove files and directories within the hive/warehouse directory. For example, hive/xxxxxx-xxxxxx.xxxxxx.co.in@IMPETUS.CO.IN .
Spark
  • Database Name: Provide the Database Name from which you want to fetch the entities.
  • Keytab Path: Keytab (also known as Key table) contains service keys that are mainly used to allow server applications to accept authentications from clients.
  • Client Principal: Provide Client Principal details. The clients are the users who initiate communication for a service request. The identity of the user is identified by the client principal for authentication.
  • Service Principal: Provide an identity for service that runs on hosts.
  • Metastore Principal: Provide the ability to remove files and directories within the database/ warehouse directory.
  • Spark Service Principal: Provide Spark Service Principal to identify the spark resources.
DDL Greenplum The data source is automatically created by grabbing the data from the uploaded DDL file.
Netezza
Oracle
SQL Server
Teradata
Vertica
ETL AWS Glue
  • IAM role: Provide IAM role to ensure security.
  • Region: Refers to a physical location where data centers are grouped.
  • Temp Directory: It is used as a temporary directory for the job.
  • Schema Name: Refers to a schema.
  • Database Name: Provide the Database Name from which you want to fetch the entities.
  • Bucket Name: Provide a bucket name.
  • Access id: Provide access id to access the AWS console.
  • Secret Key: Provide a credential key to access the AWS console.
 
File System Amazon S3
  • Schema Name: Refers to a schema.
  • Database Name: Provide the Database Name from which you want to fetch the entities.
  • Bucket Name: Provide a bucket name.
  • Access id: Provide access id to access the AWS console.
  • Secret Key: Provide a credential key to access the AWS console.
Azure Data Lake Storage
  • Account Name: Provide the account name.
  • Version: Choose the version such as Azure Data Lake Storage Gen1 or Azure Data Lake Storage Gen 2.
If you choose version as Azure Data Lake Storage Gen1,
  • Username: Provide username.
  • Password: Provide a password.
  • Client ID:  Provide client id.
  • Credentials: Provide the credentials.
  • Refresh URL: Provide Refresh URL for authentication. To access the application, you need to call the refresh url by using correct or valid credentials.
  • Default Endpoints Protocol: Mounting protocol for the Azure Data Lake Storage.
  • Account Key: Provide the key.
If you choose version as Azure Data Lake Storage Gen2,
  • Container Name: Provide a Container name that organizes a set of blobs.
  • SAS Token: Provide a SAS token.
  • Storage Account Access Key: Provide an access key to authorize data.
 
DBFS
  • Access Token: Provide the access token to authenticate and access the DBFS.
  • Path: Provide the path.
 
File Transfer Protocol Provide warehouse directory which contains data.
Secured File Transfer Protocol
  • Warehouse Directory:  Provide warehouse directory which contains data/ entity for every file.
  • Authentication: Choose Credential based or Key based authentication method.
Unix File System
If you have specified "localhost" in the Host Address when creating the repository,
  • Warehouse Directory:  Provide warehouse directory which contains data.
If you have specified an IP address in the Host Address when creating the repository,
  • Warehouse Directory:  Provide warehouse directory which contains data.
  • Authentication: Choose Credential based or Key based authentication method.
Google Cloud Storage Provide the bucket name.
HDFS
  • Warehouse Directory:  Provide warehouse directory which contains data/ entity for every file.
  • Keytab Path: Keytab (also known as Key table) contains service keys that are mainly used to allow server applications to accept authentications from clients.
  • Client Principal: Provide Client Principal details.
 
MPP Netezza Provide the Database Name from which you want to fetch the entities.
Teradata
RDBMS Azure Synapse
  • Schema Name: Refers to a schema.
  • Database Name: Provide the Database Name from which you want to fetch the entities.
 
Greenplum
Oracle
Redshift
Vertica
SQL Server
Snowflake
  • Schema Name: Refers to a schema.
  • Database Name: Provide the Database Name from which you want to fetch the entities.
  • Warehouse Name: Provide the warehouse name.
 
Aurora PostgreSQL Provide the schema and database name.
Others
  • Driver Name: Provide the name of the driver.
  • Driver Class Name: Provide the Driver Class Name.
  • Connection URI:  Provide connection URI.
  • Schema Name:  Refers to a schema.
  • Database Name:  Provide the Database Name from which you want to fetch the entities.
 
Cloud SQL for Postgres
  • Database Name:  Provide the Database Name from which you want to fetch the entities.
  • Auth Type: Select the authentication type such as Password or IAM.
 
Business Intelligence Power BI Nil
Amazon QuickSight
  • AWS Account Id: Provide the AWS account ID.
  • Region: Refers to a physical location where data centers are grouped.
  • Access Key: Provide access key to access the Amazon QuickSight.
  • Secret Key: Provide a password to access Amazon QuickSight.
  • Principal Username: Provide username.
 
Looker
  • Looker EndPoint: Provide the URL to connect with the required Looker endpoint.
  • Client ID: Provide the client ID to access the Looker endpoint.
  • Client Secret: Provide the client secret to access the Looker endpoint.
 
Version Control System Git
  • Endpoint: Provide the URL to connect with the required Git repository.
  • Client ID: Provide the client ID to access the endpoint.
  • Client Secret: Provide the client secret to access the endpoint.
  1. If the created repository is Teradata, In Database Name, enter the database name from which you want to fetch the entities.
  1. In Username and Password , provide a valid username and password to connect to the required database.
  2. In Connection Info, provide connection details such as input sources, like offline query logs or live query logs, etc.
  3. Click Test to verify whether you can connect with the data source. When the tool is successfully connected to the schema/database, it displays ; if not, it displays .
  4. To facilitate the importing and synchronizing of the referred schema/database, turn on the Import Data Source toggle.
  5. In Tags and Description , provide tags and description respectively.
  1. In Email , provide your e-mail address to receive system-generated updates made to the data source.
  1. Click at the top right of the Data Source page to create a data source. When a Data Source is successfully created, the system displays an alerting snackbar pop-up window to notify the success message.

To learn more, contact our support team or write to: info@leaplogic.io

Copyright © 2025 Impetus Technologies Inc. All Rights Reserved

  • Terms of Use
  • Privacy Policy
  • License Agreement
To the top ↑ Up ↑