Salesforce Ingestion Source

Salesforce is a top-notch CRM application built on the Force.com platform. It can manage all the customer interactions of an organization through different media, like phone calls, site email inquiries, communities, as well as social media.

Once you configure the Salesforce source channel, it then allows to read Salesforce data from a valid Salesforce account. This is done by reading Salesforce object specified by Salesforce Object Query Language.

Data Source Configuration

Fetch From Source/Upload Data File

To read data, you can either fetch data from the Salesforce source directly by providing the data source connection details or upload a sample data file in one of the supported formats to see the schema details during the ingestion application design.

If Upload Data File is selected to fetch sample data, provide the below details.

File Format: Select the sample file format (file type) depending on the data type.

Gathr-supported file formats for Salesforce data source are CSV, JSON, TEXT, Parquet, and ORC.

For CSV file format, select its corresponding delimiter.

Header Included: Enable this option to read the first row as a header if your Salesforce data is in CSV format.

Upload: Please upload the sample file as per the file format selected above.

👉

Make sure that the file size does not exceed 10 MB.

If Fetch From Source is selected, continue configuring the data source.

Connection Name: Connections are the service identifiers. A connection name can be selected from the list if you have created and saved connection details for Salesforce earlier. Or create one as explained in the topic - Salesforce Connection →

Use the Test Connection option to ensure that the connection with the Salesforce channel is established successfully.

A success message states that the connection is available. In case of any error in test connection, edit the connection to resolve the issue before proceeding further.

Table Name: Source table name to be selected for which you want to view the metadata.

Add Configuration: Additional properties can be added using this option as key-value pairs.

More Configurations

Query: Salesforce Object Query Language (SOQL) to search your organization’s Salesforce data for specific information. SOQL is similar to the SELECT statement in the widely used Structured Query Language (SQL) but is designed specifically for Salesforce data. It is mandatory for reading objects like opportunity.

👉

SOQL does not support ‘*’ identifier.

Infer Schema: (Optional) Infer schema from the query results. This will find the data type of the field specified in SOQL. Sample rows will be taken to find the data type. This will work if number of records are 5 and above.

Date Format: A string that indicates the format that follow java.text.SimpleDateFormat to use when reading timestamps. This applies to Timestamp type. By default, it is null which means trying to parse timestamp by java.sql.Timestamp.valueof().

Bulk: (Optional) Flag to enable bulk query. This is the preferred method when loading large sets of data. Bulk API is based on REST principles and is optimized for loading large sets of data. You can use it to query many records asynchronously by submitting batches. Salesforce will process batches in the background. Default value is false.

Salesforce Object: (Conditional) Salesforce Objects are database tables which permit you to store data specific to organization. This is a mandatory parameter when bulk is true and it should be same as specified in SOQL.

Pk Chunking: (Optional) Flag to enable automatic primary key chunking for bulk query job.

This splits bulk queries into separate batches that of the size defined by chunkSize option. By default, it is false. Pk Chunking feature can automatically make large queries manageable when using the Bulk API.

Pk stands for Primary Key — the object’s record ID — which is always indexed. This feature is supported for all custom objects, many standard objects, and their sharing tables.

Chunk size: The size of the number of records to include in each batch. Default value is 100,000. This option can only be used when Pk Chunking is true. Maximum size is 250,000.

Schema

Check the populated schema details. For more details, see Schema Preview →

Advanced Configuration

Optionally, you can enable incremental read. For more details, see Salesforce Incremental Configuration →

If you have any feedback on Gathr documentation, please email us!

Salesforce Ingestion Source

Data Source Configuration #

Schema #

Advanced Configuration #

Data Source Configuration

Schema

Advanced Configuration