HTTPV2 Ingestion Source

The HTTPV2 data source can fetch data based on requests sent to HTTP and HTTPS resources via Gathr.

Information that will be useful while configuring the data source:

Type of the source data (for example, CSV, JSON, Text, or XML)
Resource URI to access the source data.
Authentication type of the resource and details required for Gathr to access the data (for example, if Auth Type is Basic, then, Username and Password will be needed to access the resource)

The HTTPV2 data source configuration is equipped to handle SSL-enabled resources.

Along with the features that HTTP Data Source supports, the HTTPV2 channel additionally supports XML Data Type, enhanced authentication methods, can fetch paginated responses from the HTTP data resource, and can read the data incrementally.

Data Source Configuration

Configure the data source by providing request parameters that are explained below.

Design Application Using

Select the method to design the application.

Sample Data: Fetch sample data from the data source. You can provide connection details in the next section.
Upload Sample File: Upload a file with sample records.

💡

- If Source Data is selected, the Data Format field will appear at the end of the configuration.
- If Upload Sample File is chosen, it will be the next field for configuration in the application design.

File Format

Select the File Format matching the format of data in the HTTP resource to be requested.

Gathr supports CSV, JSON, TEXT, and XML file formats for HTTPV2 data sources.

For CSV file format, select its corresponding delimiter.
Header Included: Specify if the first row should be considered a header row while reading the sample file.
For JSON file format, there will be an additional field. To know more, see Path to Data.
For XML file format, there will be an additional field. To know more, see XML Path.

Upload

Please upload the sample data file as per the file format selected. The schema of data items in the sample file should be the same as the HTTP resource that is to be read.

👉

Make sure that the file size does not exceed 10 MB.

If Fetch From Source is selected, continue configuring the data source.

Request Method

The HTTP request method to fetch data from a source should be selected out of GET or POST.

The method selected determines how the configured parameters in the data source will be submitted to the HTTP resource.

GET: Use the GET method if form data is included in the URI, appended as query string parameters.
POST: Use the POST method if the form data is to be given in the request body of the HTTPV2 data source.

Design Time Attributes (Optional)

Use this option to limit the volume of data fetched from a source during the application design time. Here, the main objective is to see the schema details of the incoming source data.

Same as Runtime Attributes: Checkbox to keep the design time attributes same as the URI of the request method

Path to Data (For JSON Data Type)

👉

Path to Data field will appear only when the type of source data is specified as JSON in the File Format option.

Path to data is a JSON path expression that points to arrays or JSON.

To derive a JSON path expression, you can follow a structured approach based on the hierarchy and keys within the JSON data. To do it:

Start at the root: Use $ to signify the root of the JSON structure.
Navigate through objects: Traverse through the nested objects using dot notation (.).
Access arrays (If Required): If you encounter arrays, use square brackets [] with appropriate index or wildcard [*] to access array elements.

Example 1

Sample JSON data:

{
  "data": {
    "userlist": [
      {"name": "john", "age": 33, "department": "ICU"},
      {"name": "Mike", "age": 28, "department": "Oncology"},
      {"name": "Den", "age": 30, "department": "Medicine"}
    ]
  },
  "metadata": {
    "has_more_records": false
  }
}

Given this JSON structure, let’s derive the path to the userlist array:

Root: $
Navigate to data object: .data
Access userlist array: .userlist

Therefore, the path to data is: $.data.userlist

The output corresponding to this path would be:

[
  {"name": "john", "age": 33, "department": "ICU"},
  {"name": "Mike", "age": 28, "department": "Oncology"},
  {"name": "Den", "age": 30, "department": "Medicine"}
]

This output represents the array of user objects within the userlist key of the data object.

Example 2

Consider fetching the schema details of a source for the sample JSON file illustrated below:

If you provide the Path to data field value as $.data, then only the attributes of the data element along with their values will get fetched from the source as shown in the image below:

If you keep the Path to Data field value as $ (i.e., default value), then as per the data source configuration the entire JSON file will get fetched.

Example 3

Sample JSON data:

{
  "books": [
    {
      "book": {
        "title": "The Great Gatsby",
        "author": {
          "name": "F. Scott Fitzgerald",
          "birth_year": 1896
        },
        "genre": "Classic Fiction"
      }
    },
    {
      "book": {
        "title": "To Kill a Mockingbird",
        "author": {
          "name": "Harper Lee",
          "birth_year": 1926
        },
        "genre": "Southern Gothic"
      }
    },
    {
      "book": {
        "title": "1984",
        "author": {
          "name": "George Orwell",
          "birth_year": 1903
        },
        "genre": "Dystopian Fiction"
      }
    }
  ]
}

This JSON path expression starts at the root of the JSON structure, goes to the books array, and for each element in the array, extracts the book object.

We provide the path to data as: $.books[*].book

The output corresponding to this path would be:

[
  {
    "title": "The Great Gatsby",
    "author": {
      "name": "F. Scott Fitzgerald",
      "birth_year": 1896
    },
    "genre": "Classic Fiction"
  },
  {
    "title": "To Kill a Mockingbird",
    "author": {
      "name": "Harper Lee",
      "birth_year": 1926
    },
    "genre": "Southern Gothic"
  },
  {
    "title": "1984",
    "author": {
      "name": "George Orwell",
      "birth_year": 1903
    },
    "genre": "Dystopian Fiction"
  }
]

XML XPath (For XML Data Type)

👉

XML XPath field will appear only when the type of source data is specified as XML in the Data Format option.

XML Path is an XML path expression that points to arrays.

Define the XML node path to retrieve data, guiding the application to locate and extract relevant information within the XML structure.

💡

Start XML node path with / only if a sample file is uploaded to design the application.

Example 1: XML with Row Tag book:

For the path “/information/data/books/book” in the XML extract below, specifying the value as book will retrieve the data within the “book” element:

<information>
  <data>
    <books>
      <book>
        <name>"poetry"</name>
        <price>100</price>
      </book>
      <book>
        <name>"science"</name>
        <price>200</price>
      </book>
    </books>
 </data>
</information>

Example 2: XML with Row Tag student and attributes:

For the path “/class/student” in the XML extract below, specifying the value as student will retrieve the data within the “student” element:

<class>
  <student id="1">
    <name>Alice</name>
    <grade>A</grade>
  </student>
  <student id="2">
    <name>Bob</name>
    <grade>B</grade>
  </student>
</class>

JSON Path Evaluator

Use the JSON Path Evaluator.

Provide a sample JSON response.
Navigate to the specific key of the JSON, click on it for the path to appear on the search bar.
Option to write the path in the search bar is available. Click the EVALUATE button for path evaluation. If the path and provided JSON are correct, then the evaluated data will appear in the Evaluation data section.
Option to use Gathr natural language is available. Enable the toggle button to generate the path.

Parameters

You can specify path and query parameters for a request using the URL box or the Parameters tab.

Query Parameters

Query parameters are added after the ? in the URL, like this: ?id=1&type=new.

To specify a query parameter, add it directly to the URL or select the Parameters tab and enter the name and value.

When you enter your query parameters in either the URL or the Parameters tab, these values will update everywhere they’re used in the HTTPV2 connector.

Design Time Parameters

The parameters provided here will only be used during application design time.

Authentication

Specify the authorization type that should be used to authenticate the HTTP resource.

The supported authentication types are: None, Basic, Token, OAuth2, OAuth2 Client Credentials, and Custom Token. Each type is explained below in detail.

Auth Type - None

Choose this option to access an HTTP resource without needing any authentication.

Auth Type - Basic

If Auth Type is selected as Basic, proceed after providing the below parameters:

Username: Enter the user name for accessing the HTTP resource.

Password: Enter the password for accessing the HTTP resource.

Auth Type - Token Based

💡

Token-based authentication is a security technique that authenticates the users who attempt to log in to a server, a network, or any other secure system, using a security token provided by the server.

If Auth Type is selected as Token, proceed after providing the below parameters:

Token ID: The key with which the token is referred in the request.

Token: Token to access the HTTP resource.

Auth Type - Oauth2

💡

Oauth2 is an authentication technique in which the application gets a token that authorizes access to the user's account.

If Auth Type is selected as OAuth2, proceed after providing the below parameters:

Auth Headers: The headers associated with Auth URL should be provided as key-value pairs, through which the authorization code is generated.

Client ID: The client identifier given during the application registration process should be provided.

Secret Key: The secret key given to the client during the application registration process should be provided.

Auth URL: The endpoint for the authorization server, which retrieves the authorization code should be provided.

Auth Type - OAuth2 (Client Credential)

💡

The grant_type is client credential in the case of OAuth2 (Client Credential). You can also pass parameters to the authentication URL using this authentication method.

If Auth Type is selected as OAuth2 (Client Credential), proceed after providing the below parameters:

Auth Headers: Header’s parameter name and value can be provided.

Auth Params: Auth parameter name and value can be provided.

ClientId: The client identifier that is given to the client during the application registration process should be provided.

Secret Key: The secret key that is given to the client during the application registration process should be provided.

Auth URL: The endpoint for the authorization server, that retrieves the authorization code.

Use Token: Use generated token in the URL parameter or header of the request.

Auth Type - Custom Token

💡

Use this auth method to first get a token from a custom API, that can be used in the main data API to fetch the data as an authentication key.

👉

Make sure that the response content type of the custom token request is JSON.

If Auth Type is selected as Custom Token, a modal window Token Generation HTTP Configuration will appear. Proceed after providing the below parameters:

URI: HTTP or HTTPS URI to send a request to a resource.

Request Method: HTTP request method for the URI to be selected out of GET or POST.

Request Body: Request body to send a data payload to an HTTP resource in the body of the request.

👉

The Request Body field will appear only for post requests.

Header: Header’s parameter name and value.

Example:

In the below snapshot you can see that the main data API needs a token in the Header with the key as Authorization and value as BEARER ${token} and then ${token}.

${token} will be replaced with the custom token generated from the custom token API.

${token} can be used in the value of the header key which will be replaced by the token generated from the custom token API.

Path to Token: JSON path expression that points to tokens.

Auth Type: Used to specify the authorization type associated with the URL. The supported auth types for token generation are None, Basic, and Token.

None: This option specifies that the URL can be accessed without any authentication.
Basic: This option specifies that accessing the URL requires Basic Authorization.
Provide a user name and password for accessing the URL.
Token: Token-based authentication is a security technique that authenticates the users who attempt to log in to a server, a network, or other secure systems, using a security token provided by the server.

Headers

Use the header field(s) to provide additional requests to the HTTP resource via the HTTPV2 data source.

💡

HTTP headers have a key-value pair format.

Example

Body

In this tab, you can provide the request body for both runtime and design time. This option allows you to transmit a data payload to an HTTP resource within the body of the request.

Request Body

👉

The Request Body field will appear only for post requests.

Using the request body option, a data payload can be sent to an HTTP resource in the body of the request.

Example

{
  "from":1,
  "size":10,
  "query":{"match_all":{}}
}

Design Time Request Body

This parameter is applicable for POST request methods. The request body provided here will only be used during application design time.

Example

{
  "from":1,
  "size":5,
  "query":{"match_all":{}}
}

Settings

Advanced settings can be configured using this tab.

Encode URI (Optional)

Select this option to encode the URI.

All the characters in URI will be converted into a format that can be transmitted by the HTTPV2 channel.

Example: Parameters in a request body with {} need encoding for the request to pass.

SSL Configuration

The SSL Configuration page provides essential settings for managing Secure Socket Layer (SSL) connections within the HTTP data source.

Enable SSL

It is set to False by default.

Set this option to True, if the resource that is to be requested using the HTTP data source is SSL-enabled.

If set to True, choose how the SSL-enabled HTTP resource should be verified.

Either a keystore file or a certificate file needs to be uploaded based on the chosen verification method.

The Keystore Password or Certificate Alias should then be provided as per the type of file uploaded for verification.

Retry configuration

Define how many times the URL should be attempted in case of failure and specify the interval between retry attempts.

Retry Count

Runs the URL as many number of times as mentioned, in case of failure to run the URL.

Retry Delay

A retry delay interval (in seconds) should be provided.

TimeOut Configuration

Set timeout values for managing server connections and data retrieval processes. Specify the maximum time allowed for accessing a server and generating output. Additionally, set the maximum time permitted for establishing a connection.

Read Timeout (sec)

While a URL is accessing a server and the output generation is taking time, you can provide read timeout (in seconds).

A timeout value of zero is interpreted as an infinite timeout. A negative value is interpreted as undefined (system default). Default value is 5.

Connection Timeout (sec)

In certain scenarios, the connection is not established, you can specify the connection time out (in seconds).

A timeout value of zero is interpreted as an infinite timeout.

A negative value is interpreted as undefined (system default).

Default value is 5.

Add Configuration: Additional properties can be added using this option as key-value pairs.

Schema

Check the populated schema details. For more details, see Schema Preview →

Pagination

To know more about the pagination options in HTTPV2, see HTTPV2 Pagination →

Advanced Configuration

Optionally, you can enable incremental read. For more details, see HTTPV2 Incremental Configuration →

If you have any feedback on Gathr documentation, please email us!

HTTPV2 Ingestion Source

Data Source Configuration #

Design Application Using #

File Format #

Upload #

Request Method #

Design Time Attributes (Optional) #

Path to Data (For JSON Data Type) #

XML XPath (For XML Data Type) #

JSON Path Evaluator #

Parameters #

Query Parameters #

Design Time Parameters #

Authentication #

Auth Type - None #

Auth Type - Basic #

Auth Type - Token Based #

Auth Type - Oauth2 #

Auth Type - OAuth2 (Client Credential) #

Auth Type - Custom Token #

Headers #

Body #

Request Body #

Design Time Request Body #

Settings #

Encode URI (Optional) #

SSL Configuration #

Enable SSL #

Retry configuration #

Retry Count #

Retry Delay #

TimeOut Configuration #

Read Timeout (sec) #

Connection Timeout (sec) #

Schema #

Pagination #

Advanced Configuration #

Data Source Configuration

Design Application Using

File Format

Upload

Request Method

Design Time Attributes (Optional)

Path to Data (For JSON Data Type)

XML XPath (For XML Data Type)

JSON Path Evaluator

Parameters

Query Parameters

Design Time Parameters

Authentication

Auth Type - None

Auth Type - Basic

Auth Type - Token Based

Auth Type - Oauth2

Auth Type - OAuth2 (Client Credential)

Auth Type - Custom Token

Headers

Body

Request Body

Design Time Request Body

Settings

Encode URI (Optional)

SSL Configuration

Enable SSL

Retry configuration

Retry Count

Retry Delay

TimeOut Configuration

Read Timeout (sec)

Connection Timeout (sec)

Schema

Pagination

Advanced Configuration