HTTPV2 Ingestion Source

The HTTPV2 data source can fetch data based on requests sent to HTTP and HTTPS resources via Gathr.

Information that will be useful while configuring the data source:

  • Type of the source data (for example, CSV, JSON, Text, or XML)

  • Resource URI to access the source data.

  • Authentication type of the resource and details required for Gathr to access the data (for example, if Auth Type is Basic, then, Username and Password will be needed to access the resource)

The HTTPV2 data source configuration is equipped to handle SSL-enabled resources.

Along with the features that HTTP Data Source supports, the HTTPV2 channel additionally supports XML Data Type, enhanced authentication methods, can fetch paginated responses from the HTTP data resource, and can read the data incrementally.


Data Source Configuration

Configure the data source by providing request parameters that are explained below.

Design Application Using

Select the method to design the application.

  • Source Data: Fetch sample data from the data source. You can provide connection details in the next section.

  • Upload Sample File: Upload a file with sample records.

File Format

Select the File Format matching the format of data in the HTTP resource to be requested.

Gathr supports CSV, JSON, TEXT, and XML file formats for HTTPV2 data sources.

  • For CSV file format, select its corresponding delimiter.

  • Header Included: Specify if the first row should be considered a header row while reading the sample file.

  • For JSON file format, there will be an additional field. To know more, see Path to Data.

  • For XML file format, there will be an additional field. To know more, see XML Path.

Upload

Please upload the sample data file as per the file format selected. The schema of data items in the sample file should be the same as the HTTP resource that is to be read.

If Fetch From Source is selected, continue configuring the data source.


Request Method

The HTTP request method to fetch data from a source should be selected out of GET or POST.

The method selected determines how the configured parameters in the data source will be submitted to the HTTP resource.

  • GET: Use the GET method if form data is included in the URI, appended as query string parameters.

  • POST: Use the POST method if the form data is to be given in the request body of the HTTPV2 data source.


URI

The HTTP or HTTPS URI should be provided to send requests to a resource via the HTTPV2 data source.

The URI should contain all the details that the HTTPV2 data source will need to find the API resource it is requesting.

Example: https://<hostName>:<portNumber>/path/_queryString


Design Time Attributes (Optional)

Use this option to limit the volume of data fetched from a source during the application design time. Here, the main objective is to see the schema details of the incoming source data.

Based on the request method, the Design Time Attributes can be provided by selecting URI, Body, or a combination of both.

  • Same as Runtime Attributes: Checkbox to keep the design time attributes same as the URI of the request method

  • Copy Runtime Attributes: Option to copy the Runtime Attributes into the Design Time Request URI field.


Design Time Request URI

The request URI provided here will only be used during application design time.

Example: https://example.com?page=1&date=12-01-1999


Path to Data (For JSON Data Type)

Path to data is a JSON path expression that points to arrays or JSON.

To derive a JSON path expression, you can follow a structured approach based on the hierarchy and keys within the JSON data. To do it:

  1. Start at the root: Use $ to signify the root of the JSON structure.

  2. Navigate through objects: Traverse through the nested objects using dot notation (.).

  3. Access arrays (If Required): If you encounter arrays, use square brackets [] with appropriate index or wildcard [*] to access array elements.

Example 1

Sample JSON data:

{
	"data": {
		"userlist": [
			{"name": "john", "age": 33, "department": "ICU"},
			{"name": "Mike", "age": 28, "department": "Oncology"},
			{"name": "Den", "age": 30, "department": "Medicine"}
		]
	},
	"metadata": {
		"has_more_records": false
	}
}

Given this JSON structure, let’s derive the path to the userlist array:

  • Root: $
  • Navigate to data object: .data
  • Access userlist array: .userlist

Therefore, the path to data is: $.data.userlist

The output corresponding to this path would be:

[
	{"name": "john", "age": 33, "department": "ICU"},
	{"name": "Mike", "age": 28, "department": "Oncology"},
	{"name": "Den", "age": 30, "department": "Medicine"}
]

This output represents the array of user objects within the userlist key of the data object.


Example 2

Consider fetching the schema details of a source for the sample JSON file illustrated below:

HTTP_Source_Example

If you provide the Path to data field value as $.data, then only the attributes of the data element along with their values will get fetched from the source as shown in the image below:

HTTP_Source_Example_2

If you keep the Path to Data field value as $ (i.e., default value), then as per the data source configuration the entire JSON file will get fetched.

HTTP_Source_Example_1


Example 3

Sample JSON data:

{
  "books": [
    {
      "book": {
        "title": "The Great Gatsby",
        "author": {
          "name": "F. Scott Fitzgerald",
          "birth_year": 1896
        },
        "genre": "Classic Fiction"
      }
    },
    {
      "book": {
        "title": "To Kill a Mockingbird",
        "author": {
          "name": "Harper Lee",
          "birth_year": 1926
        },
        "genre": "Southern Gothic"
      }
    },
    {
      "book": {
        "title": "1984",
        "author": {
          "name": "George Orwell",
          "birth_year": 1903
        },
        "genre": "Dystopian Fiction"
      }
    }
  ]
}

This JSON path expression starts at the root of the JSON structure, goes to the books array, and for each element in the array, extracts the book object.

We provide the path to data as: $.books[*].book

The output corresponding to this path would be:

[
  {
    "title": "The Great Gatsby",
    "author": {
      "name": "F. Scott Fitzgerald",
      "birth_year": 1896
    },
    "genre": "Classic Fiction"
  },
  {
    "title": "To Kill a Mockingbird",
    "author": {
      "name": "Harper Lee",
      "birth_year": 1926
    },
    "genre": "Southern Gothic"
  },
  {
    "title": "1984",
    "author": {
      "name": "George Orwell",
      "birth_year": 1903
    },
    "genre": "Dystopian Fiction"
  }
]

XML XPath (For XML Data Type)

XML Path is an XML path expression that points to arrays.

Define the XML node path to retrieve data, guiding the application to locate and extract relevant information within the XML structure.

Example 1: XML with Row Tag book:

For the path “/information/data/books/book” in the XML extract below, specifying the value as book will retrieve the data within the “book” element:

<information>
  <data>
    <books>
      <book>
        <name>"poetry"</name>
        <price>100</price>
      </book>
      <book>
        <name>"science"</name>
        <price>200</price>
      </book>
    </books>
 </data>
</information>

Example 2: XML with Row Tag student and attributes:

For the path “/class/student” in the XML extract below, specifying the value as student will retrieve the data within the “student” element:

<class>
  <student id="1">
    <name>Alice</name>
    <grade>A</grade>
  </student>
  <student id="2">
    <name>Bob</name>
    <grade>B</grade>
  </student>
</class>

Parameters

You can specify path and query parameters for a request using the URL box or the Parameters tab.

Query Parameters

Query parameters are added after the ? in the URL, like this: ?id=1&type=new.

To specify a query parameter, add it directly to the URL or select the Parameters tab and enter the name and value.

When you enter your query parameters in either the URL or the Parameters tab, these values will update everywhere they’re used in the HTTPV2 connector.

Design Time Parameters

The parameters provided here will only be used during application design time.


Authentication

Specify the authorization type that should be used to authenticate the HTTP resource.

The supported authentication types are: None, Basic, Token, OAuth2, OAuth2 Client Credentials, and Custom Token. Each type is explained below in detail.

Auth Type - None

Choose this option to access an HTTP resource without needing any authentication.

Auth Type - Basic

If Auth Type is selected as Basic, proceed after providing the below parameters:

Username: Enter the user name for accessing the HTTP resource.

Password: Enter the password for accessing the HTTP resource.

Auth Type - Token Based

If Auth Type is selected as Token, proceed after providing the below parameters:

Token ID: The key with which the token is referred in the request.

Token: Token to access the HTTP resource.

Auth Type - Oauth2

If Auth Type is selected as OAuth2, proceed after providing the below parameters:

Auth Headers: The headers associated with Auth URL should be provided as key-value pairs, through which the authorization code is generated.

Client ID: The client identifier given during the application registration process should be provided.

Secret Key: The secret key given to the client during the application registration process should be provided.

Auth URL: The endpoint for the authorization server, which retrieves the authorization code should be provided.

Auth Type - OAuth2 (Client Credential)

If Auth Type is selected as OAuth2 (Client Credential), proceed after providing the below parameters:

Auth Headers: Header’s parameter name and value can be provided.

Auth Params: Auth parameter name and value can be provided.

ClientId: The client identifier that is given to the client during the application registration process should be provided.

Secret Key: The secret key that is given to the client during the application registration process should be provided.

Auth URL: The endpoint for the authorization server, that retrieves the authorization code.

Use Token: Use generated token in the URL parameter or header of the request.

Auth Type - Custom Token

If Auth Type is selected as Custom Token, a modal window Token Generation HTTP Configuration will appear. Proceed after providing the below parameters:

URI: HTTP or HTTPS URI to send a request to a resource.

Request Method: HTTP request method for the URI to be selected out of GET or POST.

Request Body: Request body to send a data payload to an HTTP resource in the body of the request.

Header: Header’s parameter name and value.

Example:

HTTPV2_Custom_Auth

In the below snapshot you can see that the main data API needs a token in the Header with the key as Authorization and value as BEARER ${token} and then ${token}.

HTTPV2_Custom_Auth_Header

${token} will be replaced with the custom token generated from the custom token API.

${token} can be used in the value of the header key which will be replaced by the token generated from the custom token API.

Path to Token: JSON path expression that points to tokens.

Auth Type: Used to specify the authorization type associated with the URL. The supported auth types for token generation are None, Basic, and Token.

  • None: This option specifies that the URL can be accessed without any authentication.

  • Basic: This option specifies that accessing the URL requires Basic Authorization.

    Provide a user name and password for accessing the URL.

  • Token: Token-based authentication is a security technique that authenticates the users who attempt to log in to a server, a network, or other secure systems, using a security token provided by the server.


Headers

Use the header field(s) to provide additional requests to the HTTP resource via the HTTPV2 data source.

Example

HTTP_Header_Example


Body

In this tab, you can provide the request body for both runtime and design time. This option allows you to transmit a data payload to an HTTP resource within the body of the request.

Request Body

Using the request body option, a data payload can be sent to an HTTP resource in the body of the request.

Example

{
  "from":1,
  "size":10,
  "query":{"match_all":{}}
}

Design Time Request Body

This parameter is applicable for POST request methods. The request body provided here will only be used during application design time.

Example

{
  "from":1,
  "size":5,
  "query":{"match_all":{}}
}

Settings

Advanced settings can be configured using this tab.

Encode URI (Optional)

Select this option to encode the URI.

All the characters in URI will be converted into a format that can be transmitted by the HTTPV2 channel.

Example: Parameters in a request body with {} need encoding for the request to pass.


SSL Configuration

The SSL Configuration page provides essential settings for managing Secure Socket Layer (SSL) connections within the HTTP data source.

Enable SSL

It is set to False by default.

Set this option to True, if the resource that is to be requested using the HTTP data source is SSL-enabled.

If set to True, choose how the SSL-enabled HTTP resource should be verified.

Either a keystore file or a certificate file needs to be uploaded based on the chosen verification method.

The Keystore Password or Certificate Alias should then be provided as per the type of file uploaded for verification.


Retry configuration

Define how many times the URL should be attempted in case of failure and specify the interval between retry attempts.

Retry Count

Runs the URL as many number of times as mentioned, in case of failure to run the URL.

Retry Delay

A retry delay interval (in seconds) should be provided.


TimeOut Configuration

Set timeout values for managing server connections and data retrieval processes. Specify the maximum time allowed for accessing a server and generating output. Additionally, set the maximum time permitted for establishing a connection.

Read Timeout (sec)

While a URL is accessing a server and the output generation is taking time, you can provide read timeout (in seconds).

A timeout value of zero is interpreted as an infinite timeout. A negative value is interpreted as undefined (system default). Default value is 5.

Connection Timeout (sec)

In certain scenarios, the connection is not established, you can specify the connection time out (in seconds).

A timeout value of zero is interpreted as an infinite timeout.

A negative value is interpreted as undefined (system default).

Default value is 5.


Add Configuration: Additional properties can be added using this option as key-value pairs.


Schema

Check the populated schema details. For more details, see Schema Preview β†’

Pagination

To know more about the pagination options in HTTPV2, see HTTPV2 Pagination β†’

Advanced Configuration

Optionally, you can enable incremental read. For more details, see HTTPV2 Incremental Configuration β†’

Top