HTTPV2 Ingestion Source
The HTTPV2 data source can fetch data based on requests sent to HTTP and HTTPS resources via Gathr.
Information that will be useful while configuring the data source:
Type of the source data (for example, CSV, JSON, Text, or XML)
Resource URI to access the source data.
Authentication type of the resource and details required for Gathr to access the data (for example, if Auth Type is Basic, then, Username and Password will be needed to access the resource)
The HTTPV2 data source configuration is equipped to handle SSL-enabled resources.
Along with the features that HTTP Data Source supports, the HTTPV2 channel additionally supports XML Data Type, enhanced authentication methods, can fetch paginated responses from the HTTP data resource, and can read the data incrementally.
Data Source Configuration
Configure the data source by providing request parameters that are explained below.
Design Application Using
Select the method to design the application.
Sample Data: Fetch sample data from the data source. You can provide connection details in the next section.
Upload Sample File: Upload a file with sample records.
- If Upload Sample File is chosen, it will be the next field for configuration in the application design.
File Format
Select the File Format matching the format of data in the HTTP resource to be requested.
Gathr supports CSV, JSON, TEXT, and XML file formats for HTTPV2 data sources.
For CSV file format, select its corresponding delimiter.
Header Included: Specify if the first row should be considered a header row while reading the sample file.
For JSON file format, there will be an additional field. To know more, see Path to Data.
For XML file format, there will be an additional field. To know more, see XML Path.
Upload
Please upload the sample data file as per the file format selected. The schema of data items in the sample file should be the same as the HTTP resource that is to be read.
If Fetch From Source is selected, continue configuring the data source.
Request Method
The HTTP request method to fetch data from a source should be selected out of GET or POST.
The method selected determines how the configured parameters in the data source will be submitted to the HTTP resource.
GET: Use the GET method if form data is included in the URI, appended as query string parameters.
POST: Use the POST method if the form data is to be given in the request body of the HTTPV2 data source.
Design Time Attributes (Optional)
Use this option to limit the volume of data fetched from a source during the application design time. Here, the main objective is to see the schema details of the incoming source data.
- Same as Runtime Attributes: Checkbox to keep the design time attributes same as the URI of the request method
Path to Data (For JSON Data Type)
Path to data is a JSON path expression that points to arrays or JSON.
To derive a JSON path expression, you can follow a structured approach based on the hierarchy and keys within the JSON data. To do it:
Start at the root: Use
$
to signify the root of the JSON structure.Navigate through objects: Traverse through the nested objects using dot notation (
.
).Access arrays (If Required): If you encounter arrays, use square brackets
[]
with appropriate index or wildcard[*]
to access array elements.
Example 1
Sample JSON data:
{
"data": {
"userlist": [
{"name": "john", "age": 33, "department": "ICU"},
{"name": "Mike", "age": 28, "department": "Oncology"},
{"name": "Den", "age": 30, "department": "Medicine"}
]
},
"metadata": {
"has_more_records": false
}
}
Given this JSON structure, let’s derive the path to the userlist
array:
- Root:
$
- Navigate to
data
object:.data
- Access
userlist
array:.userlist
Therefore, the path to data is: $.data.userlist
The output corresponding to this path would be:
[
{"name": "john", "age": 33, "department": "ICU"},
{"name": "Mike", "age": 28, "department": "Oncology"},
{"name": "Den", "age": 30, "department": "Medicine"}
]
This output represents the array of user objects within the userlist
key of the data
object.
Example 2
Consider fetching the schema details of a source for the sample JSON file illustrated below:
If you provide the Path to data field value as $.data, then only the attributes of the data
element along with their values will get fetched from the source as shown in the image below:
If you keep the Path to Data field value as $ (i.e., default value), then as per the data source configuration the entire JSON file will get fetched.
Example 3
Sample JSON data:
{
"books": [
{
"book": {
"title": "The Great Gatsby",
"author": {
"name": "F. Scott Fitzgerald",
"birth_year": 1896
},
"genre": "Classic Fiction"
}
},
{
"book": {
"title": "To Kill a Mockingbird",
"author": {
"name": "Harper Lee",
"birth_year": 1926
},
"genre": "Southern Gothic"
}
},
{
"book": {
"title": "1984",
"author": {
"name": "George Orwell",
"birth_year": 1903
},
"genre": "Dystopian Fiction"
}
}
]
}
This JSON path expression starts at the root of the JSON structure, goes to the books
array, and for each element in the array, extracts the book
object.
We provide the path to data as: $.books[*].book
The output corresponding to this path would be:
[
{
"title": "The Great Gatsby",
"author": {
"name": "F. Scott Fitzgerald",
"birth_year": 1896
},
"genre": "Classic Fiction"
},
{
"title": "To Kill a Mockingbird",
"author": {
"name": "Harper Lee",
"birth_year": 1926
},
"genre": "Southern Gothic"
},
{
"title": "1984",
"author": {
"name": "George Orwell",
"birth_year": 1903
},
"genre": "Dystopian Fiction"
}
]
XML XPath (For XML Data Type)
XML Path is an XML path expression that points to arrays.
Define the XML node path to retrieve data, guiding the application to locate and extract relevant information within the XML structure.
Example 1: XML with Row Tag book:
For the path “/information/data/books/book” in the XML extract below, specifying the value as book will retrieve the data within the “book” element:
<information>
<data>
<books>
<book>
<name>"poetry"</name>
<price>100</price>
</book>
<book>
<name>"science"</name>
<price>200</price>
</book>
</books>
</data>
</information>
Example 2: XML with Row Tag student and attributes:
For the path “/class/student” in the XML extract below, specifying the value as student will retrieve the data within the “student” element:
<class>
<student id="1">
<name>Alice</name>
<grade>A</grade>
</student>
<student id="2">
<name>Bob</name>
<grade>B</grade>
</student>
</class>
JSON Path Evaluator
Use the JSON Path Evaluator.
Provide a sample JSON response.
Navigate to the specific key of the JSON, click on it for the path to appear on the search bar.
Option to write the path in the search bar is available. Click the EVALUATE button for path evaluation. If the path and provided JSON are correct, then the evaluated data will appear in the Evaluation data section.
Option to use Gathr natural language is available. Enable the toggle button to generate the path.
Parameters
You can specify path and query parameters for a request using the URL box or the Parameters tab.
Query Parameters
Query parameters are added after the ?
in the URL, like this: ?id=1&type=new
.
To specify a query parameter, add it directly to the URL or select the Parameters tab and enter the name and value.
When you enter your query parameters in either the URL or the Parameters tab, these values will update everywhere they’re used in the HTTPV2 connector.
Design Time Parameters
The parameters provided here will only be used during application design time.
Authentication
Specify the authorization type that should be used to authenticate the HTTP resource.
The supported authentication types are: None, Basic, Token, OAuth2, OAuth2 Client Credentials, and Custom Token. Each type is explained below in detail.
Auth Type - None
Choose this option to access an HTTP resource without needing any authentication.
Auth Type - Basic
If Auth Type is selected as Basic, proceed after providing the below parameters:
Username: Enter the user name for accessing the HTTP resource.
Password: Enter the password for accessing the HTTP resource.
Auth Type - Token Based
If Auth Type is selected as Token, proceed after providing the below parameters:
Token ID: The key with which the token is referred in the request.
Token: Token to access the HTTP resource.
Auth Type - Oauth2
If Auth Type is selected as OAuth2, proceed after providing the below parameters:
Auth Headers: The headers associated with Auth URL should be provided as key-value pairs, through which the authorization code is generated.
Client ID: The client identifier given during the application registration process should be provided.
Secret Key: The secret key given to the client during the application registration process should be provided.
Auth URL: The endpoint for the authorization server, which retrieves the authorization code should be provided.
Auth Type - OAuth2 (Client Credential)
If Auth Type is selected as OAuth2 (Client Credential), proceed after providing the below parameters:
Auth Headers: Headerβs parameter name and value can be provided.
Auth Params: Auth parameter name and value can be provided.
ClientId: The client identifier that is given to the client during the application registration process should be provided.
Secret Key: The secret key that is given to the client during the application registration process should be provided.
Auth URL: The endpoint for the authorization server, that retrieves the authorization code.
Use Token: Use generated token in the URL parameter or header of the request.
Auth Type - Custom Token
If Auth Type is selected as Custom Token, a modal window Token Generation HTTP Configuration will appear. Proceed after providing the below parameters:
URI: HTTP or HTTPS URI to send a request to a resource.
Request Method: HTTP request method for the URI to be selected out of GET or POST.
Request Body: Request body to send a data payload to an HTTP resource in the body of the request.
Header: Headerβs parameter name and value.
Example:
In the below snapshot you can see that the main data API needs a token in the Header with the key as Authorization and value as BEARER ${token} and then ${token}.
${token} will be replaced with the custom token generated from the custom token API.
${token} can be used in the value of the header key which will be replaced by the token generated from the custom token API.
Path to Token: JSON path expression that points to tokens.
Auth Type: Used to specify the authorization type associated with the URL. The supported auth types for token generation are None, Basic, and Token.
None: This option specifies that the URL can be accessed without any authentication.
Basic: This option specifies that accessing the URL requires Basic Authorization.
Provide a user name and password for accessing the URL.
Token: Token-based authentication is a security technique that authenticates the users who attempt to log in to a server, a network, or other secure systems, using a security token provided by the server.
Headers
Use the header field(s) to provide additional requests to the HTTP resource via the HTTPV2 data source.
Example
Body
In this tab, you can provide the request body for both runtime and design time. This option allows you to transmit a data payload to an HTTP resource within the body of the request.
Request Body
Using the request body option, a data payload can be sent to an HTTP resource in the body of the request.
Example
{
"from":1,
"size":10,
"query":{"match_all":{}}
}
Design Time Request Body
This parameter is applicable for POST request methods. The request body provided here will only be used during application design time.
Example
{
"from":1,
"size":5,
"query":{"match_all":{}}
}
Settings
Advanced settings can be configured using this tab.
Encode URI (Optional)
Select this option to encode the URI.
All the characters in URI will be converted into a format that can be transmitted by the HTTPV2 channel.
Example: Parameters in a request body with {}
need encoding for the request to pass.
SSL Configuration
The SSL Configuration page provides essential settings for managing Secure Socket Layer (SSL) connections within the HTTP data source.
Enable SSL
It is set to False by default.
Set this option to True, if the resource that is to be requested using the HTTP data source is SSL-enabled.
If set to True, choose how the SSL-enabled HTTP resource should be verified.
Either a keystore file or a certificate file needs to be uploaded based on the chosen verification method.
The Keystore Password or Certificate Alias should then be provided as per the type of file uploaded for verification.
Retry configuration
Define how many times the URL should be attempted in case of failure and specify the interval between retry attempts.
Retry Count
Runs the URL as many number of times as mentioned, in case of failure to run the URL.
Retry Delay
A retry delay interval (in seconds) should be provided.
TimeOut Configuration
Set timeout values for managing server connections and data retrieval processes. Specify the maximum time allowed for accessing a server and generating output. Additionally, set the maximum time permitted for establishing a connection.
Read Timeout (sec)
While a URL is accessing a server and the output generation is taking time, you can provide read timeout (in seconds).
A timeout value of zero is interpreted as an infinite timeout. A negative value is interpreted as undefined (system default). Default value is 5.
Connection Timeout (sec)
In certain scenarios, the connection is not established, you can specify the connection time out (in seconds).
A timeout value of zero is interpreted as an infinite timeout.
A negative value is interpreted as undefined (system default).
Default value is 5.
Add Configuration: Additional properties can be added using this option as key-value pairs.
Schema
Check the populated schema details. For more details, see Schema Preview β
Pagination
To know more about the pagination options in HTTPV2, see HTTPV2 Pagination β
Advanced Configuration
Optionally, you can enable incremental read. For more details, see HTTPV2 Incremental Configuration β
If you have any feedback on Gathr documentation, please email us!