Web Studio
Note: Some of the properties reflected are not feasible with Multi-Cloud version of Gathr. These properties are marked with **
Configurations properties related to application server, i.e. Gathr web studio.
This category is further divided into various sub-categories.
Platform
Field | Description |
---|---|
Gathr UI Host | The IP address of gathr. |
Gathr Installation Directory | The installation directory of gathr. |
Gathr Web URL | The URL of gathr web studio. |
Gathr UI Port | The UI port of gathr. |
LogMonitoring UI Host | The host address of LogMonitoring. |
LogMonitoring UI Port | The port of LogMonitoring. |
Messaging Type | Specifies the Message Queuing System that application uses internally for messaging. Possible value is RABBITMQ (for RabbitMQ) |
Gathr Monitoring Reporters Supported | The monitoring reporter type and the possible values should be comma separated graphite, console and logger. |
Metric Server | Monitoring Metric Server (Graphite or Ambari). |
RDBMS
Field | Description |
---|---|
Password | The database password. |
Driver Class | The database driver class name. |
Connection URL | The database URL for the database. |
User | The database username. |
Database Dialect | The type of database on which gathr database is created. Possible values are MySQL, PostgreSQL, Oracle. |
Zookeeper
Field | Description |
---|---|
Host List | The comma separated list of <IP>:<PORT> of all nodes in zookeeper cluster where configuration will be stored. |
Indexing
Field | Description |
---|---|
Indexer Type | The default indexer type. For e.g. - Solr or ElasticSearch. |
Index Default Replication Factor | Number of additional copies of data to be saved. |
Enable Index Default is Batch | Default value for the Batch parameter of indexing. |
Index Default Batch Size | Default batch size for the indexing store. |
Enable Index Default Across Field Search | Search without specifying column names, takes extra space and time. |
Index Default Number of Shards | Number of shards to be created in index store. |
Index Default Routing Required | The default value for the Routing parameter of indexing. |
Indexer Default Source | The default value for the Source parameter of indexing. |
Index Retries | The number of retries for indexing. |
Index Retries Interval(in ms) | The retries interval for the indexing when ingestion fails. |
Indexer time to live in seconds | Indexed data older than mentioned time in seconds from current time will not be fetched. |
Persistence
Field | Description |
---|---|
Persistence Store | The default persistence type. For e.g. - Hbase, Cassandra. |
Persistence Default Is batch Enable | Defines if by default batching should be enabled in persistence. |
Persistence Default Batch Size | The batch size for the persistence store. |
Persistence Default Compression | The default compression type for the persistence store. |
Security
Field | Description |
---|---|
User Authentication Source | Options available for login into Gathr application: - LDAP - Okta - Gathr Metastore To authenticate user’s credentials while logging in to Gathr application, user can opt to authenticate login credentials using LDAP server, Okta, or Gathr metastore. Note: If configured with LDAP, user trying to login into the application should exist in LDAP server. Similarly, for enabling SSO (Single Sign On) in Gathr, choose Okta as (Service provider)/User Authentication Source to verify the login credentials for application. Note: Upon logging into the environment, you will be redirected to Okta sign in page. Provide the username and password that is configured in LDAP. |
The prerequisites for Okta authentication are:
In the LDAP server Group user mapping must be done accurately.
For example, emailID, givenName.
For Okta account, Application configuration, Application’s access management and LDAP directory integration must be done.
Initially, the Gathr Role and LDAP role mapping should be done by login in the application in the embedded mode with superuser credentials and make the required changes.
All the parameters including cn, sn, givenName, mail, uid, userPassword must be configured:
So, mapping in LDAP is done by configuring LDAP Active Directory into Okta via. Okta agent that fetches the details from LDAP and the details are mapped in Okta.
SSO in Okta
Notes:
Single Sign-On authentication allows users to login with a single ID within the software application without managing multiple accounts and passwords thus making it easier for admins to manage all users and privileges with one centralized admin dashboard.
Configure Gathr application in Okta by providing Client credentials including details such as Client ID and secret key. User will be required to provide the details for Gathr application Access Management in Okta.
User will be required to install Okta Agent to be able to establish interaction between Okta and LDAP server. (The Okta agent can be installed in gathr/client end application, from where it can access the details of Gathr application).
Once Okta Agent is configured, provide LDAP server details.
User Authorization Source | Specify user’s authorization mechanism, accordingly user will be assigned appropriate role in the gathr webstudio. Possible values are LDAP and gathr Metastore. Default value is gathr Metastore. Choose LDAP as User Authorization Source (defining the assigned roles within application. Example: superuser, custom user, workspace admin) if Okta is used for authentication. |
Superuser(Seed User) Authentication Source | Superuser authentication source needs to be selected. Currently, only gathr Metastore is supported as authentication source for Superuser. |
RT Dashboard
Field | Description |
---|---|
SuperAdmin Password | The super admin password (Required to access the Dashboard UI). |
ReportClient path | The path of ReportClient.properties required to connect with Report Engine. |
Connection Name | The connection name created for gathr in Dashboard. |
Organization ID | The name of organization for gathr in Intellicus. |
SuperAdmin User ID | The Dashboard super user Username to access the Intellicus via UI. |
SuperAdmin Organization | The Dashboard superuser organization name, required to access Intellicus via UI. |
Gathr URL | The dashboard web admin URL, used for showing Dashboard UI from within gathradmin. |
Databricks
Field | Description |
---|---|
Databricks Enabled | To enable Databricks on this environment. |
Databricks Instance URL | Databricks Instance URL to connect databricks account and access it over REST calls. |
Databricks Authentication Token | Databricks Access token provided here will be associated with superuser account for gathr. It will be saved as encrypted text. |
Databricks DBFS Upload Jar Path | DBFS Path for the gathr specific jars and files. |
Maximum Polling Time (in minutes) | Maximum Polling Time. |
Polling Interval | Polling Interval (in seconds). |
Databricks Mediator Service URL | This is gathr web service URL for Databricks. |
EMR
Field | Description |
---|---|
EMR Enabled | To enable EMR on this environment. |
Jar Upload Path | S3 Path for the gathr specific jars and files. |
Log URI | S3 Path for creating Logs for EMR Cluster launched by gathr. |
EMR Mediator Service URL | This is gathr webservice URL for EMR. |
Connect Using | Connect to the EMR cluster by using either AWS Keys or with the Instance Profile. |
AWS Key | AWS Access key to the associated gathr superuser account. |
AWS Secret Key | AWS Access key to the associated superuser account for gathr. It will be saved as encrypted text. |
Instance Profile | AWS Instance Profile to the associated superuser account for gathr. |
AWS Region | The region that the AWS EMR is to be launched in. |
GCP
Field | Description |
---|---|
GCP Enabled | Option to enable GCP on the environment. |
GCS Jar Upload Path | GCS path URI where gathr jars and files are stored. |
GCS Config Bucket | GCS bucket where gathr configurations are stored. |
Dataproc Service URL | URL for Gathr Dataproc REST service. |
GCP Regions | Provide comma-separated region name(s) where the GCP cluster is to be launched. Provide values in lower case. |
GCP ServiceAccount Key JSON File Path | Provide the GCP ServiceAccount Key JSON File Path. |
IBM Conductor
Field | Description |
---|---|
Dataproc Service URL | URL for Gathr Dataproc REST service. |
GCS Log URI | GCS path URI where gathr pipeline logs are stored. |
GCS Jar Upload Path | GCS path URI where gathr jars and files are stored. |
GCP Enabled | Check the option to enable GCP. |
IBM Conductor
Field | Description |
---|---|
IBM Enabled | Option to enable IBM conductor spark execution on this environment. |
Gathr IBM Service URL | This is Gathr web service URL from IBM. |
If you have any feedback on Gathr documentation, please email us!