Default
Note: Some of the properties reflected are not feasible with Multi-Cloud version of Gathr. These properties are marked with **
All default or shared kind of configurations properties come under this category. This category is further divided into various sub-categories.
Platform
Field | Description |
---|---|
Application Logging Level | The logging level to be used for gathr logs. |
Gathr HTTPs Enabled | Whether gathr application support HTTPs protocol or not. |
Spark HTTPs Enabled | Whether Spark server support HTTPs protocol or not. |
Test Connection Time Out | Timeout for test connection (in ms). |
Java Temp Directory | The temp directory location. |
Gathr Reporting Period | Whether to enable View Data link in application or not. |
View Data Enabled | Whether to enable View Data link in application or not. |
TraceMessage Compression | The type of compression used on emitted TraceMessage from any component. |
Message Compression | The type of compression used on emitted object from any component. |
Enable Gathr Monitoring Flag | Flag to tell if monitoring is enabled or not. |
CEP Type | Defines the name of the cep used. Possible value is esper as of now. |
Enable Esper HA Global | To enable or disable HA. |
CepHA Wait Interval | The wait interval of primary CEP task node. |
Gathr Scheduler Interval | The topology stopped alert scheduler’s time interval in seconds. |
Enable Gathr Scheduler | Flag to enable or disable the topology stopped alert. |
Gathr Session Timeout | The timeout for a login session in gathr. |
Enable dashboard | Defines whether dashboard is enable or disable. |
Enable Log Agent | Defines if Agent Configuration option should be visible on gathr GUI or not. |
Enable Storm Error Search | Enable showing pipeline Application Errors tab using LogMonitoring search page. |
Gathr Pipeline Error Search Tenant Token | Tenant token for Pipeline Error Search. |
Gathr Storm Error Search Index Expression | Pipeline application error index expression (time based is expression to create indexes in ES or Solr, that is used during retrieval also). |
Kafka Spout Connection Retry Sleep Time | Time between consecutive Kafka spout connection retry. |
Cluster Manager Home URL | The URL of gathr Cluster Manager |
Gathr Pipeline Log Location | gathr Pipeline Log Location. |
HDFS Location for Pipeline Jars | HDFS Location for Pipeline Jars. |
Scheduler Table Prefix | Tables name starting with a prefix which are related to storing scheduler’s state. |
Scheduler Thread Pool Class | Class used to implement thread pool for the scheduler. |
Scheduler Thread Pool Thread Count | This count can be any positive integer, although only numbers between 1 and 100 are practical. This is the number of threads that are available for concurrent execution of jobs. If only a few jobs run a few times a day, then 1 thread is plenty. However if multiple jobs, with most of them running every minute, then you probably want a thread count like 50 or 100 (this is dependent on the nature of the jobs performed and available resources). |
Scheduler Datasource Max Connections | The maximum number of connections that the scheduler datasource can create in its pool of connections. |
Scheduler Misfire Threshold Time | Milliseconds the scheduler will tolerate a trigger to pass its next-fire-time by, before being considered misfired. |
HDP Version | Version of HDP ecosystem. |
CDH Version | Version of CDH ecosystem. |
Audit Targets | Defines the Audit Logging Implementation to be use in the application, Default is file. |
Enable Audit | Defines the value (true/false) for enabling audit in application. |
Persistence Encryption Key | Specifies the encryption key used to encrypt data in persistence. |
Ambari HTTPs Enabled | Whether Ambari server support HTTPs protocol or not. |
Graphite HTTPs Enabled | Whether Graphite server support HTTPs protocol or not. |
Elastic Search HTTPs Enabled | Whether Elasticsearch engine support HTTPs protocol or not. |
SQL Query Execution Log File Path | File location for logging gathr SQL query execution statistics. |
SQL Query Execution Threshold Time (in ms) | Defines the max limit of execution time for sql queries after which event will be logged (in ms). |
Lineage Persistence Store | The data store that will be used by data lineage feature. |
Aspectjweaver jar location | The absolute path of aspectweaver jar required for inspect pipeline or data lineage. |
Is Apache Environment | Default value is false. For all apache environment set it to “true”. |
Zookeeper
Field | Description |
---|---|
Zookeeper Retry Count | Zookeeper connection retry count. |
Zookeeper Retry Delay Interval | Defines the retry interval for the zookeeper connection. |
Zookeeper Session Timeout | Zookeeper’s session timeout time. |
Spark
Field | Description |
---|---|
Model Registration Validation Timeout(in seconds) | The time, in seconds, after which the MLlib, ML or H2O model registration and validation process will be failed if the process not complete. |
Spark Fetch Schema Timeout(in seconds) | The time, in seconds, after which the fetch schema process of register table will be failed if the process not complete. |
Spark Failover Scheduler Period(in ms) | Regular intervals to run scheduler tasks. Only applicable for testing connection of Data Sources in running pipeline. |
Spark Failover Scheduler Delay(in ms) | Delay after which a scheduler task can run once it is ready. Only applicable for testing connection of Data Sources in running pipeline. |
Refresh Superuser Pipelines and Connections | Whether to refresh Superuser Pipelines and Default Connections in database while web studio restart. |
Gathr SparkErrorSearchPipeline Index Expression ** | Pipeline application error index expression (time based js expression to create indexes in ES or Solr, that is used during retrieval). |
Enable Spark Error Search ** | Enabled to index and search spark pipeline error in LogMonitoring. |
Register Model Minimum Memory | Minimum memory required for web studio to register tables, MLlib, ML or H2O models. Example -Xms512m. |
Register Model Maximum Memory | Maximum memory required for web studio to register tables, MLlib, ML or H2O models. Example -Xmx2048m. |
H2O Jar Location | Local file system’s directory location at which H2O model jar will be placed after model registration. |
H2O Model HDFS Jar Location | HDFS path location at which H2O model jar will be placed after model registration. |
Spark Monitoring Scheduler Delay(in ms) ** | Specifies the Spark monitoring scheduler delay in milliseconds. |
Spark Monitoring Scheduler Period(in ms) ** | Specifies the Spark monitoring scheduler period in milliseconds. |
Spark Monitoring Enable ** | Specifies the flag to enable the spark monitoring. |
Spark Executor Java Agent Config | Spark Executor Java Agent configuration to monitor executor process, the command includes jar path, configuration file path and Name of the process. |
Spark JVM Monitoring Enable ** | Specifies the flag to enable the spark monitoring. |
ES query monitoring index name | Provide the ES query monitoring index name which is required for indexing the data of query streaming. |
Scheduler period for es monitoring purging | Scheduler period for es monitoring purging in seconds. |
Rotation policy for of ES monitoring graph | Specify the rotation policy for index creation for ES monitoring graph (daily for a period of one day and weekly for 7 days). |
Purging duration of ES monitoring index | Purge duration for ES in seconds for es monitoring graph index. Index created before this duration will be deleted. |
Enable purging scheduler for ES Graph monitoring | Check the checkbox to enable purging scheduler for ES Graph monitoring. |
Spark Version ** | By default the version is set to 2.3. Note: Set spark version to 2.2 for HDP 2.6.3” |
Livy Supported JARs Location ** | HDFS location where livy related jar file and application streaming jar file have been kept. |
Livy Session Driver Memory ** | Minimum memory that will be allocated to driver while creating livy session. |
Livy Session Driver Vcores ** | Minimum virtual cores that will be allocated to driver while creating Livy session. |
Livy Session Executor Memory ** | Minimum executor instances that will be allocated while executing while creating Livy seconds where sample data has been kept while schema auto detection. |
Livy Session Executor Vcores ** | Minimum virtual cores that will be allocated to executor while creating Livy session. |
Livy Session Executor Instances ** | Minimum executor instances that will be allocated while executing while creating Livy session.HDFS where sample data has been kept while schema auto detection. |
Livy Custom Jar HDFS Path ** | The full qualified path of HDFS where uploaded custom jar has been kept while creating pipeline. |
Livy Data Fetch Timeout ** | The query time interval in seconds for fetching data while data inspection. |
isMonitoringGraphsEnabled | Whether monitoring graph is enabled or not. |
ES query monitoring index name | this property stores the data of monitoring in this given index of default ES connection. |
Scheduler period for ES monitoring purging | in this time interval purging scheduler will invoke and check whether the above index is eligible for purging (in sec.) (tomcat restart require). |
Rotation policy of ES monitoring graph | “It can have two values daily or weekly” If daily index will be rotated daily else weekly means only a single day data will be stored in single index otherwise a data of a week will be stored in an index. |
Purging duration of ES monitoring index | It’s a duration after which index will be deleted default is 604800 sec. Means index will be deleted after 1 week.” (tomcat restart requires) |
Enable purging scheduler for ES Graph monitoring | If we need purging of index or not depend on this flag. Purging will not take place if flag is disable. It requires restart of Tomcat Server. |
RabbitMQ
Field | Description |
---|---|
RabbitMQ Max Retries | Defines maximum number of retries for the RabbitMQ connection. |
RabbitMQ Retry Delay Interval | Defines the retry delay intervals for RabbitMQ connection. |
RabbitMQ Session Timeout | Defines session timeout for the RabbitMQ connection. |
Real-time Alerts Exchange Name | Defines the RabbitMQ exchange name for real time alert data. |
Kafka
Field | Description |
---|---|
Kafka Message Fetch Size Bytes | The number of byes of messages to attempt to fetch for each topic-partition in each fetch request. |
Kafka Producer Type | Defines whether Kafka producing data in async or sync mode. |
Kafka Zookeeper Session Timeout(in ms) | The Kafka Zookeeper Connection timeout. |
Kafka Producer Serializer Class | The class name of the Kafka producer key serializer used. |
Kafka Producer Partitioner Class | The class name of the Kafka producer partitioner used. |
Kafka Key Serializer Class | The class name of the Kafka producer serializer used. |
Kafka 0.9 Producer Serializer Class | The class name of the Kafka 0.9 producer key serializer used. |
Kafka 0.9 Producer Partitioner Class | The class name of the Kafka 0.9 producer partitioner used. |
Kafka 0.9 Key Serializer Class | The class name of the Kafka 0.9 producer serializer used. |
Kafka Producer Batch Size | The batch size of data produced at Kafka from log agent. |
Kafka Producer Topic Metadata Refresh Interval(in ms) | The metadata refresh time taken by Kafka when there is a failure. |
Kafka Producer Retry Backoff(in ms) | The amount of time that the Kafka producer waits before refreshing the metadata. |
Kafka Producer Message Send Max Retry Count | The number of times the producer will automatically retry a failed send request. |
Kafka Producer Request Required Acks | The acknowledgment of when a produce request is considered completed. |
Security
Field | Description |
---|---|
Kerberos Sections | Section names in keytab_login.conf for which keytabs must be extracted from pipeline if krb.config.override is set to true. |
Hadoop Security Enabled | Set to true if Hadoop in use is secured with Kerberos Authentication. |
Kafka Security Enabled | Set to true if Kafka in use is secured with Kerberos Authentication. |
Solr Security Enabled | Set to true if Solr in use is secured with Kerberos Authentication. |
Keytab login conf file Path | Specify path for keytab_login.conf file. |
CloudTrial
Field | Description |
---|---|
Cloud Trial | The flag for Cloud Trial. Possible values are True/False. |
Cloud Trial Max Datausage Monitoring Size (in bytes) | The maximum data usage limit for cloud trial. |
Cloud Trial Day Data Usage Monitoring Size (in bytes) | The maximum data usage for FTP User. |
Cloud Trial Data Usage Monitoring From Time | The time from where to enable the data usage monitoring. |
Cloud Trial Workers Limit | The maximum number of workers for FTP user. |
FTP Service URL | The URL of FTP service to create the FTP directory for logged in user (required only for cloud trial). |
FTP Disk Usage Limit | The disk usage limit for FTP users. |
FTP Base Path | The base path for the FTP location. |
Monitoring
Enable Monitoring Graphs | Set to True to enable Monitoring and to view monitoring graphs. |
---|---|
QueryServer Monitoring Flag | Defines the flag value (true/false) for enabling the query monitoring. |
QueryServer Moniting Reporters Supported | Defines the comma-separated list of appenders where metrics will be published. Valid values are graphite, console, logger. |
QueryServer Metrics Conversion Rate Unit | Specifies the unit of rates for calculating the queryserver metrics. |
QueryServer Metrics Duration Rate Unit | Specifies the unit of duration for the queryserver metrics. |
QueryServer Metrics Report Duration | Time period after which query server metrics should be published. |
Query Retries | Specifies the number of retries to make a query in indexing. |
Query Retry Interval (in ms) | Defines query retry interval in milliseconds. |
Error Search Scroll Size | Number of records to fetch in each page scroll. Default value is 10. |
Error Search Scroll Expiry Time (in secs) | Time after which search results will expire. Default value is 300 seconds. |
Index Name Prefix | Prefix to use for error search system index creation. The prefix will be used to evaluate exact index name with partitioning. Default value is sax_error_. |
Index number of shards | Number of shards to create in the error search index. Default value is 5. |
Index Replication Factor | Number of replica copies to maintain for each index shard. Default value is 0. |
Index Scheduler Frequency (in secs) | Interval (in secs) after which scheduler will collect error data and index in index store. |
Index Partitioning Duration (in hours) | Time duration after which a new index will be created using partitioning. Default value is 24 hours. |
Data Retention Time (in days) | Time duration for retaining old data. Data above this threshold will be deleted by scheduler. Default value is 60 days. |
Audit
Field | Description | Default Value |
---|---|---|
Enable Event Auditing | Defines the value for enabling events auditing in the application. | true |
Events Collection Frequency (in secs) | Time interval (in seconds) in which batch of captured events will be processed for indexing. | 10 |
Events Search Scroll size | Number of records to fetch in each page scroll on result table. | 100 |
Events Search Scroll Expiry (in secs) | Time duration (in seconds) for search scroll window to expire. | 300 |
Events Index Name Prefix | Prefix string for events index name. The prefix will be used to evaluate exact target index name while data partitioning process. | sax_audit_ |
Events Index Number of Shards | Number of shards to create for events index. | 5 |
Events Index Replication Factor | Number of replica copies to maintain for each index shard. | 0 |
Index Partitioning Duration (in hours) | Time duration (in hours) after which a new index will be created for events data. A partition number will be calculated based on this property. This calculated partition number prefixed with Events Index Name Prefix value will make target index name. | 24 |
Events Retention Time (in days) | Retention time (in days) of data after which it will be auto deleted. | 60 |
Events Indexing Retries | Number of retries to index events data before sending it to a WAL file. | 5 |
Events Indexing Retries Interval (in milliseconds) | It defines the retries interval (in milliseconds) to perform subsequent retries. | 3000 |
Query Server
Field | Description |
---|---|
QueryServer Monitoring Flag | The flag value (true/false) for enabling the query monitoring. |
QueryServer Monitoring Reporters Supported | The comma-separated list of appenders where metrics will be published. Valid values are graphite, console, logger. |
QueryServer Metrics Conversion Rate Unit | Specifies the unit of rates for calculating the queryserver metrics. |
QueryServer Metrics Duration Rate Unit | Specifies the unit of duration for the queryserver metrics. |
QueryServer Metrics Report Duration | Time after which query server metrics should be published. |
QueryServer Metrics Report Duration Unit | The units for reporting query server metrics. |
Query Retries | The number of retries to make a query in indexing. |
Query Retry Interval (in ms) | Defines query retry interval in milliseconds. |
Others
Field | Description |
---|---|
Audit Targets | Defines the audit logging implementation to be used in the application, Default is fine. |
ActiveMQ Connection Timeout(in ms) | Defines the active MQTT connection timeout interval in ms. |
MQTT Max Retries | Max retries of MQTT server. |
MQTT Retry Delay Interval | Retry interval, in milliseconds, for MQTT retry mechanism. |
JMS Max Retries | Max retries of JMS server. |
JMS Retry Delay Interval | Retry interval, in milliseconds, for JMS retry mechanism. |
Metrics Conversion Rate Unit | Specifies the unit of rates for calculating the queryserver metrics. |
Metrics Duration Rate Unit | Specifies the unit of duration for the metrics. |
Metrics Report Duration | Specifies the duration at interval of which reporting of metrics will be done. |
Metrics Report Duration Unit | Specifies the unit of the duration at which queryserver metrics will be reported. |
Gathr Default Tenant Token | Token of user for HTTP calls to LogMonitoring for adding/modifying system info. |
LogMonitoring Dashboard Interval(in min) | Log monitoring application refresh interval. |
Logmonitoring Supervisors Servers | Servers dedicated to run LogMonitoring pipeline. |
Export Search Raw Field | Comma separated fields to export LogMonitoring search result. |
Elasticsearch Keystore download path prefix | Elasticsearch keystore download path prefix in case of uploading keystore. |
Tail Logs Server Port | Listening port number where tail command will listen incoming streams of logs, default is 9001. |
Tail Logs Max Buffer Size | Maximum number of lines, that can be stored on browser, default is 1000. |
sax.datasets.profile.frequency.distribution.count.limit | Defines the number of distinct values to be shown in the frequency distribution graph of a column in a Dataset. |
sax.datasets.profile.generator.json.template | common/templates/DatasetProfileGenerator.json Template of the spark job used to generate profile of a Dataset. |
Pipeline Error Notification Email IDs | Provide comma separated email IDs for pipeline error notification. |
Pipeline Test Connection Enabled | Check mark the checkbox to enable the email notification when a pipeline component is down. |
Maintenance mode enabled | Provide true or false value for enabling the email notification in case pipeline component stops working. |
Contextual Logs | A detailed contextual information (e.g. userName, roles, projectName) will be appended in the logs once this option is enabled. |
Enable Event Notifier | Check this option to enable event notification based on the provided event notifier type. For example: SNS. |
Event Notifier Type | Provide the event notifier type i.e., SNS |
SNS Authentication Type | Select the AWS Authentication Type from the available options: - AWS Keys - Instance Profile - Role ARN |
AWS Key ID | Provide the AWS account access key. |
AWS Secret Key | Provide the AWS account secret key. |
SNS Topic Region | Provide the AWS SNS topic region. |
SNS Topic Type | Select the SNS topic type from the below available options: - Standard - FIFO |
SNS Topic ARN | Provide the SNS topic ARN where you want to publish alert data i.e., arn:aws:sns:us-east-1:123456789012:Test. |
Role ARN | Provide the AWS account Role ARN. |
Message Group ID | Provide message group ID for the FIFO SNS topic. |
If you have any feedback on Gathr documentation, please email us!