On Prem Installation Introduction

The objective of this topic is to assist the users to install Gathr and configure various infrastructure components that would interact in Gathr pipelines.

Assumptions

  1. User of this document is well acquainted with Linux systems and has fair knowledge of UNIX commands.

  2. User has sudo rights or is working as a root user.

  3. User has installed yum, rpm, unzip, tar and wget tools.

Overview

Gathr web studio provides a web interface to create, deploy and manage data processing and analytical flows. These data flows utilize services that are part of the big data cluster. Cluster managers like Cloudera manager manages these services for CDP. The web studio needs to be configured correctly to enable data pipelines to interact with the services.

Gathr web studio provides a simple way to configure these service properties as part of a post-deployment setup process.

Managed services such as YARN, Zookeeper can be configured by simply providing Cloudera manager information in the setup screen. Properties for services that are not part of the managed cluster can be configured by entering the values manually.

Top