Sandbox is an offering by Gathr which contains all the features and tools that are required by its users to create and run their code in isolation. Users have a choice to operate on their preferred development environments via Gathr with minimal setup requirements.
Currently, Gathr supports JupyterLab, RStudio and VS Code as the IDEs.
These environments along with their dependencies as chosen can either be setup on bare metal servers. The sandbox environments created with this setup in Gathr are known as non-container-based sandbox.
Or these environments along with their dependencies as chosen can also be encapsulated in a container which can run independently on any Gathr supported container-based infrastructure. The sandbox environments created with this setup in Gathr are known as container-based sandbox.
A sandbox can be created and launched in project. The steps to create a sandbox with each of these approaches and the various actions that can be performed on them are explained further in this topic.
With a non-container-based approach users can create and launch a sandbox by selecting a preferred IDE, kernel and additional packages (as per applicability of the kernel selected).
By using python as kernal, user has multiple options to provide packages during configuration of the sandbox. For more details, see end-to-end workflow of Creating Non-Container-Based Sandbox.
User can choose to host a container-based sandbox on a cluster that is registered in Gathr database. Currently, Gathr supports Kubernetes as a cluster.
Kubernetes clusters and docker images can be registered in Gathr application. Once registered, they can be utilized during sandbox configuration for a container-based deployment approach.
For more details about how to register Clusters in Gathr, see Register Cluster and for registering docker images see, Register Container Image.
The IDE selected during the container image registration (for a container-based approach) will be launched in the preferred cluster and docker image.
The container-based and non-container-based sandbox that are created within a project will appear on the Sandbox listing page. The configuration options for creating a sandbox can be accessed using the plus icon. For more details, see Creating Sandbox.
The information and actions displayed for the listed sandbox are explained below:
Field | Description | ||
---|---|---|---|
Name | This column displays the Name and State of the listed sandbox. User can hover over the state icon to check if the environment is container-based or non-container-based. On the listing page, the container-based and non-container-based sandbox along with their states can be classified as shown below: | ||
Container-based sandbox | The states that are possible for all types of sandbox are represented using below color codes: Green: Active state. Yellow: In-Progress/Launching state. Orange/Red: Failed state. Grey: Stopped state. | ||
Non-Container-based (Python, PySpark and Scala) sandbox | |||
Non-Container-based (VS Code) sandbox | |||
Non-Container-based (RStudio) sandbox | |||
User | This column displays username who has created the sandbox. | ||
Hardware Tier | This column information is only applicable for a container-based sandbox. It displays the hardware tier (i.e., Small, Medium, Large or Custom) information which was specified during sandbox configuration. | ||
CPU/Memory | This column displays sandbox CPU/memory utilization information out of the allocated hardware tier. | ||
Launched | This column displays the time since the sandbox was last launched. | ||
Accessed | This column displays the time since the sandbox was last accessed. | ||
This column provides user with certain set of actions that can be performed on the listed container-based and non-container-based sandbox. Note: Only a superuser or sandbox owner can perform the relevant actions. | |||
Given below are the actions for the container-based sandbox: | |||
View Details | The user can access logs for the sandbox with this option. In the logs tab, there are options available to list the required number of lines, filter the log results based on user input and also an option to clear the log results. | ||
Stop/Relaunch | Note: When a sandbox is in an active state, stop button will be visible. Whereas, when it is not in an active state, relaunch button will be visible. The user can stop an active sandbox with the stop action. Upon termination of Sandbox, the user will be notified via an email. The user can relaunch an inactive sandbox with this action. The configuration can be edited before a sandbox is relaunched. | ||
Delete | The user can delete the existing sandbox with this option. | ||
Given below are the actions for the non-container-based sandbox: | |||
Generate Executable | Note: This action is only applicable for the non-container-based Python sandbox. Option to initiate generation of the PEX file for the desired Sandbox. It will only be enabled when a sandbox is in an active state. Given below are the next set of actions that will appear for further user action, once the Generate Executable option is clicked: | ||
Export Status | In-Progress: PEX file generation is in-progress. Completed: PEX file is generated successfully and is ready for download. Failed: PEX file generation has failed. | ||
Download | Option to download the generated PEX file by either of the following ways: NFS: User can select the NFS option and export the PEX file by providing the desired NFS path. HDFS: User can select the HDFS option and export the PEX file by providing the desired HDFS connection and path. Desktop: User can select the Desktop option and download the PEX file. |
Note: Users can also perform several actions such as add/modify packages, stop sandbox, push files to Git repo from the Jupyter Lab IDE as well.
The sandbox that are relaunched from the Sandbox listing page will appear in this section. The user can search for the relaunched sandbox in the available search bar.
The information that is displayed for the relaunched sandbox is explained below:
Field | Description |
---|---|
Name | Name of the relaunched sandbox. |
Hardware Tier | It displays the hardware tier (i.e., Small, Medium, Large or Custom) information which was specified in the sandbox configuration. |
Start Time | This column displays the time since the sandbox was started. |
User | This column displays username who has created the sandbox. |
Duration | This column displays the time duration since the sandbox was created. |
The configuration options for creating a sandbox can be accessed using the plus icon from the Sandbox listing page.
User has options to create either non-container-based sandbox or container-based sandbox. The basic configuration required for creating and launching both types of sandbox are explained below.
Creating Non-Container-Based Sandbox
The configuration details for creation of a non-container-based sandbox are described in the table given below: Field Description Configuration parameters for the New Sandbox tab: Name Unique name of the sandbox to be created. Container-Based Deployment This option will only be visible if the containerEnabled property is enabled in the configuration settings. Option to choose between non-container-based or container-based environment creation. Check the box to opt for container-based sandbox configuration options. Sandbox IDE The IDEs that are supported by Gathr for non-container-based sandbox are: l JupyterLab l RStudio l VS Code Any of the above can be selected from the drop-down list as a preferred development environment. As per the IDE selected, the option to select kernel will get displayed. Kernel Below are the options that can be selected as kernels for JupyterLab: - PySpark - Scala - Python Note: Upon selecting Python as kernel, preferred version should be selected out of the options given below: - Python 2.7 - Python 3.8 For Python kernel, an additional tab will appear for selecting Sandbox Python Packages as described later in the table. If RStudio is the preferred IDE then the kernel that can be selected is R and an additional field i.e., Version is also displayed to select the R version for the environment. Configuration parameters for setting the email alerts: If you prefer to receive email notifications about the long-running or idle Sandbox the below configuration can be utilized. Enable Email Alerts Option to get email notifications in case if the Sandbox remains idle or long-running for a defined time. This field can only be set during the Sandbox creation. Email ID An email ID, comma separated multiple email IDs or a distribution list email address should be provided to get sandbox alerts via email. Long Run Alert An email alert will be sent if the Sandbox is running for more than the selected number of days. Idle Time Alert An email alert will be sent if the Sandbox is idle for more than the selected number of hours. Configuration parameters for the Sandbox Python Packages tab: Option to select the desired package manager for adding python package(s). User can select multiple options from nfs, conda and pip to provide python packages. The python packages can be provided in two ways: - The user can either provide the packages by mentioning the package names as newline separated entires in the Packages parameter. - The user can also provide packages with the help of a text file using the UPLOAD option. The template for the text file can be downloaded from the DOWNLOAD option for conda and pip. The Package Management options are further described as follows: nfs The user needs to provide complete location for the desired python package(s) that are available in the NFS directory. conda This option is disabled in the current version. The user can provide package names for the desired python package(s) that are available in the global anaconda repository. pip The user needs to provide package names for the desired python package(s) that are available in the Global Python repository.
Once all the configuration values are specified, user can Save and Launch the sandbox.
Creating Container-Based Sandbox
The configuration details for creation of a container-based sandbox are described in the table given below:
Field | Description |
---|---|
The configuration parameters for a container-based sandbox span across three tabs. Each of the tab is described in this table as it falls in the configuration sequence. | |
New Sandbox: | |
Name | Unique name of the sandbox to be created. |
Container-Based Deployment | This option will only be visible if the containerEnabled property is enabled in the configuration settings. Option to choose between non-container-based or container-based environment creation. Check the box to opt for container-based sandbox configuration options. |
Sandbox Environment | Container image (the set of tools, packages, libraries, and other dependencies) registered on Gathr that is to be used for sandbox deployment must be selected. Drop-down options will consist of all the container images registered on Gathr (registered on superuser as well as workspace level), depending on the Container Cluster that is selected. |
Sandbox IDE | IDE for this sandbox will be default as it was specified during the Container Image registration. |
Container Cluster | Cluster registered on Gathr that is to be used for sandbox deployment must be selected. Drop-down options will consist of all the clusters registered on Gathr. |
Hardware Tier | The hardware tier (i.e., Small, Medium, Large or Custom) has to be specified. Based on the user input, the number of cores and memory will be allocated to the sandbox. The possible values for hardware tier option are as follows: Small: 4 cores and 4 GB memory Medium: 8 cores and 16 GB memory Large: 16 cores and 64 GB memory Custom: If user selects custom option, the hardware tier values can be specified in the custom fields for memory and cores. Note: Minimum 1 GB memory and 1 core is supported for custom configuration. |
Sandbox Packages: Option to select the desired package manager for adding python package(s). User can select multiple options from nfs, conda, pip and artifactory to provide python packages. The python packages can be provided in two ways: - The user can either provide the packages by mentioning the package names as newline separated entires in the Packages parameter. - The user can also provide packages with the help of a text file using the UPLOAD option. The template for the text file can be downloaded from the DOWNLOAD option for conda, pip and artifactory. The Package Management options are further described as follows: | |
nfs | The user needs to provide complete location for the desired python package(s) that are available in the NFS directory. Also, the user needs to provide valid YAML configuration for PersistentVolumeClaim(PVC) and/or PersistentVolume (PV) to mount NFS packages within container. |
conda | The user needs to provide package names for the desired python package(s) that are available in the global anaconda repository. |
pip | The user needs to provide package names for the desired python package(s) that are available in the Global Python repository. |
artifactory | The user needs to provide package names for the desired python package(s) that are available in artifactory repository. The artifactory configuration can be provided at multiple levels in Gathr application. Gathr follows the precedence given below while uploading the python packages when artifactory option is used: - The first priority is given to an artifcatory configuration that is set in Manage Users option within a Workspace in a non-superuser login. - The second priority is given to an artifcatory configuration that is set within a Workspace in a superuser login. - The least priority is given to an artifcatory configuration that is set using Configuration>Others>Artifactory option in a superuser login. |
Sandbox Configuration: The custom expressions that were used in the YAML files (uploaded while registering container image) will get displayed here as required parameters. |
Once all the configuration values are specified, user can Save and Launch the sandbox.
The launch status for a container-based sandbox will be displayed as shown in the image given below:
Once the sandbox is launched successfully, the user will be redirected to the Jupyter Lab or any other default IDE.