Formulas adding computed columns

When looking for answers to basic business questions, your data may not contain the fields/columns that you need for your analysis. This is when computed columns are required.

For example, your dataset contains release start date and end date. But you need to know how many days it takes for the release. In such scenarios, you can create computed fields or columns in your data.

A computed column is an additional column added to your dataset can be used for even more advanced calculations and visualizations.

The value of rows of this column is calculated by the system based on a formula. You can create these formulas by using columns (or fields) from one or more datasets.

Gathr Analytics supports following arithmetic operators:

Arithmetic OperationGathr Analytics’s Operator
Addition+
Subtraction-
Multiplication*
Division/
Exponent(Power)^

Gathr Analytics also has a rich set of functions that can be used in a formula. Gathr Analytics has logical, text, date, aggregation and various advanced analytical functions for unstructured text analytics, Machine learning, and custom scripting.

Add a computed column

Suppose that you have a sample dataset that contains all the sales data for a chain of stores such as transaction date, product, price, payment type, country etc. This dataset has a column called Price, which is the purchase price to the retailer. To calculate the Sales prices with 15% profit for each product, you can create a computed column called Selling_price that contains the following formula:

Selling Price=Price +Price*0.15

The value of price will be taken from the corresponding row of the ‘Price’ column.

To add this new computed column based on the above formula, follow the steps mentioned below.

Step 1: Open the Data panel.

Under ‘Current Page’, click on the Overflow Menu for your dataset. Select ‘Add Formula’ from the overflow menu.

Step 2: Write the expression in the formula editor.

In the formula editor, on the left of ‘=’ sign, is the default name of the new computed column, which you can edit. Type Selling Price.

To the right of ‘=’ sign, is where you can write your formula. Type the name of the dataset or column in the expression (price- in this example). As you start typing, Gathr Analytics will suggest you the available datasets and columns in this format “<DataSet_Name>.<Column_Name>”. Select the appropriate column from the list.

Alternatively, you can also select the column from the Data Panel by clicking on the column name.

Complete the expression using mathematical operations.

Step 3: When finished, click on icon to create a computed column to your dataset.

You might get errors for any incomplete arithmetic expressions. Correct the errors and click icon.

Step 4: Add Business Type to newly created column. if business types are defined on the columns present in the formula/function, Gathr Analytics will provide suggestions. Select Business Type(s) from the suggested options and click Ok.

Note: You can define multiple business types for your column.

To learn more, refer Section Basic Concepts and Terminology.

New computed column “Selling Price” is saved to your dataset and can be used to create more calculated columns and visualizations.

Step 5: Add this column to your visualization (Optional Step).

Note: Not every custom column you create needs to be added to the Visualizations. Sometimes, you want to use it in another calculation without visualizing.

To add, open the Visualization Panel. Drag & drop the newly added column from the Data Panel under ‘Columns’ in the Visualization Panel.

Edit a Computed Column

To edit a computed column, follow the steps given below:

  1. Open The Data panel. For your dataset where the formula is applied, click on icon to display all the columns.

  2. Under Formula, hover over a computed column and click on the icon. In the formula editor, edit the formula and select icon.

Remove a Computed Column

To remove a computed column, follow the steps given below:

  1. Open The Data panel.

For your dataset where the formula is applied, click on icon to display all the columns.

  1. Under Formula, hover over a computed column to be deleted. Click on the Overflow Menu icon and select ‘Delete Column’.

  2. Click ‘Yes’ to confirm the removal of the column.

  3. If the deleted column was added to your visualization then you need to reconfigure the visualization. Open the Visualizations Panel and remove the column from the view as shown in the following figure.

The column will be successfully removed from the dataset and visualizations.

Gathr Analytics In-built Functions

A Gathr Analytics in-built functions are off-the-shelf formulas to perform complex calculations with ease. e.g. SUM, MAX, AVERAGE, VLOOKUP, COUNTIF etc.

All in-built functions have a standard syntax as shown below:

<Function Name> ( [Param1] , [Param2] , …)

Parameters are inputs provided to the function. Multiple arguments are separated by a comma.

e.g.: AVG(12, 17, 19)

You can use fields of a dataset as arguments as well.

e.g.: AVG(‘SalesJan2019.Selling Price’)

Categories

Gathr Analytics in-built functions are categorized as under

  • Number Functions (ABSOLUTE, ROUND etc.)

  • Text Functions (TOKENIZE, TOUPPER, TRIM etc.)

  • Aggregate Functions (SUM, AVERAGE, MAX etc.)

  • Lookup Functions (VLOOKUP)

  • Date & Time Functions (DATEDIFF, DATEADD etc.)

  • Logical Functions (AND, OR)

  • Type Conversion (TOINT, TODATE etc.)

  • Analytical functions (SENTIMENT, APPLYCLSUTERING etc.)

Gathr Analytics’s complete function list is available here - [Gathr Analytics In-built Function reference].

How to Use Gathr Analytics’s In-built Functions

Suppose “Releases” dataset has Start date and Release date but you need time taken for the release. Use the DATEDIFF function to calculate the difference between two dates.

  1. Open the formula editor, and name the new computed column.

  2. Type-in few letters of the function name (say date.) Select the function you need, from the drop downs list.

  1. A function help box will provide you description, syntax, and few examples to help you complete the formula.

  1. Type-in first few letters of the name of the column (to be provided as arguments to the function) and Gathr Analytics’s autocomplete feature will provide you the list of matching columns and Data Sets. You can choose from the drop-down list.

Note: You can select columns from the Data Panel as well.

  1. Provide all the arguments to complete your formula. In the example, you need to provide interval as ‘Day’.

  1. When finished, click icon. Fix the errors (if any) for missing parameters,an incorrect number of required parameters or misplaced closing parenthesis.

  2. Add a business type for the newly created column from the suggested options and click Ok.

To learn more, refer Section Basic Concepts and Terminology.

Gathr Analytics’s Functions List

List of all the in-built functions available in Gathr Analytics.

#FunctionDescriptionSyntax
1.ABSOLUTEReturns the absolute value of a number.ABSOLUTE (numeric column)

Example:

ABSOLUTE(EMPLOYEE.SALARY)
2.ANDReturns TRUE if ALL logical expressions are ‘TRUE’ else return ‘FALSE’.AND (logical expression, …)

Example:

AND(EMPLOYEE.SALARY>5000, EMPLOYEE.SALARY<10000)
3.AVERAGEIFSReturns the average of a range depending on multiple criteria.AVERAGEIFS (average_range, critera_range1, critera1, [critera_range2], [criteria2], ….)

Example:

AVERAGEIFS(EMPLOYEE.SALARY,EMPLOYEE.DEPARTMENT,”ENGINNERING”, EMPLOYEE.GRADE,”10)
4.AVERSAGEIFReturns the average of a column, filtered by a criterion.AVERAGEIF (range, criteria, [avg_range])

Example:

AVERAGEIF(EMPLOYEE.AGE,”>30,EMPLOYEE.SALARY)
5.AVGReturns the average values of value from the column(s)AVG (Column1, [Column2], …)

Here, Column should be numeric type only

Example:

AVG(DS1.StoryPoints)
6.CHARConvert a number into a character according to the universal Unicode table.CHAR (Column or Number)

Example:

CHAR(126)
7.CONCATReturns the concatenation of multiple valuesCONCAT(Column1,…)

Example:

CONCAT(EMPLOYEE.FIRST_NAME,EMPLOYEE.LAST_NAME)

Here, column should be from same DS
8.COUNTCounts the values from the column(s)COUNT (Column, [Column2], …)

Here, Column can be of any data type

Example: COUNT(DS1.StoryPoints), COUNT(DS1.id)
9.COUNTIFConditional count across a rangeCOUNTIF (criteria_range, criterion)

Example:

COUNTIF (DS1.StoryPoints,”>30”)

COUNTIF (DS1.StoryPoints,” >DS1.minPoints”)
10.COUNTIFSReturns a conditional count across a rangeCOUNTIFcriteria_range1, criterion1, [criteria_range2], [criterion2] …)

Same as COUNTIF, with multiple conditions

Example:

COUNTIF(DS1.StoryPoints,”>30”,DS1.status,”=Success”)
11.DATEConverts a year, month and day into a dateDATE(Year, Month, Day, [Time Zone])

Example :

DATE(2018,2,2)

DATE(1978,6,23, ”GMT+05:30”)

DATE(1978,6,23, ”UTC+02:00”)
12.DATEADDReturns a date to which a specific interval is addedDATEADD(DATE, Number_To_Add, Interval)

Example:

DATEADD(TODAY(),10,”DAY”)

Supported value of Param3: DAY,MONTH,YEAR

Supported value of Param1: Date Column or Date Value
13.DAYReturns the day of the month that a specific date falls on, in numeric format.DAY (date column, [TIME_ZONE])

Example:

DAY(EMPLOYEE.JOINING_DATE)

DAY(TODAY(),”GMT+12”)
14.IFReturns one value if a logical expression is ‘TRUE’ or another value if it is ‘FALSE’IF(Condition,Column1,Column2)

Example: IF(CreatedDate >TODAY(),DS1.Column1,DS1.Column2)

Here, Column1 and Column2 should be of same type

IF(EMPLOYEE.STATUS=1,”ACTIVE”,”DIASABLE”)
15.ISNULLReturns true if column value is null otherwise false.ISNULL(column)

Example:

ISNULL(EMPLOYEE.RELEASING_DATE)
16.ISNUMBERreturns TRUE if this is a number or Column containing a numeric value and FALSE otherwiseISNUMBER(Column)

Example:

ISNUMBER(EMPLOYEE.SALARY)
17.LARGEReturns the nth largest element from a data set, where n is user-defined.LARGE (numeric column, n)

Example:

LARGE(EMPLOYEE.SALARY,5)
18.LEFTReturns a substring from the beginning of a specified string.LEFT (column, [number_of_characters])

Example:

LEFT(EMPLOYEE.ADDRESS,4)
19.LENGTHreturns the length of a stringLENGTH(column)

Example:

LENGTH(EMPLOYEE.ADDRESS)
20.MAXReturns the max value from column(s)MAX(Column)

Here, Column can be numeric or date time type

Example:

MAX(DS1.CreatedDate)
21.MAXIFReturns the max value from the column, filtered by a criterionMAXIF (column, criteria_range, criteria1))

Here, criterion1 can be constant as well e.g. “KLERA-123”

Example:

MAXIF(EMPLOYEE.SALARY,EMPLOYEE.AGE,”30”)

MAXIF(EMPLOYEE.SALARY,EMPLOYEE.AGE,”>=50”)
22.MAXIFSReturns the max value from the column, filtered by multiple criteriaMAXIFS(column, criteria_range1, criteria1, critera_range2, criteria2], …)

Same as MAXIF, with multiple conditions

Example:

MAXIFS(EMPLOYEE.SALARY,EMPLOYEE.AGE,”30”,EMPLOYEE.GRADE,”C”)
23.MEDIANReturns the median value in a numeric dataset.MEDIAN (numeric column)

Example:

MEDIAN(EMPLOYEE.SALARY)
24.MIDExtracts A segmentMID (Column OR String, starting_at, extract_length)

Example:

MID(“Hello Word”,4,4)
25.MINReturns the min value from column(s)MIN(Column)

Here, Column can be numeric or date time type

Example:

MIN(DS1.CreatedDate)
26.MINIFReturns the min value from the column, filtered by a criterionMINIF (column, criteria_range, criteria)

Here, criterion can be constant as well e.g. “KLERA-123”

Example:

MINIF(EMPLOYEE.SALARY,EMPLOYEE.AGE,”30”)
27.MINIFSReturns the min value from the column, filtered by multiple criteriaMINIFS(column, criteria_range1, criteria1, critera_range2, criteria2], …)

Same as MINIF, with multiple conditions

Example:

MINIFS(EMPLOYEE.SALARY,EMPLOYEE.AGE,”30”,EMPLOYEE.GRADE,”C”)
28.MONTHReturns the month of the year a specific date falls in, in numeric format.MONTH (date column, [TIME_ZONE])

Example:

MONTH(EMPLOYEE.JOINING_DATE)

MONTH(NOW(),”GMT+12”)
29.NOTNULLReturns true if column value is not null otherwise false.NOTNULL(column)

Example:

NOTNULL(EMPLOYEE.RELEASING_DATE)
30.ORReturns TRUE if any logical expressions is ‘TRUE’ else return ‘FALSE’OR(logical expression)

Example:

OR (EMPLOYEE.SALARY<5000, EMPLOYEE.SALARY>10000)
31.RANKReturns the rank of a specified value in a datasetRANK (value, numeric column, [is_ascending)

IS_ASCENDING-0 BY DEFAULT, means the greatest value in data will have Rank 1

RANK(5,EMPLOYEE.SALARY,1)
32.REPTReturns a specified text repeated several timesReturns specified text repeated several times.

RANK (column OR String, number_of_repetitions)

Example:

REPT(“John”,3)
33.RIGHTReturns a substring from the end of a specified string.RIGHT (column, [number_of_characters])

Example:

RIGHT(EMPLOYEE.ADDRESS,4)
34.ROUNDRounds a number to a certain number of decimal places according to standard rules.ROUND (column1, column2)

Example:

ROUND(EMPLOYEE.SALARY,2)
35.SEARCHReturns the position at which a string is first found within cell and ignores capitalization of letters. Returns null if the string is not found.SEARCH (search_for, text_to_search, [starting_at])

Example:

SEARCH(street,EMPLOYEE.ADDRESS,1)
36.SMALLReturns the nth smallest element from a data set, where n is user-defined.SMALL (numeric column,n)

Example:

SMALL(EMPLOYEE.SALARY,4)
37.STDDEVReturns standard deviation of a numeric dataset.STDDEV (numeric column)

Example:

STDDEV(EMPLOYEE.SALARY)
38.SUMReturns the sum of values from column(s)SUM(Column1,[Column2],….)

Here, Column should be numeric type only

Example:

SUM (DS1.StoryPoints)

SUM (EMP.SALARY,EMP.INCENTIVE)
39.SUMIFReturns the sum of a column, filtered by a columnSUMIF(criteria_range, criterion,[sum_range])

Example:

SUMIF(EMP.AGE,”>30”,EMP.SALARY)

Here, criterion can be constant as well e.g. “KLERA-123”

Sum_range should be of numeric only
40.SUMIFSReturns the sum of columns depending on multiple criterionSUMIFS(criteria_range1, criterion1, [criteria_range2, …], [criterion2, …])

Same as SUMIF, with multiple conditions
41.TODATEConverts a date/datetime column to a date column.TODATE(date_time_column)

Example:

TODATE(EMPLOYEE.JOINING_DATE)
42.TODAYReturns the current date as a Date valueTODAY()
43.TOINTConverts value to integer valueTOINT(Column)

Example:

TOINT(EMPLOYEE.AGE),

TOINT(“25”)

Input can be column or Value
44.TOKENIZEDivides text around a specified character or string and puts each fragment as comma separated valuesTOKENIZE (Column, delimiter,[SpiltByEach,StopWordList],[include_system_stopwords])

Here, SplitByEach is an option and is Boolean. If user provides more than one separator and make this flag true then the string will be split wherever any of these splitter is present in the text and stowords is word list which will be ignored

Example:

TOKENIZE (EMPLOYEE.ADDRESS,”,- “, TRUE, ”#$%&”)
45.TOLOWERConverts a specified string to lower case.TOLOWER (Column OR String)

Example:

TOLOWER(EMPLOYEE.ADDRESS)
46.TOSTRINGReturns STRING VALUES OF A COLUMNTOSTRING(Column)

Example:

TOSTRING(EMPLOYEE.SALARY), TOSTRING(“25”)

Input can be column or Value
47.TOUPPERConverts a specified string to uppercase.TOUPPER (Column OR String)

Example:

TOUPPER(EMPLOYEE.ADDRESS)
48.TRIMRemoves leading and trailing spaces in a specified string.TRIM (Column OR String)

Example:

TRIM(EMPLOYEE.ADDRESS)
49.VARReturns the variance of a numeric dataset.VAR (numeric column)

Example:

VAR(EMPLOYEE.SALARY)
50.VLOOKUPSearches down the search column for a search key and returns the value of a specified output columnVLOOKUP(search_key, search column, ouput_column)

Example:

VLOOKUP(EMPLOYEE.DEPARTMENT_CODE,DEPARTMENT.DEPARTMENT_CODE,DEPARTMENT.DEPARTMENT_NAME)

Lookup will be done on first match
51.YEARReturns the year specified by a given date.YEAR (date column, [TIME_ZONE])

Example:

YEAR(DATEVALUE(“06/12/2017”))
Top