Join and Union
You can derive powerful insights from datasets fetched from different tools by combining them on the basis of some common fields. Join and Union are well-known methods to combine rows of two datasets (called tables in SQL nomenclature).
Gathr Analytics provides the following Joins and Unions.
Inner Join
Left Join
Right Join
Full Join
Cartesian Join
Union
Union All
Inner Join
Inner join results in a dataset containing the rows from both the data sets. Matching fields with the same value in both datasets are joined together.
- Take two datasets; ‘Boards” and “Projects associated with Boards”.
Projects and Issues
Select the common “Board” in both the containers one-by-one by keeping the Ctrl key pressed.
Right-click and select Join > Inner Join.
Inner Join
- Following is the Inner Joined dataset. Rows where “Board” field was same, are there in the resultant dataset.
Inner Join Result
Full Join
A Full Join results in all the rows from both Datasets, whether they are matched on the Join columns or not. The rows for which there are no matches, the resultant DataSet will contain null values.
Take two datasets; ‘Boards” and “Projects associated with Boards”.
Select “Board” Columns in both the datasets using Ctrl key
Right-click and Select Join > Full Join.
Full Join
- Following is the Full Joined dataset. Notice that the resultant dataset has two “Board” columns, one from each joined dataset. All matching and non-matching rows are included.
Full Join Result
Cartesian Join
A Cartesian Join (Also called Cross Join) creates the resultant DataSet by combining (or crossing) each row of one DataSet with every row of the other DataSet.
Take two datasets; ‘Boards” and “Projects associated with Boards”.
Select “Board” Columns in both the datasets using Ctrl key
Right-click and Select Join > Full Join
Following is the result of Cartesian Join. Notice that the number of rows in the resultant dataset is 24, which is a cross product of the rows of joined DataSets Boards(12 Rows) and Projects( 2 Rows) respectively.
Left Join
Left join results in a dataset containing all the rows of the left dataset whether there is a matching row in the right dataset or not. The rows for which there are no matches on the right side, the resultant columns will contain null.
Take two datasets; ‘Boards” and “Projects associated with Boards”.
Select “Board” Columns in both the datasets using Ctrl key
Right-click and Select Join > Left Join
Left Join
- Following the resultant Dataset of Left Join. Notice that the rows of Left Dataset where there is no matching row in right dataset, all the columns of Right Dataset have null values.
Left Join Result
Right Join
Right join results in a dataset containing all the rows of the right dataset whether there is a matching row in the left dataset or not. The rows for which there are no match on the left side, the resultant columns will contain null.
Take two datasets; ‘Boards” and “Projects associated with Boards”.
Select “Board” Columns in both the datasets using Ctrl key.
To have “Boards” as the right Dataset and Projects as the Left, Right-click on Projects container and Select Join > Right Join
Note: The Container on which you open the context menu is considered as the Left container and the other one the right container)
Right Join
- Following is the resultant Dataset of Right Join. Notice all the records from the right Dataset (Boards) are retained, and null values are populated for rows of Left Dataset (Projects) where the join column didn’t match.
Right Join Results
Union
Union is combining selected columns from 2 different Datasets such that the resultant Dataset will have the selected column only (unlike joins where the resultant Dataset contains All the Columns, Join Columns as well as the other Columns of the datasets). The rows of both the containers are appended in the resultant container based on the selected columns.
Union will work only when:
The selected columns from both the containers are of the same types. e.g. If you select 2 Numeric and 3 Text Columns in the first container then you need to select the same number and type of columns in the second container as well.
The union will only keep unique values in the resultant Dataset on the basis of the selected Columns. As shown below:
Take two datasets; ”Boards” and “Projects associated with Boards”.
Select “Board” Columns in both the datasets using Ctrl key.
Let’s take two datasets: “*Brooklyn Active Sprint Issues” and “Murex Active Sprint Issues”.
select the columns from both the containers using Ctrl key.
Right-click select Join > Union.
Map the columns to match for both the Datasets.
Click on UNION DATASETS.
- The following figure shows the result of Union:
Union All
Union All results in all the rows from both sets of data even if there are duplicate data as per the matching columns, as shown below:
Joins on Containers
You can perform a Join operation without selecting individual columns. You just need to select the containers and Right Click on container header as shown below. In this case the primary key of the two datasets is considered as the Join column. This is useful when you need to join on primary key, but it is not shown in the visualization.
Join on Containers
The result of Full Join performed on the containers. As Primary keys are unique, the result of this operation contains 14 records with other columns as NULL.
If you have any feedback on Gathr documentation, please email us!