Resource Analyzer Report

Resource Analyzer report provides “profiling” and a “analyzer report” of your pipeline so that you can evaluate the efficiency of allocated resources.

This helps in identification of “optimum resources required” and “scalability limits” for the pipeline performance.

The resource analyzer report has four pages that shows different parameters of the pipeline performance.

Efficiency Statistic

Graphs under Efficiency statistics shows current performance of pipelines.

It has three types of graphs:

Driver vs Executer graph

Driver vs Executer graph shows the wall clock time of driver and executer. It has the following properties:

Field	Description
Driver wall clock time	Time spent in driver alone.
executor wall clock time	Time spent in executors.
Total wall clock time	The total spark application wall clock time can be divided into time spent in driver and time spent in executors.

Critical and Ideal graph

Critical and Ideal graph shows the time taken by a pipeline in different cases.

Field	Description
Actual runtime	Actual time take by pipeline when execute with single executer & one core.
critical path time	Critical path time is the minimum time that this application will take even if we give it infinite executors.
Ideal Application Time	Ideal application time is computed by assuming ideal partitioning (tasks = cores and no skew) of data in all stages of the application.

OCCH Graph (Core Compute Hours)

A core-hour is a measurement of computational time. In OnScale, if you run one core CPU for one hour, that’s one core-hour. If you run 1000 CPUs for 1 hour, then that’s 1000 core-hours.

Field	Description
OCCH Driver wastage	OCCH wastage by driver in percentage.
OCCH executer wastage	OCCH wastage by executer in percentage.
Total wastage	Driver wastage + executer wastage.

Aggregate Metrics

It frameworks the report of data for each task in the application.

Shown below are parameters such as min, max, sum and mean of all the stages and jobs in the application.

Field	Description
Disk Bytes Spilled	Size of spilled bytes on disk (can be different if compressed)
Executor Runtime	Total time spent by executor core running this task
Input Bytes Read	Number of bytes read by a task (using read API’s)
JvmGCTime	Amount of time spent in GC while this task was in progress
Output Bytes Written	Number of bytes written by a task (using write API’s)
Peak Execution Memory	Maximum execution memory used by a task
Result Size	The number of bytes sent by the task back to driver
Shuffle Read Bytes Read	Total bytes read by a task for shuffle data
Shuffle Read Fetch WaitTime	Time spent by the task waiting for shuffle data
Shuffle Read Local Blocks	Total records read by a task for shuffle data
Shuffle Read Local Blocks	Shuffle blocks fetched from local machine (disk access)
Shuffle Read Remote Blocks	Shuffle blocks fetched from remote machine (network access)
Shuffle Read Remote Blocks	Number of bytes that were spilled to disk during the task
Shuffle Write Bytes Written	Total shuffle bytes written by a task
Shuffle Write Records Written	Total shuffle records written by a task
Shuffle Write Time	Amount of time spent in a task writing shuffle data
Task Duration	Total time spent by the task starting from its creation

Simulation

Using the fine grained task level data and the relationship between stages, in this we can simulate how the application will behave when the number of executors is changed. Specifically It will predict wall clock time and cluster utilization.

Note that cluster utilization is not cluster CPU utilization. It only means some task was scheduled on a core. The CPU utilization will further depend upon if the task is CPU bound or IO bound.

Stage Metrics

It shows the complete information about a stage.

Field	Description
Stage id	It is the id of a stage.
wallclock%	It is the percentage of time of a stage with respect to overall time by pipeline.
task count	how many task in stage.
wallclock time	total time by stage in ms.
maximum Task memory	maximum memory taken by any task of stage.
PRatio	Number of tasks in stage divided by number of cores. Represents degree of parallelism in the stage.
TaskSkew	Duration of largest task in stage divided by duration of median task.Represents degree of skew in the stage.
IO%	Percentage between total memory used by stage to total memory used by pipeline in execution.
Task Runtime%:-	Total time spent by the task during execution on the executor.
Input	Input bytes read by stage.
Output	Output bytes written by stage.
Shuffle-Input	Shuffle data during read by stage.
Shuffle-Output	Shuffle data during written by stage.
WallClockTime Ideal	Time required by stage for ideal application.
OneCoreComputeHours Available	Total compute hours for this stage.
OneCoreComputeHours Used%	Compute hours used in this stage.
OneCoreComputeHours Wasted%	Compute hours wasted in this stage.
Task StageSkew	Duration of largest task in stage divided by total duration of the stage. Represents the impact of the largest task on stage time.
OIRatio	Output to input ratio. Total output of the stage (results + shuffle write) divided by total input(input data + shuffle read).
ShuffleWrite%	Amount of time spent in shuffle writes across all tasks in the given stage as a percentage.
ReadFetch%	Amount of time spent in shuffle read across all tasks in the given stage as a percentage.
GC%	Amount of time spent in GC across all tasks in the given stage as a percentage.

If you have any feedback on Gathr documentation, please email us!

Resource Analyzer Report

Efficiency Statistic #

Driver vs Executer graph #

Critical and Ideal graph #

OCCH Graph (Core Compute Hours) #

Aggregate Metrics #

Simulation #

Stage Metrics #