Miscellaneous Functions

BitwiseNOT

Computes bitwise NOT of the given integer.

Input Parameters: Column arg0

  • arg0 (required) - An integer or a double column.
Output TypeInteger Column
Configuration ParametersNo
ReturnsThe one’s complement of the integer
ThrowsApplication Exception

Use Case:

bitwise

Example:

bitwiseNOT(7) will return 8


Coalesce

Returns the first non-null argument if exists. Otherwise, null.

Input Parameters:

arg0:columns representing expressions.

Output TypeInteger Column
Configuration ParametersNo
ReturnsReturns the first non-null argument if exists. Otherwise, null.
ThrowsApplication Exception

Example:

coalesce(NULL, 1, NULL) will return 1


from_json

Returns a struct value with the given jsonStr and schema.

Input Parameters:

arg0: The Json string column.

arg1: The schema column.

arg2: The properties map.

Configuration ParametersNo
ReturnsReturns the struct value.
ThrowsApplication Exception

Example:

from_json('{"a":1, "b":0.8}', 'a INT, b DOUBLE') will return {"a":1, "b":0.8} and from_json('{"time":"26/08/2015"}', 'time Timestamp', map('timestampFormat', 'dd/MM/yyyy')) will return {“time”:“2015-08-26 00:00:00.0”}.


hash

Returns a hash value of the arguments.

Input Parameters:

arg0:The columns for which hash to be calculated.

Configuration ParametersNo
ReturnsReturns a hash value of the arguments.
ThrowsApplication Exception

Example:

hash('Spark', array(123), 2) will return -1321691492


input_file_name

Returns the name of the file being read, or empty string if not available.

Configuration ParametersNo
ReturnsReturns the name of the file being read, or empty string if not available.
ThrowsApplication Exception

Example:

input_file_name() - will return the name of the file being read


map

Creates a map with the given key/value pairs.

Input Parameters:

arg0:The columns for key and value.

Configuration ParametersNo
ReturnsReturns the map.
ThrowsApplication Exception

Example:

map(1.0, '2', 3.0, '4') will return {1.0:“2”,3.0:“4”}


map_concat

Returns the union of all the given maps

Input Parameters:

arg0:The map columns.

Configuration ParametersNo
ReturnsReturns the union of all the given maps.
ThrowsApplication Exception

Example:

map_concat(map(1, 'a', 2, 'b'), map(2, 'c', 3, 'd')) will return {1:“a”,2:“c”,3:“d”}


map_from_arrays

Creates a map with a pair of the given key/value arrays. All elements in keys should not be null.

Input Parameters:

arg0:Array of keys.

arg1:Array of values.

Configuration ParametersNo
ReturnsReturns the map.
ThrowsApplication Exception

Example:

map_from_arrays(array(1.0, 3.0), array('2', '4')) will return {1.0:“2”,3.0:“4”}.


map_from_entries

Returns a map created from the given array of entries.

Input Parameters:

arg0:Array of entries.

Configuration ParametersNo
ReturnsReturns the map.
ThrowsApplication Exception

Example:

map_from_entries(array(struct(1, 'a'), struct(2, 'b'))) will return {1:“a”,2:“b”}


map_keys

Returns an unordered array containing the keys of the map.

Input Parameters:

arg0:Map column.

Configuration ParametersNo
ReturnsReturns the array.
ThrowsApplication Exception

Example:

map_keys(map(1, 'a', 2, 'b')) will return [1,2]


map_values

Returns an unordered array containing the values of the map.

Input Parameters:

arg0:Map column.

Configuration ParametersNo
ReturnsReturns the array.
ThrowsApplication Exception

Example:

map_values(map(1, 'a', 2, 'b')) will return [“a”,“b”]


nanvl

R Returns expr1 if it’s not NaN, or expr2 otherwise.

Input Parameters:

arg0:expr1 column.

arg1:expr2 column.

Configuration ParametersNo
ReturnsReturns expr1 if it’s not NaN, or expr2 otherwise.
ThrowsApplication Exception

Example:

nanvl(cast('NaN' as double), 123) will return 123.0


not

Perform logical not of given column.

Input Parameters:

arg0:Given boolean column

Configuration ParametersNo
ReturnsReturns logical not of given column.
ThrowsApplication Exception

Example:

not(false) will return true


schema_of_json

Returns schema in the DDL format of JSON string.

Input Parameters:

arg0:Given json string column

Configuration ParametersNo
ReturnsReturns schema of the json.
ThrowsApplication Exception

Example:

schema_of_json('[{"col":0}]') will return array.


sequence

Generates an array of elements from start to stop (inclusive), incrementing by step.

The type of the returned elements is the same as the type of argument expressions. Supported types are: byte, short, integer, long, date, timestamp.The start and stop expressions must resolve to the same type.

If start and stop expressions resolve to the ‘date’ or ’timestamp’ type then the step expression must resolve to the ‘interval’ type, otherwise to the same type as the start and stop expressions.

Input Parameters:

arg0:start - an expression. The start of the range.

arg1:stop - an expression. The end the range (inclusive).

arg2:step - an optional expression. The step of the range. By default step is 1 if start is less than or equal to stop, otherwise -1. For the temporal sequences it’s 1 day and -1 day respectively. If start is greater than stop then the step must be negative, and vice versa.

Configuration ParametersNo
ReturnsReturns the sequence
ThrowsApplication Exception

Example:

sequence(to_date('2018-01-01'), to_date('2018-03-01'), interval 1 month) will return [2018-01-01,2018-02-01,2018-03-01]


size

Returns the size of an array or a map.

The function returns -1 if its input is null and spark.sql.legacy.sizeOfNull is set to true.

If spark.sql.legacy.sizeOfNull is set to false, the function returns null for null input.

By default, the spark.sql.legacy.sizeOfNull parameter is set to true.

Input Parameters:

arg0: array or map column.

Configuration ParametersNo
ReturnsReturns the size
ThrowsApplication Exception

Example:

size(array('b', 'd', 'c', 'a')) will return 4


struct

Creates a struct with the given field values.

Input Parameters:

arg0:columns using which the struct will be created.

Configuration ParametersNo
ReturnsReturns the struct column
ThrowsApplication Exception

to_json

Returns a JSON string with a given struct value.

Input Parameters:

arg0:struct column.

arg1:additional options map.

Configuration ParametersNo
ReturnsReturns a JSON string with a given struct value.
ThrowsApplication Exception

Example:

to_json(named_struct('time', to_timestamp('2015-08-26', 'yyyy-MM-dd')), map('timestampFormat', 'dd/MM/yyyy')) will return {“time”:“26/08/2015”}


unhex

Converts hexadecimal expr to binary.

Input Parameters:

arg0:The hexadecimal column.

Configuration ParametersNo
ReturnsReturns the binary.
ThrowsApplication Exception

Example:

decode(unhex('537061726B2053514C'), 'UTF-8') will return Spark SQL.


Crc32

Computes a cyclic redundancy check value for string.

Input Parameters: Column arg0

  • arg0 (required) - A string column.
Output TypeInteger Column
Configuration ParametersNo
ReturnsBigint value.
ThrowsApplication Exception

Use Case:

cr32

Example:

crc32("ABC") will return 2743272264


Isnan

Returns true if the input is Not a Number.

Input Parameters: Column arg0

  • arg0 (required) - A column whose values needs to be checked.
Output TypeBoolean Column
Configuration ParametersNo
ReturnsA boolean true if the value is not a number.
ThrowsApplication Exception

Use Case:

isnan

Example:

isnan("abc") will return true


Isnull

Returns true if “a” is NULL and false otherwise.

Input Parameters: Column arg0

  • arg0 (required) - A column whose values needs to be checked.
Output TypeBoolean Column
Configuration ParametersNo
ReturnsA boolean true if the value is null
ThrowsApplication Exception

Use Case:

isnull

Example:

isnull("abc") will return false.


Lit

Creates a Column of literal value.

Input Parameters: Object arg0

  • arg0 (required) - A constant value.
Output TypeColumn
Configuration ParametersNo
ReturnsA column with the constant value
ThrowsApplication Exception

Use Case:

lit

Example:

lit(1) will return a column having 1 in all the rows of the column.


Md5

Calculates an MD5 128-bit checksum for the string.

Input Parameters: Column arg0

  • arg0 (required) - A string column
Output TypeString Column
Configuration ParametersNo
ReturnsThe value is returned as a string of 32 hex digits, or NULL if the argument was NULL.
ThrowsApplication Exception

Use Case:

Md5

Example:

md5('ABC') will give you an output ‘902fbdd2b1df0c4f70b4a5d23525e932’.


Monotonically_increasing_id

A column expression that generates monotonically increasing 64-bit integers.

Input Parameters: No input arguments.

Output TypeInteger Column
Configuration ParametersNo
ReturnsMonotonically increasing integers.
ThrowsApplication Exception

Example:

monotonically_increasing_id() will return rows as 0,1,2…


Sha1

Calculates the SHA-1 digest for string and returns the value as a hex string.

Input Parameters: Column arg0

  • arg0 (requird) - A string column
Output TypeString Column
Configuration ParametersNo
ReturnsA hex string.
ThrowsApplication Exception

Use Case:

Sha1

Example:

sha1("ABC") will return ‘3c01bdbb26f358bab27f267924aa2c9a03fcfdb8’


Sha2

Calculates the SHA-2 family of hash functions of a string value and returns the value as a hex string. NumBits controls the number of bits in the message digest.

Input Parameters: Column arg0,int numBits

  • arg0 (required) - A string column

  • arg1 (required) - one of 224, 256, 384, or 512.

Output TypeString Column
Configuration ParametersNo
ReturnsA hex String.
ThrowsApplication Exception

Use Case:

Sha2

Example:

sha2("Sam",256) will return ‘4ecde249d747d51d8..


ShiftLeft

Bitwise left shift, Shifts a b positions to the left.

Input Parameters: Column arg0,int numBits

  • arg0 (required) - A number column

  • arg1 (required) - An integer column

Output TypeNumber Column
Configuration ParametersNo
ReturnsIf the given value is a long value, this function will return a long value else it will return an integer value
ThrowsApplication Exception

Use Case:

ShiftLeft

Example:

shiftLeft(258,2) will return 1032


ShiftRight

Bitwise right shift, Shifts a b positions to the right.

Input Parameters: Column arg0,int numBits

  • arg0 (required) - A number column

  • arg1 (required) - An integer column

Output TypeNumber Column
Configuration ParametersNo
ReturnsIf the given value is a long value, this function will return a long value else it will return an integer value
ThrowsApplication Exception

Use Case:

shiftright

Example:

shiftRight(258,2) will return 64.


ShiftRightUnsigned

Bitwise unsigned right shift, Shifts a b positions to the right.

Input Parameters: Column arg0,int numBits

  • arg0 (required) - A number column

  • arg1 (required) - An integer column.

Output TypeNumber Column
Configuration ParametersNo
ReturnsIf the given value is a long value, this function will return a long value else it will return an integer value
ThrowsApplication Exception

Use Case:

shift_right

Example:

shiftRightUnsigned(258,2) will return 64

Top