String Functions
Ascii
Computes the numeric value of the first character of the string column, and returns the result as an int column.
Input Parameters: Column arg0
- arg0(required) - The string column for which the first character’s numeric value to be calculated.
Output Type | String Column |
---|---|
Configuration Parameters | No |
Returns | The ASCII value as an int column. |
Throws | Application Exception |
Use Case:
Example:
ascii(“An apple”)
will return 65
base64
Converts the argument from a binary bin to a base 64 string.
Input Parameters:
arg0: The Column to be converted to base64 string.
Output Type | String Column |
---|---|
Configuration Parameters | No |
Returns | Returns the base64 string. |
Throws | Application Exception |
Example:
base64('Spark SQL')
will return U3BhcmsgU1FM
bin
Returns the string representation of the long value expr represented in binary.
Input Parameters:
arg0: The numerical Column to be converted to represented in binary.
Output Type | String Column |
---|---|
Configuration Parameters | No |
Returns | Returns the binary representation. |
Throws | Application Exception |
Example:
bin(13)
will return 1101
decode
Decodes the first argument using the second argument character set.
Input Parameters:
arg0:Column to be decoded.
arg1: The charset
Output Type | String Column |
---|---|
Configuration Parameters | No |
Returns | Returns the decoded column. |
Throws | Application Exception |
Example:
decode(encode('abc', 'utf-8'), 'utf-8')
will return abc
encode
Encodes the first argument using the second argument character set.
Input Parameters:
arg0:Column to be encoded.
arg1: The charset
Output Type | String Column |
---|---|
Configuration Parameters | No |
Returns | Returns the encoded column. |
Throws | Application Exception |
Example:
encode('abc', 'utf-8')
will return abc
itrim
Removes the leading string contains the characters from the trim string.
Input Parameters:
arg0:the trim string characters to trim, the default value is a single space.
arg1: a string expression.
Output Type | String Column |
---|---|
Configuration Parameters | No |
Returns | Returns the trimed string. |
Throws | Application Exception |
Example:
ltrim('Sp','SsparkSQLS')
will return ArkSQLS
soundex
Returns Soundex code of the string.
Input Parameters:
arg0:String column
Output Type | String Column |
---|---|
Configuration Parameters | No |
Returns | Returns Soundex code of the string. |
Throws | Application Exception |
Example:
soundex('Miller')
will return M460
split
Splits str around occurrences that match regex
Input Parameters:
arg0: str string column.
arg1:the regex string
Output Type | String Column |
---|---|
Configuration Parameters | No |
Returns | Returns the spilts. |
Throws | Application Exception |
Example:
split('oneAtwoBthreeC', '[ABC]')
will return [“one”,“two”,“three”,""]
trim
Remove the leading and trailing trimStr characters from str
Input Parameters:
arg0:The trimStr String column.
arg1:The str string
Output Type | String Column |
---|---|
Configuration Parameters | No |
Returns | Returns the trimed string. |
Throws | Application Exception |
Example:
trim('SL', 'SsparkSQLS')
will return parkSQ
unbase64
Converts the argument from a base 64 string str to a binary.
Input Parameters:
arg0:The base 64 String column
Output Type | String Column |
---|---|
Configuration Parameters | No |
Returns | Returns the unbase64 of string. |
Throws | Application Exception |
Example:
unbase64('U3BhcmsgU1FM')
will return Spark SQL
Concat
Concatenates multiple input string columns together into a single string column.
Input Parameters: Column… arg0
arg0 (required) - The String columns to be concatenated.
Output Type | String Column |
---|---|
Configuration Parameters | No |
Returns | The concatenated string as a single string column. |
Throws | Application Exception |
Use Case:
Example:
concat("format","string")
will return “formatstring”
Concat_ws
Concatenates multiple input string columns together into a single string column, using the given separator.
Input Parameters: Object arg0, Column… arg1
arg0*(required)* - The separator to be used.
arg1 (required) - The String columns to be concatenated.
Output Type | Column |
---|---|
Configuration Parameters | No |
Returns | The concatenated strings using the given separator as a single string column |
Throws | Application Exception |
Example:
concat_ws("-","format","string")
will return “format-string”.
Expr
Parses the expression string into the column that it represents.
Input Parameters: Object arg0
- arg0 (required) - The expression string to be parsed
Output Type | Column |
---|---|
Configuration Parameters | No |
Returns | The column return by the expression evaluation. |
Throws | Application Exception |
Use Case:
Example:
expr("colA \* 3")
will return 6 if the value of column ‘colA’ is 2 for a particular row.
Format_string
Formats the arguments in printf-style and returns the result as a string column.
Input Parameters: Object arg0, Column arg1
arg0 (required) - The printf style format.
arg1 (required) - The columns to be formatted.
Output Type | String Column |
---|---|
Configuration Parameters | No |
Returns | The formatted arguments as a string column. |
Throws | Application Exception |
Example:
format_string("the %s jumped over the %s, %d times","cow","moon",2)
will return “the cow jumped over the moon 2 times”.
Initcap
Computes a new string column by converting the first letter of each word to uppercase.
Input Parameters: Column arg0
- arg0 (required) - The input string column
Output Type | String Column |
---|---|
Configuration Parameters | No |
Returns | The converted string column. |
Throws | ApplicationException |
Use Case:
Example:
initcap("apple")
will return (“Apple”)
Instr
Locate the position of the first occurrence of given substring in the given string column.
Input Parameters: Column arg0, Object arg1
arg0 (required) - The string column in which the location to be determined.
arg1 (required) - The substring for which the position to be determined.
Output Type | Integer Column |
---|---|
Configuration Parameters | No |
Returns | Returns the computed last day. |
Throws | Application Exception |
Example:
instr("apple","le")
will return 4
Length
Computes the length of a given string or binary column.
Input Parameters: Column arg0
arg0 (required) - The string column for which the length to be determined.
Output Type | Integer Column |
---|---|
Configuration Parameters | No |
Returns | The computed length. |
Use Case:
Example:
length("apple")
will return 5
Levenshtein
Computes the Levenshtein distance of the two given string columns.
Input Parameters: Column arg0, Column arg1
arg0 (required) - The first string column from which the Levenshtein distance from the second string column to be determined.
arg1 (required) - The second string column
Output Type | Integer Column |
---|---|
Configuration Parameters | No |
Returns | The computed Levenshtein distance. |
Throws | Application Exception |
Use Case:
Example:
levenshtein("kitten", "sitting")
will return 3
Locate
Locate the position of the first occurrence of given substring in a string column, after the given position.
Input Parameters: Object arg0, Column arg1, int arg2
arg0 (required) - The String for which the location to be determined.
arg1 (required) - The String column in which the location to be determined.
arg2 (required) - The position after which the location to be determined
Output Type | Integer Column |
---|---|
Configuration Parameters | No |
Returns | The position of the first occurrence of substring. |
Throws | Application Exception |
Use Case:
Example:
locate("apple","An apple",1)
will return 3
Lower
Converts a string column to lower case.
Input Parameters: Column arg0
- arg0 (required) - The string column to be converted to lower case
Output Type | String Column |
---|---|
Configuration Parameters | No |
Returns | String in lower case. |
Throws | Application Exception |
Use Case:
Example:
lower("Apple")
will return “apple”
Lpad
Left-pad the string column with the given string, to a given length.
Input Parameters: Column arg0, int arg1, Object arg2
arg0 (required) - The string column to be left-padded.
arg1 (required) - The length for the padding.
arg2 (required) - The string to use for left-pad.
Output Type | String Column |
---|---|
Configuration Parameters | No |
Returns | The Left-padded string. |
Throws | Application Exception |
Use Case:
Example:
lpad("SQL Tutorial", 20, "ABC")
will return “ABCABCABSQL Tutorial”
Ltrim
Trim the spaces from left end for the specified string column.
Input Parameters: Column arg0
- arg0 (required) - The string column from which left spaces to be trimmed.
Output Type | String Column |
---|---|
Configuration Parameters | No |
Returns | The trimmed string column. |
Use Case:
Example:
ltrim(“apple”)
will return “apple”
Regexp_extract
Extracts a specific group matched by a Java regex, from the specified string column.
Arguments: Column arg0, Object arg1, int arg2
arg0 (required) - The string column from which the group to be extracted.
arg1 (required) - The string specifying the regex.
arg2 (required) - The regex group id.
Output Type | String Column |
---|---|
Configuration Parameters | No |
Returns | The extracted group. |
Throws | Application Exception |
Use Case:
Example:
regexp_extract("foothebar","foo(.\*?)(bar)", 2)
will return “bar”
Regexp_replace
Replace all substrings of the specified string column that match regexp with the given replacement.
Input Parameters: Column arg0, Object arg1, Object arg2
arg0 (required) - The string column from which substrings to be replaced.
arg1 (required) - The pattern to be used.
arg2 (required) - The replacement.
Output Type | String Column |
---|---|
Configuration Parameters | No |
Returns | The string column after replacement of substrings that match regexp. |
Throws | Application Exception |
Use Case:
Example:
regexp_replace("foobar", "oo|ar", "")
will return “fb”
Rename_column
Rename the given column.
Input Parameters: Object arg0
arg0 (required) - The column name to be renamed.
Output Type | Column |
---|---|
Configuration Parameters | No |
Returns | The renamed column. |
Throws | Application Exception |
Use Case:
Example:
rename_column("age")
will rename “age” to given output column name.
Repeat
Repeats each value in ‘arg0’ column ‘arg1’ times.
Input Parameters: Column arg0,int arg1
arg0 (required) - A column needs to be repeated
arg1 (required) - Integer value representing no of times arg0 is to be repeated.
Output Type | Column |
---|---|
Configuration Parameters | No |
Returns | A repeated value. |
Throws | Application Exception |
Use Case:
Example:
repeat("str",2)
will return ‘strstr’
Reverse
Reverses the string column and returns it as a new string column.
Input Parameters: Column arg0
arg0 (required) - The string column to be reversed.
Output Type | String Column |
---|---|
Configuration Parameters | No |
Returns | The reversed string column. |
Throws | Application Exception |
Use Case:
Example:
reverse("apple")
will return “elppa”
Rpad
Right-pad the string column with the given string, to a given length.
Input Parameters: Column arg0, int arg1, Object arg2
arg0 (required) - The string column to be right-padded.
arg1 (required) - The length for the padding.
arg2 (required) - The string to use for right-pad.
Output Type | StringColumn |
---|---|
Configuration Parameters | No |
Returns | The Right-padded string. |
Throws | Application Exception |
Use Case:
Example:
rpad("SQL Tutorial", 20, "ABC")
will return “SQL TutorialABCABCAB”
Rtrim
Trim the spaces from right end for the specified string column.
Input Parameters: Column arg0
arg0 (required) - The string column from which right spaces to be trimmed.
Output Type | String Column |
---|---|
Configuration Parameters | No |
Returns | The trimmed string column. |
Use Case:
Example:
rtrim("apple ")
will return “apple”
Substring
Substring starts at given position and is of given length when given argument is String type or returns the slice of byte array that starts at given position in byte and is of given length when given argument is Binary type.
Input Parameters: Column arg0, int arg1, int arg2
arg0 (required) - The String column from which substring to be extracted.
arg1 (required) - The start position for the substring.
arg2 (required) - The end position for the substring.
Output Type | String Column |
---|---|
Configuration Parameters | No |
Returns | Returns the computed last day. |
Use Case:
Example:
substring("foo bar",4,6)
will return “bar”
Substring_index
Computes the substring from given string column before given count occurrences of the given delimiter.
Input Parameters: Column arg0, Object arg1, int arg2
arg0 (required) - The String column from which substring to be extracted.
arg1 (required) - The delimiter.
arg2 (required) - The count occurrences for the delimiter.
Output Type | Column |
---|---|
Configuration Parameters | No |
Returns | The result substring. |
Use Case:
![substrign _index](./images/18_Functions/substrign _index.JPG)
Example:
substring_index("www.xyz.com",".",2)
will return www.xyz
Translate
Translate any character in the given string by a given character in given replaceString.
Input Parameters: Column arg0, Object arg1, Object arg2
arg0 (required) - The string column in which the translation to be done.
arg1 (required) - The matching character.
arg2 (required) - The replacement character.
Output Type | String Column |
---|---|
Configuration Parameters | No |
Returns | The translated string. |
Example:
translate("The foo bar","f","t")
will return “The too bar”
Trim
Trim the spaces from both ends for the specified string column.
Input Parameters: Column arg0
arg0 (required) - The string column from which spaces to be trimmed.
Output Type | String Column |
---|---|
Configuration Parameters | No |
Returns | The trimmed string value. |
Throws | Application Exception |
Use Case:
Example:
trim(" An apple ")
will return “An apple”
Upper
Converts a string column to upper case.
Input Parameters: Column arg0
arg0 (required) - The string column to be converted to upper case.
Output Type | String Column |
---|---|
Configuration Parameters | No |
Returns | The converted string column. |
Use Case:
Example:
upper("aPPle")
will return “APPLE”
If you have any feedback on Gathr documentation, please email us!