pig tutorial - apache pig tutorial - Apache Pig - SIZE() Function - pig latin - apache pig - pig hadoop
What is SIZE() Function in Apache Pig ?
- The SIZE() function used in Apache Pig() is used to compute the number of elements based on any Pig data type.
- The SIZE() function includes all the NULL values in the size computation
- The SIZE() function are shape descriptors, in a geometrical and topological sense
- The SIZE() function are the functions from the half-plane x < y {\displaystyle x
- The SIZE() Function is counting certain connected components of a topological space and they are used in techniques like pattern recognition and topology.
Syntax
- The table which is given below gives the return values which vary according to the data types and their values in Apache Pig.
Data type | Value |
---|---|
int, long, float, double | For all these types, the size function returns 1. |
Char array | For a char array, the size() function returns the number of characters in the array. |
Byte array | For a bytearray, the size() function returns the number of bytes in the array. |
Tuple | For a tuple, the size() function returns number of fields in the tuple. |
Bag | For a bag, the size() function returns number of tuples in the bag. |
Map | For a map, the size() function returns the number of key/value pairs in the map. |
Example
We have loaded this file into Pig with the relation name called employee_data as given below.
Calculating the Size of the Type
Now, we need to calculate the size of the name type which is given below: