[Solved-2 Solutions] How to access an array element in pig ?
What is array
- An array is a data structure that contains a group of elements. Typically these elements are all of the same data type, such as an integer or string.
- Arrays are commonly used in computer programs to organize data so that a related set of values can be easily sorted or searched.
Problem :
How to access an array element in pig ?
Solution 1:
- Pig is a scripting language and not relational one like SQL, it is well suited to work with groups with operators nested inside a FOREACH.
- Array elements can be accessed with help of an operators and foreach statement .
Example:
A = LOAD 'input' USING PigStorage(',') AS (id:int, v1:float, v2:float);
B = GROUP A BY id; -- isolate all rows for the same id
C = FOREACH B { -- here comes the scripting bit
elems = ORDER A BY v1 DESC; -- sort rows belonging to the id
two = LIMIT elems 2; -- select top 2
two_invers = ORDER two BY v1 ASC; -- sort in opposite order to bubble second value to the top
second = LIMIT two_invers 1;
GENERATE FLATTEN(group) as id, FLATTEN(second.v2);
};
DUMP C;
Solution 2:
The below code helps for accessing the array element
A = LOAD 'input' USING PigStorage(',') AS (id:int, v1:int, v2:int);
B = ORDER A BY id ASC, v1 DESC;
C = FOREACH B GENERATE id, v2;
DUMP C;