[Solved-1 Solution] Generate multiple outputs with Hadoop Pig ?
Problem:
How to generate Multiple Output for multiple files loaded in PIG ?
Solution 1:
Why we need split ?
- Split operator can be used to partition the contents of a relation into two or more relations based on some expression. Based on the conditions provided in the expression either of the below two will be done:
- A tuple may be assigned to more than one relation
- A tuple may not be assigned to any relation
Multiple files in a directory which are used in pig to load, flatten and store:
Loading above directory:
Format the data based on your requirements
Use SPLIT operator to split the relation into multiple relations based on the conditions.
Output: