[Solved-2 Solutions] Skipping the header while loading the text file using Piglatin ?
What is filter
- The FILTER operator is used to select the required tuples from a relation based on a condition.
Syntax
- Here is the syntax of the FILTER operator.
Problem :
- If you have a text file and it's first row contains the header. Now if we want to do some operation on the data, but while loading the file using PigStorage it takes the HEADER too. Is it possible to skip the header ?
- Here is the command which is used to load a data
Solution 1:
Using Filter:
- Usually the way we solve this problem is to use a FILTER on something. We know is in the header.
For example, consider the following data example:
We can use as below mentioned:
Solution 2:
Here is another way to achieve this:
Load the complete file including header record in a relation
- Use the Linux tail command to stream only the data records
- To verify the header record is removed, use following command -