[Solved-1 Solution] Applying TRIM() in Pig for all fields in a tuple ?
TRIM()
- The TRIM() function in pig accepts a string and returns its copy after removing the unwanted spaces before and after it.
Syntax
- the syntax of the TRIM() function is below
Example
- Assume we have some unwanted spaces before and after the names of the employees in the records of the pers_data relation.
- By using the TRIM() function, we can remove these heading and tailing spaces from the names, as shown below.
- The above statement returns the copy of the names by removing the heading and tailing spaces from the names of the persons. The result is stored in the relation named trim_data.
Problem:
If you are loading a CSV file with 56 fields. You need to apply TRIM() function in Pig for all fields in the tuple.
But it fails with below error
Solution 1:
- To Trim a tuple in the Pig, we should create a UDF. Register the UDF and apply the UDF with Foreach statement to the field of the tuple that wants to trim.
Below is the code for trimming the tuple with UDF.