[Solved-1 Solution] Datetime parsing in Apache Pig ?
What is parsing ?
- Parsing methods convert the string representation of a date and time to an equivalent DateTime object.
- Parsing is influenced by the properties of a format provider that supplies information such as the strings used for date and time separators, and the names of months, days, and eras.
- The format provider is the current DateTimeFormatInfo object, which is provided implicitly by the current thread culture or explicitly by the IFormatProvider parameter of a parsing method.
- For the IFormatProvider parameter, specify a CultureInfo object, which represents a culture, or a DateTimeFormatInfo object.
Problem :
We are trying to parse a Date in a Pig script and we got the following error "Hadoop does not return any error message". Here is the example of Date format: 16/7/18 11:00 AM
It looks like the error is caused by the STORE command on "times".
If we do a DUMP then we got the error:
ERROR 1066: Unable to open iterator for alias times
It happens only when we use the ToDate function.
Solution 1:
- We need to specify the loader in the LOAD statement:
- We always remember to specify the schema with this type
After this the date conversion just works fine:
- (2016-03-09T23:55:00.000Z) (2016-03-09T23:55:00.000Z) (2016-03-09T23:55:00.000Z)
Use below code :
- PigStorage is the default load function for the LOAD operator.
- The original issue happend by the lack of datatype
If you don't assign types, fields default to type bytearray