On 13.06.23 16:52, James McMahon wrote:
Hello. I have a task to parse dates out of incoming raw content. Of course the date patterns can assume any number of forms - YYYY-MM-DD, YYYY/MM/DD, YYYYMMDD, MMDDYYYY, etc etc etc. I can build myself a robust regex to match a broad set of such patterns in the raw data, but I wonder if there is a project or library available for Groovy that already offes this?
I always wanted to try one time https://github.com/joestelmach/natty/tree/master or at least https://github.com/sisyphsu/dateparser... never came to it ;)
Assuming I get pattern matches parsed out of my raw data, I will have a collection of strings representing year-month-days in a variety of formats. I'd then like to normalize them to a standard form so that I can sort and compare them. I intend to identify the range of dates in the raw data as a sorted Groovy list.
once you have the library identified the format this is the easy step [...]
I intend to write a Groovy script that will run from an Apache NiFi ExecuteScript processor. I'll read in my data flowfile content using a buffered reader so I can handle flowfiles that may be large.
what does large mean? 1TB? Then BufferedReader may not be the right choice ;) bye Jochen