Re: Reading and updating rule-sets from a file

2020-07-03 Thread Lorenzo Nicora
Thanks Till, I understand making my FileInputFormat "unsplittable" guarantees a file is always read by a single task. But how can I produce a single record for the entire file? As my file is a CSV with some idiosyncrasies, I am extending CsvInputFormat not to reinvent the wheel of the CSV parsing

Re: Reading and updating rule-sets from a file

2020-07-01 Thread Till Rohrmann
Hi Lorenzo, what you could try to do is to derive your own InputFormat (extending FileInputFormat) where you set the field `unsplittable` to true. That way, an InputSplit is the whole file and you can handle the set of new rules as a single record. Cheers, Till On Mon, Jun 29, 2020 at 3:52 PM Lo