Re: How to read files in distributed way from a pcollection

2017-08-22 Thread Chamikara Jayalath
TextIO doesn't retain filenames. Looks like proposed API for reading whole files [1] retain filesnames so you should be able to use that to produce a PCollection of KV once it's available. - Cham [1] https://issues.apache.org/jira/browse/BEAM-2750 On Mon, Aug 21, 2017 at 9:59 PM Siddharth Mittal

How to read files in distributed way from a pcollection

2017-08-21 Thread Siddharth Mittal
Hi Team, I have a use case where I will get a PCollection of file names. Files are present on NFS and file size may wary from few KBs to few GBs. We want to transform PCollection of File Names to PCollection of Please Suggest how to handle this type of use case. Thanks & Regards Siddharth Mi