Hi Ken, Yep correct.
Thank you. On Wed, Oct 31, 2018 at 7:24 PM Ken Krugler <kkrugler_li...@transpac.com> wrote: > Hi Madan, > > If your source has a parallelism > 1, then when the CSV file is split, > only one of the operators will get the split with the header row. > > So in that case, how would you communicate the column name->index > information to the other operators? > > If you force a parallelism of 1 for the source, then I’m pretty sure > you’re guaranteed that the file will be processed in order. > > — Ken > > On Oct 31, 2018, at 12:50 AM, madan <madan.yella...@gmail.com> wrote: > > Hi, > > When we are splitting a csv file into multiple parts we are not sure which > part is read first. Is there any way to make sure first part with header is > read first ? I need to read header line first to store column name vs index > and use this index for processing next records. > > I could read header line from the file before submitting job to the flink, > but that way we are opening the file 2 times. Is there any better way to do > this? Please suggest. > > -- > Thank you. > > > -------------------------- > Ken Krugler > +1 530-210-6378 > http://www.scaleunlimited.com > Custom big data solutions & training > Flink, Solr, Hadoop, Cascading & Cassandra > > -- Thank you, Madan.