Re: splitIntoBundles vs. generateInitialSplits

2017-04-13 Thread Etienne Chauchot
to use splitAtFraction. S On Sun, Jan 8, 2017 at 6:06 AM Stas Levin wrote: Hi, A short terminology question regarding "bundle", and particularly splitIntoBundles vs. generateInitialSplits. In *BoundedSource* we have: List> *splitIntoBundles*(...) In *UnboundedSource* we have: List&g

Re: splitIntoBundles vs. generateInitialSplits

2017-04-13 Thread Jean-Baptiste Onofré
readers rather than after creating readers and waiting to use splitAtFraction. S On Sun, Jan 8, 2017 at 6:06 AM Stas Levin wrote: Hi, A short terminology question regarding "bundle", and particularly splitIntoBundles vs. generateInitialSplits. In *BoundedSource* we have: List> *

Re: splitIntoBundles vs. generateInitialSplits

2017-04-13 Thread Etienne Chauchot
startup to be able to split up the work before creating readers rather than after creating readers and waiting to use splitAtFraction. S On Sun, Jan 8, 2017 at 6:06 AM Stas Levin wrote: Hi, A short terminology question regarding "bundle", and particularly splitInto

Re: splitIntoBundles vs. generateInitialSplits

2017-03-20 Thread Stas Levin
o think of them as > occupying > >>>> > >>>> the > >>>>> > >>>>> same niche. I'll let someone else who was around for naming discuss > >>>> > >>>> whether > >>>>> > >>&

Re: splitIntoBundles vs. generateInitialSplits

2017-03-20 Thread Jean-Baptiste Onofré
ces & Reader methods are called in, but a runner trying to get efficiency would be able to use splitIntoBundles during job startup to be able to split up the work before creating readers rather than after creating readers and waiting to use splitAtFraction. S On Sun, Jan 8, 2017 at 6:06 AM

Re: splitIntoBundles vs. generateInitialSplits

2017-03-20 Thread Ismaël Mejía
ey are doing slightly different things: a >>> >>> bounded >>>> >>>> source is really kind of creating physical chunks of the data, whereas >>> >>> the >>>> >>>> streaming source is creating conceptual divisions of

Re: splitIntoBundles vs. generateInitialSplits

2017-01-11 Thread Jean-Baptiste Onofré
g to get efficiency would be able to use splitIntoBundles during job startup to be able to split up the work before creating readers rather than after creating readers and waiting to use splitAtFraction. S On Sun, Jan 8, 2017 at 6:06 AM Stas Levin wrote: Hi, A short terminology question regarding

Re: splitIntoBundles vs. generateInitialSplits

2017-01-11 Thread Stas Levin
les during job startup to be > > able to split up the work before creating readers rather than after > > creating readers and waiting to use splitAtFraction. > > > > S > > > > On Sun, Jan 8, 2017 at 6:06 AM Stas Levin wrote: > > > > > Hi, > > &

Re: splitIntoBundles vs. generateInitialSplits

2017-01-10 Thread Eugene Kirpichov
wn order the > > Sources & Reader methods are called in, but a runner trying to get > > efficiency would be able to use splitIntoBundles during job startup to be > > able to split up the work before creating readers rather than after > > creating readers and waiting to use

Re: splitIntoBundles vs. generateInitialSplits

2017-01-09 Thread Stas Levin
e to use splitIntoBundles during job startup to be > able to split up the work before creating readers rather than after > creating readers and waiting to use splitAtFraction. > > S > > On Sun, Jan 8, 2017 at 6:06 AM Stas Levin wrote: > > > Hi, > > > > A sho

Re: splitIntoBundles vs. generateInitialSplits

2017-01-09 Thread Stephen Sisk
and waiting to use splitAtFraction. S On Sun, Jan 8, 2017 at 6:06 AM Stas Levin wrote: > Hi, > > A short terminology question regarding "bundle", and > particularly splitIntoBundles vs. generateInitialSplits. > > In *BoundedSource* we have: > List> *s

splitIntoBundles vs. generateInitialSplits

2017-01-08 Thread Stas Levin
Hi, A short terminology question regarding "bundle", and particularly splitIntoBundles vs. generateInitialSplits. In *BoundedSource* we have: List> *splitIntoBundles*(...) In *UnboundedSource* we have: List> *generateInitialSplits*(...) I was wondering if the names were in