As the name suggests (parallelism_hint), it is the number of parallel spout instances you want to start. Of course, the UDF code must be parallelizable, eg, different instances should emit different data. If the UDF code is not parallelizeable (eg, the UDF reads a single file), using a parallelism greater than one will result in duplicating the input. In this case, just set the value to one (ie, don't specify it at all --- one in the default value anyway).
-Matthias On 06/09/2015 01:55 PM, Rakeshsharma PR wrote: > What the purpose of the third argument in setSpout(). It’s a numeric > parameter. > > Please explain its purpose > > > > Thanks > > rakesh > > > > > Please consider the environment before printing this e-mail > > Disclaimer: This communication is for the exclusive use of the intended > recipient(s) and shall not attach any liability on the originator or ITC > Infotech India Ltd./its Holding company/ its Subsidiaries/ its Group > Companies. If you are the addressee, the contents of this e-mail are intended > for your use only and it shall not be forwarded to any third party, without > first obtaining written authorization from the originator or ITC Infotech > India Ltd./ its Holding company/its Subsidiaries/ its Group Companies. It > may contain information which is confidential and legally privileged and the > same shall not be used or dealt with by any third party in any manner > whatsoever without the specific consent of ITC Infotech India Ltd./ its > Holding company/ its Subsidiaries/ its Group Companies. >
signature.asc
Description: OpenPGP digital signature
