On Jun 5, 2013, at 03:35 , Sky Zhao wrote: > Thanks Matthieu and Daniel, > > Just like benchmark project, using thread pool to send events with > parallelism, I am ready to test this, > I remember Matthieu said one way of improviment is to use setParallelism in > PE, I don't know whether can improve the events throughput?
Shipped benchmarks provide a baseline but throughput is very dependent on the application. It may or may not help to use the "setParallelism" method for a given stream, you'll have to try. The javadoc for that method has more info. Matthieu > > /Sky > > > -----Original Message----- > From: Daniel Gómez Ferro [mailto:danie...@yahoo-inc.com] > Sent: Tuesday, June 04, 2013 9:37 PM > To: s4-user@incubator.apache.org > Cc: Matthieu Morel > Subject: Re: Congradulat for s4 0.6 incubating version released > > > > On Tue Jun 4 15:15:54 2013, Matthieu Morel wrote: >> >> On Jun 4, 2013, at 10:48 , Sky Zhao wrote: >> >>> Thanks Mattieu, see below pls >>> *From:*Matthieu Morel [mailto:mmo...@apache.org] *Sent:*Tuesday, June >>> 04, 2013 4:32 PM *To:*s4-user@incubator.apache.org >>> <mailto:s4-user@incubator.apache.org> >>> *Subject:*Re: Congradulat for s4 0.6 incubating version released On >>> Jun 4, 2013, at 04:23 , Sky Zhao wrote: >>> >>> >>> Here is one question, >>> S4 can send/receive 200,000 events/s, how s4 does it, more memory? >>> more nodes? more machines? >>> How get this big figure? >>> Have a look at the benchmarks subproject. (you'll need to slightly >>> adapt the code since it uses the old API though). >>> >>> >>> I try an example by read files and send events, only 1000 events/s in >>> one node one machine 2g memory. >>> This looks related to your application and the way you inject events. >>> The reference benchmarks don't do I/O except for passing messages. >>> How is your code like? Are you outputting on the console? >>> How long is the run? Have you tried generating without reading from >>> the file system? Are you using a separate injector application? >>> Yes, I use new File(filename), then readline, and use adapter put the >>> stream into events(like getRemoteStream().put(event);) for every >>> line. My PE output to console, also export to csv by Metrics, 88M >>> data(1237632 events) for 20 minutes sending in only one stream, no >>> try without reading, no other inject application, only one >>> application, what is the reason do you think? >> >> Outputting to the console from the PE is very likely to impact the >> performance. >> >> Matthieu >> > > In addition to removing the writes to the console, I'd suggest you to > generate the events instead of reading them, and maybe use more than one > injector in order to saturate the processing PE. > > Cheers, > > Daniel