I think that I am getting close to that but the code is currently trying out different options so if would be more a review of other ideas and techniques prior to cleanup also I don't know how to do a review with this toolset :-(
I would welcome some review particually of the zipwriter and I have some questions on the zip file format options I still have some more digging to do - 1 some of the numbers are a little surprising so I want to play with the size and queues and threads to see why for instance Z2 is slowed than Z1 when using stored 2 I thought that I would see the impact of using SeekableByteChannel for the output stage reading and writing the file 3 need to do some verification that the content is valid (any suggestions welcome) Mike Sent from my BlackBerry® wireless device -----Original Message----- From: Alan Bateman <[email protected]> Date: Thu, 12 May 2011 13:23:30 To: <[email protected]> Cc: core-libs-dev Libs<[email protected]> Subject: Re: proposal to optimise the performance of the Jar utility [email protected] wrote: > Hi, > I have an update of the optimisations to date > > In summary jar can be 3 to 4 times faster and becomes CPU bound on all 4 > cores > of my dev system > > the results are as a CSV below > I have included tests of the 1.6 and 1.7 code and runtime for comparison > > the optimisation that I have completed are > 1. increase output buffer size > 2. add an option (D) to omit file dates (date loading was a significant > overhead > before (4), less so afterwards, but still measurable improvements and may not > be > useful in some circumstances (e.g. my use cases) > 3. pipeline the scanning for the files with the output, queuing file info via > a > BlockingQueue > 4. rewrite the scanning to use the FileVisitor > 5. temporary option (Zn) to specify the parallel option used > Z0 runs the load of the file in a single thread > Z1 runs parallel load of small files into memory caches, and runs the load of > the zip file in another thread > Z2 uses a (mostly) parallel version of Zip/JarOutputStream called > Zip/JarWriter > 6 decreased the calls to BufferedOutputStream.write(int) (in Z2 mode) to > limit > the overheads of synchronisation > 7. modified some Jar internal data structures > 8 eliminated double reading of a file in STORED mode for 'small' files (was > once > for CRC and again for data) > 9 probably a few other tweaks that I have forgotten > > I have only looked at the create path. I have mod modified the update or > extract > paths > Mike - the results look very good. Are you at the point yet where you have a patch to discuss and review? -Alan.
