Re: [Hdf-forum] Very poor performance of pHDF5 when using single (shared) file

Corey Bettenhausen Tue, 03 Sep 2013 09:21:45 -0700

Daniel,
I think you missed a very important paragraph at the top of the page:
"See www.hdfgroup.org for more information on HDF. The Nabble interface here is 
used for read-only access to the list archives. If you'd like to send messages 
to hdf-forum, you must be subscribed to the actual mailing list and send your 
messages through that email interface."
http://hdf-forum.184993.n3.nabble.com/


The Nabble.com site is just to show the messages. You can't use that site to 
reply to any messages to the mailing list. You must send an email to the 
mailing list itself address [email protected] via your own client 
(Gmail, Outlook, Apple Mail, etc).

>From your previous emails and your use of the word "forum", it seems you think 
>nabble.com is a online forum for HDF (where users can post and respond). That 
>is not the case. The hdf-forum is a mailing list, not an online forum. 

I hope this clears things up. 
Regards,
-Corey

On Sep 3, 2013, at 12:14 PM, Mohamad Chaarawi wrote:

> Hi Daniel,
> 
> I'm not sure what the issue with the forum email list is, but nobody seems to 
> have this problem. Just make sure you are always sending your messages and 
> replies to [email protected]; not another address.
> I'll ask the sysadmins to look into this issue more.
> 
> Now to your results, the multiple file strategy is always (at least in most 
> cases) going to be the fastest strategy. There are no locking contention, and 
> not inter-process communication overhead.
> The difference in performance with the single file strategy still seems a bit 
> high in your case, but again I'm saying this with a total lack of knowledge 
> on how your benchmark/application is accessing the file. I do not believe 
> chunking will help here.
> 
> One thing worth trying is varying the number of MPI aggregators. What MPI 
> library are you using? The MPI IO library is most probably ROMIO, so it 
> should accepts info hints (not sure if the top level implementation might 
> ignore those hints, but you can check anyway).
> So use an MPI info object, that you pass in H5Pset_fapl_mpio(), to set the 
> number of MPI aggregators (cb_nodes, and cb_buffer_size). A full list of 
> hints to ROMIO can be found here:
> http://www.mcs.anl.gov/research/projects/romio/doc/users-guide.pdf
> I would set the cb_nodes to the stripe count; and try the cb_buffer_size as 
> the stripe size. Those are not necessary the ideal options, but best to start 
> there. 
> 
> I know that all this tuning is a burden for an application user of HDF5, but 
> that is what needs to be done today to get good performance. There have been 
> some work done aimed at auto tuning all this parameter space using a separate 
> tool, but the architecture is not user friendly yet for someone to simply 
> grab, deploy and run.
> 
> Thanks,
> Mohamad
> 
> 
> 
> -----Original Message-----
> From: Hdf-forum [mailto:[email protected]] On Behalf Of 
> Daniel Langr
> Sent: Tuesday, September 03, 2013 10:38 AM
> To: [email protected]
> Subject: Re: [Hdf-forum] Very poor performance of pHDF5 when using single 
> (shared) file
> 
> Mohamad,
> 
> I really do not understand how to reply to this forum :(. I tried to reply to 
> your post, which I received via e-mail. In this e-mail, there was the 
> following note:
> 
> "
> If you reply to this email, your message will be added to the discussion 
> below:
> http://hdf-forum.184993.n3.nabble.com/Very-poor-performance-of-pHDF5-when-using-single-shared-file-tp4026443p4026449.html
> "
> 
> So, I replied to this e-mail, and received another one:
> 
> Subject: Delivery Status Notification (Failure)
> 
> "
> Delivery to the following recipient failed permanently:
> [email protected]
> 
> Your email to [email protected] has been rejected 
> because you are not allowed to post to 
> http://hdf-forum.184993.n3.nabble.com/Very-poor-performance-of-pHDF5-when-using-single-shared-file-tp4026443p4026449.html
>  
> . Please contact the owner about permissions or visit the Nabble Support 
> forum.
> "
> 
> What the hell... why does it say I should reply and then that I am not 
> allowed to post to my own thread???
> 
> Anyway, I tried to post the following information:
> 
> I did some experiments yesterday using the BlueWaters cluster. The 
> stripe count is limited there to 160. For runs with 256 MPI 
> processes/cores and fixed datasets were the writing times:
> 
> separate files: 1.36 [s]
> single file, 1 stripe: 133.6 [s]
> single file, best result: 17.2 [s]
> 
> (I did multiple runs with various combinations of strip count and size, 
> presenting the best results I have obtained.)
> 
> Increasing the number of stripes obviously helped a lot, but comparing 
> with the separate-files strategy, the writing time is still more than 
> ten times slower . Do you think it is "normal"?
> 
> Might chunking help here?
> 
> Thanks,
> Daniel
> 
> 
> 
> Dne 30. 8. 2013 16:05, Daniel Langr napsal(a):
>> I've run some benchmark, where within an MPI program, each process wrote
>> 3 plain 1D arrays to 3 datasets of an HDF5 file. I've used the following
>> writing strategies:
>> 
>> 1) each process writes to its own file,
>> 2) each process writes to the same file to its own dataset,
>> 3) each process writes to the same file to a same dataset.
>> 
>> I've tested 1)-3) for both fixed/chunked datasets (chunk size 1024), and
>> I've tested 2)-3) for both independent/collective options of the MPI
>> driver. I've also used 3 different clusters for measurements (all quite
>> modern).
>> 
>> As a result, the running (storage) times of the same-file strategy, i.e.
>> 2) and 3), were of orders of magnitudes longer than the running times of
>> the separate-files strategy. For illustration:
>> 
>> cluster #1, 512 MPI processes, each process stores 100 MB of data, fixed
>> data sets:
>> 
>> 1) separate files: 2.73 [s]
>> 2) single file, independent calls, separate data sets: 88.54[s]
>> 
>> cluster #2, 256 MPI processes, each process stores 100 MB of data,
>> chunked data sets (chunk size 1024):
>> 
>> 1) separate files: 10.40 [s]
>> 2) single file, independent calls, shared data sets: 295 [s]
>> 3) single file, collective calls, shared data sets: 3275 [s]
>> 
>> Any idea why the single-file strategy gives so poor writing performance?
>> 
>> Daniel
> 
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
> http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
> 
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
> http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org

-- 
Corey Bettenhausen
Science Systems and Applications, Inc
NASA Goddard Space Flight Center
301 614 5383
[email protected]


_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org

Re: [Hdf-forum] Very poor performance of pHDF5 when using single (shared) file

Reply via email to