It's likely a BIOS bug.
But I can't say more until you send the relevant data as explained earlier.
Brice



Le 20/12/2014 18:10, Sergio Manzetti a écrit :
> Dear Brice, the BIOS is the most latest. However, i wonder if this
> could be  a hardware error, as openmpi sources claim.  Is there any
> way to find out if this is a hardware error?
>
> Thanks
>
>
> > From: users-requ...@open-mpi.org
> > Subject: users Digest, Vol 3074, Issue 1
> > To: us...@open-mpi.org
> > Date: Sat, 20 Dec 2014 12:00:02 -0500
> >
> > Send users mailing list submissions to
> > us...@open-mpi.org
> >
> > To subscribe or unsubscribe via the World Wide Web, visit
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> > or, via email, send a message with subject or body 'help' to
> > users-requ...@open-mpi.org
> >
> > You can reach the person managing the list at
> > users-ow...@open-mpi.org
> >
> > When replying, please edit your Subject line so it is more specific
> > than "Re: Contents of users digest..."
> >
> >
> > Today's Topics:
> >
> > 1. Re: Deadlock in OpenMPI 1.8.3 and PETSc 3.4.5
> > (Jeff Squyres (jsquyres))
> > 2. Hwloc error with Openmpi 1.8.3 on AMD 64 (Sergio Manzetti)
> > 3. Re: Hwloc error with Openmpi 1.8.3 on AMD 64 (Brice Goglin)
> > 4. best function to send data (Diego Avesani)
> >
> >
> > ----------------------------------------------------------------------
> >
> > Message: 1
> > Date: Fri, 19 Dec 2014 19:26:58 +0000
> > From: "Jeff Squyres (jsquyres)" <jsquy...@cisco.com>
> > To: "Open MPI User's List" <us...@open-mpi.org>
> > Cc: "petsc-ma...@mcs.anl.gov" <petsc-ma...@mcs.anl.gov>
> > Subject: Re: [OMPI users] Deadlock in OpenMPI 1.8.3 and PETSc 3.4.5
> > Message-ID: <027ab453-de85-4f08-bdd7-a676ca90e...@cisco.com>
> > Content-Type: text/plain; charset="us-ascii"
> >
> > On Dec 19, 2014, at 10:44 AM, George Bosilca <bosi...@icl.utk.edu>
> wrote:
> >
> > > Regarding your second point, while I do tend to agree that such
> issue is better addressed in the MPI Forum, the last attempt to fix
> this was certainly not a resounding success.
> >
> > Yeah, fair enough -- but it wasn't a failure, either. It could
> definitely be moved forward, but it will take time/effort, which I
> unfortunately don't have. I would be willing, however, to spin up
> someone who *does* have time/effort available to move the proposal
> forward.
> >
> > > Indeed, there is a slight window of opportunity for
> inconsistencies in the recursive behavior.
> >
> > You're right; it's a small window in the threading case, but a)
> that's the worst kind :-), and b) the non-threaded case is actually
> worse (because the global state can change from underneath the loop).
> >
> > > But the inconsistencies were already in the code, especially in
> the single threaded case. As we never received any complaints related
> to this topic I did not deemed interesting to address them with my
> last commit. Moreover, the specific behavior needed by PETSc is
> available in Open MPI when compiled without thread support, as the
> only thing that "protects" the attributes is that global mutex.
> >
> > Mmmm. Ok, I see your point. But this is a (very) slippery slope.
> >
> > > For example, in ompi_attr_delete_all(), it gets the count of all
> attributes and then loops <count> times to delete each attribute. But
> each attribute callback can now insert or delete attributes on that
> entity. This can mean that the loop can either fail to delete an
> attribute (because some attribute callback already deleted it) or fail
> to delete *all* attributes (because some attribute callback added more).
> > >
> > > To be extremely precise the deletion part is always correct
> >
> > ...as long as the hash map is not altered from the application
> (e.g., by adding or deleting another attribute during a callback).
> >
> > I understand that you mention above that you're not worried about
> this case. I'm just picking here because there is quite definitely a
> case where the loop is *not* correct. PETSc apparently doesn't trigger
> this badness, but... like I said above, it's a (very) slippery slope.
> >
> > > as it copies the values to be deleted into a temporary array
> before calling any callbacks (and before releasing the mutex), so we
> only remove what was in the object attribute hash when the function
> was called. Don't misunderstand we have an extremely good reason to do
> it this way, we need to call the callbacks in the order in which they
> were created (mandated by the MPI standard).
> > >
> > > ompi_attr_copy_all() has similar problems -- in general, the hash
> that it is looping over can change underneath it.
> > >
> > > For the copy it is a little bit more tricky, as the calling order
> is not imposed. Our peculiar implementation of the hash table (with
> array) makes the code work, with a single (possible minor) exception
> when the hash table itself is grown between 2 calls. However, as
> stated before this issue was already present in the code in single
> threaded cases for years. Addressing it is another 2 line patch, but I
> leave this exercise to an interested reader.
> >
> > Yeah, thanks for that. :-)
> >
> > To be clear: both the copy and the delete code could be made thread
> safe. I just don't think we should be encouraging users to be
> exercising undefined / probably not-portable MPI code.
> >
> > --
> > Jeff Squyres
> > jsquy...@cisco.com
> > For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> >
> >
> >
> > ------------------------------
> >
> > Message: 2
> > Date: Fri, 19 Dec 2014 20:58:46 +0100
> > From: Sergio Manzetti <sergio.manze...@outlook.com>
> > To: "us...@open-mpi.org" <us...@open-mpi.org>
> > Subject: [OMPI users] Hwloc error with Openmpi 1.8.3 on AMD 64
> > Message-ID: <dub126-w2190e22e21596a1b834cf4e3...@phx.gbl>
> > Content-Type: text/plain; charset="iso-8859-1"
> >
> >
> >
> >
> >
> >
> > Dear all, when trying to run NWchem with openmpi, I get this error.
> >
> >
> >
> >
> ****************************************************************************
> > * Hwloc has encountered what looks like an error from the operating
> system.
> > *
> > * object intersection without inclusion!
> > * Error occurred in topology.c line 594
> > *
> > * Please report this error message to the hwloc user's mailing list,
> > * along with the output from the hwloc-gather-topology.sh script.
> >
> > Is there any rationale for solving this?
> >
> > Thanks
> >
> > -------------- next part --------------
> > HTML attachment scrubbed and removed
> >
> > ------------------------------
> >
> > Message: 3
> > Date: Fri, 19 Dec 2014 21:13:19 +0100
> > From: Brice Goglin <brice.gog...@inria.fr>
> > To: Open MPI Users <us...@open-mpi.org>
> > Subject: Re: [OMPI users] Hwloc error with Openmpi 1.8.3 on AMD 64
> > Message-ID: <549486df.50...@inria.fr>
> > Content-Type: text/plain; charset="windows-1252"
> >
> > Hello,
> >
> > The rationale is to read the message and do what it says :)
> >
> > Have a look at
> > www.open-mpi.org/projects/hwloc/doc/v1.10.0/a00028.php#faq_os_error
> > Try upgrading your BIOS and kernel.
> >
> > Otherwise install hwloc and send the output (tarball) of
> > hwloc-gather-topology to hwloc-users (not to OMPI users).
> >
> > thanks
> > Brice
> >
> >
> >
> > Le 19/12/2014 20:58, Sergio Manzetti a ?crit :
> > >
> > >
> > > Dear all, when trying to run NWchem with openmpi, I get this error.
> > >
> > >
> > >
> > >
> ****************************************************************************
> > > * Hwloc has encountered what looks like an error from the operating
> > > system.
> > > *
> > > * object intersection without inclusion!
> > > * Error occurred in topology.c line 594
> > > *
> > > * Please report this error message to the hwloc user's mailing list,
> > > * along with the output from the hwloc-gather-topology.sh script.
> > >
> > > Is there any rationale for solving this?
> > >
> > > Thanks
> > >
> > >
> > > _______________________________________________
> > > users mailing list
> > > us...@open-mpi.org
> > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> > > Link to this post:
> http://www.open-mpi.org/community/lists/users/2014/12/26045.php
> >
> > -------------- next part --------------
> > HTML attachment scrubbed and removed
> >
> > ------------------------------
> >
> > Message: 4
> > Date: Fri, 19 Dec 2014 23:56:36 +0100
> > From: Diego Avesani <diego.aves...@gmail.com>
> > To: Open MPI Users <us...@open-mpi.org>
> > Subject: [OMPI users] best function to send data
> > Message-ID:
> > <cag8o1y4b0uwydtrb+swdbra4tbk6ih5toeypga8b6vs-oty...@mail.gmail.com>
> > Content-Type: text/plain; charset="utf-8"
> >
> > dear all users,
> > I am new in MPI world.
> > I would like to know what is the best choice and meaning between
> different
> > function.
> >
> > In my program I would like that each process send a vector of data
> to all
> > the other process. What do you suggest?
> > Is it correct MPI_Bcast or I am missing something?
> >
> > Thanks a lot
> >
> > Diego
> > -------------- next part --------------
> > HTML attachment scrubbed and removed
> >
> > ------------------------------
> >
> > Subject: Digest Footer
> >
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> > ------------------------------
> >
> > End of users Digest, Vol 3074, Issue 1
> > **************************************
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/12/26048.php

Reply via email to