Just a quick interjection, I also have a dual-quad Nehalem system, HT on,
24GB ram, hand compiled 1.3.4 with options: --enable-mpi-threads
--enable-mpi-f77=no --with-openib=no

With v1.3.4 I see roughly the same behavior, hello, ring work, connectivity
fails randomly with np >= 8. Turning on -v increased the success, but still
hangs. np = 16 fails more often, and the hang is random in which pair of
processes are communicating.

However, it seems to be related to the shared memory layer problem. Running
with -mca btl ^sm works consistently through np = 128.

Hope this helps.

Mark

On Wed, Dec 9, 2009 at 8:03 PM, Gus Correa <g...@ldeo.columbia.edu> wrote:

> Hi Matthew
>
> Save any misinterpretation I may have made of the code:
>
> Hello_c has no real communication, except for a final Barrier
> synchronization.
> Each process prints "hello world" and that's it.
>
> Ring probes a little more, with processes Send(ing) and
> Recv(cieving) messages.
> Ring just passes a message sequentially along all process
> ranks, then back to rank 0, and repeat the game 10 times.
> Rank 0 is in charge of counting turns, decrementing the counter,
> and printing that (nobody else prints).
> With 4 processes:
> 0->1->2->3->0->1... 10 times
>
> In connectivity every pair of processes exchange a message.
> Therefore it probes all pairwise connections.
> In verbose mode you can see that.
>
> These programs shouldn't hang at all, if the system were sane.
> Actually, they should even run with a significant level of
> oversubscription, say,
> -np 128  should work easily for all three programs on a powerful
> machine like yours.
>
>
> **
>
> Suggestions
>
> 1) Stick to the OpenMPI you compiled.
>
> **
>
> 2) You can run connectivity_c in verbose mode:
>
> home/macmanes/apps/openmpi1.4/bin/mpirun -np 8 connectivity_c -v
>
> (Note the trailing "-v".)
>
> It should tell more about who's talking to who.
>
> **
>
> 3) I wonder if there are any BIOS settings that may be required
> (and perhaps not in place) to make the Nehalem hyperthreading to
> work properly in your computer.
>
> You reach the BIOS settings by typing <DEL> or <F2>
> when the computer boots up.
> The key varies by
> BIOS and computer vendor, but shows quickly on the bootup screen.
>
> You may ask the computer vendor about the recommended BIOS settings.
> If you haven't done this before, be careful to change and save only
> what really needs to change (if anything really needs to change),
> or the result may be worse.
> (Overclocking is for gamers, not for genome researchers ... :) )
>
> **
>
> 4) What I read about Nehalem DDR3 memory is that it is optimal
> on configurations that are multiples of 3GB per CPU.
> Common configs. in dual CPU machines like yours are
> 6, 12, 24 and 48GB.
> The sockets where you install the memory modules also matter.
>
> Your computer has 20GB.
> Did you build the computer or upgrade the memory yourself?
> Do you know how the memory is installed, in which memory sockets?
> What does the vendor have to say about it?
>
> See this:
>
> http://en.community.dell.com/blogs/dell_tech_center/archive/2009/04/08/nehalem-and-memory-configurations.aspx
>
> **
>
> 5) As I said before, typing "f" then "j" on "top" will add
> a column (labeled "P") that shows in which core each process is running.
> This will let you observe how the Linux scheduler is distributing
> the MPI load across the cores.
> Hopefully it is load-balanced, and different processes go to different
> cores.
>
> ***
>
> It is very disconcerting when MPI processes hang.
> You are not alone.
> The reasons are not always obvious.
> At least in your case there is no network involved or to troubleshoot.
>
>
> **
>
> I hope it helps,
>
> Gus Correa
> ---------------------------------------------------------------------
> Gustavo Correa
> Lamont-Doherty Earth Observatory - Columbia University
> Palisades, NY, 10964-8000 - USA
> ---------------------------------------------------------------------
>
>
>
>
>
> Matthew MacManes wrote:
>
>> Hi Gus and List,
>>
>> 1st of all Gus, I want to say thanks.. you have been a huge help, and when
>> I get this fixed, I owe you big time!
>>
>> However, the problems continue...
>>
>> I formatted the HD, reinstalled OS to make sure that I was working from
>> scratch.  I did your step A, which seemed to go fine:
>>
>> macmanes@macmanes:~$ which mpicc
>> /home/macmanes/apps/openmpi1.4/bin/mpicc
>> macmanes@macmanes:~$ which mpirun
>> /home/macmanes/apps/openmpi1.4/bin/mpirun
>>
>> Good stuff there...
>>
>> I then compiled the example files:
>>
>> macmanes@macmanes:~/Downloads/openmpi-1.4/examples$
>> /home/macmanes/apps/openmpi1.4/bin/mpirun -np 8 ring_c
>> Process 0 sending 10 to 1, tag 201 (8 processes in ring)
>> Process 0 sent to 1
>> Process 0 decremented value: 9
>> Process 0 decremented value: 8
>> Process 0 decremented value: 7
>> Process 0 decremented value: 6
>> Process 0 decremented value: 5
>> Process 0 decremented value: 4
>> Process 0 decremented value: 3
>> Process 0 decremented value: 2
>> Process 0 decremented value: 1
>> Process 0 decremented value: 0
>> Process 0 exiting
>> Process 1 exiting
>> Process 2 exiting
>> Process 3 exiting
>> Process 4 exiting
>> Process 5 exiting
>> Process 6 exiting
>> Process 7 exiting
>> macmanes@macmanes:~/Downloads/openmpi-1.4/examples$
>> /home/macmanes/apps/openmpi1.4/bin/mpirun -np 8 connectivity_c
>> Connectivity test on 8 processes PASSED.
>> macmanes@macmanes:~/Downloads/openmpi-1.4/examples$
>> /home/macmanes/apps/openmpi1.4/bin/mpirun -np 8 connectivity_c
>> ..HANGS..NO OUTPUT
>>
>> this is maddening because ring_c works.. and connectivity_c worked the 1st
>> time, but not the second... I did it 10 times, and it worked twice.. here is
>> the TOP screenshot:
>>
>>
>> http://picasaweb.google.com/macmanes/DropBox?authkey=Gv1sRgCLKokNOVqo7BYw#5413382182027669394
>>
>> What is the difference between connectivity_c and ring_c? Under what
>> circumstances should one fail and not the other...
>>
>> I'm off to the Linux forums to see about the Nehalem kernel issues..
>>
>> Matt
>>
>>
>>
>> On Wed, Dec 9, 2009 at 13:25, Gus Correa <g...@ldeo.columbia.edu <mailto:
>> g...@ldeo.columbia.edu>> wrote:
>>
>>    Hi Matthew
>>
>>    There is no point in trying to troubleshoot MrBayes and ABySS
>>    if not even the OpenMPI test programs run properly.
>>    You must straighten them out first.
>>
>>    **
>>
>>    Suggestions:
>>
>>    **
>>
>>    A) While you are at OpenMPI, do yourself a favor,
>>    and install it from source on a separate directory.
>>    Who knows if the OpenMPI package distributed with Ubuntu
>>    works right on Nehalem?
>>    Better install OpenMPI yourself from source code.
>>    It is not a big deal, and may save you further trouble.
>>
>>    Recipe:
>>
>>    1) Install gfortran and g++ if you don't have them using apt-get.
>>    2) Put the OpenMPI tarball in, say /home/matt/downolads/openmpi
>>    3) Make another install directory *not in the system directory tree*.
>>    Something like "mkdir /home/matt/apps/openmpi-X.Y.Z/" (X.Y.Z=version)
>>    will work
>>    4) cd /home/matt/downolads/openmpi
>>    5) ./configure CC=gcc CXX=g++ F77=gfortran FC=gfortran  \
>>    --prefix=/home/matt/apps/openmpi-X.Y.Z
>>    (Use the prefix flag to install in the directory of item 3.)
>>    6) make
>>    7) make install
>>    8) At the bottom of your /home/matt/.bashrc or .profile file
>>    put these lines:
>>
>>    export PATH=/home/matt/apps/openmpi-X.Y.Z/bin:${PATH}
>>    export MANPATH=/home/matt/apps/openmpi-X.Y.Z/share/man:`man -w`
>>    export
>>    LD_LIBRARY_PATH=home/matt/apps/openmpi-X.Y.Z/lib:${LD_LIBRARY_PATH}
>>
>>    (If you use csh/tcsh use instead:
>>    setenv PATH /home/matt/apps/openmpi-X.Y.Z/bin:${PATH}
>>    etc)
>>
>>    9) Logout and login again to freshen um the environment variables.
>>    10) Do "which mpicc"  to check that it is pointing to your newly
>>    installed OpenMPI.
>>    11) Recompile and rerun the OpenMPI test programs
>>    with 2, 4, 8, 16, .... processors.
>>    Use full path names to mpicc and to mpirun,
>>    if the change of PATH above doesn't work right.
>>
>>    ********
>>
>>    B) Nehalem is quite new hardware.
>>    I don't know if the Ubuntu kernel 2.6.31-16 fully supports all
>>    of Nehalem features, particularly hyperthreading, and NUMA,
>>    which are used by MPI programs.
>>    I am not the right person to give you advice about this.
>>    I googled out but couldn't find a clear information about
>>    minimal kernel age/requirements to have Nehalem fully supported.
>>    Some Nehalem owner in the list could come forward and tell.
>>
>>    **
>>
>>    C) On the top screenshot you sent me, please try it again
>>    (after you do item A) but type "f" and "j" to show the processors
>>    that are running each process.
>>
>>    **
>>
>>    D) Also, the screeshot shows 20GB of memory.
>>    This sounds not as a optimal memory for Nehalem,
>>    which tend to be 6GB, 12GB, 24GB, 48GB.
>>    Did you put together the system, or upgraded the memory yourself,
>>    of did you buy the computer as is?
>>    However, this should not break MPI anyway.
>>
>>    **
>>
>>    E) Answering your question:
>>    It is true that different flavors of MPI
>>    used to compile (mpicc) and run (mpiexec) a program would probably
>>    break right away, regardless of the number of processes.
>>    However, when it comes to different versions of the
>>    same MPI flavor (say OpenMPI 1.3.4 and OpenMPI 1.3.3)
>>    I am not sure it will break.
>>    I would guess it may run but not in a reliable way.
>>    Problems may appear as you stress the system with more cores, etc.
>>    But this is just a guess.
>>
>>    **
>>
>>    I hope this helps,
>>
>>    Gus Correa
>>    ---------------------------------------------------------------------
>>    Gustavo Correa
>>    Lamont-Doherty Earth Observatory - Columbia University
>>    Palisades, NY, 10964-8000 - USA
>>    ---------------------------------------------------------------------
>>
>>
>>    Matthew MacManes wrote:
>>
>>        Hi Gus,
>>
>>        Interestingly the results for the connectivity_c test... works
>>        fine with -np <8. For -np >8 it works some of the time, other
>>        times it HANGS. I have got to believe that this is a big clue!!
>>        Also, when it hangs, sometimes I get the message "mpirun was
>>        unable to cleanly terminate the daemons on the nodes shown
>>        below" Note that NO nodes are shown below.   Once, I got -np 250
>>        to pass the connectivity test, but I was not able to replicate
>>        this reliable, so I'm not sure if it was a fluke, or what.  Here
>>        is a like to a screenshop of TOP when connectivity_c is hung
>>        with -np 14.. I see that 2 processes are only at 50% CPU usage..
>>        Hmmmm
>> http://picasaweb.google.com/lh/photo/87zVEucBNFaQ0TieNVZtdw?authkey=Gv1sRgCLKokNOVqo7BYw&feat=directlink
>>        <
>> http://picasaweb.google.com/lh/photo/87zVEucBNFaQ0TieNVZtdw?authkey=Gv1sRgCLKokNOVqo7BYw&feat=directlink
>> >
>>        <
>> http://picasaweb.google.com/lh/photo/87zVEucBNFaQ0TieNVZtdw?authkey=Gv1sRgCLKokNOVqo7BYw&feat=directlink
>>        <
>> http://picasaweb.google.com/lh/photo/87zVEucBNFaQ0TieNVZtdw?authkey=Gv1sRgCLKokNOVqo7BYw&feat=directlink
>> >>
>>
>>
>>        The other tests, ring_c, hello_c, as well as the cxx versions of
>>        these guys with with all values of -np.
>>
>>        Using -mca mpi-paffinity_alone 1 I get the same behavior.
>>        I agree that I am should worry about the mismatch between where
>>        the libraries are installed versus where I am telling my
>>        programs to look for them. Would this type of mismatch cause
>>        behavior like what I am seeing, i.e. working with  a small
>>        number of processors, but failing with larger?  It seems like a
>>        mismatch would have the same effect regardless of the number of
>>        processors used. Maybe I am mistaken. Anyway, to address this,
>>        which mpirun gives me /usr/local/bin/mpirun.. so to configure
>>        ./configure --with-mpi=/usr/local/bin/mpirun and to run
>>        /usr/local/bin/mpirun -np X ...  This should
>>        uname -a gives me: Linux macmanes 2.6.31-16-generic #52-Ubuntu
>>        SMP Thu Dec 3 22:07:16 UTC 2006 x86_64 GNU/Linux
>>
>>        Matt
>>
>>        On Dec 8, 2009, at 8:50 PM, Gus Correa wrote:
>>
>>            Hi Matthew
>>
>>            Please see comments/answers inline below.
>>
>>            Matthew MacManes wrote:
>>
>>                Hi Gus, Thanks for your ideas.. I have a few questions,
>>                and will try to answer yours in hopes of solving this!!
>>
>>
>>            A simple way to test OpenMPI on your system is to run the
>>            test programs that come with the OpenMPI source code,
>>            hello_c.c, connectivity_c.c, and ring_c.c:
>>            http://www.open-mpi.org/
>>
>>            Get the tarball from the OpenMPI site, gzip and untar it,
>>            and look for it in the "examples" directory.
>>            Compile it with /your/path/to/openmpi/bin/mpicc hello_c.c
>>            Run it with /your/path/to/openmpi/bin/mpiexec -np X a.out
>>            using X = 2, 4, 8, 16, 32, 64, ...
>>
>>            This will tell if your OpenMPI is functional,
>>            and if you can run on many Nehalem cores,
>>            even with oversubscription perhaps.
>>            It will also set the stage for further investigation of your
>>            actual programs.
>>
>>
>>                Should I worry about setting things like --num-cores
>>                --bind-to-cores?  This, I think, gets at your questions
>>                about processor affinity.. Am I right? I could not
>>                exactly figure out the -mca mpi-paffinity_alone stuff...
>>
>>
>>            I use the simple minded -mca mpi-paffinity_alone 1.
>>            This is probably the easiest way to assign a process to a core.
>>            There more complex  ways in OpenMPI, but I haven't tried.
>>            Indeed, -mca mpi-paffinity_alone 1 does improve performance of
>>            our programs here.
>>            There is a chance that without it the 16 virtual cores of
>>            your Nehalem get confused with more than 3 processes
>>            (you reported that -np > 3 breaks).
>>
>>            Did you try adding just -mca mpi-paffinity_alone 1  to
>>            your mpiexec command line?
>>
>>
>>                1. Additional load: nope. nothing else, most of the time
>>                not even firefox.
>>
>>
>>            Good.
>>            Turn off firefox, etc, to make it even better.
>>            Ideally, use runlevel 3, no X, like a computer cluster node,
>>            but this may not be required.
>>
>>                2. RAM: no problems apparent when monitoring through
>>                TOP. Interesting, I did wonder about oversubscription,
>>                so I tried the option --nooversubscription, but this
>>                gave me an error mssage.
>>
>>
>>            Oversubscription from your program would only happen if
>>            you asked for more processes than available cores, i.e.,
>>            -np > 8 (or "virtual" cores, in case of Nehalem hyperthreading,
>>            -np > 16).
>>            Since you have -np=4 there is no oversubscription,
>>            unless you have other external load (e.g. Matlab, etc),
>>            but you said you don't.
>>
>>            Yet another possibility would be if your program is threaded
>>            (e.g. using OpenMP along with MPI), but considering what you
>>            said about OpenMP I would guess the programs don't use it.
>>            For instance, you launch the program with 4 MPI processes,
>>            and each process decides to start, say, 8 OpenMP threads.
>>            You end up with 32 threads and 8 (real) cores (or 16
>>            hyperthreaded
>>            ones on Nehalem).
>>
>>
>>            What else does top say?
>>            Any hog processes (memory- or CPU-wise)
>>            besides your program processes?
>>
>>                3. I have not tried other MPI flavors.. Ive been
>>                speaking to the authors of the programs, and they are
>>                both using openMPI.
>>
>>            I was not trying to convince you to use another MPI.
>>            I use MPICH2 also, but OpenMPI reigns here.
>>            The idea or trying it with MPICH2 was just to check whether
>>            OpenMPI
>>            is causing the problem, but I don't think it is.
>>
>>                4. I don't think that this is a problem, as I'm
>>                specifying --with-mpi=/usr/bin/...  when I compile the
>>                programs. Is there any other way to be sure that this is
>>                not a problem?
>>
>>
>>            Hmmm ....
>>            I don't know about your Ubuntu (we have CentOS and Fedora on
>>            various
>>            machines).
>>            However, most Linux distributions come with their MPI flavors,
>>            and so do compilers, etc.
>>            Often times they install these goodies in unexpected places,
>>            and this has caused a lot of frustration.
>>            There are tons of postings on this list that eventually
>>            boiled down to mismatched versions of MPI in unexpected places.
>>
>>
>>            The easy way is to use full path names to compile and to run.
>>            Something like this:
>>            /my/openmpi/bin/mpicc on your program configuration script),
>>
>>            and something like this
>>            /my/openmpi/bin/mpiexec -np  ... bla, bla ...
>>            when you submit the job.
>>
>>            You can check your version with "which mpicc", "which mpiexec",
>>            and (perhaps using full path names) with
>>            "ompi_info", "mpicc --showme", "mpiexec --help".
>>
>>
>>                5. I had not been, and you could see some shuffling when
>>                monitoring the load on specific processors. I have tried
>>                to use --bind-to-cores to deal with this. I don't
>>                understand how to use the -mca options you asked about.
>>                6. I am using Ubuntu 9.10. gcc 4.4.1 and g++  4.4.1
>>
>>
>>            I am afraid I won't be of help, because I don't have Nehalem.
>>            However, I read about Nehalem requiring quite recent kernels
>>            to get all of its features working right.
>>
>>            What is the output of "uname -a"?
>>            This will tell the kernel version, etc.
>>            Other list subscribers may give you a suggestion if you post
>> the
>>            information.
>>
>>                MyBayes is a for bayesian phylogenetics:
>>                 http://mrbayes.csit.fsu.edu/wiki/index.php/Main_Page
>>                ABySS: is a program for assembly of DNA sequence data:
>>                http://www.bcgsc.ca/platform/bioinfo/software/abyss
>>
>>
>>            Thanks for the links!
>>            I had found the MrBayes link.
>>            I eventually found what your ABySS was about, but no links.
>>            Amazing that it is about DNA/gene sequencing.
>>            Our abyss here is the deep ocean ... :)
>>            Abysmal difference!
>>
>>                    Do the programs mix MPI (message passing) with
>>                    OpenMP (threads)?
>>
>>                Im honestly not sure what this means..
>>
>>
>>            Some programs mix the two.
>>            OpenMP only works in a shared memory environment (e.g. a single
>>            computer like yours), whereas MPI can use both shared memory
>>            and work across a network (e.g. in a cluster).
>>            There are other differences too.
>>
>>            Unlikely that you have this hybrid type of parallel program,
>>            otherwise there would be some reference to OpenMP
>>            on the very program configuration files, program
>>            documentation, etc.
>>            Also, in general the configuration scripts of these hybrid
>>            programs can turn on MPI only, or OpenMP only, or both,
>>            depending on how you configure.
>>
>>            Even to compile with OpenMP you would need a proper compiler
>>            flag, but that one might be hidden in a Makefile too, making
>>            a bit hard to find. "grep -n mp Makefile" may give a clue.
>>            Anything on the documentation that mentions threads or OpenMP?
>>
>>            FYI, here is OpenMP:
>>            http://openmp.org/wp/
>>
>>                Thanks for all your help!
>>
>>             > Matt
>>
>>            Well, so far it didn't really help. :(
>>
>>            But let's hope to find a clue,
>>            maybe with a little help of
>>            our list subscriber friends.
>>
>>            Gus Correa
>>
>>  ---------------------------------------------------------------------
>>            Gustavo Correa
>>            Lamont-Doherty Earth Observatory - Columbia University
>>            Palisades, NY, 10964-8000 - USA
>>
>>  ---------------------------------------------------------------------
>>
>>                    Hi Matthew
>>
>>                    More guesses/questions than anything else:
>>
>>                    1) Is there any additional load on this machine?
>>                    We had problems like that (on different machines) when
>>                    users start listening to streaming video, doing
>>                    Matlab calculations,
>>                    etc, while the MPI programs are running.
>>                    This tends to oversubscribe the cores, and may lead
>>                    to crashes.
>>
>>                    2) RAM:
>>                    Can you monitor the RAM usage through "top"?
>>                    (I presume you are on Linux.)
>>                    It may show unexpected memory leaks, if they exist.
>>
>>                    On "top", type "1" (one) see all cores, type "f"
>>                    then "j"
>>                    to see the core number associated to each process.
>>
>>                    3) Do the programs work right with other MPI flavors
>>                    (e.g. MPICH2)?
>>                    If not, then it is not OpenMPI's fault.
>>
>>                    4) Any possibility that the MPI versions/flavors of
>>                    mpicc and
>>                    mpirun that you are using to compile and launch the
>>                    program are not the
>>                    same?
>>
>>                    5) Are you setting processor affinity on mpiexec?
>>
>>                    mpiexec -mca mpi_paffinity_alone 1 -np ... bla, bla ...
>>
>>                    Context switching across the cores may also cause
>>                    trouble, I suppose.
>>
>>                    6) Which Linux are you using (uname -a)?
>>
>>                    On other mailing lists I read reports that only
>>                    quite recent kernels
>>                    support all the Intel Nehalem processor features well.
>>                    I don't have Nehalem, I can't help here,
>>                    but the information may be useful
>>                    for other list subscribers to help you.
>>
>>                    ***
>>
>>                    As for the programs, some programs require specific
>>                    setup,
>>                    (and even specific compilation) when the number of
>>                    MPI processes
>>                    vary.
>>                    It may help if you tell us a link to the program sites.
>>
>>                    Baysian statistics is not totally out of our business,
>>                    but phylogenetic genetic trees is not really my league,
>>                    hence forgive me any bad guesses, please,
>>                    but would it need specific compilation or a different
>>                    set of input parameters to run correctly on a different
>>                    number of processors?
>>                    Do the programs mix MPI (message passing) with
>>                    OpenMP (threads)?
>>
>>                    I found this MrBayes, which seems to do the above:
>>
>>                    http://mrbayes.csit.fsu.edu/
>>                    http://mrbayes.csit.fsu.edu/wiki/index.php/Main_Page
>>
>>                    As for the ABySS, what is it, where can it be found?
>>                    Doesn't look like a deep ocean circulation model, as
>>                    the name suggest.
>>
>>                    My $0.02
>>                    Gus Correa
>>
>>
>>  ------------------------------------------------------------------------
>>                _______________________________________________
>>                users mailing list
>>                us...@open-mpi.org <mailto:us...@open-mpi.org>
>>
>>                http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>            _______________________________________________
>>            users mailing list
>>            us...@open-mpi.org <mailto:us...@open-mpi.org>
>>
>>            http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>        _________________________________
>>        Matthew MacManes
>>        PhD Candidate
>>        University of California- Berkeley
>>        Museum of Vertebrate Zoology
>>        Phone: 510-495-5833
>>        Lab Website: http://ib.berkeley.edu/labs/lacey
>>        Personal Website: http://macmanes.com/
>>
>>
>>
>>
>>
>>
>>
>>  ------------------------------------------------------------------------
>>
>>        _______________________________________________
>>        users mailing list
>>        us...@open-mpi.org <mailto:us...@open-mpi.org>
>>
>>        http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>    _______________________________________________
>>    users mailing list
>>    us...@open-mpi.org <mailto:us...@open-mpi.org>
>>
>>    http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Reply via email to