Re: /proc/vmcore mmap() failure issue

2013-11-25 Thread chaow...@redhat.com
On 11/25/13 at 08:09am, Atsushi Kumagai wrote:
> Hello WANG,
> 
> On 2013/11/21 16:15:22, kexec  wrote:
> > > How about this fail back structure instead of such an extra option ?
> > > 
> > > Thanks
> > > Atsushi Kumagai
> > > 
> > > From: Atsushi Kumagai 
> > > Date: Wed, 20 Nov 2013 14:10:19 +0900
> > > Subject: [PATCH] Fall back to read() when mmap() fails.
> > > 
> > > Signed-off-by: Atsushi Kumagai 
> > > ---
> > >  makedumpfile.c | 10 +-
> > >  1 file changed, 9 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/makedumpfile.c b/makedumpfile.c
> > > index ca03440..f583602 100644
> > > --- a/makedumpfile.c
> > > +++ b/makedumpfile.c
> > > @@ -324,7 +324,15 @@ read_from_vmcore(off_t offset, void *bufptr, 
> > > unsigned long size)
> > >   if (!read_with_mmap(offset, bufptr, size)) {
> > >   ERRMSG("Can't read the dump memory(%s) with mmap().\n",
> > >  info->name_memory);
> > > - return FALSE;
> > > +
> > > + ERRMSG("This kernel might have some problems about 
> > > mmap().\n");
> > > + ERRMSG("read() will be used instead of mmap() from 
> > > now.\n");
> > > +
> > > + /*
> > > +  * Fall back to read().
> > > +  */
> > > + info->flag_usemmap = FALSE;
> > > + read_from_vmcore(offset, bufptr, size);
> > 
> > Hi, Atsushi
> > 
> > I've got such a workstation too. And I confirm this patch works for me.
> 
> Thanks for your testing !
> 
> > However, I have a question:
> > Why not switch to mmap() back after read()?
> 
> I made this patch as a general safety net, not only for the partial page
> issue.
> When facing unknown issues related mmap(), the kernel may have some bugs
> and mmap() can fail for every pages. In the worst case, most all mmap()
> will fail and try read() with error messages after every fail, but this
> patch will prevent the chattering of the switch and so many error messages.

Thanks for you explanation. I agree with you. Since mmap() is error
prone after first mmap failure, use read() instead as a fail safe makes
much sense to me.

WANG Chao
> 
> 
> Thanks
> Atsushi Kumagai
> 
> > Thanks
> > WANG Chao
> > 
> > >   }
> > >   } else {
> > >   if (lseek(info->fd_memory, offset, SEEK_SET) == failed) {
> > > -- 
> > > 1.8.0.2
> > 
> > ___
> > kexec mailing list
> > ke...@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/kexec
> > 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: /proc/vmcore mmap() failure issue

2013-11-20 Thread chaow...@redhat.com
On 11/20/13 at 05:27am, Atsushi Kumagai wrote:
> On 2013/11/19 18:56:21, kexec  wrote:
> > (2013/11/18 9:51), Atsushi Kumagai wrote:
> > > (2013/11/15 23:26), Vivek Goyal wrote:
> > >> On Fri, Nov 15, 2013 at 06:41:52PM +0900, HATAYAMA Daisuke wrote:
> > >>
> > >> [..]
> >  Given the fact that hpa does not like fixing it in kernel. We are 
> >  left with option of fixing it in following places.
> > 
> >  - Drop partial pages in kexec-tools
> >  - Drop partial pages in makeudmpfile.
> >  - Read partial pages using read() interface in makedumpfile
> >  - Modify /proc/vmcore to copy partial pages in second kernel's memory.
> > 
> >  It is not clear to me that partial pages are really useful.  So I 
> >  want to avoid modifying /proc/vmcore to deal with partial pages and 
> >  increase complexity.
> > 
> >  So fixing makedumpfile (either option2 or option 3) seems least 
> >  risky to me. In fact I would say let us keep it simple and truncate 
> >  partial pages in makedumpfile to keep it simple. And look at option 
> >  3 once we have a strong use case for partial pages.
> > 
> >  What do you think?
> > 
> > >>>
> > >>> As you say, it's not clear that partial pages are really useful, but 
> > >>> on the other hand, it seems to me not clear that they are really 
> > >>> useless.
> > >>> I think we should get them as long as we have access to them.
> > >>>
> > >>> It seems best to me the option 3). Switching between read and mmap 
> > >>> would be not so complex and also it's by far flexible in 
> > >>> makedumpfile than in kernel.
> > >>
> > >> Ok, I am fine with option 3. It is more complicated option but safe 
> > >> option.
> > > 
> > > It sounds reasonable also to me.
> > > 
> > >> Is there any chance that you could look into fixing this. I have no 
> > >> experience writing code for makedumpfile.
> > > 
> > > I'll send a patch to fix this soon.
> > > 
> > 
> > Thanks.
> > 
> > BTW, now the following patch has been applied on top of makedumpfile in 
> > kexec-tools package on fedora in order to avoid the issue.
> > 
> > https://lists.fedoraproject.org/pipermail/kexec/2013-November/000254.html
> > 
> > I remember prototype version of mmap patch implemented a kind of --no-mmap 
> > option and we could use it to disable mmap() use and use read() instead, I 
> > think which is useful when we face this kind of issue.
> 
> How about this fail back structure instead of such an extra option ?
> 
> Thanks
> Atsushi Kumagai
> 
> From: Atsushi Kumagai 
> Date: Wed, 20 Nov 2013 14:10:19 +0900
> Subject: [PATCH] Fall back to read() when mmap() fails.
> 
> Signed-off-by: Atsushi Kumagai 
> ---
>  makedumpfile.c | 10 +-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/makedumpfile.c b/makedumpfile.c
> index ca03440..f583602 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -324,7 +324,15 @@ read_from_vmcore(off_t offset, void *bufptr, unsigned 
> long size)
>   if (!read_with_mmap(offset, bufptr, size)) {
>   ERRMSG("Can't read the dump memory(%s) with mmap().\n",
>  info->name_memory);
> - return FALSE;
> +
> + ERRMSG("This kernel might have some problems about 
> mmap().\n");
> + ERRMSG("read() will be used instead of mmap() from 
> now.\n");
> +
> + /*
> +  * Fall back to read().
> +  */
> + info->flag_usemmap = FALSE;
> + read_from_vmcore(offset, bufptr, size);

Hi, Atsushi

I've got such a workstation too. And I confirm this patch works for me.

However, I have a question:
Why not switch to mmap() back after read()?

Thanks
WANG Chao

>   }
>   } else {
>   if (lseek(info->fd_memory, offset, SEEK_SET) == failed) {
> -- 
> 1.8.0.2
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/