On Sun, 2008-08-31 at 13:50 +0200, Zhou Rui wrote:
> Hi, all:
>     My problem seems basically solved.
>     We we used to call vmalloc() in the memory management part of our
> source, but it seems to be the key unreliable point resulting in the
> problem. vmalloc() always assigns some virtual addresses whose
> corresponding physical addresses are out of memory size (there is only
> 32MB DRAM in our 405 board). Once instructions try to access these
> illegal physical address, machine check happens

That should -never- happen.

Have you verified, as I asked you a while ago, that you are actually
passing the right amount of memory to your kernel from the device-tree
or the bootloader ?

Ben.

>     Afterwards, we call kmalloc() instead and it works basically as what
> we want. But problems of the memory management still exist because
> therea are program check exception sometimes and page always:
> ....
> -bash-3.2# PROGRAM: reason: 0x8000000, nip: 0xc028bf20
> Oops: Exception in kernel mode, sig: 4 [#1]
> NIP: C028BF20 LR: C028BF20 CTR: C31C6078
> REGS: c028be80 TRAP: 0700   Not tainted  (2.6.19.2-eldk-xm.1.0)
> MSR: 00029030 <EE,ME,IR,DR>  CR: 00000000  XER: 00000000
> TASK = c0228a30[0] 'swapper' THREAD: c028a000
> GPR00: 00000000 C028BF30 C0228A30 C034B7B0 C028BF20 00000000 00000001
> 00000000 
> GPR08: 00000003 C31D0000 22000082 00029030 2BDD9FE1 C03B3164 0000066F
> 2B1F1DC8 
> GPR16: C03B3050 0FFEA478 10010000 C31D0000 C028BEF0 C31CA2E4 00021030
> C028A000 
> GPR24: C028BEF0 C0228B44 C0228468 C03B3050 C028BF10 C31C60C4 00029030
> C03B3050 
> NIP [C028BF20] init_thread_union+0x1f20/0x2000
> LR [C028BF20] init_thread_union+0x1f20/0x2000
> Call Trace:
> [C028BF30] [0FFEA478] 0xffea478 (unreliable)
> Instruction dump:
> XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX 
> XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX 
> Kernel panic - not syncing: Attempted to kill the idle task!
>  <0>Rebooting in 180 seconds..
> 
> And there is bad page:
> Message from syslogd@ at Thu Jan  1 01:32:00 1970 ...
> 405 kernel: Backtrace:
> Message from syslogd@ at Thu Jan  1 01:32:00 1970 ...
> 405 kernel: Bad page state in process 'loader.xm'
> Message from syslogd@ at Thu Jan  1 01:32:00 1970 ...
> 405 kernel: Trying to fix it up, but a reboot is needed
> Message from syslogd@ at Thu Jan  1 01:32:00 1970 ...
> 405 kernel: Bad page state in process 'loader.xm'
> Message from syslogd@ at Thu Jan  1 01:32:00 1970 ...
> 405 kernel: Trying to fix it up, but a reboot is needed
> Message from syslogd@ at Thu Jan  1 01:32:00 1970 ...
> 405 kernel: page:c02f0e60 flags:0x00000400 mapping:00000000 mapcount:0
> count:1
> 
> I will do some traces for fixing those problems.
> 
> And could anyone like to give some explanation between vmalloc() and
> kmalloc()? Based on our work, there seems to be great difference.
> 
> Thank you very much!
> 
> Best Wishes
> 
> Zhou Rui
> 2008-08-31
> 
> 在 2008-08-25一的 21:16 +0200,Zhou Rui写道:
> > Hi,
> > I think maybe you have known this project named XtratuM
> > (http://www.xtratum.org). I'm porting it from x86 to PPC405. The
> > implementation on PPC440 has been basically finished
> > (ftp://dslab.lzu.edu.cn/pub/xtratum/xtratum-ppc/snapshots/xtratum-ppc-20071205.tar.bz2)
> >  and I know there was discussion about it in this mail list before. XtratuM 
> > is an ADEOS based nano kernel. It aims for realtime and is designed to 
> > provide virtual timer, virtual interrupt and memory space sperations for 
> > domains. Each domain is loaded by a userspace program (instead of the root 
> > domain as a kernel module) and the loader will load the domain's (ELF 
> > staticly excutable) PT_LOAD section into memory, and then raise a properly 
> > system call (passing the structurized loaded data as arguments) to load the 
> > domain via load_domain_sys() of XtratuM, and at the last step of loading 
> > the domain, xtratum will jump to the entry code of the new domain(asm 
> > wrappered start() routine) and then everything should be fine. 0x100000a0 
> > is the entry point of the test domain, and that is why I need to start 
> > execution from it.
> > 
> > I think I can say something of my analysis so far for the cause of my
> > problem. Thanks for the mention of memory size. Once the kernel module
> > of XtratuM is loaded, the symbols of it are placed to virtual addresses
> > like 0xc3xxxxxx. Because in normal state, address translation is enabled
> > (MSR[IR, DR] = [1, 1]), these addresses are okay. However, when loading
> > the domain, because the entry point 0x100000a0 is not in TLB and it
> > should be reloaded, Data TLB Miss Exception arises and DTLBMiss is
> > called. The exception clears MSR[IR, DR], so address translation is
> > disabled and physical address should be used at this moment. If we want
> > something at the virtual address of 0xc3xxxxxx, we must access the
> > physical addresses like 0x03xxxxxx. Nevertheless, the limitation of 32MB
> > memory makes the valid physical address range from 0x0 to 0x1ffffff.
> > Therefore, during the exception handling, the addresses out of range
> > should not be accessed, but the instructions cannot know the memory
> > limitation in advance and tries to do something in addresses such as
> > 0x03072da0 based on the address translation mechanism, which leads to
> > machine check.
> > I haved tried to append "mem=32M" to kernel command line but no help. I
> > think it is because when loading the kernel in normal state, address
> > translation is enabled and the virtual addresses are okay. Kernel cannot
> > foresee that there is going to be a TLB miss exception and the illegal
> > physical addresses like 0x03xxxxxx may be accessed.
> > 
> > So any ideas for this problem are welcome.
> > 
> > Thank you very much for taking care.
> > 
> > Best Wishes
> > 
> > Zhou Rui
> > 2008-08-25
> > 
> > 在 2008-08-24日的 20:55 +0200,Wolfgang Denk写道:
> > > Dear Zhou Rui,
> > > 
> > > In message <[EMAIL PROTECTED]> you wrote:
> > > >
> > > > > >    I am running a kernel module which will execute a user space
> > > > > >application. The entry point of the application is 0x100000a0. At the
> > > > > 
> > > > > That should be the first clue that you are doing it wrong.  Don't do
> > > > > stuff like that in modules...
> > > > 
> > > > Oh, but our project needs a function like that ...
> > > 
> > > You should really think about this. Why do you think you  need  this?
> > > What  exactly  are  you  trying  to  do?  [Probably  there are better
> > > approaches to solve your problem...]
> > 
> > > > It is physical address at this moment. Address translation is disabled
> > > > automatically (MSR[IR, DR] = [0, 0]) because of TLB Miss Exception and
> > > > Instrunction Storage Exception.
> > > 
> > > Hm.. are you absolutely sure that the 0x100000a0 mentioned above is a
> > > physical address?
> > > 
> > > > > Do you have enough DRAM to cover that?  Some of those boards only come
> > > > > with 32MiB of DRAM.
> > > > 
> > > > My board only has 32MB DRAM. Do you mean 32MB is not enough for that?
> > > 
> > > Well, 0x1000'00A0 is above 256 MB, while you  have  only  32  MB  RAM
> > > which is most probably mapped from 0x0000'0000...0x01FF'FFFF... So
> > > what you claim to be a physical address (and I think your claim is
> > > wrong) is far outside available physical memory.
> > > 
> > > > The same codes can run well in a PPC440EP (Yosemite Board) which owns
> > > > 256MB DRAM. At the beginning of my work, I thought memory size may be
> > > > the cause of failure. But I did not know how to demonstrate it. So if
> > > > the limitation of 32MB DRAM leads to the failure, are there any methods
> > > > for the codes to solve it?
> > > 
> > > I think you got lost on the wrong track. Please describe  which  task
> > > you  want  to  implement, and there might be another, better approach
> > > for it.
> > > 
> > > Best regards,
> > > 
> > > Wolfgang Denk
> > 
> > __________________________________________________
> > ϿעŻ?
> > http://cn.mail.yahoo.com
> > 
> > _______________________________________________
> > Linuxppc-dev mailing list
> > Linuxppc-dev@ozlabs.org
> > https://ozlabs.org/mailman/listinfo/linuxppc-dev
> 
> __________________________________________________
> ϿעŻ?
> http://cn.mail.yahoo.com
> 
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc-dev

_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Reply via email to