Hello,

my brother got ill this weekend, so I'll continue this task:

> > See http://download.hennerich.de/kallsyms_20050312_1630.gz
> 
> Great, just so that there is no confusion, I still need a new run
> of /proc/page_owner, the shorter time before the lockup the better.

The machine locked up this morning again. See

    http://download.hennerich.de/page_owner_sorted_20050314_0740.bz2

for one of the last results of /proc/page_owner. It seems to be
obvious that the memory-leak seems to be the first entry:

    $ less page_owner_sorted_20050314_0740.bz2
    881397 times:
    Page allocated via order 0
    [0xc013962b] find_or_create_page+91
    [0xf8aa9955] +613
    [0xf8aaa606] +1366
    [0xc015765c] vfs_write+172
    [0xc015776c] sys_write+60
    [0xc0103879] sysenter_past_esp+82
     
    13358 times:
    Page allocated via order 0
    [0xc014817a] do_wp_page+282
    [0xc014914e] handle_mm_fault+302
    [0xc0113625] do_page_fault+501
    [0xc0104a7b] error_code+43

The sorted table of /proc/kallsyms looks like this:

    ...
    f8aa90e0 t reiserfs_commit_page [reiserfs]
    f8aa92e0 t reiserfs_submit_file_region_for_write        [reiserfs]
    f8aa9550 t reiserfs_check_for_tail_and_convert  [reiserfs]
    f8aa96f0 t reiserfs_prepare_file_region_for_write       [reiserfs]
    f8aaa0b0 t reiserfs_file_write  [reiserfs]
    f8aaa770 t reiserfs_aio_write   [reiserfs]
    f8aaa779 t .text.lock.file      [reiserfs]
    f8aaa7c0 t reiserfs_dir_fsync   [reiserfs]
    f8aaa7f0 t reiserfs_readdir     [reiserfs]
    f8aaad70 t make_empty_dir_item_v1       [reiserfs]
    f8aaae50 t make_empty_dir_item  [reiserfs]
    f8aaafc0 t create_virtual_node  [reiserfs]
    f8aab520 t check_left   [reiserfs]
    f8aab670 t check_right  [reiserfs]
    f8aab7c0 t get_num_ver  [reiserfs]
    ...

So I guess that we have a problem with the reiser filesystem??
We are using reiserfs 3.6...

Perhaps it's important to notice that the operating system

- is no fresh installation of SuSE 9.2, but was updated from SuSE 8.2
- is installed on that special hardware via a restore with the software
  Mondo-Rescue v2.03

Unfortunately we were not able up to now to reproduce the bug with
identical hardware and simulated disk-io. Only the production environment
triggers the bug.


0xf8aa9955 -  613 = 0xf8aa96f0: reiserfs_prepare_file_region_for_write
0xf8aaa606 - 1366 = 0xf8aaa0b0: reiserfs_file_write  

Wild guessing: Is "reiserfs_prepare_file_region_for_write" used
to append new output to existent files only - and do we have a memory
leak inside of this function? The production machine is used as loghost 
and syslog is writing several 100MB logfiles each day - which would be
a difference to the test hardware.



Best regards         Timo
-- 
T+T Hennerich GmbH --- Zettachring 12a --- 70567 Stuttgart
Fon:+49(711)720714-0  Fax:+49(711)720714-44  Vanity:+49(700)HENNERICH
UNIX - Linux - Java - C  Entwicklung/Beratung/Betreuung/Schulung
http://www.hennerich.de/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to