Hello Frieder,
Am 10.04.2019 um 15:31 schrieb Schrempf Frieder:
Hi Heiko,
On 10.04.19 14:44, Heiko Schocher wrote:
Hello Frieder,
Am 10.04.2019 um 12:49 schrieb Schrempf Frieder:
Hi,
I have a customer who has a NAND device with two MTD partitions and
each > of the partitions contains one UBI volume with a UBIFS filesystem.
Bad idea ... why?
You may loose lifetime of the board, as UBI cannot use PEBs between the
2 MTD partitions on the nand ... better you would have one big MTD
Partition
with n ubi volumes in it ...
But ... okay... this must work also.
Yeah, I only recently learned about the disadvantages of this setup.
Maybe we can change it in the future, but for now they are using
separate partitions.
Ok.
Now U-Boot can mount the UBIFS from the first partition just fine, but
if the UBIFS from the second partition is mounted afterwards this fails
in some cases.
:-(
I can reproduce the error and tracked it down to uboot_ubifs_mount() in
fs/ubifs/super.c. If this function is run for the second mount, the
struct ubifs_fs_type is reused and it contains a list fs_supers, that
still holds one entry for the first mount.
Sure?
fs_supers in struct file_system_type seems used only in none
U-Boot code...
Right, I had a closer look and fs_supers seems to be unused indeed, but
somehow it causes corruption in my case. When I apply 5a08cfee3967
(ubifs: remove useless code) the problem disappears.
Ah, 5a08cfee3967 is in mainline ... good ;-)
Without this patch there still is hlist_add_head(&s->s_instances,
&type->fs_supers) and this line somehow seems to cause the error.
I applied this debug diff:
--- a/fs/ubifs/super.c
+++ b/fs/ubifs/super.c
@@ -2428,7 +2428,15 @@ retry:
#else
strncpy(s->s_id, type->name, sizeof(s->s_id));
#endif
+ printf("%s:%d: ubi_num: %d, vol_id: %d\n",
+ __func__, __LINE__,
+ ((struct ubifs_info *)s->s_fs_info)->vi.ubi_num,
+ ((struct ubifs_info *)s->s_fs_info)->vi.vol_id);
hlist_add_head(&s->s_instances, &type->fs_supers);
+ printf("%s:%d: ubi_num: %d, vol_id: %d\n",
+ __func__, __LINE__,
+ ((struct ubifs_info *)s->s_fs_info)->vi.ubi_num,
+ ((struct ubifs_info *)s->s_fs_info)->vi.vol_id);
#ifndef __UBOOT__
spin_unlock(&sb_lock);
get_filesystem(type);
And I'm getting this for the first mount:
sget:2431: ubi_num: 0, vol_id: 0
sget:2433: ubi_num: 0, vol_id: 0
And this for the second mount:
sget:2431: ubi_num: 0, vol_id: 0
sget:2433: ubi_num: -1678121552, vol_id: -1678120656
Hmm...
I guess, that if the second mount would happen on a volume that is on
the same MTD partition as the first volume, than this will work. The
second entry is added to ubifs_fs_type.fs_supers.
I cannot see this from looking into code ... so hard to say, but
I only looked into mainline code ...
Yeah, I was probably wrong with these first wild guesses...
;-)
In my case however, the second entry being added to
ubifs_fs_type.fs_supers is invalid and causes the mount error.
Reinitializing the list in uboot_ubifs_mount() before each mount, solves
the problem, but I guess that it will cause failures in other setups,
where there are actually multiple volumes on one MTD device.
So how can I solve this properly? Do we need one instance of struct
ubifs_fs_type for each MTD device?
Hmm.. without digging into it, it is difficult to say...
I tested this on an old version (2017.03), but looking at the current
code, it looks like the same problem applies to current mainline.
Is there any chance to try it with current mainline ?
The problem is a bit strange and this what I'm actually worried about.
It is persistent in a certain environment: U-Boot loaded from SPI NOR,
environment set to certain values, data written to UBIFS partition in
Linux and then power-cut.
If one of these conditions changes, the error usually disappears, for
example if I use the exact same setup, but load the Bootloader from MMC
or RAM. Or if no write access with power-cut happens.
So I wonder if there's some memory corruption somewhere else. Though,
the error happens always at the same place. Debug prints or other code
changes have no influence.
I really would like to understand what's going on so I can make sure
that 5a08cfee3967 actually solves the real issue or just hides it.
Please keep me informed ... I try to reproduce it here on a hw I
have access to, but need some "free" time ...
bye,
Heiko
--
DENX Software Engineering GmbH, Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: +49-8142-66989-52 Fax: +49-8142-66989-80 Email: h...@denx.de
_______________________________________________
U-Boot mailing list
U-Boot@lists.denx.de
https://lists.denx.de/listinfo/u-boot