Hi, > -----Original Message----- > From: openwrt-devel [mailto:openwrt-devel-boun...@lists.openwrt.org] > On Behalf Of John Thomson > Sent: Mittwoch, 5. August 2020 23:14 > To: openwrt-devel@lists.openwrt.org > Cc: John Thomson <g...@johnthomson.fastmail.com.au> > Subject: [PATCH] kernel: fix mtd partition erase<parent_erasesize writing to > wrong address > > This bug applied where mtd partition end address, or erase start address, > was not cleanly divisible by parent mtd erasesize. > > This error would cause the bits following the end of the partition to the next > erasesize block boundary to be erased, and this partition-overflow data to be > written to the partition erase address (missing additional partition offset > address) of the mtd (top) parent device.
We had a longer discussion about that on IRC yesterday, where we also discussed (stylistically) different solutions. As it appears, kernel did a major overhaul of mtdpart.c in a more recent major kernel version: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=46b5889cc2c54bac7d7e727a44d28a298df23cef So, any cosmetic improvement would probably be in vain anyway. Therefore, I decided to merge the patch in its current (tested) state. However, this essentially means we will meet the same problem again with the next kernel bump, and it won't become easier. In the discussion, jow had the idea to circumvent these problems and handle the non-aligned partitions in user-space. Maybe one can exploit the mtd-rw tool (available as kmod-mtd-rw in packages repo) for unlocking the relevant areas and then do everything else in user-space. If that seems suitable, we could use the merged patch for the 20.xx release, and then migrate master to a better solution when we have one. I put the discussion below in case somebody feels inspired to play around with that (first half is the general discussion, second half mentions the user-space solution). Best Adrian --- [...] 12:31 jow: adrianschmutzler: I took a look at the difference and in hindsight it seems to make a lot of sense 12:33 jow: adrianschmutzler: plain diff: (Link: https://pastebin.com/AqE89TXB)https://pastebin.com/AqE89TXB - code context: (Link: https://pastebin.com/Hn2ny4Lt)https://pastebin.com/Hn2ny4Lt 12:34 jow: adrianschmutzler: so the "instr->addr -= part->offset;" address manipulation is moved after the _write() which makes sense since we're supposed to both erase and (partially) re-write the same block 12:34 jow: (my second paste shows the old, not the patched code) 12:35 jow: the way the code is right now, it would erase the entire block partly occupied by a partition, then overwrite the first block of the mtd device 12:35 jow: this is my interpretation at least, I am not an mtd subsystem expert 12:38 adrianschmutzler: okay, thanks. will have a look at the older kernel now, as in the PR it is called "a merge bug in 4.19 & 5.4" ... 12:39 jow: yeah, my first thought was the same, probably a merge/patch rebase quirk 12:41 jow: tbh, I'd refactor the code while at it, directly substract instr->addr after part->parent->_erase(part->parent, instr); like it is done in the current broken code and explicitly pass instr->addr + part->offset and instr->addr + instr->len + part->offset respectively to part->parent->_write() to mirror what is done for mtd_read() a few further lines up 12:41 adrianschmutzler: unfortunately upstream also doesn't seem to be interested in commenting on the patches 12:43 jow: mtd_read() and part->parent->_write() take explicit offsets as arguments (part->offset + instr->addr + ...) while part->parent->_erase() uses the ->addr member from instr, hence the address manipulation 12:43 jow: and I'd revert the address to its original value as soon as possible and continue passing calculated offsets later 12:44 adrianschmutzler: jow: So, it's actually bad design after all? 12:45 jow: this would be my fix on top of the current broken code: (Link: https://pastebin.com/diff/EjUdXshw)https://pastebin.com/diff/EjUdXshw 12:45 adrianschmutzler: the need to increase/decrease instr->addr ... 12:47 adrianschmutzler: jow: okay. IIRC, they stirred around mtd in one of the recent kernel anyway, and also did some fairly similarly looking changes with offsets there as well 12:47 jow: I do not know enough about the struct erase_info and part->parent->_erase() semantics to understand why _erase() apparently uses instr->addr relative to the partition offset while part->parent->_write() expects an absolute offset relative to the mtd block 12:48 jow: that seems inconsitent to me, maybe the _erase() operates on a wrong offset as well 12:53 jow: adrianschmutzler: I took at the upstream code as well: (Link: https://lxr.missinglinkelectronics.com/linux/drivers/mtd/mtdpart.c#L197)https://lxr.missinglinkelectronics.com/linux/drivers/mtd/mtdpart.c#L197 and I think I got a better idea now. So the various part_*() functions expect an erase_info struct with partition-relative addresses, they then interally manipulate those addresses to be mtd-relative, invoke the lowlevel whole mtd ops, then revert the address back to their 12:53 jow: original partition-relative value 12:54 jow: so the originally proposed patch which simply moves the reversal (instr->addr -= part->offset;) directly before the `return ret;` is fine and matches the style of the kernel part_*() functions 12:58 jow: adrianschmutzler: just for completeness, and even better fix would be (Link: https://pastebin.com/diff/mFSb8Uqp)https://pastebin.com/diff/mFSb8Uqp 13:02 adrianschmutzler: jow: why is the additional mtd_to_part(mtd) superior? 13:03 jow: it should be optimized away by the compiler, the code is superior because the intention is clearer 13:03 jow: likewise the mtd_read() at the beginning could be changed to part_read() and the `part->offset + ` removed in each invocation 13:04 jow: this way the only "address fiddling" that remains is in the original function body of part_erase() 13:04 jow: (we can't obviously call part_erase() from within part_erase() since that'd recurse forever) 13:05 jow: but this is all just a matter of style and taste, not sure what the subsystem maintainers would prefer 13:06 jow: I guess however that this partial erase will likely will never go upstream 13:06 adrianschmutzler: okay, I see your point now 13:06 adrianschmutzler: probably that's why they remained silent on the other issues 13:07 jow: I actually wonder why this can't be done in userspace 13:08 jow: or in wahtever code using the mtd api 13:08 jow: after all it is just a matter of reading an entire block, holding it in memory, erasing the entire block and then simply partially write back the parts we did not want to erase 13:09 jow: the only obstacle would be the write protection for non-eb aligned parts I guess 13:10 adrianschmutzler: no idea; but I will throw that idea towards the affected people, maybe they want to go into that direction 13:11 jow: in the meanwhile I'd say we should apply the proposed fix 13:11 jow: it makes sense 13:11 jow: and its not the submitters fault that the orginal code is not beautiful to begin with 13:11 adrianschmutzler: sysupgrade won't be a problem for a user-space solution, as other stuff there happens in user-space as well? 13:12 jow: if I remember correctly, that partial erase thing was introduced years ago to support sysupgrade on Redboot systems 13:13 adrianschmutzler: that's one of currently broken systems and the patch fixes them apparently 13:13 jow: there we wanted to write accross the FIS table dictated kernel/rootfs boundary using the mtd util and nbd implemented a kernel patch and corresponding functionality in mtd to do exactly that 13:13 jow: and I believe one of (the?) problem was that the kernel mtd subsystem write-protects all non-aligned partitions by default 13:14 adrianschmutzler: but that sounds more like the subsequent patch: 13:14 adrianschmutzler: 412-mtd-partial_eraseblock_unlock.patch 13:15 adrianschmutzler: which states it's for ixp4xx only, which probably isn't true anymore ... 13:16 jow: adrianschmutzler: I think both is needed 13:19 jow: adrianschmutzler: I think the partial erase was needed to rewrite the FIS partition while keeping the Redboot config (often both share one eraseblock) 13:20 adrianschmutzler: okay, then since this is tested and waiting some time already, and since part_write is removed with newer kernel anyway, I merge the existing patch. 13:20 jow: adrianschmutzler: and the partial erase unlock patch covers cases where the partition you manipulate is locked because its not spanning an entire eb 13:21 jow: at least this is my understanding 13:22 adrianschmutzler: jow: okay. but then the 412 patch is probably relevant to all devices with redboot partitions and the ixp4xx references should be removed from the description ... 13:22 jow: or rather, it allows unlocking non-aligned parts by simply tweaking the flash area begin and end offsets to start at the previous eb boundary and end on the next 13:22 jow: so you're not only unlocking the target area but a little bit of stuff before and after too 13:23 jow: otherwise the mtd subsystem would reject the unlock 13:24 adrianschmutzler: so, it's effectively a hack 13:24 jow: yes 13:24 adrianschmutzler: btw just found that RafaĆ wanted to remove these scripts already in March: (Link: https://patchwork.ozlabs.org/project/openwrt/patch/20200309114302.14383-1-zaj...@gmail.com/)https://patchwork.ozlabs.org/project/openwrt/patch/20200309114302.14383-1-zaj...@gmail.com/ 13:25 adrianschmutzler: scripts->patches 13:26 jow: I think that if we somehow have a mechanism to forcibly unlock all flash areas (which could be done by a much leaner and more isolated patch I suppose) we could handle all the other edge cases in userspace 13:26 jow: by disregarding any partitioning and simply treating the entire mtd as continuous block device with offsets 13:27 jow: like you would e.g. dd an x86 image to the blockdev 13:28 jow: ah, just remebered this: (Link: https://github.com/jclehner/mtd-rw)https://github.com/jclehner/mtd-rw 13:28 jow: something like this could be a solution for sysupgrade 13:28 jow: not sure if its still compatible to 5.4 but if yes the we wouldn't even need a patch at all 13:29 adrianschmutzler: mtd-rw is in packages repo and works fine AFAIK 13:29 jow: a slightly less dangerous solution would be a kernel module or some other facility which would allow us to create mtd partitions on demand, not sure if something like this is possible 13:30 jow: one could thne simply "spwan" a new mtd partition for the flash area relevant for sysupgrade, then "mtd write" in there 13:30 adrianschmutzler: I just hesitate to include and enable a tool like that by default 13:30 jow: while "masking" other sensible parts like bootloader etc. to avoid bricks if something goes wrong 13:33 jow: hm well, yeah I understand your concerns 13:33 jow: but on x86 for example you can also happily dd over your entire disk 13:33 jow: granted, it is way easier to recover, but still... 13:34 adrianschmutzler: I will push that forward to the Mikrotik people anyway, as they seem to be the most affected at the moment. Maybe they have time to play around with idea going into that direction 13:34 adrianschmutzler: "Mikrotik people" -> contributors working with Mikrotik devices, not the vendor of course 13:37 jow: I could imagine that adding mtd-rw by defualt and adding some kind of sysctl to trigger it would be good enough 13:37 adrianschmutzler: together with Tomasz on the Redboot front there should be enough people interested 13:37 adrianschmutzler: hmm, I really have to think about that 13:37 jow: that sysctl could then be toggled by "mtd unlock" which never actually worked for me when I needed it 13:38 jow: or some other equivalent commend to not break existing "mtd unlock" users 13:38 jow: *command 13:38 adrianschmutzler: I fear this may end up in a lot of scripts then, where people will use it to take shortcuts ... 13:39 adrianschmutzler: but if it solves the problem, we should consider it 13:40 adrianschmutzler: because maintaining these patches won't become easier 13:41 jow: I agree 14:12 rmilecki: adrianschmutzler: jow: omg, that parial erase size crap again? 14:13 jow: rmilecki: we've been discussing its original intnetion and possible, non-invasive alternatives 14:14 jow: rmilecki: current working idea is to ship some kind of mtd unlock mechanism and handling the overlapping write/erase things in userspace 14:15 jow: rmilecki: (Link: https://github.com/jclehner/mtd-rw)https://github.com/jclehner/mtd-rw plus some clever scripting should already achive that 14:15 adrianschmutzler: rmilecki: the problem is that this partial erase stuff is needed for the redboot devices and all of the mikrotik ones as well (which currently all have "broken" sysupgrade) 14:16 rmilecki: the problem is it exists in a current form 14:16 rmilecki: afair it enables those partial wries for all ops, without checking if access is aligned or not 14:18 rmilecki: adrianschmutzler: i see jow posted few improvement ideas, do you want to implement them? if so, as part of submitted patch? or on top of that (later)? 14:21 adrianschmutzler: rmilecki: most of jow's improvement were about style. however, in (Link: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=46b5889cc2c54bac7d7e727a44d28a298df23cef)https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=46b5889cc2c54bac7d7e727a44d28a298df23cef kernel changes partition handling anyway 14:21 rmilecki: oh, that one is a hure change 14:21 adrianschmutzler: so, I don't think it's worth to care too much about style when we have to do a different implementation for next stable kernel anyway. 14:21 rmilecki: it's in some recent kernel though, right? 14:21 adrianschmutzler: and the existing patch is tested and known to be working on the affected devices 14:22 rmilecki: adrianschmutzler: ok, i agree, let's just make it work 14:22 adrianschmutzler: haven't checked explicitly, but somewhere after 5.4 of course
openpgp-digital-signature.asc
Description: PGP signature
_______________________________________________ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/mailman/listinfo/openwrt-devel