On 03/21/2016 06:56 PM, Eric Nelson wrote: > Hi Marek, Hi!
> On 03/21/2016 09:49 AM, Marek Vasut wrote: >> On 03/21/2016 02:48 PM, Eric Nelson wrote: >>> On 03/20/2016 06:59 PM, Marek Vasut wrote: >>>> On 03/21/2016 02:45 AM, Eric Nelson wrote: >>>>> Here's a more full-featured implementation of a cache for block >>>>> devices that uses a small linked list of cache blocks. >>>> >>>> Why do you use linked list ? You have four entries, you can as well >>>> use fixed array. Maybe you should implement an adaptive cache would >>>> would use the unpopulated malloc area and hash the sector number(s) >>>> into that area ? >>>> >>> >>> I was looking for a simple implementation that would allow tweaking of >>> the max entries/size per entry. >>> >>> We could get higher performance through hashing, but with such a >>> small cache, it's probably not worth extra code. >> >> The hashing function can be a simple modulo on sector number ;-) That'd >> be less code than linked lists. >> > > I'm not seeing how. > > I'm going look first at a better way to integrate than the approach > taken in patch 3. I will dive in the code itself and comment if applicable. >>> Using an array and re-allocating on changes to the max entries variable >>> is feasible, but I think it would be slightly more code. >> >> That would indeed be more code. >> >>>>> Experimentation loading a 4.5 MiB kernel from the root directory of >>>>> a FAT filesystem shows that a single cache entry of a single >>>>> block is the only >>>> >>>> only ... what ? This is where things started to be interesting, but >>>> you leave us hanging :) >>>> >>> >>> Oops. >>> >>> ... I was planning on re-wording that. >>> >>> My testing showed no gain in performance (additional cache hits) past a >>> single entry of a single block. This was done on a small (32MiB) >>> partition with a small number of files (~10) and only a single >>> read is skipped. >> >> I'd kinda expect that indeed. >> > > Yeah, and the single-digit-ms improvement isn't worth much. > >>> => blkc c ; blkc i ; blkc 0 0 ; >>> changed to max of 0 entries of 0 blocks each >>> => load mmc 0 10008000 /zImage >>> reading /zImage >>> 4955304 bytes read in 247 ms (19.1 MiB/s) >>> => blkc >>> block cache: >>> 0 hits >>> 7 misses >>> 0 entries in cache >>> trace off >>> max blocks/entry 0 >>> max entries 0 >>> => blkc c ; blkc i ; blkc 1 1 ; >>> changed to max of 1 entries of 1 blocks each >>> => load mmc 0 10008000 /zImage >>> reading /zImage >>> 4955304 bytes read in 243 ms (19.4 MiB/s) >>> => blkc >>> block cache: >>> 1 hits >>> 6 misses >>> 1 entries in cache >>> trace off >>> max blocks/entry 1 >>> max entries 1 >>> >>> I don't believe that enabling the cache is worth the extra code >>> for this use case. >>> >>> By comparison, a load of 150 MiB compressed disk image from >>> ext4 showed a 30x speedup with the V1 patch (single block, >>> single entry) from ~150s to 5s. >>> >>> Without some form of cache, the 150s was long enough to make >>> a user (me) think something is broken. >> >> I'm obviously loving this improvement. >> > > Glad to hear it. > :) -- Best regards, Marek Vasut _______________________________________________ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot