Building Linaro-netbook filesystem from sources
Hi, I want to build linaro-netboook filesystem from *sources*, as I need a filesystem build without VFP support. I checked the wiki and other links but could not find a related document. Can anyone please point me to directions on how to do the same? -- Thanks Amit Mahajan ___ linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev
Re: Building Linaro-netbook filesystem from sources
On 17/12/10 08:01, Amit Mahajan wrote: I want to build linaro-netboook filesystem from *sources*, as I need a filesystem build without VFP support. I checked the wiki and other links but could not find a related document. Can anyone please point me to directions on how to do the same? I don't know if there's an official way to do that, but here's what I would do: 1. Find a board you can run the existing installation on. Cross building packages is hard, so it'll be easier to bootstrap it this way. 2. On this board, download the compiler sources: apt-get source gcc 3. Tweak the compiler configuration flags in the debian directory so that they set up the VFP as you want it, and build the compiler: sudo apt-get build-dep gcc dpkg-buildpackage gcc*.dsc 4. Install the new gcc into the build board (I would recommend doing this work in a chroot incase something goes wrong ...): dpkg . gcc*.dep 5. Squirrel away the newly built .deb files. 6. Repeat for all packages until you have no packages depending on VFP in your system. Most won't require any reconfiguration, but you never know. It's probably best to start with glibc, and then other libraries, just to make sure the headers and configure tests are right. (Maybe non-VFP kernel headers also, but you'll need to run a VFP-enabled kernel until you're done rebuilding everything.) Your build file-system should now run on your netbook, although it'll be chock full of -dev packages, so you might have to clean it up a bit. This should work because the default ARM EABI is the same whether you use VFP, or not. If/when we choose to switch to the hard-fp EABI variant (in theory, it's more efficient), then tricks like this will be more difficult. Hope that helps Andrew ___ linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev
Re: Building Linaro-netbook filesystem from sources
On Fri, 2010-12-17 at 11:30 +, Andrew Stubbs wrote: > On 17/12/10 08:01, Amit Mahajan wrote: > > I want to build linaro-netboook filesystem from *sources*, as I need a > > filesystem build without VFP support. > > > > I checked the wiki and other links but could not find a related > > document. > > > > Can anyone please point me to directions on how to do the same? > > I don't know if there's an official way to do that, but here's what I > would do: > > 1. Find a board you can run the existing installation on. Cross building > packages is hard, so it'll be easier to bootstrap it this way. > > 2. On this board, download the compiler sources: > >apt-get source gcc > > 3. Tweak the compiler configuration flags in the debian directory so > that they set up the VFP as you want it, and build the compiler: > >sudo apt-get build-dep gcc >dpkg-buildpackage gcc*.dsc > > 4. Install the new gcc into the build board (I would recommend doing > this work in a chroot incase something goes wrong ...): > >dpkg . gcc*.dep > > 5. Squirrel away the newly built .deb files. > > 6. Repeat for all packages until you have no packages depending on VFP > in your system. Most won't require any reconfiguration, but you never > know. It's probably best to start with glibc, and then other libraries, > just to make sure the headers and configure tests are right. (Maybe > non-VFP kernel headers also, but you'll need to run a VFP-enabled kernel > until you're done rebuilding everything.) > > Your build file-system should now run on your netbook, although it'll be > chock full of -dev packages, so you might have to clean it up a bit. > > This should work because the default ARM EABI is the same whether you > use VFP, or not. If/when we choose to switch to the hard-fp EABI variant > (in theory, it's more efficient), then tricks like this will be more > difficult. > > Hope that helps > > Andrew Hi Andrew, Thanks for your help. The procedure you mentioned looks good. But I have 2 points here: 1. Right now I do not have access to board. I think probably I can use QEMU for simulating my hardware. 2. Compiling each package individually will be a long process. I wonder if Ubuntu has something like ALIP (ARM linux internet platform), which can be readily used with scratchbox. -- Thanks Amit Mahajan ___ linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev
Linaro Infrastructure Team Weekly Report (2010-12-10 to 2010-12-16)
All, The weekly report for the Linaro Infrastructure team may be found at:- Status report: https://wiki.linaro.org/Platform/Infrastructure/Status/2010-12-16 Burndown chart: This link is awaiting the production of new burndown charts. The Infrastructure related blueprints from the maverick cycle, of which currently there are 4 active ones (4 from the last report), are showing that there are 8 work items in progress (8 last report), and 11 work items to undertake (11 last report). These are to be moved into the natty cycle if still required. * arm-m-validation-dashboard; 1 work item completed; 3 in progress; 6 to do (7 last report); 1 work item added * arm-m-image-building-console; no change in status from last report; 3 in progress; 3 to do * arm-m-automated-testing-framework; no change in status from last report; 1 in progress; 0 to do * arm-m-testsuites-and-profilers; no change in status from last report; 1 in progress; 1 to do In the natty cycle, the following lists the current active Blueprints, or Blueprints planned and not started. Currently there are 4 active Blueprints (4 from the last report), which are showing that there are 8 work items in progress (8 last report), 41 work items to undertake (41 last report), 0 work items postponed (0 last report) and 1 work item added (8 items added last report). * other-linaro-n-improve-linaro-media-create: 7 work items completed in total (5 last week); 3 work items in progress (4 last week); 8 work items to do (8 last week); 1 work item added this week (2 last week). * other-linaro-n-test-result-display-in-launch-control: 0 work items completed (0 last week); 1 work item in progress (1 last week); 10 work items to do (10 last week); 0 work items added (0 last week) * other-linaro-n-patch-tracking: 0 work item completed (0 last week); 2 work items in progress (2 last week); 9 work items to do (9 last week) * other-linaro-n-image-building: 2 work items in progress (2 last week); 5 work items to do (5 last week); 2 work items postponed (2 last week); 0 work items added (0 last week) * other-linaro-n-continuous-integration: Not started - awaiting a Hudson build server (RT#42278). * other-linaro-n-package-development-tools: Not Started; 9 work items to do Specifics accomplished this week include:- * Set up hardware in Cambridge for automated testing; worked on images to boot a stable environment, and be used to flash the test image * Progress on blueprint tracking - Code changes seem to be sufficient to get burndown charts going again * other-linaro-n-image-building: "4.6 Multiple hwpacks": DONE * ImproveLinaroMediaCreate: port setup_sizes, calculatesize and get_device_by_id to python: DONE * Working on the extended test plan for ux500, three new test cases added to the plan this week. Please let me know if you have any comments or questions. Kind Regards, Ian ___ linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev
Re: Building Linaro-netbook filesystem from sources
1. Right now I do not have access to board. I think probably I can use QEMU for simulating my hardware. Yes, I expect so, but it won't be fast! Does the kernel not have a VFP emulation mode that might make the existing binaries work on your netbook, at least well enough for bootstrapping purposes? Just a thought ... it might be better than QEMU? 2. Compiling each package individually will be a long process. I wonder if Ubuntu has something like ALIP (ARM linux internet platform), which can be readily used with scratchbox. That would be nice, but I don't know of such a thing, and would it work for builds in a custom environment? Anyway, here's another top-tip: use distcc. This is a tool that gives you a 'virtual' compiler. It does all the preprocessing and linking using the native tools (in QEMU, in your case), but sends the preprocessed source to a distcc server on another machine for the actual compilation job. The distcc server can be another ARM machine, but equally it can be a PC with a suitable ARM cross compiler. You can set up multiple distcc servers, each configured to run multiple compile jobs, if you wish, and then run the build with DEB_BUILD_OPTIONS=parallel=2 (or whatever) in the environment, and maybe get a performance boost, depending how QEMU performs at the preprocessing. Back in a former job, I used to have 6 SH-64 boards running package builds via distcc, with the compilers running on 8 x86 build servers, and I could rebuild the entire distro in one night. Of course, that was only a small distro I put together myself - nothing on the scale of Ubuntu, and the boards were faster than QEMU, probably. Distcc is often mentioned in conjunction with ccache, but caching object files isn't really very interesting if you only build each package once. There might be some small advantage in speeding up repeated configure tests, I suppose, but I suggest not bothering with it. Hope that helps Andrew ___ linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev
Re: Building Linaro-netbook filesystem from sources
+++ Amit Mahajan [2010-12-17 17:19 +0530]: > On Fri, 2010-12-17 at 11:30 +, Andrew Stubbs wrote: > > > I checked the wiki and other links but could not find a related > > > document. You are right that this process is not properly documented. That's largely because because it's either very slow (emulated) or very difficult (cross). If you do have some fast-enough hardware of the right type then it's fairly tractable, although toolchain-defaulting is still an awkward process requiring a flavoured rebuild of the toolchain. We do at least have that in place now though. I'll this to mey list of 'missing wiki pages'. We should at least document the flavoured-native-rebuild process as that is in place, even if it doesn;t help in your case. > Thanks for your help. The procedure you mentioned looks good. But I have > 2 points here: > 1. Right now I do not have access to board. I think probably I can use > QEMU for simulating my hardware. > > 2. Compiling each package individually will be a long process. I wonder > if Ubuntu has something like ALIP (ARM linux internet platform), which > can be readily used with scratchbox. We are working on this. Debian/Ubuntu was never designed for cross-building so a fair amount of work is needed to make this an easy, slick process. What you probably actually want is for someone else to have already built a no-VFP flavour of the distro that you could just use. Presumably that would be fine? (Debian's existing armel port might be of use, unless you also need everything to be built for v7, or to be actually using the ubuntu sources for some reason?) My belief is that in practice most people would be satisfied if a small number of flavours (v5, v6, v7 noVFP) were pre-built. Shout if that's not true for you. Nevertheless there are enough awkward cases that making it easy to bootstrap a new flavour without relevant hardware is an important goal (a-la ALIP/AEL). For this to work we need the main core of 200-odd packages to cross-build properly, for circular build-dependencies to be breakable, and for cross-build tools to be able to reliably do the right thing with dependencies. About half of those 200 packages do now cross (sometimes with not-yet-merged patches), circular build-dependencies are being removed (with staged builds), and the cross-building tools and meta-data are being improved so that the process is automable and reliable. That last process depends on multiarch to be fully implemented in order to have cross-dependencies correctly described in package metadata. For this stuff to stay working we also need continuous cross-build testing, otherwise it will bitrot. I expect this to be a working, demonstrable, process within the next couple of months, but from a patched set of sources because various aspects can't go in the main repo yet. I am trying to keep this page https://wiki.linaro.org/CrossBuilding as a good overview. That has a currently more-or-less empty 'Rebuilding everything for a new ABI/flavour section. I'll fill that out now. Wookey -- Principal hats: Linaro, Emdebian, Wookware, Balloonboard, ARM http://wookware.org/ ___ linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev
Re: MMC double buffering
Hi again, I made a mistake in my double buffering implementation. I assumed dma_unmap did not do any cache operations. Well, it does. Due to L2 read prefetch the L2 needs to be invalidated at dma_unmap. I made a quick test to see how much throughput would improved if dma_unmap could be run in parallel. In this run dma_unmap is removed. Then the figures for read becomes: * 7-16 % gain if double buffering in the ideal case. Closing on the same performance as for PIO. Relative diff: MMC-VANILLA-DMA-LOG -> MMC-MMCI-2-BUF-DMA-LOG-NO-UNMAP CPU is abs diff random random KB reclen write rewrite readreread readwrite 51200 4 +0% +0% +7% +8% +2% +0% cpu:+0.0+0.0+0.7+0.7-0.0+0.0 51200 8 +0% +0% +10%+10%+6% +0% cpu:-0.1+0.1+0.6+0.9+0.3+0.0 51200 16 +0% +0% +11%+11%+8% +0% cpu:-0.0-0.1+0.9+1.0+0.3+0.0 51200 32 +0% +0% +13%+13%+10%+0% cpu:-0.1+0.0+1.0+0.5+0.8+0.0 51200 64 +0% +0% +13%+13%+12%+1% cpu:+0.0+0.0+0.4+1.0+0.9+0.1 51200 128 +0% +5% +14%+14%+14%+1% cpu:+0.0+0.2+1.0+0.9+1.0+0.0 51200 256 +0% +2% +13%+13%+13%+1% cpu:+0.0+0.1+0.9+0.3+1.6-0.1 51200 512 +0% +1% +14%+14%+14%+8% cpu:-0.0+0.3+2.5+1.8+2.4+0.3 51200 1024+0% +2% +14%+15%+15%+0% cpu:+0.0+0.3+1.3+1.4+1.3+0.1 51200 2048+2% +2% +15%+15%+15%+4% cpu:+0.3+0.1+1.6+2.1+0.9+0.3 51200 4096+5% +3% +15%+16%+16%+5% cpu:+0.0+0.4+1.1+1.7+1.7+0.5 51200 8192+5% +3% +16%+16%+16%+2% cpu:+0.0+0.4+2.0+1.3+1.8+0.1 51200 16384 +1% +1% +16%+16%+16%+4% cpu:+0.1-0.2+2.3+1.7+2.6+0.2 I will work on adding unmap to double buffering next week. /Per On 16 December 2010 15:15, Per Forlin wrote: > Hi, > > I am working on the blueprint > https://blueprints.launchpad.net/linux-linaro/+spec/other-storage-performance-emmc. > Currently I am investigating performance for DMA vs PIO on eMMC. > > Pros and cons for DMA on MMC > + Offloads CPU > + Fewer interrupts, one single interrupt for each transfer compared to > 100s or even 1000s > + Power save, DMA consumes less power than CPU > - Less bandwidth / throughput compared to PIO-CPU > > The reason for introducing double buffering in the MMC framework is to > address the throughput issue for DMA on MMC. > The assumption is that the CPU and DMA have higher throughput than the > MMC / SD-card. > My hypothesis is that the difference in performance between PIO-mode > and DMA-mode for MMC is due to latency for preparing a DMA-job. > If the next DMA-job could be prepared while the current job is ongoing > this latency would be reduced. The biggest part of preparing a DMA-job > is maintenance of caches. > In my case I run on U5500 (mach-ux500) which has both L1 and L2 > caches. The host mmc driver in use is the mmci driver (PL180). > > I have done a hack in both the MMC-framework and mmci in order to make > a prove of concept. I have run IOZone to get measurements to prove my > case worthy. > The next step, if the results are promising will be to clean up my > work and send out patches for review. > > The DMAC in ux500 support to modes LOG and PHY. > LOG - Many logical channels are multiplex on top of one physical channel > PHY - Only one channel per physical channel > > DMA mode LOG and PHY have different latency both HW and SW wise. One > could almost treat them as "two different DMACs. To get a wider test > scope I have tested using both modes. > > Summary of the results. > * It is optional for the mmc host driver to utitlize the 2-buf > support. 2-buf in framework requires no change in the host drivers. > * IOZone shows no performance hit on existing drivers* if adding 2-buf > to the framework but not in the host driver. > (* So far I have only test one driver) > * The performance gain for DMA using 2-buf is probably proportional to > the cache maintenance time. > The faster the card is the more significant the cache maintenance > part becomes and vice versa. > * For U5500 with 2-buf performance for DMA is: > Throughput: DMA vanilla vs DMA 2-buf > * read +5-10 % > * write +0-3 % > CPU load: CPU vs D
Re: Building Linaro-netbook filesystem from sources
On Fri, 2010-12-17 at 12:20 +, Andrew Stubbs wrote: > > 1. Right now I do not have access to board. I think probably I can use > > QEMU for simulating my hardware. > > Yes, I expect so, but it won't be fast! > > Does the kernel not have a VFP emulation mode that might make the > existing binaries work on your netbook, at least well enough for > bootstrapping purposes? Just a thought ... it might be better than QEMU? Yes, my kernel right now doesnot support VFP at all. > > 2. Compiling each package individually will be a long process. I wonder > > if Ubuntu has something like ALIP (ARM linux internet platform), which > > can be readily used with scratchbox. > > That would be nice, but I don't know of such a thing, and would it work > for builds in a custom environment? ALIP is a bit tricky thing, but if you understand their build process, its very easy to customize it. Right now i use its v7 without vfp variant. > Anyway, here's another top-tip: use distcc. This is a tool that gives > you a 'virtual' compiler. It does all the preprocessing and linking > using the native tools (in QEMU, in your case), but sends the > preprocessed source to a distcc server on another machine for the actual > compilation job. > > The distcc server can be another ARM machine, but equally it can be a PC > with a suitable ARM cross compiler. > > You can set up multiple distcc servers, each configured to run multiple > compile jobs, if you wish, and then run the build with > DEB_BUILD_OPTIONS=parallel=2 (or whatever) in the environment, and maybe > get a performance boost, depending how QEMU performs at the preprocessing. > > Back in a former job, I used to have 6 SH-64 boards running package > builds via distcc, with the compilers running on 8 x86 build servers, > and I could rebuild the entire distro in one night. Of course, that was > only a small distro I put together myself - nothing on the scale of > Ubuntu, and the boards were faster than QEMU, probably. > > Distcc is often mentioned in conjunction with ccache, but caching object > files isn't really very interesting if you only build each package once. > There might be some small advantage in speeding up repeated configure > tests, I suppose, but I suggest not bothering with it. > > Hope that helps > > Andrew This looks very nice. I will try to establish this environment over the weekend. Thanks for great help! -- Thanks Amit Mahajan ___ linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev
Re: Building Linaro-netbook filesystem from sources
On Fri, 2010-12-17 at 13:24 +, Wookey wrote: > +++ Amit Mahajan [2010-12-17 17:19 +0530]: > > On Fri, 2010-12-17 at 11:30 +, Andrew Stubbs wrote: > > > > I checked the wiki and other links but could not find a related > > > > document. > > You are right that this process is not properly documented. That's > largely because because it's either very slow (emulated) or very > difficult (cross). If you do have some fast-enough hardware of the > right type then it's fairly tractable, although toolchain-defaulting > is still an awkward process requiring a flavoured rebuild of the > toolchain. We do at least have that in place now though. > > I'll this to mey list of 'missing wiki pages'. We should at least > document the flavoured-native-rebuild process as that is in place, > even if it doesn;t help in your case. Hi Wookey, If you can give me some short comments on how you do flavoured-native-rebuild, I can possible customize it for my scenario and possible get back with some good document, that might be helpful for others too. Just a thought. > > Thanks for your help. The procedure you mentioned looks good. But I have > > 2 points here: > > 1. Right now I do not have access to board. I think probably I can use > > QEMU for simulating my hardware. > > > > 2. Compiling each package individually will be a long process. I wonder > > if Ubuntu has something like ALIP (ARM linux internet platform), which > > can be readily used with scratchbox. > > We are working on this. Debian/Ubuntu was never designed for > cross-building so a fair amount of work is needed to make this an > easy, slick process. > > What you probably actually want is for someone else to have already > built a no-VFP flavour of the distro that you could just use. > Presumably that would be fine? (Debian's existing armel port might be > of use, unless you also need everything to be built for v7, or to be > actually using the ubuntu sources for some reason?) Yes, if some has a prebuilt noVFP flavour, I can use that but it need to be build for ARM v7 only, as my target is v7 platform only :) Ubuntu is not a special requirement for me, so I have already explored armel port of Debian too, but found the same restrictions as you mentioned. > My belief is that in practice most people would be satisfied if a > small number of flavours (v5, v6, v7 noVFP) were pre-built. Shout if > that's not true for you. As and end user I agree here with you totally. But from a developer's perspective I would like to have a system which I can build from scratch and see how it works, so that I can customize it readily for future. A common example is ALIP. It was well maintained, with some good preliminary tutorials. Its can be customized easily for various configs. > Nevertheless there are enough awkward cases that making it easy to > bootstrap a new flavour without relevant hardware is an important goal > (a-la ALIP/AEL). For this to work we need the main core of 200-odd > packages to cross-build properly, for circular build-dependencies to > be breakable, and for cross-build tools to be able to reliably do the > right thing with dependencies. About half of those 200 packages do now > cross (sometimes with not-yet-merged patches), circular > build-dependencies are being removed (with staged builds), and the > cross-building tools and meta-data are being improved so that the > process is automable and reliable. That last process depends on > multiarch to be fully implemented in order to have cross-dependencies > correctly described in package metadata. > > For this stuff to stay working we also need continuous cross-build > testing, otherwise it will bitrot. > > I expect this to be a working, demonstrable, process within the next > couple of months, but from a patched set of sources because various > aspects can't go in the main repo yet. > > I am trying to keep this page https://wiki.linaro.org/CrossBuilding as > a good overview. That has a currently more-or-less empty 'Rebuilding > everything for a new ABI/flavour section. I'll fill that out now. > > Wookey Hmm, ok let me see this page. Thanks for help! -- Thanks Amit Mahajan ___ linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev
Re: MMC double buffering
Hi, It's interesting. Can you send the your working codes to test it in our environment. Samsung SoC. Thank you, Kyungmin Park On Sat, Dec 18, 2010 at 12:38 AM, Per Forlin wrote: > Hi again, > > I made a mistake in my double buffering implementation. > I assumed dma_unmap did not do any cache operations. Well, it does. > Due to L2 read prefetch the L2 needs to be invalidated at dma_unmap. > > I made a quick test to see how much throughput would improved if > dma_unmap could be run in parallel. > In this run dma_unmap is removed. > > Then the figures for read becomes: > * 7-16 % gain if double buffering in the ideal case. Closing on the > same performance as for PIO. > > Relative diff: MMC-VANILLA-DMA-LOG -> MMC-MMCI-2-BUF-DMA-LOG-NO-UNMAP > CPU is abs diff > random random > KB reclen write rewrite read reread read write > 51200 4 +0% +0% +7% +8% +2% +0% > cpu: +0.0 +0.0 +0.7 +0.7 -0.0 +0.0 > > 51200 8 +0% +0% +10% +10% +6% +0% > cpu: -0.1 +0.1 +0.6 +0.9 +0.3 +0.0 > > 51200 16 +0% +0% +11% +11% +8% +0% > cpu: -0.0 -0.1 +0.9 +1.0 +0.3 +0.0 > > 51200 32 +0% +0% +13% +13% +10% +0% > cpu: -0.1 +0.0 +1.0 +0.5 +0.8 +0.0 > > 51200 64 +0% +0% +13% +13% +12% +1% > cpu: +0.0 +0.0 +0.4 +1.0 +0.9 +0.1 > > 51200 128 +0% +5% +14% +14% +14% +1% > cpu: +0.0 +0.2 +1.0 +0.9 +1.0 +0.0 > > 51200 256 +0% +2% +13% +13% +13% +1% > cpu: +0.0 +0.1 +0.9 +0.3 +1.6 -0.1 > > 51200 512 +0% +1% +14% +14% +14% +8% > cpu: -0.0 +0.3 +2.5 +1.8 +2.4 +0.3 > > 51200 1024 +0% +2% +14% +15% +15% +0% > cpu: +0.0 +0.3 +1.3 +1.4 +1.3 +0.1 > > 51200 2048 +2% +2% +15% +15% +15% +4% > cpu: +0.3 +0.1 +1.6 +2.1 +0.9 +0.3 > > 51200 4096 +5% +3% +15% +16% +16% +5% > cpu: +0.0 +0.4 +1.1 +1.7 +1.7 +0.5 > > 51200 8192 +5% +3% +16% +16% +16% +2% > cpu: +0.0 +0.4 +2.0 +1.3 +1.8 +0.1 > > 51200 16384 +1% +1% +16% +16% +16% +4% > cpu: +0.1 -0.2 +2.3 +1.7 +2.6 +0.2 > > I will work on adding unmap to double buffering next week. > > /Per > > On 16 December 2010 15:15, Per Forlin wrote: >> Hi, >> >> I am working on the blueprint >> https://blueprints.launchpad.net/linux-linaro/+spec/other-storage-performance-emmc. >> Currently I am investigating performance for DMA vs PIO on eMMC. >> >> Pros and cons for DMA on MMC >> + Offloads CPU >> + Fewer interrupts, one single interrupt for each transfer compared to >> 100s or even 1000s >> + Power save, DMA consumes less power than CPU >> - Less bandwidth / throughput compared to PIO-CPU >> >> The reason for introducing double buffering in the MMC framework is to >> address the throughput issue for DMA on MMC. >> The assumption is that the CPU and DMA have higher throughput than the >> MMC / SD-card. >> My hypothesis is that the difference in performance between PIO-mode >> and DMA-mode for MMC is due to latency for preparing a DMA-job. >> If the next DMA-job could be prepared while the current job is ongoing >> this latency would be reduced. The biggest part of preparing a DMA-job >> is maintenance of caches. >> In my case I run on U5500 (mach-ux500) which has both L1 and L2 >> caches. The host mmc driver in use is the mmci driver (PL180). >> >> I have done a hack in both the MMC-framework and mmci in order to make >> a prove of concept. I have run IOZone to get measurements to prove my >> case worthy. >> The next step, if the results are promising will be to clean up my >> work and send out patches for review. >> >> The DMAC in ux500 support to modes LOG and PHY. >> LOG - Many logical channels are multiplex on top of one physical channel >> PHY - Only one channel per physical channel >> >> DMA mode LOG and PHY have different latency both HW and SW wise. One >> could almost treat them as "two different DMACs. To get a wider test >> scope I have tested using both modes. >> >> Summary of the results. >> * It is optional for the mmc host driver to utitlize the 2-buf >> support. 2-buf in framework requires no change in the host drivers. >> * IOZone shows no performance hit on existing drivers* if adding 2-buf >> to the framework but not in the host driver. >> (* So far I have only test one driver) >> * The performance gain for DMA using 2