Building Linaro-netbook filesystem from sources

2010-12-17 Thread Amit Mahajan
Hi,

I want to build linaro-netboook filesystem from *sources*, as I need a
filesystem build without VFP support.

I checked the wiki and other links but could not find a related
document.

Can anyone please point me to directions on how to do the same?

-- 
Thanks
Amit Mahajan



___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: Building Linaro-netbook filesystem from sources

2010-12-17 Thread Andrew Stubbs

On 17/12/10 08:01, Amit Mahajan wrote:

I want to build linaro-netboook filesystem from *sources*, as I need a
filesystem build without VFP support.

I checked the wiki and other links but could not find a related
document.

Can anyone please point me to directions on how to do the same?


I don't know if there's an official way to do that, but here's what I 
would do:


1. Find a board you can run the existing installation on. Cross building 
packages is hard, so it'll be easier to bootstrap it this way.


2. On this board, download the compiler sources:

  apt-get source gcc

3. Tweak the compiler configuration flags in the debian directory so 
that they set up the VFP as you want it, and build the compiler:


  sudo apt-get build-dep gcc
  dpkg-buildpackage gcc*.dsc

4. Install the new gcc into the build board (I would recommend doing 
this work in a chroot incase something goes wrong ...):


  dpkg .  gcc*.dep

5. Squirrel away the newly built .deb files.

6. Repeat for all packages until you have no packages depending on VFP 
in your system. Most won't require any reconfiguration, but you never 
know. It's probably best to start with glibc, and then other libraries, 
just to make sure the headers and configure tests are right. (Maybe 
non-VFP kernel headers also, but you'll need to run a VFP-enabled kernel 
until you're done rebuilding everything.)


Your build file-system should now run on your netbook, although it'll be 
chock full of -dev packages, so you might have to clean it up a bit.


This should work because the default ARM EABI is the same whether you 
use VFP, or not. If/when we choose to switch to the hard-fp EABI variant 
(in theory, it's more efficient), then tricks like this will be more 
difficult.


Hope that helps

Andrew

___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: Building Linaro-netbook filesystem from sources

2010-12-17 Thread Amit Mahajan
On Fri, 2010-12-17 at 11:30 +, Andrew Stubbs wrote:
> On 17/12/10 08:01, Amit Mahajan wrote:
> > I want to build linaro-netboook filesystem from *sources*, as I need a
> > filesystem build without VFP support.
> >
> > I checked the wiki and other links but could not find a related
> > document.
> >
> > Can anyone please point me to directions on how to do the same?
> 
> I don't know if there's an official way to do that, but here's what I 
> would do:
> 
> 1. Find a board you can run the existing installation on. Cross building 
> packages is hard, so it'll be easier to bootstrap it this way.
> 
> 2. On this board, download the compiler sources:
> 
>apt-get source gcc
> 
> 3. Tweak the compiler configuration flags in the debian directory so 
> that they set up the VFP as you want it, and build the compiler:
> 
>sudo apt-get build-dep gcc
>dpkg-buildpackage gcc*.dsc
> 
> 4. Install the new gcc into the build board (I would recommend doing 
> this work in a chroot incase something goes wrong ...):
> 
>dpkg .  gcc*.dep
> 
> 5. Squirrel away the newly built .deb files.
> 
> 6. Repeat for all packages until you have no packages depending on VFP 
> in your system. Most won't require any reconfiguration, but you never 
> know. It's probably best to start with glibc, and then other libraries, 
> just to make sure the headers and configure tests are right. (Maybe 
> non-VFP kernel headers also, but you'll need to run a VFP-enabled kernel 
> until you're done rebuilding everything.)
> 
> Your build file-system should now run on your netbook, although it'll be 
> chock full of -dev packages, so you might have to clean it up a bit.
> 
> This should work because the default ARM EABI is the same whether you 
> use VFP, or not. If/when we choose to switch to the hard-fp EABI variant 
> (in theory, it's more efficient), then tricks like this will be more 
> difficult.
> 
> Hope that helps
> 
> Andrew

Hi Andrew,

Thanks for your help. The procedure you mentioned looks good. But I have
2 points here:
1. Right now I do not have access to board. I think probably I can use
QEMU for simulating my hardware.

2. Compiling each package individually will be a long process. I wonder
if Ubuntu has something like ALIP (ARM linux internet platform), which
can be readily used with scratchbox.

-- 
Thanks
Amit Mahajan


___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Linaro Infrastructure Team Weekly Report (2010-12-10 to 2010-12-16)

2010-12-17 Thread Ian Smith
All,

The weekly report for the Linaro Infrastructure team may be found at:-
Status report: https://wiki.linaro.org/Platform/Infrastructure/Status/2010-12-16
Burndown chart: This link is awaiting the production of new burndown charts.

The Infrastructure related blueprints from the maverick cycle, of which 
currently there are 4 active ones (4 from the last report), are showing that 
there are 8 work items in progress (8 last report), and 11 work items to 
undertake (11 last report). These are to be moved into the natty cycle if still 
required.

* arm-m-validation-dashboard; 1 work item completed; 3 in progress; 6 to do (7 
last report); 1 work item added
* arm-m-image-building-console; no change in status from last report; 3 in 
progress; 3 to do
* arm-m-automated-testing-framework; no change in status from last report; 1 in 
progress; 0 to do
* arm-m-testsuites-and-profilers; no change in status from last report; 1 in 
progress; 1 to do

In the natty cycle, the following lists the current active Blueprints, or 
Blueprints planned and not started. Currently there are 4 active Blueprints (4 
from the last report), which are showing that there are 8 work items in 
progress (8 last report), 41 work items to undertake (41 last report), 0 work 
items postponed (0 last report) and 1 work item added (8 items added last 
report).

  * other-linaro-n-improve-linaro-media-create: 7 work items completed in total 
(5 last week); 3 work items in progress (4 last week); 8 work items to do (8 
last week); 1 work item added this week (2 last week).
  * other-linaro-n-test-result-display-in-launch-control: 0 work items 
completed (0 last week); 1 work item in progress (1 last week); 10 work items 
to do (10 last week); 0 work items added (0 last week)
  * other-linaro-n-patch-tracking: 0 work item completed (0 last week); 2 work 
items in progress (2 last week); 9 work items to do (9 last week)
 * other-linaro-n-image-building: 2 work items in progress (2 last week); 5 
work items to do (5 last week); 2 work items postponed (2 last week); 0 work 
items added (0 last week)
 * other-linaro-n-continuous-integration: Not started - awaiting a Hudson build 
server (RT#42278).
 * other-linaro-n-package-development-tools: Not Started; 9 work items to do

Specifics accomplished this week include:-

 * Set up hardware in Cambridge for automated testing; worked on images to boot 
a stable environment, and be used to flash the test image
 * Progress on blueprint tracking - Code changes seem to be sufficient to get 
burndown charts going again
 * other-linaro-n-image-building: "4.6 Multiple hwpacks": DONE
 * ImproveLinaroMediaCreate: port setup_sizes, calculatesize and 
get_device_by_id to python: DONE
 * Working on the extended test plan for ux500, three new test cases added to 
the plan this week.

Please let me know if you have any comments or questions.

Kind Regards,


Ian















___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: Building Linaro-netbook filesystem from sources

2010-12-17 Thread Andrew Stubbs

1. Right now I do not have access to board. I think probably I can use
QEMU for simulating my hardware.


Yes, I expect so, but it won't be fast!

Does the kernel not have a VFP emulation mode that might make the 
existing binaries work on your netbook, at least well enough for 
bootstrapping purposes? Just a thought ... it might be better than QEMU?



2. Compiling each package individually will be a long process. I wonder
if Ubuntu has something like ALIP (ARM linux internet platform), which
can be readily used with scratchbox.


That would be nice, but I don't know of such a thing, and would it work 
for builds in a custom environment?


Anyway, here's another top-tip: use distcc. This is a tool that gives 
you a 'virtual' compiler. It does all the preprocessing and linking 
using the native tools (in QEMU, in your case), but sends the 
preprocessed source to a distcc server on another machine for the actual 
compilation job.


The distcc server can be another ARM machine, but equally it can be a PC 
with a suitable ARM cross compiler.


You can set up multiple distcc servers, each configured to run multiple 
compile jobs, if you wish, and then run the build with 
DEB_BUILD_OPTIONS=parallel=2 (or whatever) in the environment, and maybe 
get a performance boost, depending how QEMU performs at the preprocessing.


Back in a former job, I used to have 6 SH-64 boards running package 
builds via distcc, with the compilers running on 8 x86 build servers, 
and I could rebuild the entire distro in one night. Of course, that was 
only a small distro I put together myself - nothing on the scale of 
Ubuntu, and the boards were faster than QEMU, probably.


Distcc is often mentioned in conjunction with ccache, but caching object 
files isn't really very interesting if you only build each package once. 
There might be some small advantage in speeding up repeated configure 
tests, I suppose, but I suggest not bothering with it.


Hope that helps

Andrew

___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: Building Linaro-netbook filesystem from sources

2010-12-17 Thread Wookey
+++ Amit Mahajan [2010-12-17 17:19 +0530]:
> On Fri, 2010-12-17 at 11:30 +, Andrew Stubbs wrote:
> > > I checked the wiki and other links but could not find a related
> > > document.

You are right that this process is not properly documented. That's
largely because because it's either very slow (emulated) or very
difficult (cross). If you do have some fast-enough hardware of the
right type then it's fairly tractable, although toolchain-defaulting
is still an awkward process requiring a flavoured rebuild of the
toolchain. We do at least have that in place now though.

I'll this to mey list of 'missing wiki pages'. We should at least
document the flavoured-native-rebuild process as that is in place,
even if it doesn;t help in your case.

> Thanks for your help. The procedure you mentioned looks good. But I have
> 2 points here:
> 1. Right now I do not have access to board. I think probably I can use
> QEMU for simulating my hardware.
> 
> 2. Compiling each package individually will be a long process. I wonder
> if Ubuntu has something like ALIP (ARM linux internet platform), which
> can be readily used with scratchbox.

We are working on this. Debian/Ubuntu was never designed for
cross-building so a fair amount of work is needed to make this an
easy, slick process.

What you probably actually want is for someone else to have already
built a no-VFP flavour of the distro that you could just use.
Presumably that would be fine? (Debian's existing armel port might be
of use, unless you also need everything to be built for v7, or to be
actually using the ubuntu sources for some reason?)

My belief is that in practice most people would be satisfied if a
small number of flavours (v5, v6, v7 noVFP) were pre-built. Shout if
that's not true for you.

Nevertheless there are enough awkward cases that making it easy to
bootstrap a new flavour without relevant hardware is an important goal
(a-la ALIP/AEL). For this to work we need the main core of 200-odd
packages to cross-build properly, for circular build-dependencies to
be breakable, and for cross-build tools to be able to reliably do the
right thing with dependencies. About half of those 200 packages do now
cross (sometimes with not-yet-merged patches), circular
build-dependencies are being removed (with staged builds), and the
cross-building tools and meta-data are being improved so that the
process is automable and reliable. That last process depends on
multiarch to be fully implemented in order to have cross-dependencies
correctly described in package metadata.

For this stuff to stay working we also need continuous cross-build
testing, otherwise it will bitrot. 

I expect this to be a working, demonstrable, process within the next
couple of months, but from a patched set of sources because various
aspects can't go in the main repo yet.

I am trying to keep this page https://wiki.linaro.org/CrossBuilding as
a good overview. That has a currently more-or-less empty 'Rebuilding
everything for a new ABI/flavour section. I'll fill that out now.

Wookey
-- 
Principal hats:  Linaro, Emdebian, Wookware, Balloonboard, ARM
http://wookware.org/

___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: MMC double buffering

2010-12-17 Thread Per Forlin
Hi again,

I made a mistake in my double buffering implementation.
I assumed dma_unmap did not do any cache operations. Well, it does.
Due to L2 read prefetch the L2 needs to be invalidated at dma_unmap.

I made a quick test to see how much throughput would improved if
dma_unmap could be run in parallel.
In this run dma_unmap is removed.

Then the figures for read becomes:
* 7-16 % gain if double buffering in the ideal case. Closing on the
same performance as for PIO.

Relative diff: MMC-VANILLA-DMA-LOG -> MMC-MMCI-2-BUF-DMA-LOG-NO-UNMAP
CPU is abs diff
random  random
KB  reclen  write   rewrite readreread  readwrite
51200   4   +0% +0% +7% +8% +2% +0%
cpu:+0.0+0.0+0.7+0.7-0.0+0.0

51200   8   +0% +0% +10%+10%+6% +0%
cpu:-0.1+0.1+0.6+0.9+0.3+0.0

51200   16  +0% +0% +11%+11%+8% +0%
cpu:-0.0-0.1+0.9+1.0+0.3+0.0

51200   32  +0% +0% +13%+13%+10%+0%
cpu:-0.1+0.0+1.0+0.5+0.8+0.0

51200   64  +0% +0% +13%+13%+12%+1%
cpu:+0.0+0.0+0.4+1.0+0.9+0.1

51200   128 +0% +5% +14%+14%+14%+1%
cpu:+0.0+0.2+1.0+0.9+1.0+0.0

51200   256 +0% +2% +13%+13%+13%+1%
cpu:+0.0+0.1+0.9+0.3+1.6-0.1

51200   512 +0% +1% +14%+14%+14%+8%
cpu:-0.0+0.3+2.5+1.8+2.4+0.3

51200   1024+0% +2% +14%+15%+15%+0%
cpu:+0.0+0.3+1.3+1.4+1.3+0.1

51200   2048+2% +2% +15%+15%+15%+4%
cpu:+0.3+0.1+1.6+2.1+0.9+0.3

51200   4096+5% +3% +15%+16%+16%+5%
cpu:+0.0+0.4+1.1+1.7+1.7+0.5

51200   8192+5% +3% +16%+16%+16%+2%
cpu:+0.0+0.4+2.0+1.3+1.8+0.1

51200   16384   +1% +1% +16%+16%+16%+4%
cpu:+0.1-0.2+2.3+1.7+2.6+0.2

I will work on adding unmap to double buffering next week.

/Per

On 16 December 2010 15:15, Per Forlin  wrote:
> Hi,
>
> I am working on the blueprint
> https://blueprints.launchpad.net/linux-linaro/+spec/other-storage-performance-emmc.
> Currently I am investigating performance for DMA vs PIO on eMMC.
>
> Pros and cons for DMA on MMC
> + Offloads CPU
> + Fewer interrupts, one single interrupt for each transfer compared to
> 100s or even 1000s
> + Power save, DMA consumes less power than CPU
> - Less bandwidth / throughput compared to PIO-CPU
>
> The reason for introducing double buffering in the MMC framework is to
> address the throughput issue for DMA on MMC.
> The assumption is that the CPU and DMA have higher throughput than the
> MMC / SD-card.
> My hypothesis is that the difference in performance between PIO-mode
> and DMA-mode for MMC is due to latency for preparing a DMA-job.
> If the next DMA-job could be prepared while the current job is ongoing
> this latency would be reduced. The biggest part of preparing a DMA-job
> is maintenance of caches.
> In my case I run on U5500 (mach-ux500) which has both L1 and L2
> caches. The host mmc driver in use is the mmci driver (PL180).
>
> I have done a hack in both the MMC-framework and mmci in order to make
> a prove of concept. I have run IOZone to get measurements to prove my
> case worthy.
> The next step, if the results are promising will be to clean up my
> work and send out patches for review.
>
> The DMAC in ux500 support to modes LOG and PHY.
> LOG - Many logical channels are multiplex on top of one physical channel
> PHY - Only one channel per physical channel
>
> DMA mode LOG and PHY have different latency both HW and SW wise. One
> could almost treat them as "two different DMACs. To get a wider test
> scope I have tested using both modes.
>
> Summary of the results.
> * It is optional for the mmc host driver to utitlize the 2-buf
> support. 2-buf in framework requires no change in the host drivers.
> * IOZone shows no performance hit on existing drivers* if adding 2-buf
> to the framework but not in the host driver.
>  (* So far I have only test one driver)
> * The performance gain for DMA using 2-buf is probably proportional to
> the cache maintenance time.
>  The faster the card is the more significant the cache maintenance
> part becomes and vice versa.
> * For U5500 with 2-buf performance for DMA is:
> Throughput: DMA vanilla vs DMA 2-buf
>  * read +5-10 %
>  * write +0-3 %
> CPU load: CPU vs D

Re: Building Linaro-netbook filesystem from sources

2010-12-17 Thread Amit Mahajan
On Fri, 2010-12-17 at 12:20 +, Andrew Stubbs wrote:
> > 1. Right now I do not have access to board. I think probably I can use
> > QEMU for simulating my hardware.
> 
> Yes, I expect so, but it won't be fast!
> 
> Does the kernel not have a VFP emulation mode that might make the 
> existing binaries work on your netbook, at least well enough for 
> bootstrapping purposes? Just a thought ... it might be better than QEMU?

Yes, my kernel right now doesnot support VFP at all. 

> > 2. Compiling each package individually will be a long process. I wonder
> > if Ubuntu has something like ALIP (ARM linux internet platform), which
> > can be readily used with scratchbox.
> 
> That would be nice, but I don't know of such a thing, and would it work 
> for builds in a custom environment?


ALIP is a bit tricky thing, but if you understand their build process,
its very easy to customize it. Right now i use its v7 without vfp
variant.


> Anyway, here's another top-tip: use distcc. This is a tool that gives 
> you a 'virtual' compiler. It does all the preprocessing and linking 
> using the native tools (in QEMU, in your case), but sends the 
> preprocessed source to a distcc server on another machine for the actual 
> compilation job.
> 
> The distcc server can be another ARM machine, but equally it can be a PC 
> with a suitable ARM cross compiler.
> 
> You can set up multiple distcc servers, each configured to run multiple 
> compile jobs, if you wish, and then run the build with 
> DEB_BUILD_OPTIONS=parallel=2 (or whatever) in the environment, and maybe 
> get a performance boost, depending how QEMU performs at the preprocessing.
> 
> Back in a former job, I used to have 6 SH-64 boards running package 
> builds via distcc, with the compilers running on 8 x86 build servers, 
> and I could rebuild the entire distro in one night. Of course, that was 
> only a small distro I put together myself - nothing on the scale of 
> Ubuntu, and the boards were faster than QEMU, probably.
> 
> Distcc is often mentioned in conjunction with ccache, but caching object 
> files isn't really very interesting if you only build each package once. 
> There might be some small advantage in speeding up repeated configure 
> tests, I suppose, but I suggest not bothering with it.
> 
> Hope that helps
> 
> Andrew

This looks very nice. I will try to establish this environment over the
weekend. 

Thanks for great help!

-- 
Thanks
Amit Mahajan


___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: Building Linaro-netbook filesystem from sources

2010-12-17 Thread Amit Mahajan
On Fri, 2010-12-17 at 13:24 +, Wookey wrote:
> +++ Amit Mahajan [2010-12-17 17:19 +0530]:
> > On Fri, 2010-12-17 at 11:30 +, Andrew Stubbs wrote:
> > > > I checked the wiki and other links but could not find a related
> > > > document.
> 
> You are right that this process is not properly documented. That's
> largely because because it's either very slow (emulated) or very
> difficult (cross). If you do have some fast-enough hardware of the
> right type then it's fairly tractable, although toolchain-defaulting
> is still an awkward process requiring a flavoured rebuild of the
> toolchain. We do at least have that in place now though.
> 
> I'll this to mey list of 'missing wiki pages'. We should at least
> document the flavoured-native-rebuild process as that is in place,
> even if it doesn;t help in your case.

Hi Wookey,

If you can give me some short comments on how you do
flavoured-native-rebuild, I can possible customize it for my scenario
and possible get back with some good document, that might be helpful for
others too. Just a thought.

> > Thanks for your help. The procedure you mentioned looks good. But I have
> > 2 points here:
> > 1. Right now I do not have access to board. I think probably I can use
> > QEMU for simulating my hardware.
> > 
> > 2. Compiling each package individually will be a long process. I wonder
> > if Ubuntu has something like ALIP (ARM linux internet platform), which
> > can be readily used with scratchbox.
> 
> We are working on this. Debian/Ubuntu was never designed for
> cross-building so a fair amount of work is needed to make this an
> easy, slick process.
> 
> What you probably actually want is for someone else to have already
> built a no-VFP flavour of the distro that you could just use.
> Presumably that would be fine? (Debian's existing armel port might be
> of use, unless you also need everything to be built for v7, or to be
> actually using the ubuntu sources for some reason?)

Yes, if some has a prebuilt noVFP flavour, I can use that but it need to
be build for ARM v7 only, as my target is v7 platform only :) 

Ubuntu is not a special requirement for me, so I have already explored
armel port of Debian too, but found the same restrictions as you
mentioned.

> My belief is that in practice most people would be satisfied if a
> small number of flavours (v5, v6, v7 noVFP) were pre-built. Shout if
> that's not true for you.

As and end user I agree here with you totally. But from a developer's
perspective I would like to have a system which I can build from scratch
and see how it works, so that I can customize it readily for future.

A common example is ALIP. It was well maintained, with some good
preliminary tutorials. Its can be customized easily for various configs.

> Nevertheless there are enough awkward cases that making it easy to
> bootstrap a new flavour without relevant hardware is an important goal
> (a-la ALIP/AEL). For this to work we need the main core of 200-odd
> packages to cross-build properly, for circular build-dependencies to
> be breakable, and for cross-build tools to be able to reliably do the
> right thing with dependencies. About half of those 200 packages do now
> cross (sometimes with not-yet-merged patches), circular
> build-dependencies are being removed (with staged builds), and the
> cross-building tools and meta-data are being improved so that the
> process is automable and reliable. That last process depends on
> multiarch to be fully implemented in order to have cross-dependencies
> correctly described in package metadata.
> 
> For this stuff to stay working we also need continuous cross-build
> testing, otherwise it will bitrot. 
> 
> I expect this to be a working, demonstrable, process within the next
> couple of months, but from a patched set of sources because various
> aspects can't go in the main repo yet.
> 
> I am trying to keep this page https://wiki.linaro.org/CrossBuilding as
> a good overview. That has a currently more-or-less empty 'Rebuilding
> everything for a new ABI/flavour section. I'll fill that out now.
> 
> Wookey

Hmm, ok let me see this page. Thanks for help!

-- 
Thanks
Amit Mahajan


___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: MMC double buffering

2010-12-17 Thread Kyungmin Park
Hi,

It's interesting.

Can you send the your working codes to test it in our environment. Samsung SoC.

Thank you,
Kyungmin Park

On Sat, Dec 18, 2010 at 12:38 AM, Per Forlin  wrote:
> Hi again,
>
> I made a mistake in my double buffering implementation.
> I assumed dma_unmap did not do any cache operations. Well, it does.
> Due to L2 read prefetch the L2 needs to be invalidated at dma_unmap.
>
> I made a quick test to see how much throughput would improved if
> dma_unmap could be run in parallel.
> In this run dma_unmap is removed.
>
> Then the figures for read becomes:
> * 7-16 % gain if double buffering in the ideal case. Closing on the
> same performance as for PIO.
>
> Relative diff: MMC-VANILLA-DMA-LOG -> MMC-MMCI-2-BUF-DMA-LOG-NO-UNMAP
> CPU is abs diff
>                                                        random  random
>        KB      reclen  write   rewrite read    reread  read    write
>        51200   4       +0%     +0%     +7%     +8%     +2%     +0%
>        cpu:            +0.0    +0.0    +0.7    +0.7    -0.0    +0.0
>
>        51200   8       +0%     +0%     +10%    +10%    +6%     +0%
>        cpu:            -0.1    +0.1    +0.6    +0.9    +0.3    +0.0
>
>        51200   16      +0%     +0%     +11%    +11%    +8%     +0%
>        cpu:            -0.0    -0.1    +0.9    +1.0    +0.3    +0.0
>
>        51200   32      +0%     +0%     +13%    +13%    +10%    +0%
>        cpu:            -0.1    +0.0    +1.0    +0.5    +0.8    +0.0
>
>        51200   64      +0%     +0%     +13%    +13%    +12%    +1%
>        cpu:            +0.0    +0.0    +0.4    +1.0    +0.9    +0.1
>
>        51200   128     +0%     +5%     +14%    +14%    +14%    +1%
>        cpu:            +0.0    +0.2    +1.0    +0.9    +1.0    +0.0
>
>        51200   256     +0%     +2%     +13%    +13%    +13%    +1%
>        cpu:            +0.0    +0.1    +0.9    +0.3    +1.6    -0.1
>
>        51200   512     +0%     +1%     +14%    +14%    +14%    +8%
>        cpu:            -0.0    +0.3    +2.5    +1.8    +2.4    +0.3
>
>        51200   1024    +0%     +2%     +14%    +15%    +15%    +0%
>        cpu:            +0.0    +0.3    +1.3    +1.4    +1.3    +0.1
>
>        51200   2048    +2%     +2%     +15%    +15%    +15%    +4%
>        cpu:            +0.3    +0.1    +1.6    +2.1    +0.9    +0.3
>
>        51200   4096    +5%     +3%     +15%    +16%    +16%    +5%
>        cpu:            +0.0    +0.4    +1.1    +1.7    +1.7    +0.5
>
>        51200   8192    +5%     +3%     +16%    +16%    +16%    +2%
>        cpu:            +0.0    +0.4    +2.0    +1.3    +1.8    +0.1
>
>        51200   16384   +1%     +1%     +16%    +16%    +16%    +4%
>        cpu:            +0.1    -0.2    +2.3    +1.7    +2.6    +0.2
>
> I will work on adding unmap to double buffering next week.
>
> /Per
>
> On 16 December 2010 15:15, Per Forlin  wrote:
>> Hi,
>>
>> I am working on the blueprint
>> https://blueprints.launchpad.net/linux-linaro/+spec/other-storage-performance-emmc.
>> Currently I am investigating performance for DMA vs PIO on eMMC.
>>
>> Pros and cons for DMA on MMC
>> + Offloads CPU
>> + Fewer interrupts, one single interrupt for each transfer compared to
>> 100s or even 1000s
>> + Power save, DMA consumes less power than CPU
>> - Less bandwidth / throughput compared to PIO-CPU
>>
>> The reason for introducing double buffering in the MMC framework is to
>> address the throughput issue for DMA on MMC.
>> The assumption is that the CPU and DMA have higher throughput than the
>> MMC / SD-card.
>> My hypothesis is that the difference in performance between PIO-mode
>> and DMA-mode for MMC is due to latency for preparing a DMA-job.
>> If the next DMA-job could be prepared while the current job is ongoing
>> this latency would be reduced. The biggest part of preparing a DMA-job
>> is maintenance of caches.
>> In my case I run on U5500 (mach-ux500) which has both L1 and L2
>> caches. The host mmc driver in use is the mmci driver (PL180).
>>
>> I have done a hack in both the MMC-framework and mmci in order to make
>> a prove of concept. I have run IOZone to get measurements to prove my
>> case worthy.
>> The next step, if the results are promising will be to clean up my
>> work and send out patches for review.
>>
>> The DMAC in ux500 support to modes LOG and PHY.
>> LOG - Many logical channels are multiplex on top of one physical channel
>> PHY - Only one channel per physical channel
>>
>> DMA mode LOG and PHY have different latency both HW and SW wise. One
>> could almost treat them as "two different DMACs. To get a wider test
>> scope I have tested using both modes.
>>
>> Summary of the results.
>> * It is optional for the mmc host driver to utitlize the 2-buf
>> support. 2-buf in framework requires no change in the host drivers.
>> * IOZone shows no performance hit on existing drivers* if adding 2-buf
>> to the framework but not in the host driver.
>>  (* So far I have only test one driver)
>> * The performance gain for DMA using 2