Re: RFC: Dynamic hwcaps

2010-12-05 Thread Mark Mitchell
On 12/3/2010 11:35 AM, Dave Martin wrote:

> What you describe is one of two mechanisms currently in use--- the
> other is for a single library to contain two implementations of
> certain functions and to choose between them based on the hwcaps.
> Typically, one set of functions is chosen a library initialisation
> time.  Some libraries, such as libpixman, are implementated this way;
> and it's often preferable since the the proportion of functions in a
> library which get significant benefit from special instruction set
> extensions is often pretty small.

I've believed for a long time that we should try to encourage this
approach.  The current approach (different libraries for each hardware
configuration) is prevalent, both in the toolchain ("multilibs") and in
other libraries -- but it seems to me premised on the idea that one is
building everything from source for one's particular hardware.  In the
earlier days of FOSS, the typical installation model was to download a
source tarball, build it, and install it on your local machine.  In that
context, tuning the library "just so" for your machine made sense.  But,
to enable binary distribution, having to have N copies of a library (let
alone an application) for N different ARM core variants just doesn't
make sense to me.

So, I certainly think that things like STT_GNU_IFUNC (which enable
determination of which routine to use at application start-up) make a
lot of sense.

I think your idea of exposing whether a unit is "ready", to allow even
more fine-grained choices as an application runs, is clever.  I don't
really know enough to say whether most applications could take advantage
of that.  One of the problems I see is that you need global information,
not local information.  In particular, if I'm using NEON to implement
the inner loop of some performance-critical application, then when the
unit is not ready, I want the kernel to wake it up already!  But, if I'm
just using NEON to do some random computation off the critical path, I'm
probably happy to do it slowly if that's more efficient than waking up
the NEON unit.  But, which of these cases I'm in isn't always locally
known at the point I'm doing the computation; the computation may be
buried in a small library routine.

Do we have good examples of applications that could profit from this
capability?

-- 
Mark Mitchell
CodeSourcery
m...@codesourcery.com
(650) 331-3385 x713

___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: RFC: Dynamic hwcaps

2010-12-06 Thread Mark Mitchell
On 12/6/2010 5:07 AM, Dave Martin wrote:

>> But,
>> to enable binary distribution, having to have N copies of a library (let
>> alone an application) for N different ARM core variants just doesn't
>> make sense to me.
> 
> Just so, and as discussed before improvements to package managers
> could help here to avoid installing duplicate libraries.  (I believe
> that rpm may have some capability here (?) but deb does not at
> present).

Yes, a smarter package manager could help a device builder automatically
get the right version of a library.  But, something more fundamental has
to happen to avoid the library developer having to *produce* N versions
of a library.  (Yes, in theory, you just type "make" with different
CFLAGS options, but in practice of course it's often more complex than
that, especially if you need to validate the library.)

> Currently, I don't have many examples-- the main one is related to the
> discussions aroung using NEON for memcpy().  This can be a performance
> win on some platforms, but except when the system is heavily loaded,
> or when NEON happens to be turned on anyway, it may not be
> advantageous for the user or overall system performance.

How good of a proxy would the length of the copy be, do you think?  If
you want to copy 1G of data, and NEON makes you 2x-4x faster, then it
seems to me that you probably want to use NEON, almost independent of
overall system load.  But, if you're only going to copy 16 bytes, even
if NEON is faster, it's probably OK not to use it -- the function-call
overhead to get into memcpy at all is probably significant relative to
the time you'd save by using NEON.  In between, it's harder, of course
-- but perhaps if memcpy is the key example, we could get 80% of the
benefit of your idea simply by a test inside memcpy as to the length of
the data to be copied?

-- 
Mark Mitchell
CodeSourcery
m...@codesourcery.com
(650) 331-3385 x713

___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev


Re: Adding packages to image seeds

2010-08-10 Thread Mark Mitchell
Amit Kucheria wrote:

> Is security that much of an issue for linaro images, given that the images
> are not meant as end-user/product-focused images but rather technology demos.

Demos tend to get turned into products rather quickly in some cases. :-)
 I wouldn't be at all amazed to find Linaro images forming the core of
small products.

Thanks,

-- 
Mark Mitchell
CodeSourcery
m...@codesourcery.com
(650) 331-3385 x713

___
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev