Re: [Mjpeg-users] [Mjpeg-developer] yuvdenoise performance patch

2010-10-14 Thread Bernhard Praschinger
Hallo

> Oh yeah, this isn't on Linux.  OSX probably has some kind of API for
> checking if sse2 is available.  Using CPUID isn't enough, because sse
> requires OS support that might not be there.  I.e., the cpu supports
> sse2 but you're not actually able to use it.  Probably not much of issue
> on OSX.
>
> Easy fix would just be the change the sse detection asm to save and
> restore ebx.
>
> __asm__ volatile("pushl %%ebx ; cpuid ; popl %%ebx" : "=d"(d) : "a"(1) :
> "ecx");
>
> or better
>
> uint32_t tmp;
> __asm__ volatile("movl %%ebx, %1; cpuid; movl %1, %%ebx" : "=d"(d),
> "=&g"(tmp) : "a"(1) : "ecx");
>
> The latter is safer in general, as you can't use push or pop around any
> asm code that has a parameter with a constraint that allows memory
> references.  The memory reference might be relative to esp, in which
> case the push/pop would move it.  Or it might not be relative to esp, in
> which case the push/pop doesn't move it.  So there's no way to adjust
> for it.
I tested your better version. And it compiles here on my linux and Intel 
osx box. I did also a quick test with the new version on the linux box. 
And it works well.

So I would appreciate a feedback if it works on a mac.

auf hoffentlich bald,

Berni the Chaos of Woodquarter

Email: shadowl...@utanet.at
www: http://www.lysator.liu.se/~gz/bernhard

--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
___
Mjpeg-users mailing list
Mjpeg-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mjpeg-users


Re: [Mjpeg-users] [Mjpeg-developer] yuvdenoise performance patch

2010-10-14 Thread Christian Ebert
* Bernhard Praschinger on Thursday, October 14, 2010 at 19:06:33 +0200
>> Oh yeah, this isn't on Linux.  OSX probably has some kind of API for
>> checking if sse2 is available.  Using CPUID isn't enough, because sse
>> requires OS support that might not be there.  I.e., the cpu supports
>> sse2 but you're not actually able to use it.  Probably not much of issue
>> on OSX.
>> 
>> Easy fix would just be the change the sse detection asm to save and
>> restore ebx.
>> 
>> __asm__ volatile("pushl %%ebx ; cpuid ; popl %%ebx" : "=d"(d) : "a"(1) :
>> "ecx");
>> 
>> or better
>> 
>> uint32_t tmp;
>> __asm__ volatile("movl %%ebx, %1; cpuid; movl %1, %%ebx" : "=d"(d),
>> "=&g"(tmp) : "a"(1) : "ecx");
>> 
>> The latter is safer in general, as you can't use push or pop around any
>> asm code that has a parameter with a constraint that allows memory
>> references.  The memory reference might be relative to esp, in which
>> case the push/pop would move it.  Or it might not be relative to esp, in
>> which case the push/pop doesn't move it.  So there's no way to adjust
>> for it.
> I tested your better version. And it compiles here on my linux and Intel 
> osx box. I did also a quick test with the new version on the linux box. 
> And it works well.
> 
> So I would appreciate a feedback if it works on a mac.

Thanks for looking into this, but I get:

gcc -DHAVE_CONFIG_H -I. -I.. -I.. -I../utils   -O3 -funroll-all-loops 
-ffast-math -march=nocona -mtune=nocona -g -O2 -I/sw/include -no-cpp-precomp 
-D_THREAD_SAFE  -Wall -Wunused -MT main.o -MD -MP -MF .deps/main.Tpo -c -o 
main.o main.c
main.c: In function ‘main’:
main.c:1339: error: PIC register ‘ebx’ clobbered in ‘asm’
make: *** [main.o] Error 1

$ sw_vers
ProductName:Mac OS X
ProductVersion: 10.5.8
BuildVersion:   9L30

c
-- 
theatre - books - texts - movies
Black Trash Productions at home: http://www.blacktrash.org
Black Trash Productions on Facebook:
http://www.facebook.com/blacktrashproductions

--
Download new Adobe(R) Flash(R) Builder(TM) 4
The new Adobe(R) Flex(R) 4 and Flash(R) Builder(TM) 4 (formerly 
Flex(R) Builder(TM)) enable the development of rich applications that run
across multiple browsers and platforms. Download your free trials today!
http://p.sf.net/sfu/adobe-dev2dev
___
Mjpeg-users mailing list
Mjpeg-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mjpeg-users


Re: [Mjpeg-users] [Mjpeg-developer] yuvdenoise performance patch

2010-10-14 Thread Christian Ebert
* Christian Ebert on Thursday, October 14, 2010 at 23:43:20 +0200
> * Bernhard Praschinger on Thursday, October 14, 2010 at 19:06:33 +0200
>>> Oh yeah, this isn't on Linux.  OSX probably has some kind of API for
>>> checking if sse2 is available.  Using CPUID isn't enough, because sse
>>> requires OS support that might not be there.  I.e., the cpu supports
>>> sse2 but you're not actually able to use it.  Probably not much of issue
>>> on OSX.
>>> 
>>> Easy fix would just be the change the sse detection asm to save and
>>> restore ebx.
>>> 
>>> __asm__ volatile("pushl %%ebx ; cpuid ; popl %%ebx" : "=d"(d) : "a"(1) :
>>> "ecx");
>>> 
>>> or better
>>> 
>>> uint32_t tmp;
>>> __asm__ volatile("movl %%ebx, %1; cpuid; movl %1, %%ebx" : "=d"(d),
>>> "=&g"(tmp) : "a"(1) : "ecx");
>>> 
>>> The latter is safer in general, as you can't use push or pop around any
>>> asm code that has a parameter with a constraint that allows memory
>>> references.  The memory reference might be relative to esp, in which
>>> case the push/pop would move it.  Or it might not be relative to esp, in
>>> which case the push/pop doesn't move it.  So there's no way to adjust
>>> for it.
>> I tested your better version. And it compiles here on my linux and Intel 
>> osx box. I did also a quick test with the new version on the linux box. 
>> And it works well.
>> 
>> So I would appreciate a feedback if it works on a mac.
> 
> Thanks for looking into this, but I get:
> 
> gcc -DHAVE_CONFIG_H -I. -I.. -I.. -I../utils   -O3 -funroll-all-loops 
> -ffast-math -march=nocona -mtune=nocona -g -O2 -I/sw/include -no-cpp-precomp 
> -D_THREAD_SAFE  -Wall -Wunused -MT main.o -MD -MP -MF .deps/main.Tpo -c -o 
> main.o main.c
> main.c: In function ‘main’:
> main.c:1339: error: PIC register ‘ebx’ clobbered in ‘asm’
> make: *** [main.o] Error 1
> 
> $ sw_vers
> ProductName:  Mac OS X
> ProductVersion:   10.5.8
> BuildVersion: 9L30

Just came across this:

http://lists.mplayerhq.hu/pipermail/mplayer-users/2010-October/081276.html

c
-- 
\black\trash movie   _COWBOY  CANOE  COMA_
Ein deutscher Western/A German Western

--->> http://www.blacktrash.org/underdogma/ccc.php

--
Download new Adobe(R) Flash(R) Builder(TM) 4
The new Adobe(R) Flex(R) 4 and Flash(R) Builder(TM) 4 (formerly 
Flex(R) Builder(TM)) enable the development of rich applications that run
across multiple browsers and platforms. Download your free trials today!
http://p.sf.net/sfu/adobe-dev2dev
___
Mjpeg-users mailing list
Mjpeg-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mjpeg-users


Re: [Mjpeg-users] [Mjpeg-developer] yuvdenoise performance patch

2010-10-14 Thread Trent Piepho
On Thu, Oct 14, 2010 at 2:43 PM, Christian Ebert  wrote:

> * Bernhard Praschinger on Thursday, October 14, 2010 at 19:06:33 +0200
> >> Oh yeah, this isn't on Linux.  OSX probably has some kind of API for
> >> checking if sse2 is available.  Using CPUID isn't enough, because sse
> >> requires OS support that might not be there.  I.e., the cpu supports
> >> sse2 but you're not actually able to use it.  Probably not much of issue
> >> on OSX.
> >>
> >> Easy fix would just be the change the sse detection asm to save and
> >> restore ebx.
> >>
> >> __asm__ volatile("pushl %%ebx ; cpuid ; popl %%ebx" : "=d"(d) : "a"(1) :
> >> "ecx");
> >>
> >> or better
> >>
> >> uint32_t tmp;
> >> __asm__ volatile("movl %%ebx, %1; cpuid; movl %1, %%ebx" : "=d"(d),
> >> "=&g"(tmp) : "a"(1) : "ecx");
> >>
> >> The latter is safer in general, as you can't use push or pop around any
> >> asm code that has a parameter with a constraint that allows memory
> >> references.  The memory reference might be relative to esp, in which
> >> case the push/pop would move it.  Or it might not be relative to esp, in
> >> which case the push/pop doesn't move it.  So there's no way to adjust
> >> for it.
> > I tested your better version. And it compiles here on my linux and Intel
> > osx box. I did also a quick test with the new version on the linux box.
> > And it works well.
> >
> > So I would appreciate a feedback if it works on a mac.
>
> Thanks for looking into this, but I get:
>
> gcc -DHAVE_CONFIG_H -I. -I.. -I.. -I../utils   -O3 -funroll-all-loops
> -ffast-math -march=nocona -mtune=nocona -g -O2 -I/sw/include -no-cpp-precomp
> -D_THREAD_SAFE  -Wall -Wunused -MT main.o -MD -MP -MF .deps/main.Tpo -c -o
> main.o main.c
> main.c: In function ‘main’:
> main.c:1339: error: PIC register ‘ebx’ clobbered in ‘asm’
> make: *** [main.o] Error 1
>

Looks like you didn't actually change the needed lines.
--
Download new Adobe(R) Flash(R) Builder(TM) 4
The new Adobe(R) Flex(R) 4 and Flash(R) Builder(TM) 4 (formerly 
Flex(R) Builder(TM)) enable the development of rich applications that run
across multiple browsers and platforms. Download your free trials today!
http://p.sf.net/sfu/adobe-dev2dev___
Mjpeg-users mailing list
Mjpeg-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mjpeg-users