* Trent Piepho on Thursday, October 14, 2010 at 17:38:39 -0700
> On Thu, Oct 14, 2010 at 2:43 PM, Christian Ebert <blacktr...@gmx.net> wrote:
> 
>> * Bernhard Praschinger on Thursday, October 14, 2010 at 19:06:33 +0200
>>>> Oh yeah, this isn't on Linux.  OSX probably has some kind of API for
>>>> checking if sse2 is available.  Using CPUID isn't enough, because sse
>>>> requires OS support that might not be there.  I.e., the cpu supports
>>>> sse2 but you're not actually able to use it.  Probably not much of issue
>>>> on OSX.
>>>> 
>>>> Easy fix would just be the change the sse detection asm to save and
>>>> restore ebx.
>>>> 
>>>> __asm__ volatile("pushl %%ebx ; cpuid ; popl %%ebx" : "=d"(d) : "a"(1) :
>>>> "ecx");
>>>> 
>>>> or better
>>>> 
>>>> uint32_t tmp;
>>>> __asm__ volatile("movl %%ebx, %1; cpuid; movl %1, %%ebx" : "=d"(d),
>>>> "=&g"(tmp) : "a"(1) : "ecx");
>>>> 
>>>> The latter is safer in general, as you can't use push or pop around any
>>>> asm code that has a parameter with a constraint that allows memory
>>>> references.  The memory reference might be relative to esp, in which
>>>> case the push/pop would move it.  Or it might not be relative to esp, in
>>>> which case the push/pop doesn't move it.  So there's no way to adjust
>>>> for it.
>>> I tested your better version. And it compiles here on my linux and Intel
>>> osx box. I did also a quick test with the new version on the linux box.
>>> And it works well.
>>> 
>>> So I would appreciate a feedback if it works on a mac.
>> 
>> Thanks for looking into this, but I get:
>> 
>> gcc -DHAVE_CONFIG_H -I. -I.. -I.. -I../utils   -O3 -funroll-all-loops
>> -ffast-math -march=nocona -mtune=nocona -g -O2 -I/sw/include -no-cpp-precomp
>> -D_THREAD_SAFE  -Wall -Wunused -MT main.o -MD -MP -MF .deps/main.Tpo -c -o
>> main.o main.c
>> main.c: In function ‘main’:
>> main.c:1339: error: PIC register ‘ebx’ clobbered in ‘asm’
>> make: *** [main.o] Error 1
> 
> Looks like you didn't actually change the needed lines.

No, I didn't but Bernhard did:

$ cvs status yuvdenoise/main.c
===================================================================
File: main.c            Status: Up-to-date

   Working revision:    1.71
   Repository revision: 1.71    /cvsroot/mjpeg/mjpeg_play/yuvdenoise/main.c,v
   Sticky Tag:          HEAD (revision: 1.71)
   Sticky Date:         (none)
   Sticky Options:      (none)

$ cvs diff -r 1.70 yuvdenoise/main.c
Index: yuvdenoise/main.c
===================================================================
RCS file: /cvsroot/mjpeg/mjpeg_play/yuvdenoise/main.c,v
retrieving revision 1.70
retrieving revision 1.71
diff -u -r1.70 -r1.71
--- yuvdenoise/main.c   10 Oct 2010 13:01:55 -0000      1.70
+++ yuvdenoise/main.c   14 Oct 2010 16:57:54 -0000      1.71
@@ -810,8 +810,8 @@
 /* 4 to 5 times faster */
 void filter_plane_median_sse2(uint8_t *plane, int w, int h, int level) {
        int i;
-       int avg;
-       int cnt;
+       /* int avg; should not be needed any more */
+       /* int cnt; should not be needed any more */
        uint8_t * p;
        uint8_t * d;
        
@@ -1326,10 +1326,12 @@
 static void init_accel() {
        filter_plane_median = filter_plane_median_p;
        temporal_filter_planes = temporal_filter_planes_p;
-       
+       uint32_t tmp;
+
 #if defined(__SSE2__)
        int d = 0;
-       __asm__ volatile("cpuid" : "=d"(d) : "a"(1) : "ebx", "ecx");
+/*     __asm__ volatile("cpuid" : "=d"(d) : "a"(1) : "ebx", "ecx"); */
+       __asm__ volatile("movl %%ebx, %1; cpuid; movl %1, %%ebx" : "=d"(d), 
"=&g"(tmp) : "a"(1) : "ecx");
        if ((d & (1 << 26))) {
                mjpeg_info("SETTING SSE2 for standard Temporal-Noise-Filter");
                temporal_filter_planes = temporal_filter_planes_sse2;


c
-- 
theatre - books - texts - movies
Black Trash Productions at home: http://www.blacktrash.org
Black Trash Productions on Facebook:
http://www.facebook.com/blacktrashproductions

------------------------------------------------------------------------------
Download new Adobe(R) Flash(R) Builder(TM) 4
The new Adobe(R) Flex(R) 4 and Flash(R) Builder(TM) 4 (formerly 
Flex(R) Builder(TM)) enable the development of rich applications that run
across multiple browsers and platforms. Download your free trials today!
http://p.sf.net/sfu/adobe-dev2dev
_______________________________________________
Mjpeg-users mailing list
Mjpeg-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mjpeg-users

Reply via email to