Want to retouch your photos? we can help you.
Deep etching or masking for your photos, or even adding clipping path.
Retouching also if needed.
Hopefully to start something for you soon.
Thanks,
Grace
Fuldad
Altednburg
> static inline int new_find_first_bit(const unsigned long *b, unsigned size)
> {
> int x = 0;
> do {
> unsigned long v = *b++;
> if (v)
> return __ffs(v) + x;
> if (x >= size)
> break;
>
Hi,
FWIW the following routine is consistently slightly faster using
Steven's test harness , with a big win when no bit set.
static inline int new_find_first_bit(const unsigned long *b, unsigned
size)
{
int x = 0;
do {
unsigned long v = *b++;
if (v)
On Fri, 29 Jul 2005, Linus Torvalds wrote:
> It has another downside too: it's extra complexity and potential for bugs
> in the compiler. And if you tell me gcc people never have bugs, I will
> laugh in your general direction.
You mean these that have been sitting in their Bugzilla for some th
On Fri, 29 Jul 2005, David Woodhouse wrote:
>
> On Thu, 2005-07-28 at 10:25 -0700, Linus Torvalds wrote:
> > Basic rule: inline assembly is _better_ than random compiler extensions.
> > It's better to have _one_ well-documented extension that is very generic
> > than it is to have a thousand sp
On Fri, 29 Jul 2005, Maciej W. Rozycki wrote:
>
> Hmm, that's what's in the GCC info pages for the relevant functions
> (I've omitted the "l" and "ll" variants):
>
> "-- Built-in Function: int __builtin_ffs (unsigned int x)
> Returns one plus the index of the least significant 1-bit of X
On Thu, 28 Jul 2005, Linus Torvalds wrote:
> > Since you're considering GCC-generated code for ffs(), ffz() and friends,
> > how about trying __builtin_ffs(), __builtin_clz() and __builtin_ctz() as
> > apropriate?
>
> Please don't. Try again in three years when everybody has them.
Well, __bu
On Fri, 29 Jul 2005 [EMAIL PROTECTED] wrote:
>> OK, I guess when I get some time, I'll start testing all the i386 bitop
>> functions, comparing the asm with the gcc versions. Now could someone
>> explain to me what's wrong with testing hot cache code. Can one
>> instruction retrieve from memory
On Fri, 29 Jul 2005, David Woodhouse wrote:
> Builtins are more portable and their implementation will improve to
> match developments in the target CPU. Inline assembly, as we have seen,
> remains the same for years while the technology moves on.
>
> Although it's often the case that inline asse
On Thu, 28 Jul 2005, Linus Torvalds wrote:
> There may be more upsides on other architectures (*cough*ia64*cough*) that
> have strange scheduling issues and other complexities, but on x86 in
> particular, the __builtin_xxx() functions tend to be a lot more pain than
> they are worth. Not only d
> OK, I guess when I get some time, I'll start testing all the i386 bitop
> functions, comparing the asm with the gcc versions. Now could someone
> explain to me what's wrong with testing hot cache code. Can one
> instruction retrieve from memory better than others?
To add one to Linus' list, not
On Thu, 2005-07-28 at 10:25 -0700, Linus Torvalds wrote:
> Basic rule: inline assembly is _better_ than random compiler extensions.
> It's better to have _one_ well-documented extension that is very generic
> than it is to have a thousand specialized extensions.
Counterexample: FR-V and its __bu
On Thu, 28 Jul 2005, Steven Rostedt wrote:
>
> OK, I guess when I get some time, I'll start testing all the i386 bitop
> functions, comparing the asm with the gcc versions. Now could someone
> explain to me what's wrong with testing hot cache code. Can one
> instruction retrieve from memory bet
On Thu, 2005-07-28 at 17:34 +0100, Maciej W. Rozycki wrote:
> On Thu, 28 Jul 2005, Steven Rostedt wrote:
>
> > I've been playing with different approaches, (still all hot cache
> > though), and inspecting the generated code. It's not that the gcc
> > generated code is always better for the normal
Steven Rostedt wrote:
> In the thread "[RFC][PATCH] Make MAX_RT_PRIO and MAX_USER_RT_PRIO
> configurable" I discovered that a C version of find_first_bit is faster
> than the asm version
There are probably other cases of this in asm-i386/bitopts.h. For instance
I think the "btl" instruction is pr
On Thu, 28 Jul 2005, Steven Rostedt wrote:
>
> I can change the find_first_bit to use __builtin_ffs, but how would you
> implement the ffz?
The thing is, there are basically _zero_ upsides to using the __builtin_xx
functions on x86.
There may be more upsides on other architectures (*cough*ia6
On Thu, 28 Jul 2005, Maciej W. Rozycki wrote:
>
> Since you're considering GCC-generated code for ffs(), ffz() and friends,
> how about trying __builtin_ffs(), __builtin_clz() and __builtin_ctz() as
> apropriate?
Please don't. Try again in three years when everybody has them.
On Thu, 28 Jul 2005, Steven Rostedt wrote:
> I've been playing with different approaches, (still all hot cache
> though), and inspecting the generated code. It's not that the gcc
> generated code is always better for the normal case. But since it sees
> more and everything is not hidden in asm, it
On Thu, 2005-07-28 at 17:34 +0100, Maciej W. Rozycki wrote:
> Since you're considering GCC-generated code for ffs(), ffz() and friends,
> how about trying __builtin_ffs(), __builtin_clz() and __builtin_ctz() as
> apropriate? Reasonably recent GCC may actually be good enough to use the
> faste
On Thu, 2005-07-28 at 08:30 -0700, Linus Torvalds wrote:
>
> I suspect the old "rep scas" has always been slower than
> compiler-generated code, at least under your test conditions. Many of the
> old asm's are actually _very_ old, and some of them come from pre-0.01
> days and are more about me
On Thu, 28 Jul 2005, Steven Rostedt wrote:
>
> The 32 looks like it may be problamatic. Is there any i386 64 bit
> machines. Or is hard coding 32 OK?
We have BITS_PER_LONG exactly for this usage, but the sizeof also works.
Linus
-
To unsubscribe from this list: send the line
On Thu, 28 Jul 2005, Steven Rostedt wrote:
>
> In the thread "[RFC][PATCH] Make MAX_RT_PRIO and MAX_USER_RT_PRIO
> configurable" I discovered that a C version of find_first_bit is faster
> than the asm version now when compiled against gcc 3.3.6 and gcc 4.0.1
> (both from versions of Debian unsta
[snip]
> static inline int find_first_bit(const unsigned long *addr, unsigned size)
> {
[snip]
> + int x = 0;
> + do {
> + if (*addr)
> + return __ffs(*addr) + x;
> + addr++;
> + if (x >= size)
> + break;
> +
In the thread "[RFC][PATCH] Make MAX_RT_PRIO and MAX_USER_RT_PRIO
configurable" I discovered that a C version of find_first_bit is faster
than the asm version now when compiled against gcc 3.3.6 and gcc 4.0.1
(both from versions of Debian unstable). I wrote a benchmark (attached)
that runs the cod
24 matches
Mail list logo