On Mon, 2005-01-31 at 20:30, Matt Mackall wrote: > On Mon, Jan 31, 2005 at 06:16:23PM +0100, Andreas Gruenbacher wrote: > > Hello, > > > > On Mon, 2005-01-31 at 08:34, Matt Mackall wrote: > > > This patch adds a generic array sorting library routine. This is meant > > > to replace qsort, which has two problem areas for kernel use. > > > > looks reasonable. > > > > > Note that this function has an extra parameter for passing in an > > > optimized swapping function. This is worth 10% or more over the > > > typical byte-by-byte exchange functions. > > > > I would appreciate a version without the swap callback. > > Why? To eliminate an argument?
Yes, because a custom swap routine isn't very useful generally. It's over-engineered IMHO. > > The optimized version of swap should use the machine word size > > instead of u32. > > That occurred to me, but most instances I found in my audit were using > u32 rather than long. Alright, that may be the case. > > How about this approach instead, if you think we > > must really optimize swapping? > > > > static inline void swap(void *a, void *b, int size) > > { > > if (size % sizeof(long)) { > > char t; > > do { > > t = *(char *)a; > > *(char *)a++ = *(char *)b; > > *(char *)b++ = t; > > } while (--size > 0); > > } else { > > long t; > > do { > > t = *(long *)a; > > *(long *)a = *(long *)b; > > *(long *)b = t; > > size -= sizeof(long); > > } while (size > sizeof(long)); > > } > > } > > This makes things worse. Sort isn't inlined, so we don't know size > until we're called and then we're branching in the innermost loop Well, the example code I referred to had an inlined __sort, and the size == sizeof(long) case ended up being perfectly optimized. It bloats the code somewhat though, but it's still tiny. > and growing the code footprint to boot. You mean the object footprint? I don't care much whether we use some 350 bytes more or not. > Function pointer wins in my benchmarks. I had a slowdown in the low percentage range when doing bytewise swap instead of wordwise swap as well, but function pointers were about as fast as wordwise swap. So no significant gain. > Note that there are callers like IA64 extable that want a custom swap already: > > - * stack may be more than 2GB away from the exception-table). > + * Sort the exception table. It's usually already sorted, but there > + * may be unordered entries due to multiple text sections (such as the > + * .init text section). Note that the exception-table-entries contain > + * location-relative addresses, which requires a bit of care during > + * sorting to avoid overflows in the offset members (e.g., it would > + * not be safe to make a temporary copy of an exception-table entry on > + * the stack, because the stack may be more than 2GB away from the > + * exception-table). > */ We could either leave the current insertion sort in place in arch/ia64/mm/extable.c, or transform the exception table into sortable form first, as in the below mock-up. + /* Exception-table-entries contain location-relative addresses. Convert + to addresses relative to START before sorting, and convert back to + location-relative addresses afterwards. This allows us to use the + general-purpose sort function. */ + for (p = start+1; p < finish; p++) { + int n = (void *)p - (void *)start; + p->addr += n; + p->cont += n; + } + sort(start, finish - start, sizeof(struct exception_table_entry), + compare_entries); + for (p = start+1; p < finish; p++) { + int n = (void *)p - (void *)start; + p->addr -= n; + p->cont -= n; + } Switching to the generic sort probably isn't worth it in this case. > There are a bunch of other potential users that sort parallel arrays > a[] and b[] with keys in a[] that want this too. Where? Thanks, -- Andreas Gruenbacher <[EMAIL PROTECTED]> SUSE Labs, SUSE LINUX GMBH - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/