ges people from adding inefficient code. Note that the
shifts in (unsigned, and by a compile-time constant amount)
are simpler and generated inline.
Signed-off-by: George Spelvin
Acked-By: Andrey Ryabinin
Feedback-from: Rasmus Villemoes
Cc: linux-s...@vger.kernel.org
Cc: Heiko Carstens
---
include
bsence discourages people from adding inefficient code. Note that the
shifts in (unsigned, and by a compile-time constant amount)
are simpler and generated inline.
Signed-off-by: George Spelvin
Acked-By: Andrey Ryabinin
Cc: linux-s...@vger.kernel.org
Cc: Heiko Carstens
---
lib/ubsan
s in different subsystems, and that needs handling.
Option 1: Submit this for 5.2 and turn on INT128 for s390 in 5.3.
Option 2: Let the arches cherry-pick this patch pre-5.2.
My preference is for option 2, but that requires permission from
ubsan's owner. Andrey?
Signed-off-by: George Spelvin
x27;t actually help GCC 8.3's code generation, so just
omit it.
Signed-off-by: George Spelvin
Fixes: 820c81be5237 ("lib/list_sort: simplify and remove MAX_LIST_LENGTH_BITS")
Cc: Andrew Morton
Cc: Stephen Rothwell
---
lib/list_sort.c | 14 +-
1 file changed, 9 insert
Than you all for the build warning report.
The warning produced by gcc versions 4.9, 7.3, 8.1, whatever version
Stephen Rothwell is running, is:
lib/list_sort.c:17:36: warning: __pure__ attribute ignored [-Wattributes]
The relevant code is:
10: /*
11: * By declaring the compare function with the
By the way, I just noticed that my fallback get_random_max64()
algorithm (if there's no __int128 type) is completely broken and
will need rewriting.
It would work if I rejected and regenerated the high half
if the low half were out of range, but that's not what it does.
The worst case is a range
P.S. The cited paper calls your algorithm the "OpenBSD algorithm"
and has a bunch of benchmarks comparing it to others in Fisher-Yates
shuffles of sizes 1e3..1e9.
Including all overhead (base PRNG, shuffle), it's 3x slower for
32-bit operations and 8x slower for 64-bit up to arrays of size
1e6, af
On Sun, 24 Mar 2019 at 21:47:50 +0100, Jason A. Donenfeld wrote:
> I generally use a slightly simpler algorithm in various different projects:
>
> //[0, bound)
> static unsigned long random_bounded(unsigned long bound)
> {
>unsigned long ret;
>const unsigned long max_mod_bound = (1
ction, there are a few extra predicted branches on the code path.)
This actually *shrinks* the x86-64 code, because it inlines the various
swap functions inside do_swap, eliding function prologues & epilogues.
x86-64 code size 767 -> 703 bytes (-64)
Signed-off-by: George Spelvin
Acke
I'm running this code right now, with CONFIG_TEST_SORT and
CONFIG_TEST_LIST_SORT, but I confess I haven't rebooted since
the last round of minor edits to quell checkpatch. I figure there
will be at least one round of comments and final testing.
George Spelvin (5):
lib/sort: Make swap
the great tradition of bikeshedding, the names were
by far the most contentious issue during review of this patch series.
x86-64 code size 872 -> 886 bytes (+14)
Signed-off-by: George Spelvin
Acked-by: Andrey Abramov
Feedback-from: Andy Shevchenko
Feedback-from: Rasmus Villemoes
Feedbac
feature.
x86-64 code size 1086 -> 739 bytes (-347)
(Yes, I see checkpatch complaining about no space after comma in
"__attribute__((nonnull(2,3,4,5)))". Checkpatch is wrong.)
Signed-off-by: George Spelvin
Acked-by: Andrey Abramov
Feedback-from: Rasmus Villemoes
Feedback-from: Andy
6
https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.4.5380
Queue-Mergesort
Mordecai J. Golin, Robert Sedgewick
Information Processing Letters, 48(5):253--259, 10 December 1993
https://doi.org/10.1016/0020-0190(93)90088-q
https://sci-hub.tw/10.1016/0020-0190(93)90088-Q
Signed-off-by:
ode size 885 -> 767 bytes (-118)
(I see the checkpatch complaint about "else if (n -= size)".
The alternative is significantly uglier.)
Signed-off-by: George Spelvin
Acked-by: Andrey Abramov
---
lib/sort.c | 110 ++---
1 file changed, 8
s to quell checkpatch. I figure there
will be at least one round of comments and final testing.
George Spelvin (5):
lib/sort: Make swap functions more generic
lib/sort: Use more efficient bottom-up heapsort variant
lib/sort: Avoid indirect calls to built-in swap
lib/list_sort: Simplify
feature.
x86-64 code size 1086 -> 739 bytes (-347)
(Yes, I see checkpatch complaining about no space after comma in
"__attribute__((nonnull(2,3,4,5)))". Checkpatch is wrong.)
Signed-off-by: George Spelvin
Acked-by: Andrey Abramov
Feedback-from: Rasmus Villemoes
Feedback-from: Andy
ction, there are a few extra predicted branches on the code path.)
This actually *shrinks* the x86-64 code, because it inlines the various
swap functions inside do_swap, eliding function prologues & epilogues.
x86-64 code size 767 -> 703 bytes (-64)
Signed-off-by: George Spelvin
Acke
the great tradition of bikeshedding, the names were
by far the most contentious issue during review of this patch series.
x86-64 code size 872 -> 886 bytes (+14)
Signed-off-by: George Spelvin
Acked-by: Andrey Abramov
Feedback-from: Andy Shevchenko
Feedback-from: Rasmus Villemoes
Feedbac
ode size 885 -> 767 bytes (-118)
(I see the checkpatch complaint about "else if (n -= size)".
The alternative is significantly uglier.)
Signed-off-by: George Spelvin
Acked-by: Andrey Abramov
---
lib/sort.c | 110 ++---
1 file changed, 8
6
https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.4.5380
Queue-Mergesort
Mordecai J. Golin, Robert Sedgewick
Information Processing Letters, 48(5):253--259, 10 December 1993
https://doi.org/10.1016/0020-0190(93)90088-q
https://sci-hub.tw/10.1016/0020-0190(93)90088-Q
Signed-off-by:
Indeed, thanks to everyone who commented. The extra conceptual
complexity and reduced readbility is Just Not Worth It.
v2 (and final, as far as I'm concerned) follows.
On Fri, 15 Mar 2019 at 13:57:05 +0100, Geert Uytterhoeven wrote:
> On Fri, Mar 15, 2019 at 11:23 AM George Spelvin wrote:
>> On Fri, 15 Mar 2019 at 09:20:58 +0100, Geert Uytterhoeven wrote:
>>> On Fri, Mar 15, 2019 at 5:33 AM George Spelvin wrote:
>>>> One question
On Fri, 15 Mar 2019 at 09:20:58 +0100, Geert Uytterhoeven wrote:
> On Fri, Mar 15, 2019 at 5:33 AM George Spelvin wrote:
>> On Thu, 14 Mar 2019 at 11:10:41 +0200, Andy Shevchenko wrote:
>>> On Tue, Mar 05, 2019 at 03:06:44AM +0000, George Spelvin wrote:
>>>> +
On Thu, 14 Mar 2019 at 11:10:41 +0200, Andy Shevchenko wrote:
> On Tue, Mar 05, 2019 at 03:06:44AM +0000, George Spelvin wrote:
>> +for (bit = 1; count & bit; bit <<= 1) {
>> +cur = merge(priv, (cmp_func)cmp, pending, cur);
>> +
>> swap_bytes / swap_4byte_words / swap_8byte_words
>> swap_bytes / swap_ints / swap_longs
>> swap_1 / swap_4 / swap_8
>> Pistols at dawn?
On Thu, 14 Mar 2019 at 22:59:55 +0300, Andrey Abramov wrote:
> Yes, in my opinion, swap_bytes / swap_ints / swap_longs are the
> most readable because we have
On Thu, 14 Mar 2019 at 11:41:26 +0100, Geert Uytterhoeven wrote:
> On Thu, Mar 14, 2019 at 11:10 AM George Spelvin wrote:
>> On Sat, 09 Mar 2019 at 23:19:49 +0300, Andrey Abramov wrote:
>>>> How about one of:
>>>> swap_bytes / swap_ints / swap_longs
>>>&g
On Sat, 09 Mar 2019 at 23:19:49 +0300, Andrey Abramov wrote:
>> Although I'm thinking of:
>>
>> static bool __attribute_const__
>> is_aligned(const void *base, size_t size, unsigned char align)
>> {
>> unsigned char lsbits = (unsigned char)size;
>>
>> (void)base;
>> #ifndef CONFIG_HAVE_EF
On Sat, 09 Mar 2019 at 23:19:49 +0300, Andrey Abramov wrote:
>> Although I'm thinking of:
>>
>> static bool __attribute_const__
>> is_aligned(const void *base, size_t size, unsigned char align)
>> {
>> unsigned char lsbits = (unsigned char)size;
>>
>> (void)base;
>> #ifndef CONFIG_HAVE_EF
On Thu, 14 Mar 2019 at 11:10:41 +0200, Andy Shevchenko wrote:
> On Tue, Mar 05, 2019 at 03:06:44AM +0000, George Spelvin wrote:
>> +/* Do merges corresponding to set lsbits in count */
>
>> +for (bit = 1; count & bit; bit <<= 1) {
>> +
On Thu, 14 Mar 2019 at 00:28:16 +0100, Rasmus Villemoes wrote:
> On 05/03/2019 06.58, George Spelvin wrote:
>> This patch avoids badly unbalanced merges on unlucky input sizes.
>> It slightly increases the code size, but saves an average of 0.2*n
>> calls to cmp().
>>
On Wed, 13 Mar 2019 at 23:29:40 +0100, Rasmus Villemoes wrote:
> On 21/02/2019 09.21, George Spelvin wrote:
>> +/**
>> + * parent - given the offset of the child, find the offset of the parent.
>> + * @i: the offset of the heap element whose parent is sought. Non-z
Thank you for your thoughtful comments!
On Wed, 13 Mar 2019 at 23:23:44 +0100, Rasmus Villemoes wrote:
> On 21/02/2019 07.30, George Spelvin wrote:
> + * @align: required aignment (typically 4 or 8)
>
> typo aLignment
Thanks; fixed!
>> + * Returns true if elements can be copie
Rasmus Villemoes wrote:
> On 05/03/2019 04.06, George Spelvin wrote:
>> + * (Actually, it is always called with @a being the element which was
>> + * originally first, so it is not necessary to to distinguish the @a < @b
>> + * and @a == @b cases; the return value may be a
Andy Shevchenko wrote:
> Shouldn't simple memcpy cover these case for both 32- and 64-bit
> architectures?
Speaking of replacing swap with copying via temporary buffers, one
idea that did come to mind was avoiding swap for sufficiently small
objects.
Every sift-down is actually a circular shift.
, very few users of sort() sort pointers (or pointer-sized
objects); most sort structures containing at least two words.
(E.g. drivers/acpi/fan.c:acpi_fan_get_fps() sorts an array of 40-byte
struct acpi_fan_fps.)
x86-64 code size 872 -> 885 bytes (+8)
Signed-off-by: George Spelvin
---
lib
n
"__attribute__((nonnull(2,3,4,5)))". Checkpatch is wrong.)
Signed-off-by: George Spelvin
---
include/linux/list_sort.h | 1 +
lib/list_sort.c | 152 --
2 files changed, 96 insertions(+), 57 deletions(-)
diff --git a/include/linux/l
1998.0986
https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.4.5380
Queue-Mergesort
Mordecai J. Golin, Robert Sedgewick
Information Processing Letters, 48(5):253--259, 10 December 1993
https://doi.org/10.1016/0020-0190(93)90088-q
https://sci-hub.tw/10.1016/0020-0190(93)90088-Q
770 bytes (-115)
(I see the checkpatch complaint about "else if (n -= size)".
The alternative is significantly uglier.)
Signed-off-by: George Spelvin
---
lib/sort.c | 102 +++--
1 file changed, 75 insertions(+), 27 deletions(-)
diff
ction, there are a few additional (highly predictable) conditional
branches on the code path.)
This actually *shrinks* the x86-64 code, because it inlines the various
swap functions inside do_swap, eliding function prologues & epilogues.
x86-64 code size 770 -> 709 bytes (-61)
Signed-of
heckpatch. I figure there
will be at least one round of comments and final testing.
George Spelvin (5):
lib/sort: Make swap functions more generic
lib/sort: Use more efficient bottom-up heapsort variant
lib/sort: Avoid indirect calls to built-in swap
lib/list_sort: Simplify and rem
around.
> On Fri, 2016-12-23 at 20:17 -0500, George Spelvin wrote:
>> For example, two mix-backs of 64 bits gives you 65 bit security, not 128.
>> (Because each mixback can be guessed and verified separately.)
> Exactly, but the full reseed after running out of entropy is strong
&g
Hannes Frederic Sowa wrote:
> On 24.12.2016 00:39, George Spelvin wrote:
>> We just finished discussing why 8 bytes isn't enough. If you only
>> feed back 8 bytes, an attacker who can do 2^64 computation can find it
>> (by guessing and computing forward to verify
Hannes Frederic Sowa wrote:
> In general this looks good, but bitbuffer needs to be protected from
> concurrent access, thus needing at least some atomic instruction and
> disabling of interrupts for the locking if done outside of
> get_random_long. Thus I liked your previous approach more where yo
(Cc: list trimmed slightly as the topic is wandering a bit.)
Hannes Frederic Sowa wrote:
> On Thu, 2016-12-22 at 19:07 -0500, George Spelvin wrote:
>> Adding might_lock() annotations will improve coverage a lot.
>
> Might be hard to find the correct lock we take later down the code
Hannes Frederic Sowa wrote:
> A lockdep test should still be done. ;)
Adding might_lock() annotations will improve coverage a lot.
> Yes, that does look nice indeed. Accounting for bits instead of bytes
> shouldn't be a huge problem either. Maybe it gets a bit more verbose in
> case you can't sat
> I do tend to like Ted's version in which we use batched
> get_random_bytes() output. If it's fast enough, it's simpler and lets
> us get the full strength of a CSPRNG.
With the ChaCha20 generator, that's fine, although note that this abandons
anti-backtracking entirely.
It also takes locks, so
> Having slept on this, I like it less. The problem is that a
> backtracking attacker doesn't just learn H(random seed || entropy_0 ||
> secret || ...) -- they learn the internal state of the hash function
> that generates that value. This probably breaks any attempt to apply
> security propertie
> True, but it's called get_random_int(), and it seems like making it
> stronger, especially if the performance cost is low to zero, is a good
> thing.
If it's cheap enough, I don't mind. But it's documented as being
marginal-quality, used where speed is more important.
In particular, it's *not*
Andy Lutomirski wrote:
> I don't even think it needs that. This is just adding a
> non-destructive final operation, right?
It is, but the problem is that SipHash is intended for *small* inputs,
so the standard implementations aren't broken into init/update/final
functions.
There's just one big f
> Plus the benchmark was bogus anyway, and when I built a more specific
> harness -- actually comparing the TCP sequence number functions --
> SipHash was faster than MD5, even on register starved x86. So I think
> we're fine and this chapter of the discussion can come to a close, in
> order to mov
As a separate message, to disentangle the threads, I'd like to
talk about get_random_long().
After some thinking, I still like the "state-preserving" construct
that's equivalent to the current MD5 code. Yes, we could just do
siphash(current_cpu || per_cpu_counter, global_key), but it's nice to
pr
Theodore Ts'o wrote:
> On Wed, Dec 21, 2016 at 01:37:51PM -0500, George Spelvin wrote:
>> SipHash annihilates the competition on 64-bit superscalar hardware.
>> SipHash dominates the field on 64-bit in-order hardware.
>> SipHash wins easily on 32-bit hardware *wit
Eric Dumazet wrote:
> Now I am quite confused.
>
> George said :
>> Cycles per byte on 1024 bytes of data:
>> Pentium Core 2 Ivy
>> 4 Duo Bridge
>> SipHash-2-4 38.9 8.3 5.8
>> HalfSipHash-2-4 12.7 4.5 3.2
>> MD5
Linus wrote:
>> How much does kernel_fpu_begin()/kernel_fpu_end() cost?
>
> It's now better than it used to be, but it's absolutely disastrous
> still. We're talking easily many hundreds of cycles. Under some loads,
> thousands.
I think I've been thoroughly dissuaded, but just to clarify one thing
Actually, DJB just made a very relevant suggestion.
As I've mentioned, the 32-bit performance problems are an x86-specific
problem. ARM does very well, and other processors aren't bad at all.
SipHash fits very nicely (and runs very fast) in the MMX registers.
They're 64 bits, and there are 8 of
Eric Dumazet wrote:
> On Tue, 2016-12-20 at 22:28 -0500, George Spelvin wrote:
>> Cycles per byte on 1024 bytes of data:
>> Pentium Core 2 Ivy
>> 4 Duo Bridge
>> SipHash-2-4 38.9 8.3 5.8
>>
> I do not see why SipHash, if faster than MD5 and more secure, would be a
> problem.
Because on 32-bit x86, it's slower.
Cycles per byte on 1024 bytes of data:
Pentium Core 2 Ivy
4 Duo Bridge
SipHash-2-4 38.9 8.3 5.8
Theodore Ts'o wrote:
> On Mon, Dec 19, 2016 at 06:32:44PM +0100, Jason A. Donenfeld wrote:
>> 1) Anything that requires actual long-term security will use
>> SipHash2-4, with the 64-bit output and the 128-bit key. This includes
>> things like TCP sequence numbers. This seems pretty uncontroversial
David Laight wrote:
> From: George Spelvin
...
>> uint32_t
>> hsiphash24(char const *in, size_t len, uint32_t const key[2])
>> {
>> uint32_t c = key[0];
>> uint32_t d = key[1];
>> uint32_t a = 0x6c796765 ^ 0x736f6d65;
>> uint32_t
To follow up on my comments that your benchmark results were peculiar,
here's my benchmark code.
It just computes the hash of all n*(n+1)/2 possible non-empty substrings
of a buffer of n (called "max" below) bytes. "cpb" is "cycles per byte".
(The average length is (n+2)/3, c.f. https://oeis.org
BTW, here's some SipHash code I wrote for Linux a while ago.
My target application was ext4 directory hashing, resulting in different
implementation choices, although I still think that a rolled-up
implementation like this is reasonable. Reducing I-cache impact speeds
up the calling code.
One th
> I already did this. Check my branch.
Do you think it should return "u32" (as you currently have it) or
"unsigned long"? I thought the latter, since it doesn't cost any
more and makes more
> I wonder if this could also lead to a similar aliasing
> with arch_get_random_int, since I'm pretty sur
> 64-bit security for an RNG is not reasonable even with rekeying. No no
> no. Considering we already have a massive speed-up here with the
> secure version, there's zero reason to start weakening the security
> because we're trigger happy with our benchmarks. No no no.
Just to clarify, I was disc
An idea I had which mght be useful:
You could perhaps save two rounds in siphash_*u64.
The final word with the length (called "b" in your implementation)
only needs to be there if the input is variable-sized.
If every use of a given key is of a fixed-size input, you don't need
a length suffix.
> What should we do with get_random_int() and get_random_long()? In
> some cases it's being used in performance sensitive areas, and where
> anti-DoS protection might be enough. In others, maybe not so much.
This is tricky. The entire get_random_int() structure is an abuse of
the hash function
Jason A. Donenfeld wrote:
> I saw that jiffies addition in there and was wondering what it was all
> about. It's currently added _after_ the siphash input, not before, to
> keep with how the old algorithm worked. I'm not sure if this is
> correct or if there's something wrong with that, as I haven'
Tom Herbert wrote:
> Tested this. Distribution and avalanche effect are still good. Speed
> wise I see about a 33% improvement over siphash (20 nsecs/op versus 32
> nsecs). That's about 3x of jhash speed (7 nsecs). So that might closer
> to a more palatable replacement for jhash. Do we lose any sec
>> On a 64-bit machine, 64-bit SipHash is *always* faster than 32-bit, and
>> should be used always. Don't even compile the 32-bit code, to prevent
>> anyone accidentally using it, and make hsiphash an alias for siphash.
> Fascinating! Okay. So I'll alias hsiphash to siphash on 64-bit then. I
> l
> It appears that hsiphash can produce either 32-bit output or 64-bit
> output, with the output length parameter as part of the hash algorithm
> in there. When I code this for my kernel patchset, I very likely will
> only implement one output length size. Right now I'm leaning toward
> 32-bit.
A 1
Jason A. Donenfeld wrote:
> Isn't that equivalent to:
> v0 = key[0];
> v1 = key[1];
> v2 = key[0] ^ (0x736f6d6570736575ULL ^ 0x646f72616e646f6dULL);
> v3 = key[1] ^ (0x646f72616e646f6dULL ^ 0x7465646279746573ULL);
(Pre-XORing key[] with the first two constants which, if the
Jean-Philippe Aumasson wrote:
> If a halved version of SipHash can bring significant performance boost
> (with 32b words instead of 64b words) with an acceptable security level
> (64-bit enough?) then we may design such a version.
It would be fairly significant, a 2x speed benefit on a lot of 32-b
> If a halved version of SipHash can bring significant performance boost
> (with 32b words instead of 64b words) with an acceptable security level
> (64-bit enough?) then we may design such a version.
I was thinking if the key could be pushed to 80 bits, that would be nice,
but honestly 64 bits is
> While SipHash is extremely fast for a cryptographically secure function,
> it is likely a tiny bit slower than the insecure jhash, and so replacements
> will be evaluated on a case-by-case basis based on whether or not the
> difference in speed is negligible and whether or not the current jhash u
> There's a 32-bit secret random salt (inet_ehash_secret) which means
> that in practice, inet_ehashfn() will select 1 out of 2^32 different
> hash functions at random each time you boot the kernel; without
> knowing which one it selected, how can a local or remote attacker can
> force IPv4 connect
> As these patches were meant for generic non-m68k code, I didn't plan
> to take them through my tree.
Well, that explains everything. Sorry for letting it fall through the
cracks. I assumed that since you send pull requests to Linus regularly,
you'd send it directly.
That's why I only acked th
Arnd Bergmann wrote:
>> Warnings:
>> lib/test_hash.c:224:7: warning: "HAVE_ARCH__HASH_32" is not defined
>> [-Wundef]
>> lib/test_hash.c:229:7: warning: "HAVE_ARCH_HASH_32" is not defined
>> [-Wundef]
>> lib/test_hash.c:234:7: warning: "HAVE_ARCH_HASH_64" is not defined
>> [-Wundef]
+#define external_array(type, name) \
+ ({ \
+ extern type name[]; \
+ type *name_ptr = name; \
+ asm ("" : "+r" (name_ptr)); \
+ name_ptr; \
+ })
I've had to pull similar tricks to persuade GCC to generate
the code I wanted (in m
Cyril Hrubis wrote:
> Thomas Gleixner wrote:
>> Err. You know that the timer expired because sigtimedwait() returns
>> EAGAIN. And the only thing you can reliably check for is that the timer did
>> not expired to early. Anything else is guesswork and voodoo programming.
> But seriously is there a
> With that being said, wouldn't it make sense to:
>
> - Get rid of the entropy heuristic entirely and just assume a fixed value of
> entropy for a given event?
What does that gain you? You can always impose an upper bound, but *some*
evidence that it's not a metronome is nice to have.
> - rem
I want to read this even more, but here's a dump of my comments so far...
> 1) Cascading is avoided (except for extreme long time timers)
> + * Note: This implementation might be suboptimal vs. timers enqueued in the
> + *cascade level because we do not look at the timers to figure out when
>
Madhavan Srinivasan wrote:
> +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64)
> + tmp = addr[(((nbits - 1)/BITS_PER_LONG) - (start / BITS_PER_LONG))]
> + ^ invert;
> +#else
> tmp = addr[start / BITS_PER_LONG] ^ invert
Thomas Gleixner wrote:
> On Tue, 14 Jun 2016, George Spelvin wrote:
>> If you need to shrink TIMER_ARRAYMASK to fit another flag bit,
>
> We can accomodate wheel with 512 buckets with the current ARRAYMASK and that
> really should be enough.
You're absolutely correct, but
Peter Zijlstra wrote:
> That did occur to me as well; however I think it would be best to
> eradicate all forms of cascading entirely -- if at all possible.
Agreed.
>> to be replaced with __builtin_clz or similar:
> Problem is for the archs that don't have that, the 5 layer branch is
> trivial f
On Tue, 14 Jun 2016, Thomas Gleixner wrote:
> I thought about that and when looking at those long timeout thingies
> I came to the conclusion that it's simply not worth the trouble.
Okay. A comment might be nice, just to stop someone else wasting
brain power on it. E.g.
/*
* If the timer happe
Boris Brezillon wrote:
> On 12 Jun 2016 16:24:53 George Spelvin wrote:
>> Boris Brezillon wrote:
>> My problem is that I don't really understand MLC programming.
> I came to the same conclusion: we really have these 2 cases in the
> wild, which makes it even more compli
Nice cleanup!
I think I see a buglet in your level-5 cascading.
Suppose a timer is requested far in the future for a time
that is an exact multiple of 32768 jiffies.
collect_expired_timers() scans level 5 after all the previous ones,
and will cascade it to level 0, in a level-0 bucket which has
Shouldn't this function be called hlist_is_only_node()?
hlist_is_last_node would be simply !n->next, as "last" doesn't
imply the lack of others.
I understand how the name might make sense in the context of removing
entries from the list, but in other contexts, it's confusing.
;, but it's quite noticeably slower than the new
streamlined __d_lookup_rcu(). And as you can tell, that means that we
must have a 90%+ hitrate in the new L1 dcache lookup, since we only see
10% as much time in the slow routine as in the L1 front-end.
[George Spelvin speaking]
I fixed th
Boris Brezillon wrote:
> On 12 Jun 2016 08:25:49 -0400
> "George Spelvin" wrote:
>> (In fact, an interesting
>> question is whether bad pages should be skipped or not!)
>
> There's no such thing. We have bad blocks, but when a block is bad all
> the page
>> (Another thing I thought of, but am less sure of, is packing the group
>> and pair numbers into a register-passable int rather than a structure.
>> Even 2 bits for the group is probably the most that will ever be needed,
>> but it's easy to say the low 4 bits are the group and the high 28 are
>>
>> It also applies an offset of +1, to avoid negative numbers and the
>> problems of signed divides.
> It seems to cover all cases.
I wasn't sure why you used a signed int for the interface.
(Another thing I thought of, but am less sure of, is packing the group
and pair numbers into a register-p
;, but it's quite noticeably slower than the new
streamlined __d_lookup_rcu(). And as you can tell, that means that we
must have a 90%+ hitrate in the new L1 dcache lookup, since we only see
10% as much time in the slow routine as in the L1 front-end.
[George Spelvin speaking]
I fixed th
lying all of this, the corrected code looks like the following:
(Note the addition of "const" and "__pure" annotations, which should
get copied to the "struct mtd_pairing_scheme" declaration.)
Signed-off-by: George Spelvin
/*
* In distance-3 pairing, odd pages i ar
Andy Shevchenko wrote:
> Something wrong with mail configuration?
Oops, sorry, I forgot to delete the header.
> On Sun, 2016-06-05 at 15:25 -0400, George Spelvin wrote:
>> It's also faster, as hex_to_bin() *is* inlined within hex2bin()
>> (if you compile with -O).
r
>From andriy.shevche...@linux.intel.com Sun Jun 05 14:21:40 2016
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.26,421,1459839600";
d="scan'208";a="969274163"
Subject: Re: [PATCH v2 1/2] lib/vsprintf.c: Simplify uuid_string()
From: Andy Shevchenko
To: Ge
>From andriy.shevche...@linux.intel.com Sun Jun 05 14:19:48 2016
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.26,421,1459839600";
d="scan'208";a="995605979"
Subject: Re: [PATCH v2 3/2] lib/uuid.c: Silence an unchecked return value
warning
From: Andy
Joe Perches wrote:
> And Ingo commented last month:
> https://lkml.org/lkml/2016/4/29/69
>
> Maybe this __uuid_to_bin function should be made public and
> the acpi version in drivers/acpi/acpica/utuuid.c should be
> removed.
I agree with the second part, but not the first; the uuid_le_to_bin()
wra
-24.4%
arm 92 104 +12 13.0% 116 -12 -10.3%
thumb 50 62 +12 24.0% 100 -38 -38.0%
arm64 116 124 +8 6.9%148 -24 -16.2%
Signed-off-by: George Spelvin
---
lib/uuid.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
: George Spelvin
---
include/linux/uuid.h | 5 +++--
lib/uuid.c | 24 +---
lib/vsprintf.c | 26 ++
3 files changed, 22 insertions(+), 33 deletions(-)
diff --git a/include/linux/uuid.h b/include/linux/uuid.h
index 2d095fc6..238f16f7 100644
Percentage
x86-32 245 199 -46 -18.8%
x86-64 246 186 -60 -24.4%
arm 292 264 -28 -9.6%
thumb 220 160 -60 -27.3%
arm64 296 244 -52 -17.6%
Signed-off-by: George Spelvin
---
lib/vsprintf.c | 16
1 file changed, 8
1 - 100 of 560 matches
Mail list logo