On 09/14/16 at 09:23am, Mickaël Salaün wrote:
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 9aa01d9d3d80..36c3e482239c 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -85,6 +85,8 @@ enum bpf_arg_type {
>
> ARG_PTR_TO_CTX, /* pointer to context
On 09/14/16 at 09:23am, Mickaël Salaün wrote:
> @@ -155,6 +163,7 @@ union bpf_attr {
> __u32 log_size; /* size of user buffer */
> __aligned_u64 log_buf;/* user supplied buffer */
> __u32 kern_version; /* checked when
On 09/14/16 at 09:23am, Mickaël Salaün wrote:
> This fix a pointer leak when an unprivileged eBPF program read a pointer
> value from the context. Even if is_valid_access() returns a pointer
> type, the eBPF verifier replace it with UNKNOWN_VALUE. The register
> value containing an address is then
On 07/07/16 at 10:36pm, Jiri Kosina wrote:
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index f45929c..630838e 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -52,6 +52,7 @@
> #include
> #include
> #include
> +#include
>
> struct n
On 12/09/15 at 10:38am, Herbert Xu wrote:
> On Wed, Dec 09, 2015 at 03:36:32AM +0100, Thomas Graf wrote:
> >
> > Without knowing your exact implementation plans: introducing an
> > additional reference indirection for every lookup will have a
> > huge performance penalt
On 12/09/15 at 10:24am, Herbert Xu wrote:
> On Wed, Dec 09, 2015 at 03:18:26AM +0100, Thomas Graf wrote:
> >
> > Assuming that we only encounter this scenario with very large
> > table sizes, it might be OK to assume that deferring the actual
> > resize via the worker
On 12/05/15 at 03:06pm, Herbert Xu wrote:
> Unless we can make __vmalloc work with BH disabled, I guess we'll
> have to go back to multi-level lookups unless someone has a better
> suggestion.
Assuming that we only encounter this scenario with very large
table sizes, it might be OK to assume that
On 12/05/15 at 03:06pm, Herbert Xu wrote:
> On Fri, Dec 04, 2015 at 07:15:55PM +0100, Phil Sutter wrote:
> >
> > > Only one should really do this, while others are waiting.
> >
> > Sure, that was my previous understanding of how this thing works.
>
> Yes that's clearly how it should be. Unfortun
On 10/21/15 at 05:17pm, Daniel Borkmann wrote:
> On 10/20/2015 08:56 PM, Eric W. Biederman wrote:
> ...
> >Just FYI: Using a device for this kind of interface is pretty
> >much a non-starter as that quickly gets you into situations where
> >things do not work in containers. If someone gets a vers
On 10/16/15 at 10:32am, Alexei Starovoitov wrote:
> On 10/16/15 9:43 AM, Hannes Frederic Sowa wrote:
> >Oh, tracing does not allow daemons. Why? I can only imagine embedded
> >users, no?
>
> yes and for networking: restartability and HA.
> cannot really do that with fuse/daemons.
Right, the small
On 10/08/15 at 08:20pm, Hannes Frederic Sowa wrote:
> Hi Alexei,
>
> On Thu, Oct 8, 2015, at 07:23, Alexei Starovoitov wrote:
> > The feature is controlled by sysctl kernel.unprivileged_bpf_disabled.
> > This toggle defaults to off (0), but can be set true (1). Once true,
> > bpf programs and map
>
> entry->next = base + off
> entry->next <<= 1
> entry->next |= 1
>
> Which will break concurrent readers.
>
> NULLS value recomputation is not needed here, so just remove
> the complex logic.
>
> The data race was found with KernelThreadSanit
On 09/21/15 at 04:03pm, Eric Dumazet wrote:
> What I said is :
>
> In @head you already have the correct nulls value, from hash table.
>
> You do not need to recompute this value, and/or test if hash table chain
> is empty.
>
> If hash bucket is empty, it contains the appropriate NULLS value.
>
On 09/21/15 at 07:51am, Eric Dumazet wrote:
> The important part here is that we rehash an item, so we need to make
> sure to maintain consistent ->next field, and need to prevent compiler
> from using ->next as a temporary variable.
>
> ptr->next = 1UL | ((base + offset) << 1);
>
> Is dangerous
e socket
> has been bound.
>
> Fixes: c0bb07df7d98 ("netlink: Reset portid after netlink_insert failure")
> Reported-by: Tejun Heo
> Reported-by: Linus Torvalds
> Signed-off-by: Herbert Xu
> Reviewed-by: Cong Wang
Acked-by: Thomas Graf
--
To unsubscribe from this li
getting rid of the pointer intermediary and referring directly to the
> ip_tunnel_info structure.
>
> Signed-off-by: Geert Uytterhoeven
Acked-by: Thomas Graf
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.ke
On 09/02/15 at 10:00am, Herbert Xu wrote:
> On Tue, Sep 01, 2015 at 04:51:24PM +0200, Thomas Graf wrote:
> >
> > 1. The current in-kernel self-test
> > 2. bind_netlink.c: https://github.com/tgraf/rhashtable
>
> Thanks, I will try to reproduce this.
Th
On 09/01/15 at 10:16pm, Herbert Xu wrote:
> On Tue, Sep 01, 2015 at 04:13:05PM +0200, Thomas Graf wrote:
> >
> > You can easily trigger this outside of the testsuite as well. Open
> > 10K Netlink sockets in a loop and the creation of the sockets will
> > fail way befo
On 09/01/15 at 10:03pm, Herbert Xu wrote:
> On Tue, Sep 01, 2015 at 03:56:18PM +0200, Phil Sutter wrote:
> >
> > Looking at rhashtable_test.c, I see the initial table size is 8 entries.
> > 70% of that is 5.6 entries, so background expansion is started after the
> > 6th entry has been added, right
On 08/28/15 at 03:34pm, Phil Sutter wrote:
> Quite ugly, IMHO: rhashtable_insert_fast() may return -ENOMEM as
> non-permanent error, if allocation in GFP_ATOMIC failed. In this case,
> allocation in GFP_KERNEL is retried by rht_deferred_worker(). Sadly,
> there is no way to determine if that has al
size unless overridden by parameter.
>
> Note that specifying the exact number of objects upon table init won't
> suffice as that value is being rounded down to the next power of two -
> anticipate this by rounding up to the next power of two in beforehand.
>
> Signed-off
On 08/28/15 at 12:28pm, Phil Sutter wrote:
> After adding cond_resched() calls to threadfunc(), a surprisingly high
> rate of insert failures occurred probably due to table resizes getting a
> better chance to run in background. To not soften up the remaining
> tests, retry inserts until they eithe
On 08/28/15 at 12:28pm, Phil Sutter wrote:
> This should fix for soft lockup bugs triggered on slow systems.
>
> Signed-off-by: Phil Sutter
Acked-by: Thomas Graf
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord
ot subject to this patch but we may want to revisit the syntax of the
state flags. It's neatly compressed like this but ct_state=untracked
ct_state=related might be more readable. The +trk should be implicit
for anything !untracked
> Signed-off-by: Joe Stringer
Acked-by: Thomas Graf
-
ached to the connection. Label modification occurs after
> lookup, and will only persist when the conntrack entry is committed by
> providing the COMMIT flag to the CT action. Labels are currently fixed
> to 128 bits in size.
>
> Signed-off-by: Joe Stringer
Acked-by: Thomas Graf
--
To
nel state to dst_entry")
> Signed-off-by: Sasha Levin
Acked-by: Thomas Graf
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
ff-by: Joe Stringer
Acked-by: Thomas Graf
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
; CT action. The maximum received unit (MRU) size is tracked so that
> refragmentation can occur during output.
>
> IP frag handling contributed by Andy Zhou.
>
> Signed-off-by: Joe Stringer
> Signed-off-by: Justin Pettit
> Signed-off-by: Andy Zhou
Acked-by
On 08/24/15 at 05:32pm, Joe Stringer wrote:
> This variation on skb_dst_copy() doesn't require two skbs.
>
> Signed-off-by: Joe Stringer
> Acked-by: Pravin B Shelar
Acked-by: Thomas Graf
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel"
the original actions length when
> de-serializing and re-use the original length when serializing.
>
> Signed-off-by: Joe Stringer
> Acked-by: Pravin B Shelar
Acked-by: Thomas Graf
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a me
Signed-off-by: Joe Stringer
> Acked-by: Florian Westphal
Acked-by: Thomas Graf
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read
On 08/18/15 at 04:39pm, Joe Stringer wrote:
> The following patches will reuse this code from OVS.
>
> Signed-off-by: Joe Stringer
Acked-by: Thomas Graf
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@v
andled by transparently assembling them as part of the
> ct action. The maximum received unit (MRU) size is tracked so that
> refragmentation can occur during output.
>
> IP frag handling contributed by Andy Zhou.
>
> Signed-off-by: Joe Stringer
> Signed-off-by: Justin Pettit
en set to zero so this extended test does not run at all by
> default.
>
> Signed-off-by: Phil Sutter
Looks great. A default of 10 makes sense as well. Thanks a lot!
Acked-by: Thomas Graf
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the
On 07/31/15 at 10:51am, Joe Stringer wrote:
> On 31 July 2015 at 07:34, Hannes Frederic Sowa wrote:
> > In general, this shouldn't be necessary as the packet should already be
> > scrubbed before they arrive here.
> >
> > Could you maybe add a WARN_ON and check how those skbs with conntrack
> > da
On 07/30/15 at 04:16pm, Joe Stringer wrote:
> On 30 July 2015 at 11:40, Thomas Graf wrote:
> > On 07/30/15 at 11:12am, Joe Stringer wrote:
> >> Signed-off-by: Joe Stringer
> >
> > Can you write a few lines on why this is needed? I have flows which
> > use th
On 07/30/15 at 11:12am, Joe Stringer wrote:
> Signed-off-by: Joe Stringer
Acked-by: Thomas Graf
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majord
On 07/30/15 at 11:12am, Joe Stringer wrote:
> This will allow the ovs-conntrack code to reuse these macros.
>
> Signed-off-by: Joe Stringer
Acked-by: Thomas Graf
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord
the original actions length when
> de-serializing and re-use the original length when serializing.
>
> Signed-off-by: Joe Stringer
Acked-by: Thomas Graf
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.o
On 07/30/15 at 11:12am, Joe Stringer wrote:
> Signed-off-by: Joe Stringer
Can you write a few lines on why this is needed? I have flows which
use the mark to communicate with netfilter through internal ports.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body o
On 07/23/15 at 01:28am, David Miller wrote:
> From: Thomas Graf
> Date: Thu, 23 Jul 2015 10:08:44 +0200
>
> > Convert the module_init() to a invocation from inet_init() since
> > ip_tunnel_core is part of the INET built-in.
> >
> > Fixes: 3093fbe7ff4 ("r
Convert the module_init() to a invocation from inet_init() since
ip_tunnel_core is part of the INET built-in.
Fixes: 3093fbe7ff4 ("route: Per route IP tunnel metadata via lightweight
tunnel")
Signed-off-by: Thomas Graf
---
Compiles for me with:
make ARCH=arm CROSS_COMPILE=arm
On 07/17/15 at 12:26pm, Phil Sutter wrote:
> On Fri, Jul 17, 2015 at 10:04:56AM +0200, Thomas Graf wrote:
> > On 07/02/15 at 10:09pm, Meelis Roos wrote:
> > > [ 33.425061] Running rhashtable test nelem=8, max_size=65536,
> > > shrinking=0
> > > [ 33.
On 07/02/15 at 10:09pm, Meelis Roos wrote:
> [ 33.425061] Running rhashtable test nelem=8, max_size=65536, shrinking=0
> [ 33.425154] Test 00:
> [ 33.534470] Adding 5 keys
> [ 34.743553] Info: encountered resize
> [ 34.743698] Info: encountered resize
> [ 34.743838] Info: encounte
On 07/16/15 at 02:15pm, Denys Vlasenko wrote:
> On 07/16/2015 12:41 PM, Thomas Graf wrote:
> > On 07/16/15 at 12:02pm, Denys Vlasenko wrote:
> >> +/* jhash - hash an arbitrary key
> >> + * @k: sequence of bytes as key
> >> + * @length: the length of the key
>
On 07/16/15 at 12:02pm, Denys Vlasenko wrote:
> +/* jhash - hash an arbitrary key
> + * @k: sequence of bytes as key
> + * @length: the length of the key
> + * @initval: the previous hash, or an arbitray value
> + *
> + * The generic version, hashes an arbitrary sequence of bytes.
> + * No alignmen
On 07/15/15 at 12:35am, mr...@linux.ee wrote:
> Yes, this fixes the error, thank you.
>
> The new problem with the test - soft lockup - CPU#0 stuck for 22s! is
> still there on 360 MHz UltraSparc IIi. I understand it is harmless but
> is there some easy way to make the test avoid NMI watchdog?
>
On 07/13/15 at 10:11pm, Cong Wang wrote:
> Caused by:
>
> commit 21e4902aea80ef35afc00ee8d2abdea4f519b7f7
> Author: Thomas Graf
> Date: Fri Jan 2 23:00:22 2015 +0100
>
> netlink: Lockless lookup with RCU grace period in socket release
>
> Defers the rele
^^
> [ 32.022828] Deleting 2048 keys
Thanks for the report. I think this is already fixed. Can you try with the
following commit:
commit 246b23a7695bd5a457aa51a36a948cce53d1d477
Author: Thomas Graf
Date: Thu Apr 30 22:37:44 2015 +
rhashtable-test: Use walker to test bucket statis
Remove bogus max_size setting
>
> Now that resizing is completely automatic, we need to remove
> the max_size setting or the test will fail.
>
> Reported-by: Fengguang Wu
> Signed-off-by: Herbert Xu
Acked-by: Thomas Graf
Had the same fix queued up in an upcoming series ;-)
On 02/18/15 at 03:52pm, Shachar Raindel wrote:
> Hi,
>
> I'm running trinity inside a VM running linux-next tagged next-20150204.
>
> The kernel debugging infrastructure detected a use-after-free situation,
> probably in netlink:
This is most likely rhashtable related. The fixes for the
use-aft
Policy extension")
> Signed-off-by: Geert Uytterhoeven
Acked-by: Thomas Graf
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
On 01/29/15 at 03:40pm, Geert Uytterhoeven wrote:
> Allow the selftest on the resizable hash table to be built modular, just
> like all other tests that do not depend on DEBUG_KERNEL.
>
> Signed-off-by: Geert Uytterhoeven
LGTM
Acked-by: Thomas Graf
--
To unsubscribe from this li
On 01/29/15 at 09:57am, Valdis Kletnieks wrote:
> For the past few kernels, I've seen occasional crashes in rhashtable_shrink.
> Looks like the last thing to touch that code was the patch series:
>
> Subject [PATCH 0/9 net-next v2] rhashtable: Per bucket locks & deferred
> table resizing
>
On 01/16/15 at 03:37pm, Patrick McHardy wrote:
> On 02.01, Thomas Graf wrote:
> > +{
> > + struct nft_hash_elem *he = ptr;
> > + struct nft_compare_arg *x = arg;
> > +
> > + if (!nft_data_cmp(&he->key, &x->elem->key, x->set-
On 01/13/15 at 11:14am, Cong Wang wrote:
> On Tue, Jan 13, 2015 at 12:41 AM, Thomas Graf wrote:
> > I can't reproduce it in my KVM box either so far. It looks like a
> > mutex_lock() on an uninitialized mutex or use after free but I can't
> > find such a code pat
On 01/13/15 at 03:06pm, David Laight wrote:
> OK, ht->mutes saves the day.
> Might be worth a comment to save people looking at the code in isolation
> from worrying and doing a bit search.
> OTOH it might be obvious from a slightly larger fragment than the diff.
Good idea. Will do this. Also, tha
On 01/13/15 at 09:49am, David Laight wrote:
> From: Thomas Graf
> > Each per bucket lock covers a configurable number of buckets. While
> > shrinking, two buckets in the old table contain entries for a single
> > bucket in the new table. We need to lock down both while linki
On 01/13/15 at 03:50pm, Ying Xue wrote:
> On 01/12/2015 08:42 PM, Thomas Graf wrote:
> > On 01/12/15 at 09:38am, Ying Xue wrote:
> >> Hi Thomas,
> >>
> >> I am really unable to see where is wrong leading to below warning
> >> complaints. Can you please
("rhashtable: Per bucket locks & deferred expansion/shrinking")
Reported-by: Fengguang Wu
Signed-off-by: Thomas Graf
---
lib/rhashtable.c | 15 ---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index 8023b55..45477f7 10
On 01/12/15 at 09:38am, Ying Xue wrote:
> Hi Thomas,
>
> I am really unable to see where is wrong leading to below warning
> complaints. Can you please help me check it?
Not sure yet. It's not your patch that introduced the issue though.
It merely exposed the affected code path.
Just wondering,
On 01/03/15 at 11:02am, Stephen Hemminger wrote:
> As a follow on to Thomas's patch I think this would complete the
> transistion to RCU for netlink.
> Compile tested only.
>
>
>
> This patch gets rid of the reader/writer nl_table_lock and replaces it
> with exclusively using RCU for reading, an
Hash the key inside of rhashtable_lookup_compare() like
rhashtable_lookup() does. This allows to simplify the hashing
functions and keep them private.
Signed-off-by: Thomas Graf
Cc: netfilter-de...@vger.kernel.org
---
include/linux/rhashtable.h | 5 +--
lib/rhashtable.c | 91
-off-by: Thomas Graf
Cc: netfilter-de...@vger.kernel.org
---
include/linux/rhashtable.h | 2 --
lib/rhashtable.c | 34 +++---
net/netfilter/nft_hash.c | 11 +++
3 files changed, 10 insertions(+), 37 deletions(-)
diff --git a/include/linux
compiler from caching the first element.
The lockdep verifier is introduced as stub which always succeeds
and properly implement in the next patch when the locks are
introduced.
Signed-off-by: Thomas Graf
---
include/linux/rhashtable.h | 173 ++---
lib
ithout the side effect of severely
delayed socket destruction.
Signed-off-by: Thomas Graf
---
net/netlink/af_netlink.c | 32
net/netlink/af_netlink.h | 1 +
2 files changed, 17 insertions(+), 16 deletions(-)
diff --git a/net/netlink/af_netlink.c b/net/netlink/af
set through the rhashtable_params structure
like this:
struct rhashtable_params params = {
[...]
.nulls_base = (1U << RHT_BASE_SHIFT),
};
This reduces the hash length from 32 bits to 27 bits.
Signed-off-by: Thomas Graf
---
include/linux/list_nulls.h | 3 ++-
include
insertions performed during the RCU grace period which would at that
point land in the future table. The lookup will see them as it searches
both tables if needed.
Having multiple insertions and removals occur in parallel requires nelems
to become an atomic counter.
Signed-off-by: Thomas Graf
Signed-off-by: Thomas Graf
---
include/linux/spinlock.h | 8
include/linux/spinlock_api_smp.h | 2 ++
include/linux/spinlock_api_up.h | 1 +
kernel/locking/spinlock.c| 8
4 files changed, 19 insertions(+)
diff --git a/include/linux/spinlock.h b/include/linux
Subsequent patches will require access to the bucket tail. Access
to the tail is relatively cheap as the automatic resizing of the
table should keep the number of entries per bucket to no more
than 0.75 on average.
Signed-off-by: Thomas Graf
---
lib/rhashtable.c | 23 ++-
1
Signed-off-by: Thomas Graf
---
lib/rhashtable.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index 1ee0eb6..b658245 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -427,7 +427,7 @@ void *rhashtable_lookup(const struct
the chain linked list to be terminated with a special
nulls marker to allow entries to move between multiple lists.
Last but not least, reintroduces lockless netlink_lookup() with
deferred Netlink socket destruction to avoid the side effect of
increased netlink_release() runtime.
Thomas Graf (9
On 12/18/14 at 06:02pm, Varlese, Marco wrote:
> Roopa, one of the comments I got from Thomas Graf on my v1 patch was that
> your patch and mine were supplementary ("I think Roopa's patches are
> supplementary. Not all switchdev users will be backed with a Linux Bridge. I
On 12/18/14 at 08:03am, John Fastabend wrote:
> On 12/18/2014 07:30 AM, Varlese, Marco wrote:
> Could you also document the attributes. I think they are mostly
> clear but what is IFLA_SW_LOOPBACK. It will help later when we
> try to read the code in 6months and implement drivers.
>
> I am thinkin
On 12/18/14 at 11:29am, Varlese, Marco wrote:
> diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
> index f7d0d2d..19cb51a 100644
> --- a/include/uapi/linux/if_link.h
> +++ b/include/uapi/linux/if_link.h
> @@ -146,6 +146,7 @@ enum {
> IFLA_PHYS_PORT_ID,
> IFLA_CAR
On 12/15/14 at 02:29pm, Varlese, Marco wrote:
> > All of these are highly generic and should *not* be passed through from user
> > space to the driver directly but rather be properly abstracted as Roopa
> > proposed. The value of this API is abstraction.
> How would you let the user enable/disable
On 12/11/14 at 09:59am, Varlese, Marco wrote:
> An example of attributes are:
> * enabling/disabling of learning of source addresses on a given port (you can
> imagine the attribute called LEARNING for example);
> * internal loopback control (i.e. LOOPBACK) which will control how the flow
> of tr
On 12/10/14 at 04:23pm, Varlese, Marco wrote:
> +#ifdef CONFIG_NET_SWITCHDEV
> +static int do_setswcfg(struct net_device *dev, struct nlattr *attr)
> +{
> + int rem, err = -EINVAL;
> + struct nlattr *v;
> + const struct net_device_ops *ops = dev->netdev_ops;
> +
> + nla_for_each_nes
On 12/04/14 at 11:29pm, Herbert Xu wrote:
> On Thu, Dec 04, 2014 at 03:26:37PM +0000, Thomas Graf wrote:
> >
> > As Daniel pointed out, this work originated for the OVS edge use
> > case where security is of less concern and the rehashing is
> > sufficient. Identi
On 12/04/14 at 04:11pm, Herbert Xu wrote:
> Hi:
>
> While working on rhashtable it came to me that this whole concept
> of arch_fast_hash is flawed. CRCs are linear functions so it's
> fairly easy for an attacker to identify collisions or at least
> eliminate a large amount of search space (e.g.,
| __GFP_NORETRY to allow
for silent fall back to vzalloc() without the OOM killer jumping in as
pointed out by Eric Dumazet and Eric W. Biederman.
Signed-off-by: Thomas Graf
---
include/linux/rhashtable.h | 10 +-
lib/rhashtable.c | 41 +
net/netfilter
Signed-off-by: Thomas Graf
---
net/netlink/af_netlink.c | 37 +
1 file changed, 25 insertions(+), 12 deletions(-)
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 7a186e7..f1de72d 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink
On 10/21/14 at 03:25pm, Steinar H. Gunderson wrote:
> On Fri, Oct 17, 2014 at 07:25:17AM +0100, Thomas Graf wrote:
> > I think the only option at this point is to re-add the nltable lock to
> > netlink_lookup() so we can drop the synchronize_net() until we find a
> >
Heiko,
Can you test the following patch:
The synchronize_rcu() in netlink_release() introduces unacceptable
latency. Reintroduce minimal lookup so we can drop the
synchronize_rcu() until socket destruction has been RCUfied.
Signed-off-by: Thomas Graf
---
net/netlink/af_netlink.c | 37
On 10/20/14 at 04:10pm, Michael S. Tsirkin wrote:
> On Mon, Oct 20, 2014 at 01:07:50PM +0100, Thomas Graf wrote:
> > On 10/13/14 at 10:50am, Michael S. Tsirkin wrote:
> > > virtio spec requires drivers to set DRIVER_OK before using VQs.
> > > This is set automatically
On 10/20/14 at 02:42pm, Cornelia Huck wrote:
> On Mon, 20 Oct 2014 13:07:50 +0100
> Thomas Graf wrote:
>
> > On 10/13/14 at 10:50am, Michael S. Tsirkin wrote:
> > > virtio spec requires drivers to set DRIVER_OK before using VQs.
> > > This is set automatically af
On 10/13/14 at 10:50am, Michael S. Tsirkin wrote:
> virtio spec requires drivers to set DRIVER_OK before using VQs.
> This is set automatically after probe returns, virtio console violated this
> rule by adding inbufs, which causes the VQ to be used directly within
> probe.
>
> To fix, call virtio
On 10/17/14 at 02:34am, Steinar H. Gunderson wrote:
> On Fri, Oct 17, 2014 at 02:21:32AM +0200, Steinar H. Gunderson wrote:
> > Hi,
> >
> > We recently upgraded a machine from 3.14.5 to 3.17.1, and a Perl script
> > we're
> > running to poll SNMP suddenly needed ten times as much time to complete
On 10/11/14 at 12:09pm, Sasha Levin wrote:
> On 10/11/2014 11:50 AM, Paul E. McKenney wrote:
> >>> > > I am guessing that this happens only when running the resizable
> >>> > > hashtable
> >>> > > tests -- if that guess is incorrect, please let me know.
> >> >
> >> > I've seen it a few times only
On 10/11/14 at 12:32pm, Eric Dumazet wrote:
> On Sat, 2014-10-11 at 10:36 +0200, Heiko Carstens wrote:
> > Hi all,
> >
> > it just came to my attention that commit e341694e3eb5
> > "netlink: Convert netlink_lookup() to use RCU protected hash table"
> > causes network latencies for me on s390.
> >
On 09/16/14 at 08:04pm, Fabian Frederick wrote:
> linux/log2.h was included twice.
>
> Signed-off-by: Fabian Frederick
Acked-by: Thomas Graf
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
Mor
On 09/15/14 at 08:23am, Eric Dumazet wrote:
> You need to roundup to next power of two.
Fixed, thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
On 09/15/14 at 07:49am, Eric Dumazet wrote:
> On Mon, 2014-09-15 at 13:49 +0100, Thomas Graf wrote:
>
> > Agreed. Will introduce this through a table parameter option when
> > converting the inet hash table.
>
> I am not sure you covered the /proc/net/tcp problem yet ? (
On 09/15/14 at 05:35am, Eric Dumazet wrote:
> On Mon, 2014-09-15 at 14:18 +0200, Thomas Graf wrote:
> > As the expansion/shrinking is moved to a worker thread, no allocations
> > will be performed anymore.
> >
>
> You meant : no GFP_ATOMIC allocations ?
>
> I wo
during the RCU grace period which would at that
point land in the future table. The lookup will see them as it searches
both tables if needed.
Signed-off-by: Thomas Graf
---
include/linux/rhashtable.h | 42 +++--
lib/rhashtable.c | 375 +++--
net
In order to allow wider usage of rhashtable, use a special nulls marker
to terminate each chain. The reason for not using the existing
nulls_list is that the pprev pointer usage would not be valid as entries
can be linked in two different buckets at the same time.
Signed-off-by: Thomas Graf
Signed-off-by: Thomas Graf
---
include/linux/spinlock.h | 8
include/linux/spinlock_api_smp.h | 2 ++
include/linux/spinlock_api_up.h | 1 +
kernel/locking/spinlock.c| 8
4 files changed, 19 insertions(+)
diff --git a/include/linux/spinlock.h b/include/linux
Signed-off-by: Thomas Graf
---
lib/rhashtable.c | 19 ++-
1 file changed, 14 insertions(+), 5 deletions(-)
diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index c133d82..c10df45 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -648,15 +648,14 @@ static int __init
hat during that RCU grace period, a bucket
traversal using any rht_for_each() variant on the main table will not see
any insertions performed during the RCU grace period which would at that
point land in the future table. The lookup will see them as it searches
both tables if needed.
Thom
As the expansion/shrinking is moved to a worker thread, no allocations
will be performed anymore.
Signed-off-by: Thomas Graf
---
include/linux/rhashtable.h | 10 +-
lib/rhashtable.c | 41 +
net/netfilter/nft_hash.c | 4 ++--
net
1 - 100 of 184 matches
Mail list logo