On 2020-05-05 22:03, Andrew Morton wrote:
On Tue, 05 May 2020 10:42:05 +0200 Roman Penyaev
wrote:
May I ask you to remove "epoll: ensure ep_poll() doesn't miss wakeup
events" from your -mm queue? Jason lately found out that the patch
does not fully solve the problem and this
_pending() check was added:
c257a340ede0 ("fs, epoll: short circuit fetching events if thread
has been killed").
Fixes: 339ddb53d373 ("fs/epoll: remove unnecessary wakeups of nested epoll")
Signed-off-by: Roman Penyaev
Reported-by: Jason Baron
Reviewed-by: Jason Baron
Cc: Andre
the final check under the lock). Previous changes are not needed.
Thanks.
--
Roman
On 2020-05-05 10:40, Roman Penyaev wrote:
The original problem was described here:
https://lkml.org/lkml/2020/4/27/1121
There is a possible race when ep_scan_ready_list() leaves ->rdllist
and ->obflist emp
hort circuit fetching events if thread
has been killed").
Signed-off-by: Roman Penyaev
Reported-by: Jason Baron
Cc: Andrew Morton
Cc: Khazhismel Kumykov
Cc: Alexander Viro
Cc: linux-fsde...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: sta...@vger.kernel.or
On 2020-05-04 06:59, Jason Baron wrote:
On 5/4/20 12:29 AM, Jason Baron wrote:
On 5/3/20 6:24 AM, Roman Penyaev wrote:
On 2020-05-02 00:09, Jason Baron wrote:
On 5/1/20 5:02 PM, Roman Penyaev wrote:
Hi Jason,
That is indeed a nice catch.
Seems we need smp_rmb() pair between
On 2020-05-02 00:09, Jason Baron wrote:
On 5/1/20 5:02 PM, Roman Penyaev wrote:
Hi Jason,
That is indeed a nice catch.
Seems we need smp_rmb() pair between list_empty_careful(&rp->rdllist)
and
READ_ONCE(ep->ovflist) for ep_events_available(), do we?
Hi Roman,
Good point, even i
Hi Jason,
That is indeed a nice catch.
Seems we need smp_rmb() pair between list_empty_careful(&rp->rdllist)
and
READ_ONCE(ep->ovflist) for ep_events_available(), do we?
Other than that:
Reviewed-by: Roman Penyaev
--
Roman
On 2020-05-01 21:15, Jason Baron wrote:
No
n ep_poll_callback")
the other problem is when several sequential events hit the same
waiting thread, thus other waiters get no wakeups. Problem is
fixed in the following patch.
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Khazhismel Kumykov
Cc: Alexander Viro
Cc: Heiher
Cc: Jaso
" represents overall time spent
doing the benchmark, thus lower is better)
[1] tools/testing/selftests/filesystems/epoll/epoll_wakeup_test.c
[2] https://github.com/rouming/test-tools/blob/master/stress-epoll.c
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Khazhismel Kumykov
Cc: Alex
On 2020-04-29 06:12, Jason Baron wrote:
On 4/28/20 2:10 PM, Roman Penyaev wrote:
On 2020-04-27 22:38, Jason Baron wrote:
On 4/25/20 4:59 PM, Khazhismel Kumykov wrote:
On Sat, Apr 25, 2020 at 9:17 AM Jason Baron
wrote:
On 4/24/20 3:00 PM, Khazhismel Kumykov wrote:
In the event that we add
On 2020-04-27 22:38, Jason Baron wrote:
On 4/25/20 4:59 PM, Khazhismel Kumykov wrote:
On Sat, Apr 25, 2020 at 9:17 AM Jason Baron wrote:
On 4/24/20 3:00 PM, Khazhismel Kumykov wrote:
In the event that we add to ovflist, before 339ddb53d373 we would be
woken up by ep_scan_ready_list, and di
up, i.e. selftests/epoll?
Reviewed-by: Roman Penyaev
--
Roman
ll-wakeup
Cc: Al Viro
Cc: Andrew Morton
Cc: Davide Libenzi
Cc: Davidlohr Bueso
Cc: Dominik Brodowski
Cc: Eric Wong
Cc: Jason Baron
Cc: Linus Torvalds
Cc: Roman Penyaev
Cc: Sridhar Samudrala
Cc: linux-kernel@vger.kernel.org
Cc: linux-fsde...@vger.kernel.org
Signed-off-by: hev
---
fs/
On 2019-10-07 20:43, Jason Baron wrote:
[...]
But what if to make this wakeup explicit if we have more events to
process?
(nothing is tested, just a guess)
@@ -255,6 +255,7 @@ struct ep_pqueue {
struct ep_send_events_data {
int maxevents;
struct epoll_event __user *events;
+
On 2019-10-07 20:43, Jason Baron wrote:
On 10/7/19 2:30 PM, Roman Penyaev wrote:
On 2019-10-07 18:42, Jason Baron wrote:
On 10/7/19 6:54 AM, Roman Penyaev wrote:
On 2019-10-03 18:13, Jason Baron wrote:
On 9/30/19 7:55 AM, Roman Penyaev wrote:
On 2019-09-28 04:29, Andrew Morton wrote:
On
On 2019-10-07 18:42, Jason Baron wrote:
On 10/7/19 6:54 AM, Roman Penyaev wrote:
On 2019-10-03 18:13, Jason Baron wrote:
On 9/30/19 7:55 AM, Roman Penyaev wrote:
On 2019-09-28 04:29, Andrew Morton wrote:
On Wed, 25 Sep 2019 09:56:03 +0800 hev wrote:
From: Heiher
Take the case where we
On 2019-10-03 18:13, Jason Baron wrote:
On 9/30/19 7:55 AM, Roman Penyaev wrote:
On 2019-09-28 04:29, Andrew Morton wrote:
On Wed, 25 Sep 2019 09:56:03 +0800 hev wrote:
From: Heiher
Take the case where we have:
t0
| (ew)
e0
| (et)
e1
On 2019-09-28 04:29, Andrew Morton wrote:
On Wed, 25 Sep 2019 09:56:03 +0800 hev wrote:
From: Heiher
Take the case where we have:
t0
| (ew)
e0
| (et)
e1
| (lt)
s0
t0: thread 0
e0: epoll fd 0
e1: epoll fd 1
s0: socket fd 0
ew: epoll
On 2019-09-28 04:29, Andrew Morton wrote:
On Wed, 25 Sep 2019 09:56:03 +0800 hev wrote:
From: Heiher
Take the case where we have:
t0
| (ew)
e0
| (et)
e1
| (lt)
s0
t0: thread 0
e0: epoll fd 0
e1: epoll fd 1
s0: socket fd 0
ew: epoll
On 2019-09-24 19:34, Jason Baron wrote:
On 9/23/19 3:23 PM, Roman Penyaev wrote:
On 2019-09-23 17:43, Jason Baron wrote:
On 9/4/19 4:22 PM, Jason Baron wrote:
Currently, ep_poll_safewake() in the CONFIG_DEBUG_LOCK_ALLOC case
uses
ep_call_nested() in order to pass the correct subclass
for epoll depth and loops that are already verified when doing
EPOLL_CTL_ADD. This mirrors a conversion that was done for
!CONFIG_DEBUG_LOCK_ALLOC in: commit 37b5e5212a44 ("epoll: remove
ep_call_nested() from ep_eventpoll_poll()")
Signed-off-by: Jason Baron
Cc: Davidlohr Bueso
Cc: Rom
:02 PM Heiher wrote:
>
> Hi,
>
> On Wed, Sep 4, 2019 at 8:02 PM Jason Baron wrote:
> >
> >
> >
> > On 9/4/19 5:57 AM, Roman Penyaev wrote:
> > > On 2019-09-03 23:08, Jason Baron wrote:
> > >> On 9/2/19 11:36 AM, Roman Penyaev wrote:
>
On 2019-09-03 23:08, Jason Baron wrote:
On 9/2/19 11:36 AM, Roman Penyaev wrote:
Hi,
This is indeed a bug. (quick side note: could you please remove efd[1]
from your test, because it is not related to the reproduction of a
current bug).
Your patch lacks a good description, what exactly you
se(efd[1]);
close(efd[0]);
close(sfd[0]);
close(sfd[1]);
printf("PASS\n");
return 0;
out:
printf("FAIL\n");
return -1;
}
Cc: Al Viro
Cc: Andrew Morton
Cc: Davide Libenzi
Cc: Davidlohr Bueso
Cc: Dominik Brodows
On 2019-06-24 22:38, Linus Torvalds wrote:
On Mon, Jun 24, 2019 at 10:42 PM Roman Penyaev
wrote:
So harvesting events from userspace gives 15% gain. Though bench_http
is not ideal benchmark, but at least it is the part of libevent and
was
easy to modify.
Worth to mention that uepoll is
On 2019-06-25 02:24, Eric Wong wrote:
Roman Penyaev wrote:
Hi all,
+cc Jason Baron
** Limitations
4. No support for EPOLLEXCLUSIVE
If device does not pass pollflags to wake_up() there is no way to
call poll() from the context under spinlock, thus special work is
On 2019-06-24 18:14, Arnd Bergmann wrote:
On Mon, Jun 24, 2019 at 4:42 PM Roman Penyaev wrote:
epoll_create2() is needed to accept EPOLL_USERPOLL flags
and size, i.e. this patch wires up polling from userspace.
Can you explain in the patch description more what it's needed for?
Sure.
Those helpers will access private eventpoll structure in future patches,
so keep those helpers close to callers.
Nothing important here.
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Linus Torvalds
Cc: linux-fsde...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
fs
This one introduces structures of user items array:
struct epoll_uheader -
describes inserted epoll items.
struct epoll_uitem -
single epoll item visible to userspace.
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Linus Torvalds
Cc: linux-fsde...@vger.kernel.org
Cc
d.
* Is there any testing app available?
There is a small app [2] which starts many threads with many event fds and
produces many events, while single consumer fetches them from userspace
and goes to kernel from time to time in order to wait.
Also libevent modification [1] is available, see &qu
ng patches.
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Linus Torvalds
Cc: linux-fsde...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
fs/eventpoll.c | 131 -
1 file changed, 107 insertions(+), 24 deletions(-)
diff
When epfd is polled from userspace and item is being modified:
1. Update user item with new pointer or poll flags.
2. Add event to user ring if needed.
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Linus Torvalds
Cc: linux-fsde...@vger.kernel.org
Cc: linux-kernel
On ep_remove() simply mark a user item with EPOLLREMOVE if the item was
ready (i.e. has some bits set). That will prevent further user index
entry creation on item ->bit reuse.
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Linus Torvalds
Cc: linux-fsde...@vger.kernel.org
) for epfd, created with EPOLL_USERPOLL flag, accepts events
as NULL and maxevents as 0. No other values are accepted.
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Linus Torvalds
Cc: linux-fsde...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
fs/eventpoll.c
():
o user item is marked as EPOLLREMOVED only if it was ready,
thus userspace will obseve previously added entry in index
uring and correct "removed" state of the item.
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Linus Torvalds
Cc: Peter Zi
epoll_create2() is needed to accept EPOLL_USERPOLL flags
and size, i.e. this patch wires up polling from userspace.
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Arnd Bergmann
Cc: Linus Torvalds
Cc: linux-fsde...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
arch
When epfd is polled by userspace and new item is inserted new bit
should be get from a bitmap and then user item is set accordingly.
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Linus Torvalds
Cc: linux-fsde...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
fs
User has to mmap user_header and user_index vmalloce'd pointers in order
to consume events from userspace. Also we do not let any copies of vma
on fork().
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Linus Torvalds
Cc: linux-fsde...@vger.kernel.org
Cc: linux-k
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Linus Torvalds
Cc: linux-fsde...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/uepoll/.gitignore | 1 +
tools/testing/selftests/uepoll
call from another cpu.
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Linus Torvalds
Cc: linux-fsde...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
fs/eventpoll.c | 38 ++
1 file changed, 38 insertions(+)
diff --git a/fs/eventpoll.c b
Rule of thumb for epfd polled from userspace is simple: epfd has
events if ->head != ->tail, no traversing of each item is performed.
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Linus Torvalds
Cc: linux-fsde...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
-
be corrupted or observed correctly. For these archs -EOPNOTSUP
is returned.
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Linus Torvalds
Cc: linux-fsde...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
fs/eventpoll.c |
On 2019-06-17 16:44, Arnd Bergmann wrote:
On Mon, Jun 17, 2019 at 4:12 PM Uladzislau Rezki
wrote:
On Mon, Jun 17, 2019 at 02:14:11PM +0200, Arnd Bergmann wrote:
> gcc points out some obviously broken code in linux-next
>
> mm/vmalloc.c: In function 'pcpu_get_vm_areas':
> mm/vmalloc.c:991:4: er
On 2019-06-17 16:04, Arnd Bergmann wrote:
On Mon, Jun 17, 2019 at 3:49 PM Roman Penyaev wrote:
> augment_tree_propagate_from(va);
>
> - if (type == NE_FIT_TYPE)
> - insert_vmap_area_augment(lva,
On 2019-06-17 14:14, Arnd Bergmann wrote:
gcc points out some obviously broken code in linux-next
mm/vmalloc.c: In function 'pcpu_get_vm_areas':
mm/vmalloc.c:991:4: error: 'lva' may be used uninitialized in this
function [-Werror=maybe-uninitialized]
insert_vmap_area_augment(lva, &va->rb_nod
On 2019-06-13 10:12, Anshuman Khandual wrote:
vmap_pte_range() returns an -EBUSY when it encounters a non-empty PTE.
But
currently vmap_pmd_range() unifies both -EBUSY and -ENOMEM return code
as
-ENOMEM and send it up the call chain which is wrong. Interestingly
enough
vmap_page_range_noflush()
This one introduces structures of user items array:
struct epoll_uheader -
describes inserted epoll items.
struct epoll_uitem -
single epoll item visible to userspace.
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Linus Torvalds
Cc: linux-fsde...@vger.kernel.org
Cc
When epfd is polled by userspace and new item is inserted new bit
should be get from a bitmap and then user item is set accordingly.
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Linus Torvalds
Cc: linux-fsde...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
fs
ng patches.
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Linus Torvalds
Cc: linux-fsde...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
fs/eventpoll.c | 131 -
1 file changed, 107 insertions(+), 24 deletions(-)
diff
This one allocates user header and user events ring according to max items
number, passed as a parameter. User events (index) ring is in a pow2.
Pages, which will be shared between kernel and userspace, are accounted
through user->locked_vm counter.
Signed-off-by: Roman Penyaev
Cc: And
On ep_remove() simply mark a user item with EPOLLREMOVE if the item was
ready (i.e. has some bits set). That will prevent further user index
entry creation on item ->bit reuse.
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Linus Torvalds
Cc: linux-fsde...@vger.kernel.org
Rule of thumb for epfd polled from userspace is simple: epfd has
events if ->head != ->tail, no traversing of each item is performed.
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Linus Torvalds
Cc: linux-fsde...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
-
call from another cpu.
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Linus Torvalds
Cc: linux-fsde...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
fs/eventpoll.c | 38 ++
1 file changed, 38 insertions(+)
diff --git a/fs/eventpoll.c b
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Linus Torvalds
Cc: linux-fsde...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/uepoll/.gitignore | 1 +
tools/testing/selftests/uepoll
epoll_create2() is needed to accept EPOLL_USERPOLL flags
and size, i.e. this patch wires up polling from userspace.
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Arnd Bergmann
Cc: Linus Torvalds
Cc: linux-fsde...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
Hi Arnd
User has to mmap user_header and user_index vmalloce'd pointers in order
to consume events from userspace. Also we do not let any copies of vma
on fork().
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Linus Torvalds
Cc: linux-fsde...@vger.kernel.org
Cc: linux-k
():
o user item is marked as EPOLLREMOVED only if it was ready,
thus userspace will obseve previously added entry in index
uring and correct "removed" state of the item.
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Linus Torvalds
Cc: Peter Zi
epfd, created with EPOLL_USERPOLL flag, accepts events
as NULL and maxevents as 0. No other values are accepted.
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Linus Torvalds
Cc: linux-fsde...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
fs/eventpoll.c
Those helpers will access private eventpoll structure in future patches,
so keep those helpers close to callers.
Nothing important here.
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Linus Torvalds
Cc: linux-fsde...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
fs
ernel from time to time in order to wait.
Also libevent modification [1] is available, see "measurements" section
above.
[1] https://github.com/libevent/libevent/pull/801
[2] https://github.com/rouming/test-tools/blob/master/userpolled-epoll.c
Roman Penyaev (14):
epoll: move private
When epfd is polled from userspace and item is being modified:
1. Update user item with new pointer or poll flags.
2. Add event to user ring if needed.
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Linus Torvalds
Cc: linux-fsde...@vger.kernel.org
Cc: linux-kernel
Hi Renzo,
On 2019-06-03 17:00, Renzo Davoli wrote:
Hi Roman,
I sorry for the delay in my answer, but I needed to set up a minimal
tutorial to show what I am working on and why I need a feature like the
one I am proposing.
Please, have a look of the README.md page here:
https://github.
On 2019-05-31 23:09, Jens Axboe wrote:
On 5/31/19 1:45 PM, Roman Penyaev wrote:
On 2019-05-31 18:54, Jens Axboe wrote:
On 5/31/19 10:02 AM, Roman Penyaev wrote:
On 2019-05-31 16:48, Jens Axboe wrote:
On 5/16/19 2:57 AM, Roman Penyaev wrote:
Hi all,
This is v3 which introduces pollable
On 2019-06-03 11:09, Peter Zijlstra wrote:
On Fri, May 31, 2019 at 08:58:19PM +0200, Roman Penyaev wrote:
On 2019-05-31 18:51, Peter Zijlstra wrote:
> But like you show, it can be done. It also makes the thing wait-free, as
> opposed to merely lockless.
You think it's better?
On 2019-05-31 18:54, Jens Axboe wrote:
On 5/31/19 10:02 AM, Roman Penyaev wrote:
On 2019-05-31 16:48, Jens Axboe wrote:
On 5/16/19 2:57 AM, Roman Penyaev wrote:
Hi all,
This is v3 which introduces pollable epoll from userspace.
v3:
- Measurements made, represented below.
- Fix
On 2019-05-31 18:51, Peter Zijlstra wrote:
On Fri, May 31, 2019 at 04:21:30PM +0200, Roman Penyaev wrote:
The ep_add_event_to_uring() is lockless, thus I can't increase tail
after,
I need to reserve the index slot, where to write to. I can use shadow
tail,
which is not seen by userspace
On 2019-05-31 18:33, Peter Zijlstra wrote:
On Thu, May 16, 2019 at 10:57:57AM +0200, Roman Penyaev wrote:
When new event comes for some epoll item kernel does the following:
struct epoll_uitem *uitem;
/* Each item has a bit (index in user items array), discussed later
*/
uitem
On 2019-05-31 16:48, Jens Axboe wrote:
On 5/16/19 2:57 AM, Roman Penyaev wrote:
Hi all,
This is v3 which introduces pollable epoll from userspace.
v3:
- Measurements made, represented below.
- Fix alignment for epoll_uitem structure on all 64-bit archs except
x86-64. epoll_uitem
On 2019-05-31 15:05, Peter Zijlstra wrote:
On Fri, May 31, 2019 at 01:22:54PM +0200, Roman Penyaev wrote:
On 2019-05-31 11:56, Peter Zijlstra wrote:
> On Thu, May 16, 2019 at 10:58:04AM +0200, Roman Penyaev wrote:
> > +static inline bool ep_clear_public_event_bits(struct ep
On 2019-05-31 14:53, Peter Zijlstra wrote:
On Fri, May 31, 2019 at 01:15:21PM +0200, Roman Penyaev wrote:
On 2019-05-31 11:56, Peter Zijlstra wrote:
> On Thu, May 16, 2019 at 10:58:03AM +0200, Roman Penyaev wrote:
> > + i = __atomic_fetch_add(&ep->user
On 2019-05-31 14:56, Peter Zijlstra wrote:
On Fri, May 31, 2019 at 01:15:21PM +0200, Roman Penyaev wrote:
On 2019-05-31 11:56, Peter Zijlstra wrote:
> On Thu, May 16, 2019 at 10:58:03AM +0200, Roman Penyaev wrote:
> > +static inline bool ep_add_event_to_uring(struct epitem *epi,
>
On 2019-05-31 12:45, Renzo Davoli wrote:
HI Roman,
On Fri, May 31, 2019 at 11:34:08AM +0200, Roman Penyaev wrote:
On 2019-05-27 15:36, Renzo Davoli wrote:
> Unfortunately this approach cannot be applied to
> poll/select/ppoll/pselect/epoll.
If you have to override other systemcalls, w
On 2019-05-31 11:55, Peter Zijlstra wrote:
On Thu, May 16, 2019 at 10:58:03AM +0200, Roman Penyaev wrote:
+#define atomic_set_unless_zero(ptr, flags) \
+({ \
+ typeof(ptr) _ptr = (ptr
On 2019-05-31 11:56, Peter Zijlstra wrote:
On Thu, May 16, 2019 at 10:58:04AM +0200, Roman Penyaev wrote:
Each ep_poll_callback() is called when fd calls wakeup() on epfd.
So account new event in user ring.
The tricky part here is EPOLLONESHOT. Since we are lockless we
have to be deal with
On 2019-05-31 11:56, Peter Zijlstra wrote:
On Thu, May 16, 2019 at 10:58:03AM +0200, Roman Penyaev wrote:
+static inline bool ep_add_event_to_uring(struct epitem *epi, __poll_t
pollflags)
+{
+ struct eventpoll *ep = epi->ep;
+ struct epoll_uitem *uitem;
+ bool added = fa
Hi Renzo,
On 2019-05-27 15:36, Renzo Davoli wrote:
On Mon, May 27, 2019 at 09:33:32AM +0200, Greg KH wrote:
On Sun, May 26, 2019 at 04:25:21PM +0200, Renzo Davoli wrote:
> This patch implements an extension of eventfd to define file descriptors
> whose I/O events can be generated at user level.
On 2019-05-21 09:51, Eric Wong wrote:
Roman Penyaev wrote:
diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index 81da4571f1e0..9d3905c0afbf 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -44,6 +44,7 @@
#include
#include
#include
+#include
#include
/*
@@ -185,6 +186,9 @@ struct
On 2019-05-22 04:33, Andrew Morton wrote:
On Thu, 16 May 2019 12:20:50 +0200 Roman Penyaev
wrote:
On 2019-05-16 12:03, Arnd Bergmann wrote:
> On Thu, May 16, 2019 at 10:59 AM Roman Penyaev
> wrote:
>>
>> epoll_create2() is needed to accept EPOLL_USERPOLL flags
>> a
On 2019-05-16 12:03, Arnd Bergmann wrote:
On Thu, May 16, 2019 at 10:59 AM Roman Penyaev
wrote:
epoll_create2() is needed to accept EPOLL_USERPOLL flags
and size, i.e. this patch wires up polling from userspace.
Could you add the system call to all syscall*.tbl files at the same
time here
ads with many event fds and
produces many events, while single consumer fetches them from userspace
and goes to kernel from time to time in order to wait.
[1] https://github.com/libevent/libevent/pull/801
[2] https://github.com/rouming/test-tools/blob/master/userpolled-epoll.c
Roman Penyaev (1
This one introduces structures of user items array:
struct epoll_uheader -
describes inserted epoll items.
struct epoll_uitem -
single epoll item visible to userspace.
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Linus Torvalds
Cc: linux-fsde...@vger.kernel.org
Cc
call from another cpu.
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Linus Torvalds
Cc: linux-fsde...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index 2f551c005640..55612da9651e 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
epfd, created with EPOLL_USERPOLL flag, accepts events
as NULL and maxevents as 0. No other values are accepted.
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Linus Torvalds
Cc: linux-fsde...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
diff --git a/fs/eventpoll.c
():
o user item is marked as EPOLLREMOVED only if it was ready,
thus userspace will obseve previously added entry in index
uring and correct "removed" state of the item.
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Linus Torvalds
Cc:
User has to mmap user_header and user_index vmalloce'd pointers in order
to consume events from userspace. Also we do not let any copies of vma
on fork().
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Linus Torvalds
Cc: linux-fsde...@vger.kernel.org
Cc: linux-k
On ep_remove() simply mark a user item with EPOLLREMOVE if the item was
ready (i.e. has some bits set). That will prevent further user index
entry creation on item ->bit reuse.
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Linus Torvalds
Cc: linux-fsde...@vger.kernel.org
Rule of thumb for epfd polled from userspace is simple: epfd has
events if ->head != ->tail, no traversing of each item is performed.
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Linus Torvalds
Cc: linux-fsde...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
diff -
When epfd is polled from userspace and item is being modified:
1. Update user item with new pointer or poll flags.
2. Add event to user ring if needed.
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Linus Torvalds
Cc: linux-fsde...@vger.kernel.org
Cc: linux-kernel
This one allocates user header and user events ring according to max items
number, passed as a parameter. User events (index) ring is in a pow2.
Pages, which will be shared between kernel and userspace, are accounted
through user->locked_vm counter.
Signed-off-by: Roman Penyaev
Cc: And
When epfd is polled by userspace and new item is inserted new bit
should be get from a bitmap and then user item is set accordingly.
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Linus Torvalds
Cc: linux-fsde...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
diff --git a
Those helpers will access private eventpoll structure in future patches,
so keep those helpers close to callers.
Nothing important here.
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Linus Torvalds
Cc: linux-fsde...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
diff
epoll_create2() is needed to accept EPOLL_USERPOLL flags
and size, i.e. this patch wires up polling from userspace.
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Linus Torvalds
Cc: linux-fsde...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
diff --git a/arch/x86/entry
k and then to
call ep_poll_callback() with pollflags in a hand.
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Al Viro
Cc: Linus Torvalds
Cc: linux-fsde...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index 81da4571f1e0..9d3905c0afbf 10064
On 2019-03-08 09:14, Peng Wang wrote:
When ep_busy_loop() is called, timed_out is always zero,
otherwise ep_poll() would return first.
Yes, that's correct.
Reviewed-by: Roman Penyaev
On 2019-03-11 17:37, Dmitry Vyukov wrote:
On Mon, Mar 11, 2019 at 5:36 PM syzbot
wrote:
> On Mon, Mar 11, 2019 at 2:53 PM Roman Penyaev wrote:
>> On 2019-03-11 14:45, Dmitry Vyukov wrote:
>> > On Mon, Mar 11, 2019 at 2:37 PM Roman Penyaev wrote:
>> >>
>>
On 2019-03-11 14:45, Dmitry Vyukov wrote:
On Mon, Mar 11, 2019 at 2:37 PM Roman Penyaev wrote:
Hi Andrew,
I thought "epoll: loosen irq safety in ep_poll_callback()" patch was
removed from your tree, at least I got a notification on 9th of
january,
also I do not see it in the
syzbot has bisected this bug to:
commit f92cacf118171208f62519d92502a8dd0341286d
Author: Roman Penyaev
Date: Tue Jan 8 01:15:44 2019 +
epoll: loosen irq safety in ep_poll_callback()
bisection log:
https://syzkaller.appspot.com/x/bisect.txt?x=107ae15f20
start commit: f92cacf1 epo
On 2019-01-21 22:34, Linus Torvalds wrote:
So I'm not entirely convinced, but I guess actual numbers and users
might convince me otherwise.
However, a quick comment:
On Tue, Jan 22, 2019 at 9:15 AM Roman Penyaev wrote:
+struct epoll_uitem {
+ __poll_t ready_events;
+ s
Those helpers will access private eventpoll structure in future patches,
so keep those helpers close to callers.
Nothing important here.
Signed-off-by: Roman Penyaev
Cc: Andrew Morton
Cc: Davidlohr Bueso
Cc: Jason Baron
Cc: Al Viro
Cc: "Paul E. McKenney"
Cc: Linus Torvalds
[1] which starts many threads with many event fds and
produces many events, while single consumer fetches them from userspace
and goes to kernel from time to time in order to wait.
[1] https://github.com/rouming/test-tools/blob/master/userpolled-epoll.c
Roman Penyaev (13):
epoll: move private
1 - 100 of 210 matches
Mail list logo