* Eric Dumazet <[EMAIL PROTECTED]> wrote:
> I tried your bench and found two problems :
> - You scan half of the bitmap
[...]
> Try to close not a 'middle fd', but a really low one (10 for example),
> and latencie is doubled.
that was intentional. I really didnt want to fabricate a worst-case
On Thu, 31 May 2007 11:02:52 +0200
Ingo Molnar <[EMAIL PROTECTED]> wrote:
>
> * Ingo Molnar <[EMAIL PROTECTED]> wrote:
>
> > it's both a flexibility and a speedup thing as well:
> >
> > flexibility: for libraries to be able to open files and keep them open
> > comes up regularly. For example c
* Albert Cahalan <[EMAIL PROTECTED]> wrote:
> Ingo Molnar writes:
>
> >looking over the list of our new generic APIs (see further below) i
> >think there are three important things that are needed for an API to
> >become widely used:
> >
> > 1) it should solve a real problem (ha ;-), it should b
On Thu, May 31 2007, Ingo Molnar wrote:
>
> * Ingo Molnar <[EMAIL PROTECTED]> wrote:
>
> > (i definitely remember having written code for that too, but i cannot
> > find that in the archives. hm.) In theory we could avoid _all_
> > fd-bitmap overhead as well and use a per-process list/pool of s
* Ingo Molnar <[EMAIL PROTECTED]> wrote:
> (i definitely remember having written code for that too, but i cannot
> find that in the archives. hm.) In theory we could avoid _all_
> fd-bitmap overhead as well and use a per-process list/pool of struct
> file buffers plus a maximum-fd field as the
* Eric Dumazet <[EMAIL PROTECTED]> wrote:
> > speedup: i suggested O_ANY 6 years ago as a speedup to Apache -
> > non-linear fds are cheaper to allocate/map:
> >
> > http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg23820.html
> >
> > (i definitely remember having written code for
* Ingo Molnar <[EMAIL PROTECTED]> wrote:
> it's both a flexibility and a speedup thing as well:
>
> flexibility: for libraries to be able to open files and keep them open
> comes up regularly. For example currently glibc is quite wasteful in a
> number of common networking related functions (U
Ingo Molnar writes:
looking over the list of our new generic APIs (see further below) i
think there are three important things that are needed for an API to
become widely used:
1) it should solve a real problem (ha ;-), it should be intuitive to
humans and it should fit into existing thing
On Thu, 31 May 2007 08:13:03 +0200
Ingo Molnar <[EMAIL PROTECTED]> wrote:
>
> * Linus Torvalds <[EMAIL PROTECTED]> wrote:
>
> > > I agree. What would be a good interface to allocate fds in such
> > > area? We don't want to replicate syscalls, so maybe a special new
> > > dup function?
> >
> >
* Linus Torvalds <[EMAIL PROTECTED]> wrote:
> > I agree. What would be a good interface to allocate fds in such
> > area? We don't want to replicate syscalls, so maybe a special new
> > dup function?
>
> I'd do it with something like "newfd = dup2(fd, NONLINEAR_FD)" or
> similar, and just hav
On Wed, May 30, 2007 at 01:00:30PM -0700, Linus Torvalds wrote:
>> Which *could* be something as simple as saying "bit 30 in the file
>> descriptor specifies a separate fd space" along with some flags to make
>> open and friends return those separate fd's. That makes them useless for
>> "select(
On Wed, May 30, 2007 at 01:00:30PM -0700, Linus Torvalds wrote:
> Which *could* be something as simple as saying "bit 30 in the file
> descriptor specifies a separate fd space" along with some flags to make
> open and friends return those separate fd's. That makes them useless for
> "select()" (
On Wed, May 30, 2007 at 02:27:52PM -0700, Linus Torvalds wrote:
> Well, don't think of it as a special case at all: think of bit 30 as a
> "the user asked for a non-linear fd".
> In fact, to make it effective, I'd suggest literally scrambling the low
> bits (using, for example, some silly per-boo
On Wed, 30 May 2007 14:27:52 -0700 (PDT)
Linus Torvalds <[EMAIL PROTECTED]> wrote:
> Well, don't think of it as a special case at all: think of bit 30 as
> a "the user asked for a non-linear fd".
If the sole point is to protect an fd from being closed or operated on
outside of a certain context,
Davide Libenzi a écrit :
On Wed, 30 May 2007, Linus Torvalds wrote:
And then the semantics: do these descriptors should show up in
/proc/self/fd? Are there separate directories for each namespace? Do
they count against the rlimit?
Oh, absolutely. The'd be real fd's in every way. People could
On Wed, 30 May 2007, Ulrich Drepper wrote:
> You also have to be aware that open() is just one piece of the puzzle.
> What about socket()? I've cursed this interface many times before and
> now it's biting you: there is parameter to pass a flag. What about
> transferring file descriptors via Uni
On Wed, 30 May 2007, Davide Libenzi wrote:
>
> I agree. What would be a good interface to allocate fds in such area? We
> don't want to replicate syscalls, so maybe a special new dup function?
I'd do it with something like "newfd = dup2(fd, NONLINEAR_FD)" or similar,
and just have NONLINEAR_F
Linus Torvalds a écrit :
On Wed, 30 May 2007, Eric Dumazet wrote:
No, Davide, the problem is that some applications depend on getting
_specific_ file descriptors.
Fix the application, and not adding kernel bloat ?
No. The application is _correct_. It's how file descriptors are defined to
wo
Linus Torvalds wrote:
> Side note: it might not even be a "close-on-exec by default" thing: it
> might well be a *always* close-on-exec.
>
> That COE is pretty horrid to do, we need to scan a bitmap of those things
> on each exec. So it migth be totally sensible to just declare that the
> non-li
On Wed, 30 May 2007, Linus Torvalds wrote:
>
> Sure. I think there are things we can do (like make the non-linear fd's
> appear somewhere else, and make them close-on-exec by default etc).
Side note: it might not even be a "close-on-exec by default" thing: it
might well be a *always* close-on
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Linus Torvalds wrote:
> Well, don't think of it as a special case at all: think of bit 30 as a
> "the user asked for a non-linear fd".
This sounds easy but doesn't really solve all the issues. Let me repeat
your example and the solution currently in
On Wed, 30 May 2007, Linus Torvalds wrote:
> > And then the semantics: do these descriptors should show up in
> > /proc/self/fd? Are there separate directories for each namespace? Do
> > they count against the rlimit?
>
> Oh, absolutely. The'd be real fd's in every way. People could use them
>
On Wed, 30 May 2007, Jeremy Fitzhardinge wrote:
>
> Some programs - legitimately, I think - scan /proc/self/fd to close
> everything. The question is whether the glibc-private fds should appear
> there. And something like a "close-on-fork" flag might be useful,
> though I guess glibc can keep
On Wed, 30 May 2007, Ulrich Drepper wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> Linus Torvalds wrote:
> > for (i = 0; i < NR_OPEN; i++)
> > close(i);
> >
> > to clean up all file descriptors before doing something new. And yes, I
> > think it was bash that used
Ulrich Drepper wrote:
> I don't like special cases. For me things better come in quantities 0,
> 1, and unlimited (well, reasonable high limit). Otherwise, who gets to
> use that special namespace? The C library is not the only body of code
> which would want to use descriptors.
Valgrind could
Linus Torvalds wrote:
> Which *could* be something as simple as saying "bit 30 in the file
> descriptor specifies a separate fd space" along with some flags to make
> open and friends return those separate fd's. That makes them useless for
> "select()" (which assumes a flat address space, of cou
On Wed, 30 May 2007, Eric Dumazet wrote:
> > So library routines *must not* open file descriptors in the normal space.
> >
> > (The same is true of real applications doing the equivalent of
> >
> > for (i = 0; i < NR_OPEN; i++)
> > close(i);
>
> Quite buggy IMHO
Looking at it n
On Wed, 30 May 2007, Ulrich Drepper wrote:
>
> I don't like special cases. For me things better come in quantities 0,
> 1, and unlimited (well, reasonable high limit). Otherwise, who gets to
> use that special namespace? The C library is not the only body of code
> which would want to use des
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Linus Torvalds wrote:
> for (i = 0; i < NR_OPEN; i++)
> close(i);
>
> to clean up all file descriptors before doing something new. And yes, I
> think it was bash that used to *literally* do something like that a long
> time ago.
Linus Torvalds a écrit :
On Wed, 30 May 2007, Eric Dumazet wrote:
So yes, reimplement sendfile() should help to find last splice() bugs, and as
a bonus it could add non blocking disk io, (O_NONBLOCK on input file ->
socket)
Well, to get those kinds of advantages, you'd have to use splice dire
On Wed, 30 May 2007, Eric Dumazet wrote:
> >
> > No, Davide, the problem is that some applications depend on getting
> > _specific_ file descriptors.
>
> Fix the application, and not adding kernel bloat ?
No. The application is _correct_. It's how file descriptors are defined to
work.
> The
Linus Torvalds a écrit :
On Wed, 30 May 2007, Davide Libenzi wrote:
Here I think we are forgetting that glibc is userspace and there's no
separation between the application code and glibc code. An application
linking to glibc can break glibc in thousand ways, indipendently from fds
or not fds
On Wed, 30 May 2007, Eric Dumazet wrote:
>
> So yes, reimplement sendfile() should help to find last splice() bugs, and as
> a bonus it could add non blocking disk io, (O_NONBLOCK on input file ->
> socket)
Well, to get those kinds of advantages, you'd have to use splice directly,
since sendfi
On Wed, 30 May 2007, Linus Torvalds wrote:
> On Wed, 30 May 2007, Davide Libenzi wrote:
> >
> > Here I think we are forgetting that glibc is userspace and there's no
> > separation between the application code and glibc code. An application
> > linking to glibc can break glibc in thousand ways,
Linus Torvalds a écrit :
On Wed, 30 May 2007, Mark Lord wrote:
I wonder how useful it would be to reimplement sendfile()
using splice(), either in glibc or inside the kernel itself?
I'd like that, if only because right now we have two separate paths that
kind of do the same thing, and splice
On Wed, 30 May 2007, Davide Libenzi wrote:
>
> Here I think we are forgetting that glibc is userspace and there's no
> separation between the application code and glibc code. An application
> linking to glibc can break glibc in thousand ways, indipendently from fds
> or not fds. Like complain
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Davide Libenzi wrote:
> An application
> linking to glibc can break glibc in thousand ways, indipendently from fds
> or not fds.
It's not (only/mainly) about breaking. File descriptors are a resources
which has to be used under the control of the p
On Wed, 30 May 2007, Ingo Molnar wrote:
>
> * Linus Torvalds <[EMAIL PROTECTED]> wrote:
>
> > > To echo Uli and paraphrase an ad, "it's the interface, silly."
> >
> > THERE IS NO INTERFACE! You're just making that up, and glossing over
> > the most important part of the whole thing!
> >
> > I
On Wed, 30 May 2007, Ingo Molnar wrote:
> yeah - this is a fundamental design question for Linus i guess :-) glibc
> (and other infrastructure libraries) have a fundamental problem: they
> cannot (and do not) presently use persistent file descriptors to make
> use of kernel functionality, due t
On Wed, May 30 2007, Linus Torvalds wrote:
>
>
> On Wed, 30 May 2007, Mark Lord wrote:
> >
> > I wonder how useful it would be to reimplement sendfile()
> > using splice(), either in glibc or inside the kernel itself?
>
> I'd like that, if only because right now we have two separate paths that
On Wed, 30 May 2007, Mark Lord wrote:
>
> I wonder how useful it would be to reimplement sendfile()
> using splice(), either in glibc or inside the kernel itself?
I'd like that, if only because right now we have two separate paths that
kind of do the same thing, and splice really is the only on
On Wed, May 30 2007, Mark Lord wrote:
> Ingo Molnar wrote:
> >
> > - sendfile(). This API mainly failed on #2. It partly failed on #1 too.
> > (couldnt be used in certain types of scenarios so was unintuitive.)
> > splice() fixes this almost completely.
> >
> > - KAIO. It fails on #2 and #3.
>
Ingo Molnar wrote:
- sendfile(). This API mainly failed on #2. It partly failed on #1 too.
(couldnt be used in certain types of scenarios so was unintuitive.)
splice() fixes this almost completely.
- KAIO. It fails on #2 and #3.
I wonder how useful it would be to reimplement sendfile(
On Wed, May 30 2007, Ingo Molnar wrote:
> - splice. (a bit too early to tell but it's looking good so far. Would
>be nice if someone did a brute-force memcpy() based vmsplice to user
>memory, just to make usage fully symmetric.)
Heh, I actually agree, at least then the interface is comple
On Wed, May 30 2007, Zach Brown wrote:
> > Yeah, it'll confuse CFQ a lot actually. The threads either need to share
> > an io context (clean approach, however will introduce locking for things
> > that were previously lockless), or CFQ needs to get better support for
> > cooperating processes.
>
>
> due to the added syscall. (Maybe we can just get that reserved
> upstream now?)
Maybe, but we'd have to agree on the bare syslet interface that is being
supported :).
Personally, I'd like that to be the simplest thing that works for people
and I'm not convinced that the current syslet-specific
> Yeah, it'll confuse CFQ a lot actually. The threads either need to share
> an io context (clean approach, however will introduce locking for things
> that were previously lockless), or CFQ needs to get better support for
> cooperating processes.
Do let me know if I can be of any help in this.
>
* Linus Torvalds <[EMAIL PROTECTED]> wrote:
> > To echo Uli and paraphrase an ad, "it's the interface, silly."
>
> THERE IS NO INTERFACE! You're just making that up, and glossing over
> the most important part of the whole thing!
>
> If you could actually point to something specific that match
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Ingo Molnar wrote:
> we should perhaps enable glibc to have its separate fd namespace (or
> 'hidden' file descriptors at the upper end of the fd space) so that it
> can transparently listen to netlink events (or do epoll),
Something like this would
On Wed, 30 May 2007, Jeff Garzik wrote:
>
> You snipped the key part of my response, so I'll say it again:
>
> Event rings (a) most closely match what is going on in the hardware and (b)
> often closely match what is going on in multi-socket, event-driven software
> application.
I have rather
On Wed, 30 May 2007, Ingo Molnar wrote:
>
> * Ulrich Drepper <[EMAIL PROTECTED]> wrote:
> >
> > I'm not going to judge your tests but saying there are no significant
> > advantages is too one-sided. There is one huge advantage: the
> > interface. A memory-based interface is simply the best f
* Ingo Molnar <[EMAIL PROTECTED]> wrote:
> epoll is very much is capable of doing it - but why bother if
> something more flexible than a ring can be used and the performance
> difference is negligible? (Read my other reply in this thread for
> further points.)
in particular i'd like to (re-)
* Jeff Garzik <[EMAIL PROTECTED]> wrote:
> >>You should pick up the kevent work :)
> >
> >3 months ago i verified the published kevent vs. epoll benchmark and
> >found that benchmark to be fatally flawed. When i redid it properly
> >kevent showed no significant advantage over epoll. Note that i
On Wed, May 30, 2007 at 10:54:00AM +0200, Ingo Molnar ([EMAIL PROTECTED]) wrote:
>
> * Evgeniy Polyakov <[EMAIL PROTECTED]> wrote:
>
> > I did not want to start with another round of ping-pong insults :),
> > but, Ingo, you did not show that kevent works worse. I did show that
> > sometimes it
Ingo Molnar wrote:
* Jeff Garzik <[EMAIL PROTECTED]> wrote:
You should pick up the kevent work :)
3 months ago i verified the published kevent vs. epoll benchmark and
found that benchmark to be fatally flawed. When i redid it properly
kevent showed no significant advantage over epoll. Note
* Evgeniy Polyakov <[EMAIL PROTECTED]> wrote:
> On Wed, May 30, 2007 at 10:42:52AM +0200, Ingo Molnar ([EMAIL PROTECTED])
> wrote:
> > it is a serious flexibility issue that should not be ignored. The
> > unified fd space is a blessing on one hand because it's simple and
> > powerful, but it's
On Wed, May 30, 2007 at 10:42:52AM +0200, Ingo Molnar ([EMAIL PROTECTED]) wrote:
> it is a serious flexibility issue that should not be ignored. The
> unified fd space is a blessing on one hand because it's simple and
> powerful, but it's also a curse because nested use of the fd space for
> lib
* Evgeniy Polyakov <[EMAIL PROTECTED]> wrote:
> I did not want to start with another round of ping-pong insults :),
> but, Ingo, you did not show that kevent works worse. I did show that
> sometimes it works better. It flawed from 0 to 30% win in that tests,
> in results Johann Bork presented
* Ulrich Drepper <[EMAIL PROTECTED]> wrote:
> Ingo Molnar wrote:
> > 3 months ago i verified the published kevent vs. epoll benchmark and
> > found that benchmark to be fatally flawed. When i redid it properly
> > kevent showed no significant advantage over epoll.
>
> I'm not going to judge yo
Hi Ingo, developers.
On Wed, May 30, 2007 at 09:20:55AM +0200, Ingo Molnar ([EMAIL PROTECTED]) wrote:
>
> * Jeff Garzik <[EMAIL PROTECTED]> wrote:
>
> > You should pick up the kevent work :)
>
> 3 months ago i verified the published kevent vs. epoll benchmark and
> found that benchmark to be f
On Tue, May 29 2007, Zach Brown wrote:
Thanks for picking this up, Zach!
> - cfq gets confused, share io_context amongst threads?
Yeah, it'll confuse CFQ a lot actually. The threads either need to share
an io context (clean approach, however will introduce locking for things
that were previousl
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Ingo Molnar wrote:
> 3 months ago i verified the published kevent vs. epoll benchmark and
> found that benchmark to be fatally flawed. When i redid it properly
> kevent showed no significant advantage over epoll.
I'm not going to judge your tests bu
* Zach Brown <[EMAIL PROTECTED]> wrote:
> > Having async request and response rings would be quite useful, and
> > most closely match what is going on under the hood in the kernel and
> > hardware.
>
> Yeah, but I have lots of competing thoughts about this.
note that async request and respons
* Jeff Garzik <[EMAIL PROTECTED]> wrote:
> You should pick up the kevent work :)
3 months ago i verified the published kevent vs. epoll benchmark and
found that benchmark to be fatally flawed. When i redid it properly
kevent showed no significant advantage over epoll. Note that i did those
me
On Tue, May 29, 2007 at 04:20:04PM -0700, Ulrich Drepper wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> Zach Brown wrote:
> > That todo item
> > about producing documentation and distro kernels is specifically to bait
> > Uli into trying to implement posix aio on top of syslet
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Zach Brown wrote:
> That todo item
> about producing documentation and distro kernels is specifically to bait
> Uli into trying to implement posix aio on top of syslets in glibc.
Get DaveJ to pick up the code for Fedora kernels and I'll get to it.
-
> You should pick up the kevent work :)
I haven't looked at it in a while but yes, it's "on the radar" :).
> Having async request and response rings would be quite useful, and most
> closely match what is going on under the hood in the kernel and hardware.
Yeah, but I have lots of competing tho
> .. so don't keep us in suspense. Do you have any numbers for anything
> (like Oracle, to pick a random thing out of thin air ;) that might
> actually indicate whether this actually works or not?
I haven't gotten to running Oracle's database against it. It is going
to be Very Cranky if O_DIREC
Zach Brown wrote:
I'm pleased to announce the availability of version 6 of the syslet subsystem.
Ingo and I agreed that I'll handle syslet releases while he's busy with CFS. I
copied the cc: list from Ingo's v5 announcement. If you'd like to be dropped
(or added), please let me know.
The v6 pa
I'm pleased to announce the availability of version 6 of the syslet subsystem.
Ingo and I agreed that I'll handle syslet releases while he's busy with CFS. I
copied the cc: list from Ingo's v5 announcement. If you'd like to be dropped
(or added), please let me know.
The v6 patch series against 2
On Tue, 29 May 2007, Zach Brown wrote:
>
> Included in this patch series is an experimental patch which reworks fs/aio.c
> to reuse the syslet subsystem to process iocb requests from user space. The
> intent of this work is to simplify the code and broaden aio functionality.
.. so don't keep
71 matches
Mail list logo