Going through all the discussions once again and trying to look at this
from the point of view of just basic requirements for data structures and
mechanisms, that they imply.
1. Should have a data structure that represents a memory chain , which may
not be contiguous in physical memory, and whi
Hi!
> > So you consider inability to select() on regular files _feature_?
>
> select on files is unimplementable. You can't do background file IO the
> same way you do background receiving of packets on socket. Filesystem is
> synchronous. It can block.
You can use helper friends if VFS layer
Linus Torvalds wrote:
> Absolutely. This is exactly what I mean by saying that low-level drivers
> may not actually be able to handle new cases that they've never been asked
> to do before - they just never saw anything like a 64kB request before or
> something that crossed its own alignment.
>
>
Linus Torvalds wrote:
>
> On Thu, 8 Feb 2001, Rik van Riel wrote:
>
> > On Thu, 8 Feb 2001, Mikulas Patocka wrote:
> >
> > > > > You need aio_open.
> > > > Could you explain this?
> > >
> > > If the server is sending many small files, disk spends huge
> > > amount time walking directory tree and
Hi,
On Thu, Feb 08, 2001 at 03:52:35PM +0100, Mikulas Patocka wrote:
>
> > How do you write high-performance ftp server without threads if select
> > on regular file always returns "ready"?
>
> No, it's not really possible on Linux. Use SYS$QIO call on VMS :-)
Ahh, but even VMS SYS$QIO is sync
On Thu, 8 Feb 2001, Rik van Riel wrote:
> On Thu, 8 Feb 2001, Mikulas Patocka wrote:
>
> > > > You need aio_open.
> > > Could you explain this?
> >
> > If the server is sending many small files, disk spends huge
> > amount time walking directory tree and seeking to inodes. Maybe
> > opening
On Thu, 8 Feb 2001, Marcelo Tosatti wrote:
>
> On Thu, 8 Feb 2001, Stephen C. Tweedie wrote:
>
>
>
> > > How do you write high-performance ftp server without threads if select
> > > on regular file always returns "ready"?
> >
> > Select can work if the access is sequential, but async IO is
On Thu, 8 Feb 2001, Martin Dalecki wrote:
> >
> > But you'll have a bitch of a time trying to merge multiple
> > threads/processes reading from the same area on disk at roughly the same
> > time. Your higher levels won't even _know_ that there is merging to be
> > done until the IO requests hit
On Thu, 8 Feb 2001, Pavel Machek wrote:
> >
> > There are currently no other alternatives in user space. You'd have to
> > create whole new interfaces for aio_read/write, and ways for the kernel to
> > inform user space that "now you can re-try submitting your IO".
>
> Why is current select()
On Thu, 8 Feb 2001, Rik van Riel wrote:
> On Thu, 8 Feb 2001, Mikulas Patocka wrote:
>
> > > > You need aio_open.
> > > Could you explain this?
> >
> > If the server is sending many small files, disk spends huge
> > amount time walking directory tree and seeking to inodes. Maybe
> > opening th
On Thu, 8 Feb 2001, Mikulas Patocka wrote:
> > > You need aio_open.
> > Could you explain this?
>
> If the server is sending many small files, disk spends huge
> amount time walking directory tree and seeking to inodes. Maybe
> opening the file is even slower than reading it
Not if you have a
On Thu, 8 Feb 2001, Mikulas Patocka wrote:
> > > The problem is that aio_read and aio_write are pretty useless for ftp or
> > > http server. You need aio_open.
> >
> > Could you explain this?
>
> If the server is sending many small files, disk spends huge amount time
> walking directory tree
> > The problem is that aio_read and aio_write are pretty useless for ftp or
> > http server. You need aio_open.
>
> Could you explain this?
If the server is sending many small files, disk spends huge amount time
walking directory tree and seeking to inodes. Maybe opening the file is
even slowe
On Thu, Feb 08 2001, Mikulas Patocka wrote:
> > Even async IO (ie aio_read/aio_write) should block on the request queue if
> > its full in Linus mind.
>
> This is not problem (you can create queue big enough to handle the load).
Well in theory, but in practice this isn't a very good idea. At som
On Thu, 8 Feb 2001, Mikulas Patocka wrote:
> > > > How do you write high-performance ftp server without threads if select
> > > > on regular file always returns "ready"?
> > >
> > > Select can work if the access is sequential, but async IO is a more
> > > general solution.
> >
> > Even async
> > > How do you write high-performance ftp server without threads if select
> > > on regular file always returns "ready"?
> >
> > Select can work if the access is sequential, but async IO is a more
> > general solution.
>
> Even async IO (ie aio_read/aio_write) should block on the request queue
On Thu, 8 Feb 2001, Marcelo Tosatti wrote:
>
> On Thu, 8 Feb 2001, Ben LaHaise wrote:
>
>
>
> > > (besides, latency would suck. I bet you're better off waiting for the
> > > requests if they are all used up. It takes too long to get deep into the
> > > kernel from user space, and you cannot
On Thu, 8 Feb 2001, Ben LaHaise wrote:
> > (besides, latency would suck. I bet you're better off waiting for the
> > requests if they are all used up. It takes too long to get deep into the
> > kernel from user space, and you cannot use the exclusive waiters with its
> > anti-herd behaviour et
On Tue, 6 Feb 2001, Linus Torvalds wrote:
> There are currently no other alternatives in user space. You'd have to
> create whole new interfaces for aio_read/write, and ways for the kernel to
> inform user space that "now you can re-try submitting your IO".
>
> Could be done. But that's a big thi
Hi!
> So you consider inability to select() on regular files _feature_?
select on files is unimplementable. You can't do background file IO the
same way you do background receiving of packets on socket. Filesystem is
synchronous. It can block.
> It can be a pretty serious problem with slow blo
On Thu, 8 Feb 2001, Pavel Machek wrote:
> Hi!
>
> > > Its arguing against making a smart application block on the disk while its
> > > able to use the CPU for other work.
> >
> > There are currently no other alternatives in user space. You'd have to
> > create whole new interfaces for aio_read/wr
On Thu, 8 Feb 2001, Stephen C. Tweedie wrote:
> > How do you write high-performance ftp server without threads if select
> > on regular file always returns "ready"?
>
> Select can work if the access is sequential, but async IO is a more
> general solution.
Even async IO (ie aio_read/aio_wri
Hi,
On Thu, Feb 08, 2001 at 12:15:13AM +0100, Pavel Machek wrote:
>
> > EAGAIN is _not_ a valid return value for block devices or for regular
> > files. And in fact it _cannot_ be, because select() is defined to always
> > return 1 on them - so if a write() were to return EAGAIN, user space woul
Linus Torvalds wrote:
>
> On Tue, 6 Feb 2001, Ben LaHaise wrote:
> >
> > On Tue, 6 Feb 2001, Stephen C. Tweedie wrote:
> >
> > > The whole point of the post was that it is merging, not splitting,
> > > which is troublesome. How are you going to merge requests without
> > > having chains of scatt
Hi!
> > Its arguing against making a smart application block on the disk while its
> > able to use the CPU for other work.
>
> There are currently no other alternatives in user space. You'd have to
> create whole new interfaces for aio_read/write, and ways for the kernel to
> inform user space t
Hi!
> > > Reading write(2):
> > >
> > >EAGAIN Non-blocking I/O has been selected using O_NONBLOCK and there was
> > > no room in the pipe or socket connected to fd to write the data
> > > immediately.
> > >
> > > I see no reason why "aio function have to
On Tue, Feb 06, 2001 at 10:14:21AM -0800, Linus Torvalds wrote:
> I will claim that you CANNOT merge at higher levels and get good
> performance.
>
> Sure, you can do read-ahead, and try to get big merges that way at a high
> level. Good for you.
>
> But you'll have a bitch of a time trying to m
On Wednesday February 7, [EMAIL PROTECTED] wrote:
>
>
> On Wed, 7 Feb 2001, Christoph Hellwig wrote:
>
> > On Tue, Feb 06, 2001 at 12:59:02PM -0800, Linus Torvalds wrote:
> > >
> > > Actually, they really aren't.
> > >
> > > They kind of _used_ to be, but more and more they've moved away from
Hi,
On Wed, Feb 07, 2001 at 12:12:44PM -0700, Richard Gooch wrote:
> Stephen C. Tweedie writes:
> >
> > Sorry? I'm not sure where communication is breaking down here, but
> > we really don't seem to be talking about the same things. SGI's
> > kiobuf request patches already let us pass a large
Stephen C. Tweedie writes:
> Hi,
>
> On Tue, Feb 06, 2001 at 06:37:41PM -0800, Linus Torvalds wrote:
> > Absolutely. And this is independent of what kind of interface we end up
> > using, whether it be kiobuf of just plain "struct buffer_head". In that
> > respect they are equivalent.
>
> Sorry?
On Wed, Feb 07, 2001 at 10:36:47AM -0800, Linus Torvalds wrote:
>
>
> On Wed, 7 Feb 2001, Christoph Hellwig wrote:
>
> > On Tue, Feb 06, 2001 at 12:59:02PM -0800, Linus Torvalds wrote:
> > >
> > > Actually, they really aren't.
> > >
> > > They kind of _used_ to be, but more and more they've m
On Wed, 7 Feb 2001, Christoph Hellwig wrote:
> On Tue, Feb 06, 2001 at 12:59:02PM -0800, Linus Torvalds wrote:
> >
> > Actually, they really aren't.
> >
> > They kind of _used_ to be, but more and more they've moved away from that
> > historical use. Check in particular the page cache, and as
On Tue, Feb 06, 2001 at 09:35:58PM +0100, Ingo Molnar wrote:
> caching bmap() blocks was a recent addition around 2.3.20, and i suggested
> some time ago to cache pagecache blocks via explicit entries in struct
> page. That would be one solution - but it creates overhead.
>
> but there isnt anyth
On Tue, Feb 06, 2001 at 12:59:02PM -0800, Linus Torvalds wrote:
>
>
> On Tue, 6 Feb 2001, Christoph Hellwig wrote:
> >
> > The second is that bh's are two things:
> >
> > - a cacheing object
> > - an io buffer
>
> Actually, they really aren't.
>
> They kind of _used_ to be, but more and mo
Hi,
On Tue, Feb 06, 2001 at 06:37:41PM -0800, Linus Torvalds wrote:
> >
> However, I really _do_ want to have the page cache have a bigger
> granularity than the smallest memory mapping size, and there are always
> special cases that might be able to generate IO in bigger chunks (ie
> in-kernel s
Hi,
On Wed, Feb 07, 2001 at 09:10:32AM +, David Howells wrote:
>
> I presume that correct_size will always be a power of 2...
Yes.
--Stephen
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http:/
Linus Torvalds <[EMAIL PROTECTED]> wrote:
> Actually, I'd rather leave it in, but speed it up with the saner and
> faster
>
> if (bh->b_size & (correct_size-1)) {
I presume that correct_size will always be a power of 2...
David
-
To unsubscribe from this list: send the line "unsubscribe l
On Wed, 7 Feb 2001, Stephen C. Tweedie wrote:
>
> > "struct buffer_head" can deal with pretty much any size: the only thing it
> > cares about is bh->b_size.
>
> Right now, anything larger than a page is physically non-contiguous,
> and sorry if I didn't make that explicit, but I thought that w
On Tue, Feb 06 2001, Linus Torvalds wrote:
> > > [...] so I would be _really_ nervous about just turning it on
> > > silently. This is all very much a 2.5.x-kind of thing ;)
> >
> > Then you might want to apply this :-)
> >
> > --- drivers/block/ll_rw_blk.c~ Wed Feb 7 02:38:31 2001
> > +++
Hi,
On Tue, Feb 06, 2001 at 04:50:19PM -0800, Linus Torvalds wrote:
>
>
> On Wed, 7 Feb 2001, Stephen C. Tweedie wrote:
> >
> > That gets us from 512-byte blocks to 4k, but no more (ll_rw_block
> > enforces a single blocksize on all requests but that relaxing that
> > requirement is no big dea
On Wed, 7 Feb 2001, Jens Axboe wrote:
>
> > [...] so I would be _really_ nervous about just turning it on
> > silently. This is all very much a 2.5.x-kind of thing ;)
>
> Then you might want to apply this :-)
>
> --- drivers/block/ll_rw_blk.c~Wed Feb 7 02:38:31 2001
> +++ drivers/blo
On Wed, 7 Feb 2001, Stephen C. Tweedie wrote:
> >
> > The fact is, if you have problems like the above, then you don't
> > understand the interfaces. And it sounds like you designed kiobuf support
> > around the wrong set of interfaces.
>
> They used the only interfaces available at the time..
On Tue, Feb 06 2001, Linus Torvalds wrote:
> > I don't see anything that would break doing this, in fact you can
> > do this as long as the buffers are all at least a multiple of the
> > block size. All the drivers I've inspected handle this fine, noone
> > assumes that rq->bh->b_size is the same
Hi,
On Tue, Feb 06, 2001 at 04:41:21PM -0800, Linus Torvalds wrote:
>
> On Wed, 7 Feb 2001, Stephen C. Tweedie wrote:
> > No, it is a problem of the ll_rw_block interface: buffer_heads need to
> > be aligned on disk at a multiple of their buffer size.
>
> Ehh.. True of ll_rw_block() and submit_
On Wed, 7 Feb 2001, Ingo Molnar wrote:
>
> most likely some coding error on your side. buffer-size mismatches should
> show up as filesystem corruption or random DMA scribble, not in-driver
> oopses.
I'm not sure. If I was a driver writer (and I'm happy those days are
mostly behind me ;), I wo
On Wed, 7 Feb 2001, Jens Axboe wrote:
>
> I don't see anything that would break doing this, in fact you can
> do this as long as the buffers are all at least a multiple of the
> block size. All the drivers I've inspected handle this fine, noone
> assumes that rq->bh->b_size is the same in all t
On Wed, Feb 07, 2001 at 02:06:27AM +0100, Ingo Molnar wrote:
>
> On Tue, 6 Feb 2001, Jeff V. Merkey wrote:
>
> > > I don't see anything that would break doing this, in fact you can
> > > do this as long as the buffers are all at least a multiple of the
> > > block size. All the drivers I've insp
On Wed, 7 Feb 2001, Jens Axboe wrote:
> > > Adaptec drivers had an oops. Also, AIC7XXX also had some oops with it.
> >
> > most likely some coding error on your side. buffer-size mismatches should
> > show up as filesystem corruption or random DMA scribble, not in-driver
> > oopses.
>
> I would
On Wed, Feb 07, 2001 at 02:08:53AM +0100, Jens Axboe wrote:
> On Tue, Feb 06 2001, Jeff V. Merkey wrote:
> > Adaptec drivers had an oops. Also, AIC7XXX also had some oops with it.
>
> Do you still have this oops?
>
I can recreate. Will work on it tommorrow. SCI testing today.
Jeff
> --
>
On Tue, Feb 06 2001, Jeff V. Merkey wrote:
> Adaptec drivers had an oops. Also, AIC7XXX also had some oops with it.
Do you still have this oops?
--
Jens Axboe
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read th
On Wed, Feb 07 2001, Ingo Molnar wrote:
> > > So I would appreciate pointers to these devices that break so we
> > > can inspect them.
> > >
> > > --
> > > Jens Axboe
> >
> > Adaptec drivers had an oops. Also, AIC7XXX also had some oops with it.
>
> most likely some coding error on your side. bu
On Tue, 6 Feb 2001, Jeff V. Merkey wrote:
> > I don't see anything that would break doing this, in fact you can
> > do this as long as the buffers are all at least a multiple of the
> > block size. All the drivers I've inspected handle this fine, noone
> > assumes that rq->bh->b_size is the same
On Wed, Feb 07, 2001 at 02:02:21AM +0100, Jens Axboe wrote:
> On Tue, Feb 06 2001, Jeff V. Merkey wrote:
> > I remember Linus asking to try this variable buffer head chaining
> > thing 512-1024-512 kind of stuff several months back, and mixing them to
> > see what would happen -- result. About
On Wed, Feb 07, 2001 at 02:01:54AM +0100, Ingo Molnar wrote:
>
> On Tue, 6 Feb 2001, Jeff V. Merkey wrote:
>
> > I remember Linus asking to try this variable buffer head chaining
> > thing 512-1024-512 kind of stuff several months back, and mixing them
> > to see what would happen -- result. Abo
On Tue, 6 Feb 2001, Jeff V. Merkey wrote:
> I remember Linus asking to try this variable buffer head chaining
> thing 512-1024-512 kind of stuff several months back, and mixing them
> to see what would happen -- result. About half the drivers break with
> it. [...]
time to fix them then - inste
On Tue, Feb 06 2001, Jeff V. Merkey wrote:
> I remember Linus asking to try this variable buffer head chaining
> thing 512-1024-512 kind of stuff several months back, and mixing them to
> see what would happen -- result. About half the drivers break with it.
> The interface allows you to do i
On Tue, Feb 06, 2001 at 04:50:19PM -0800, Linus Torvalds wrote:
>
>
> On Wed, 7 Feb 2001, Stephen C. Tweedie wrote:
> >
> > That gets us from 512-byte blocks to 4k, but no more (ll_rw_block
> > enforces a single blocksize on all requests but that relaxing that
> > requirement is no big deal).
On Wed, 7 Feb 2001, Stephen C. Tweedie wrote:
>
> That gets us from 512-byte blocks to 4k, but no more (ll_rw_block
> enforces a single blocksize on all requests but that relaxing that
> requirement is no big deal). Buffer_heads can't deal with data which
> spans more than a page right now.
S
On Wed, Feb 07, 2001 at 12:36:29AM +, Stephen C. Tweedie wrote:
> Hi,
>
> On Tue, Feb 06, 2001 at 07:25:19PM -0500, Ingo Molnar wrote:
> >
> > On Wed, 7 Feb 2001, Stephen C. Tweedie wrote:
> >
> > > No, it is a problem of the ll_rw_block interface: buffer_heads need to
> > > be aligned on d
On Tue, 6 Feb 2001, Ingo Molnar wrote:
>
> On Wed, 7 Feb 2001, Stephen C. Tweedie wrote:
>
> > No, it is a problem of the ll_rw_block interface: buffer_heads need to
> > be aligned on disk at a multiple of their buffer size. Under the Unix
> > raw IO interface it is perfectly legal to begin a
On Wed, 7 Feb 2001, Stephen C. Tweedie wrote:
>
> On Tue, Feb 06, 2001 at 08:57:13PM +0100, Ingo Molnar wrote:
> >
> > [overhead of 512-byte bhs in the raw IO code is an artificial problem of
> > the raw IO code.]
>
> No, it is a problem of the ll_rw_block interface: buffer_heads need to
> be
Hi,
On Tue, Feb 06, 2001 at 07:25:19PM -0500, Ingo Molnar wrote:
>
> On Wed, 7 Feb 2001, Stephen C. Tweedie wrote:
>
> > No, it is a problem of the ll_rw_block interface: buffer_heads need to
> > be aligned on disk at a multiple of their buffer size. Under the Unix
> > raw IO interface it is p
On Wed, Feb 07 2001, Stephen C. Tweedie wrote:
> > [overhead of 512-byte bhs in the raw IO code is an artificial problem of
> > the raw IO code.]
>
> No, it is a problem of the ll_rw_block interface: buffer_heads need to
> be aligned on disk at a multiple of their buffer size. Under the Unix
> r
Hi,
On Tue, Feb 06, 2001 at 08:57:13PM +0100, Ingo Molnar wrote:
>
> [overhead of 512-byte bhs in the raw IO code is an artificial problem of
> the raw IO code.]
No, it is a problem of the ll_rw_block interface: buffer_heads need to
be aligned on disk at a multiple of their buffer size. Under
On Wed, 7 Feb 2001, Stephen C. Tweedie wrote:
> No, it is a problem of the ll_rw_block interface: buffer_heads need to
> be aligned on disk at a multiple of their buffer size. Under the Unix
> raw IO interface it is perfectly legal to begin a 128kB IO at offset
> 512 bytes into a device.
then
On Tue, 6 Feb 2001, Marcelo Tosatti wrote:
>
> Its arguing against making a smart application block on the disk while its
> able to use the CPU for other work.
There are currently no other alternatives in user space. You'd have to
create whole new interfaces for aio_read/write, and ways for th
On Tue, 6 Feb 2001, Linus Torvalds wrote:
> Remember: in the end you HAVE to wait somewhere. You're always going to be
> able to generate data faster than the disk can take it. SOMETHING has to
> throttle - if you don't allow generic_make_request() to throttle, you have
> to do it on your own at
On Tue, 6 Feb 2001, Linus Torvalds wrote:
>
>
> On Tue, 6 Feb 2001, Manfred Spraul wrote:
> > >
> > > The aio functions should NOT use READA/WRITEA. They should just use the
> > > normal operations, waiting for requests.
> >
> > But then you end with lots of threads blocking in get_request()
On Tue, 6 Feb 2001, Jens Axboe wrote:
> On Tue, Feb 06 2001, Marcelo Tosatti wrote:
> >
> > Reading write(2):
> >
> >EAGAIN Non-blocking I/O has been selected using O_NONBLOCK and there was
> > no room in the pipe or socket connected to fd to write the data
> >
On Tue, 6 Feb 2001, Manfred Spraul wrote:
> >
> > The aio functions should NOT use READA/WRITEA. They should just use the
> > normal operations, waiting for requests.
>
> But then you end with lots of threads blocking in get_request()
So?
What the HELL do you expect to happen if somebody wri
On Tue, Feb 06 2001, Marcelo Tosatti wrote:
> > > > We don't even need that, non-blocking is implicitly applied with READA.
> > > >
> > > READA just returns - I doubt that the aio functions should poll until
> > > there are free entries in the request queue.
> >
> > The aio functions should NOT u
On Tue, 6 Feb 2001, Linus Torvalds wrote:
>
>
> On Tue, 6 Feb 2001, Manfred Spraul wrote:
> > Jens Axboe wrote:
> > >
> > > > Several kernel functions need a "dontblock" parameter (or a callback, or
> > > > a waitqueue address, or a tq_struct pointer).
> > >
> > > We don't even need that, no
Linus Torvalds wrote:
>
> On Tue, 6 Feb 2001, Manfred Spraul wrote:
> > Jens Axboe wrote:
> > >
> > > > Several kernel functions need a "dontblock" parameter (or a callback, or
> > > > a waitqueue address, or a tq_struct pointer).
> > >
> > > We don't even need that, non-blocking is implicitly ap
On Tue, 6 Feb 2001, Manfred Spraul wrote:
> Jens Axboe wrote:
> >
> > > Several kernel functions need a "dontblock" parameter (or a callback, or
> > > a waitqueue address, or a tq_struct pointer).
> >
> > We don't even need that, non-blocking is implicitly applied with READA.
> >
> READA just
Jens Axboe wrote:
>
> > Several kernel functions need a "dontblock" parameter (or a callback, or
> > a waitqueue address, or a tq_struct pointer).
>
> We don't even need that, non-blocking is implicitly applied with READA.
>
READA just returns - I doubt that the aio functions should poll until
t
>
> On Tue, 6 Feb 2001, Marcelo Tosatti wrote:
>
> > Think about a given number of pages which are physically contiguous on
> > disk -- you dont need to cache the block number for each page, you
> > just need to cache the physical block number of the first page of the
> > "cluster".
>
> ranges
On Tue, 6 Feb 2001, Christoph Hellwig wrote:
>
> The second is that bh's are two things:
>
> - a cacheing object
> - an io buffer
Actually, they really aren't.
They kind of _used_ to be, but more and more they've moved away from that
historical use. Check in particular the page cache, and
On Tue, 6 Feb 2001, Marcelo Tosatti wrote:
> Think about a given number of pages which are physically contiguous on
> disk -- you dont need to cache the block number for each page, you
> just need to cache the physical block number of the first page of the
> "cluster".
ranges are a hell of a lo
On Tue, 6 Feb 2001, Ingo Molnar wrote:
>
> On Tue, 6 Feb 2001, Christoph Hellwig wrote:
>
> > The second is that bh's are two things:
> >
> > - a cacheing object
> > - an io buffer
> >
> > This is not really an clean appropeach, and I would really like to get
> > away from it.
>
> caching
On Tue, Feb 06 2001, Ben LaHaise wrote:
> =) This is what I'm seeing: lots of processes waiting with wchan ==
> __get_request_wait. With async io and a database flushing lots of io
> asynchronously spread out across the disk, the NR_REQUESTS limit is hit
> very quickly.
You can't do async I/O t
On Tue, Feb 06 2001, Manfred Spraul wrote:
> > =) This is what I'm seeing: lots of processes waiting with wchan ==
> > __get_request_wait. With async io and a database flushing lots of io
> > asynchronously spread out across the disk, the NR_REQUESTS limit is hit
> > very quickly.
> >
> Has that
Ben LaHaise wrote:
>
> On Tue, 6 Feb 2001, Ingo Molnar wrote:
>
> >
> > On Tue, 6 Feb 2001, Ben LaHaise wrote:
> >
> > > This small correction is the crux of the problem: if it blocks, it
> > > takes away from the ability of the process to continue doing useful
> > > work. If it returns -EAGAIN
On Tue, 6 Feb 2001, Christoph Hellwig wrote:
> The second is that bh's are two things:
>
> - a cacheing object
> - an io buffer
>
> This is not really an clean appropeach, and I would really like to get
> away from it.
caching bmap() blocks was a recent addition around 2.3.20, and i suggested
On Tue, Feb 06, 2001 at 11:32:43AM -0800, Linus Torvalds wrote:
> Traditionally, a "bh" is only _used_ for small areas, but that's not a
> "bh" issue, that's a memory management issue. The code should pretty much
> handle the issue of a single 64kB bh pretty much as-is, but nothing
> creates them:
On Tue, 6 Feb 2001, Ben LaHaise wrote:
>
> This small correction is the crux of the problem: if it blocks, it takes
> away from the ability of the process to continue doing useful work. If it
> returns -EAGAIN, then that's okay, the io will be resubmitted later when
> other disk io has complet
On Tue, 6 Feb 2001, Ingo Molnar wrote:
>
> On Tue, 6 Feb 2001, Ben LaHaise wrote:
>
> > This small correction is the crux of the problem: if it blocks, it
> > takes away from the ability of the process to continue doing useful
> > work. If it returns -EAGAIN, then that's okay, the io will be
> >
On Tue, 6 Feb 2001, Ben LaHaise wrote:
> Sure. General parameters will be as follows (since I think we both have
> access to these machines):
>
> - 4xXeon, 4GB memory, 3GB to be used for the ramdisk (enough for a
> base install plus data files.
> - data to/from the ram block
On Tue, 6 Feb 2001, Ingo Molnar wrote:
>
> On Tue, 6 Feb 2001, Ben LaHaise wrote:
>
> > > > You mentioned non-spindle base io devices in your last message. Take
> > > > something like a big RAM disk. Now compare kiobuf base io to buffer
> > > > head based io. Tell me which one is going to perfor
On Tue, Feb 06 2001, Ingo Molnar wrote:
> > This small correction is the crux of the problem: if it blocks, it
> > takes away from the ability of the process to continue doing useful
> > work. If it returns -EAGAIN, then that's okay, the io will be
> > resubmitted later when other disk io has com
On Tue, 6 Feb 2001, Ben LaHaise wrote:
> This small correction is the crux of the problem: if it blocks, it
> takes away from the ability of the process to continue doing useful
> work. If it returns -EAGAIN, then that's okay, the io will be
> resubmitted later when other disk io has completed.
On Tue, 6 Feb 2001, Linus Torvalds wrote:
>
>
> On Tue, 6 Feb 2001, Ben LaHaise wrote:
> >
> > s/impossible/unpleasant/. ll_rw_blk blocks; it should be possible to have
> > a non blocking variant that does all of the setup in the caller's context.
> > Yes, I know that we can do it with a kernel
On Tue, 6 Feb 2001, Ben LaHaise wrote:
> > > You mentioned non-spindle base io devices in your last message. Take
> > > something like a big RAM disk. Now compare kiobuf base io to buffer
> > > head based io. Tell me which one is going to perform better.
> >
> > roughly equal performance when u
On Tue, 6 Feb 2001, Linus Torvalds wrote:
> (Small correction: it doesn't block on anything else than allocating a
> request structure if needed, and quite frankly, you have to block
> SOMETIME. You can't just try to throw stuff at the device faster than
> it can take it. Think of it as a "there
On Tue, Feb 06 2001, Ben LaHaise wrote:
> > > As for io completion, can't we just issue seperate requests for the
> > > critical data and the readahead? That way for SCSI disks, the important
> > > io should be finished while the readahead can continue. Thoughts?
> >
> > Priorities?
>
> Definat
On Tue, Feb 06 2001, Ben LaHaise wrote:
> > > - make asynchronous io possible in the block layer. This is
> > > impossible with the current ll_rw_block scheme and io request
> > > plugging.
> >
> > why is it impossible?
>
> s/impossible/unpleasant/. ll_rw_blk blocks; it should be poss
On Tue, 6 Feb 2001, Ben LaHaise wrote:
>
> s/impossible/unpleasant/. ll_rw_blk blocks; it should be possible to have
> a non blocking variant that does all of the setup in the caller's context.
> Yes, I know that we can do it with a kernel thread, but that isn't as
> clean and it significantly
On Tue, 6 Feb 2001, Ben LaHaise wrote:
> > > - make asynchronous io possible in the block layer. This is
> > > impossible with the current ll_rw_block scheme and io request
> > > plugging.
> >
> > why is it impossible?
>
> s/impossible/unpleasant/. ll_rw_blk blocks; it should be possi
On Tue, 6 Feb 2001, Ben LaHaise wrote:
> On Tue, 6 Feb 2001, Ingo Molnar wrote:
>
> > If you are merging based on (device, offset) values, then that's lowlevel
> > - and this is what we have been doing for years.
> >
> > If you are merging based on (inode, offset), then it has flaws like not
>
On Tue, 6 Feb 2001, Ingo Molnar wrote:
>
> On Tue, 6 Feb 2001, Ben LaHaise wrote:
>
> > - reduce the overhead in submitting block ios, especially for
> > large ios. Look at the %CPU usages differences between 512 byte
> > blocks and 4KB blocks, this can be better.
>
> my system is
On Tue, 6 Feb 2001, Ben LaHaise wrote:
> - reduce the overhead in submitting block ios, especially for
> large ios. Look at the %CPU usages differences between 512 byte
> blocks and 4KB blocks, this can be better.
my system is already submitting 4KB bhs. If anyone's raw-IO
1 - 100 of 204 matches
Mail list logo