Stefano,
--On 5 April 2013 17:22:04 +0100 Stefano Stabellini
wrote:
Thanks for that. Do you plan to get the 4.2 patches in for 4.2.2?
Yes
Great - thanks.
--
Alex Bligh
timers cannot. For instance,
in block drivers where there is no mainloop that calls timers
(qemu-nbd, qemu-img), or where (per stefa...@redhat.com) the
aio code loops internally and thus timers never get called.
Signed-off-by: Alex Bligh
---
async.c | 53
--On 6 July 2013 17:24:57 +0100 Alex Bligh wrote:
Add timed bottom halves. A timed bottom half is a bottom half that
will not execute until a given time has passed (qemu_bh_schedule_at)
or a given interval has passed (qemu_bh_schedule_in). Any qemu
clock can be used, and times are specified in
uld round up wait time
Signed-off-by: Alex Bligh
---
async.c | 53 +--
include/block/aio.h | 33
tests/test-aio.c| 47 +
3 files changed, 127 inserti
cient position.
*/
static inline void wavelet_level(int *data, int size, int l, int
skip_pixel) {
If we're doing spelling/grammar fixes, that should be "is quite the same"
and "that of coefficient position".
--
Alex Bligh
de I've contributed (e.g. to fix a bug), to indicate I agree
with them.
It's also pretty clear (line 437 and 462) what Reviewed-by: means
--
Alex Bligh
sector_num += n;
ide_set_sector(s, sector_num);
s->nsector -= n;
}
--
Alex Bligh
Paolo,
--On 15 July 2013 16:25:01 +0200 Paolo Bonzini wrote:
Thanks for the review.
Il 06/07/2013 18:24, Alex Bligh ha scritto:
Add timed bottom halves. A timed bottom half is a bottom half that
will not execute until a given time has passed (qemu_bh_schedule_at)
or a given interval has
h the patch I sent) be treated exactly the
same as untimed BH's, i.e. the timeout to poll should be being
set to zero.
--
Alex Bligh
ave
a look at that.
--
Alex Bligh
he second (and removing the aio_notify
from qemu_bh_schedule_at) is sufficient provided it checks
for scheduled bh's immediately prior to the poll. This assumes
other threads cannot schedule bh's. This would seem to be less
intrusive that a TimedEventNotifier approach which (as far as I
can see) requires another thread.
--
Alex Bligh
millisecond
timer. By that I mean poll() would be called at least once before
it ran, but with a 0 ms timeout.
--
Alex Bligh
:
* aio_ctx_prepare should cope with wait<0
* aio_ctx_prepare should round up wait time
Changes since v2:
* aio_poll timeout depends on presence of timed bottom halves
* timed bh's do not do aio_notify immediately
Signed-off-by: Alex Bligh
---
aio-posix.c | 20 +-
aio
aving something to comment on, I just sent v3 of this
patch to the list. This is basically a 'minimal change' version that
fixes the issue with aio_poll (I think). It passes make check.
Stefan? Kevin?
--
Alex Bligh
by no means set on) can use any QEMUClock
in the bh, so you get to choose the clock per BH, not per AioContext,
which may or may not have advantages.
--
Alex Bligh
go at that.
--
Alex Bligh
is (a).
2. If we do delete alarm timers, I'll need to delete the -clock option.
WDYT?
--
Alex Bligh
and no SIGALRM, and the timeout will still be infinite (unless I
calculate the timeout as the minimum across all clocks, in which
case I might as well do (b) above).
--
Alex Bligh
x86_64
I can't help but note the last commit is:
commit 24943978cbe79634a9a8b02a20efb25b29b3ab49
Author: Markus Armbruster
Date: Wed Jun 26 15:52:23 2013 +0200
boot-order-test: Add tests for Sun4u
Obviously that should make no difference for x86_64. I've copied
the author just in case.
--
Alex Bligh
the time being) a conversion to
milliseconds.
Make qemu_run_timers return a bool to indicate progress.
Add QEMUClock to AioContext.
Run timers attached to clock in aio_poll
Signed-off-by: Alex Bligh
---
aio-posix.c | 16 +-
aio-win32.c | 20 +-
async.c |
and a long rebuild due to failing
to limit my configure options fixed this. Apologies.
--
Alex Bligh
returning true only
if a an AIO dispatch did something, or a BH was executed, or a timer ran.
Specifically if the poll simply times out, it should not be returning
true unless a timer ran. Correct?
--
Alex Bligh
qemu_timeout_ns_to_ms to convert a timeout in nanoseconds back to
milliseconds for when ppoll is not used.
Signed-off-by: Alex Bligh
---
include/qemu/timer.h | 19
qemu-timer.c | 83 ++
2 files changed, 96 insertions(+), 6
s check is incorrect, in that it checking aio_poll
makes progress when in fact it should not make progress. I fixed an issue
where aio_poll was (as far as I can tell) wrongly returning true on
a timeout, and that generated this error.
Alex Bligh (7):
aio / timers: Remove alarm timers
aio / ti
Make qemu_run_timers and qemu_run_all_timers return progress
so that aio_poll etc. can determine whether a timer has been
run.
Signed-off-by: Alex Bligh
---
include/qemu/timer.h |4 ++--
qemu-timer.c | 17 +++--
2 files changed, 13 insertions(+), 8 deletions(-)
diff
Add qemu_g_poll_ns which works like g_poll but takes a nanosecond
timeout.
Signed-off-by: Alex Bligh
---
configure| 19 +++
qemu-timer.c | 24
2 files changed, 43 insertions(+)
diff --git a/configure b/configure
index 9e1cd19..b491c00 100755
Add a test harness for AioContext timers. The g_source equivalent is
unsatisfactory as it suffers from false wakeups.
Signed-off-by: Alex Bligh
---
tests/test-aio.c | 124 +-
1 file changed, 123 insertions(+), 1 deletion(-)
diff --git a
Add a clock to each AioContext and delete it when freed.
Signed-off-by: Alex Bligh
---
async.c |2 ++
include/block/aio.h |5 +
2 files changed, 7 insertions(+)
diff --git a/async.c b/async.c
index 90fe906..0d41431 100644
--- a/async.c
+++ b/async.c
@@ -177,6 +177,7
Remove alarm timers from qemu-timers.c in anticipation of using
timeouts for g_poll / p_poll instead.
Signed-off-by: Alex Bligh
---
include/qemu/timer.h |2 -
main-loop.c |4 -
qemu-timer.c | 501 +-
vl.c
Switch to ppoll (or rather qemu_g_poll_ns which will use ppoll if available).
Set timeouts for aio, g_source, and mainloop from earliest timer deadline.
Run timers for AioContext (only) in aio_poll/aio_dispatch.
Signed-off-by: Alex Bligh
---
aio-posix.c | 20 +---
aio-win32
enable timers to run on AioContext's thread. And
maybe in future, hpet can run with its dedicated thread too.
Also, I see Alex Bligh is on the same effort by another method,(it is a
good idea)"[RFC] aio/async: Add timed bottom-halves".
Stefan & Paolo did not like that method muc
();
+
if (progress && !blocking) {
return true;
}
I am told (by Stefan H) this approach is unsafe as existing timers may
not expect to be run within aio_poll.
Also, I suspect you need to change the value of progress if timers
run so bdrv draining terminates properly.
--
Alex Bligh
ed to do is write() to the correct
notifier FD, which will end the relevant poll. Of course if we delete
alarm_timers this is all irrelevant.
--
Alex Bligh
will then (presumably) repoll with a recalculated
timeout.
--
Alex Bligh
thus recommended using a separate QEMUClock, and only running
that clock's timers.
--
Alex Bligh
n false.
Yup, you made the same fix as me at the end of aio_poll in my PATCHv2
RFC series.
On a related point, the g_source appears very fragile in respect of
false wakeups. I would not be confident that it would not busy-loop.
See the comments in the last of the patches in my series.
--
Alex Bligh
Richard,
--On 23 July 2013 13:09:18 -0800 Richard Henderson wrote:
On 07/20/2013 10:06 AM, Alex Bligh wrote:
+int64_t qemu_clock_deadline_ns(QEMUClock *clock);
+int64_t qemu_clock_deadline_all_ns(void);
+int qemu_timeout_ns_to_ms(int64_t ns);
+gint qemu_g_poll_ns(GPollFD *fds, guint nfds
bjects can have pretty
appalling latency anyway (100ms!), and there's no evidence that's
limited by making one of the FDs (or objects) ready. In these
circumstances, I'd question whether we gain anything by worrying
about timer resolution.
--
Alex Bligh
th - in this case - 100ns resolution which is probably
enough.
Again I know nothing about Windows so this may be completely wrong.
--
Alex Bligh
EventNotifier which can be signalled with aio_notify(). The
purpose of this function is to kick an event loop that is blocking in
select()/poll(). This is necessary when another thread modifies
something that the AioContext needs to act upon, such as adding/removing
an fd.
Thanks
--
Alex Bligh
Paolo,
--On 24 July 2013 09:54:57 +0200 Paolo Bonzini wrote:
Alex, can you add it to your series? (Note that you must set a timer
slack of 1, because 0 is interpreted as "default").
Sure, will do. I'm guessing I'll have to look for that inside configure
as well.
--
Alex Bligh
--On 24 July 2013 09:01:22 +0100 Alex Bligh wrote:
Most 'reasonable' POSIX compliant operating systems have ppoll
Really? I could find no manpages for any of Solaris and *BSD.
OK I shall (re)research that then! I suppose select() / pselect() is
an alternative when there a
Where supported, called prctl(PR_SET_TIMERSLACK, 1, ...) to
set one nanosecond timer slack to increase precision of timer
calls.
Signed-off-by: Alex Bligh
---
[ Additional patch on the the end of the PATCHv2 series - I'll
resend if I have further comments with this reordered so it's
n
Should it be possible to live migrate between different versions of Qemu
(assuming suitable hardware)? I am thinking specifically of Qemu 1.0.x
to Qemu 1.5.0, x86_84 kvm.
--
Alex Bligh
everything is the way to go.
--
Alex Bligh
tc. I am avoiding are the ones that would
be introduced by using kernel rbd devices rather than librbd.
I had planned to introduce this as a sort of layer on top of any
existing block device handler; I believe they are layered at the
moment.
--
Alex Bligh
possibility might be copy the idea behind the PCI unplug logic. Either
if the new PCI device is used, it could unplug the old one, or vice versa.
Drivers magically unplugging themselves may not be ideal, but it beats
having 2 drivers fighting over the same device.
--
Alex Bligh
m also keen to hear from the Ceph guys as if they have a way of
keeping lots of reads and writes in the box and not crossing the
network, I'd be only too keen to use that.
--
Alex Bligh
7;d not even thought of putting it in librbd (which might
be simpler). I suspect it might be easier to get patches into librbd
than into qemu, and that ensuring cache coherency might be simpler.
If I get time to look at this, would you be interested in taking patches
for this?
--
Alex Bligh
front which would acknowledge the write - even if
a flush/fua had come in - on the basis it had been written to
persistent storage and it can recover on a reboot after this
point, then go sort the rest out in the background.
--
Alex Bligh
sh).
I've also backported this to the Ubuntu Precise packaging of qemu-kvm,
(again note the branch is v1.0-rbd-add-async-flush) at
https://github.com/flexiant/qemu-kvm-1.0-noroms/tree/v1.0-rbd-add-async-flush
THESE PATCHES ARE VERY LIGHTLY TESTED. USE AT YOUR OWN RISK.
--
Alex Bligh
s original version
to get it to compile with old includes and still run with a new
library, as well as vice versa.
It would take me only a few minutes to bring this back as a patch for
1.5 if anyone is interested. I assumed they would not be.
--
Alex Bligh
is in C.
Another is that sheepdog has a daemon running on the client which
can keep state etc. as qemu comes and goes. Ceph just has a library.
--
Alex Bligh
new drivers not
making such assumptions, and carrying some versioning information, such
that we need one new PCI device now, but no more in the future?
--
Alex Bligh
#x27;d add that blkverify and blkdebug seem to be quite useful too.
--
Alex Bligh
Stefano,
--On 27 June 2013 19:16:30 +0100 Stefano Stabellini
wrote:
* Therefore this option gives the backend permission to use
* O_DIRECT, notwithstanding that bug.
Looks useful. Are you planning to do this for both emulated and pv
disks?
--
Alex Bligh
t very important (because IDE is slow anyway).
... perhaps 'who cares'.
--
Alex Bligh
or 1
make[1]: Leaving directory `/home/amb/qemu/git/qemu/tests/tcg'
make: *** [test] Error 2
--
Alex Bligh
This is an RFC for a very lightly tested patch.
Add a delay option to blkdebug, allowing operations to be delayed by
a specifiable number of microseconds. Example configuration:
[inject-error]
event = "read_aio"
delay = "20"
Signed-off-by: Alex Bligh
---
block/blkdebug.c | 83 +
h's are scheduled whenever non-idle ones are. However, I am
newbie as far as the async code and the block code are concerned.
--
Alex Bligh
irtic also produce PV drivers, and get
assigned their own device ID. Will both of them be visible with
a different device ID for the same device? Will windows cope with that?
Or do we need to mediate which device IDs are exposed?
--
Alex Bligh
ry to avoid was Citrix drivers
binding to PCI-ID A and Xirtic drivers binding to PCI-ID B, and them
both trying to work at once.
Perhaps I should stop worrying about this as I'm finding it really hard
to resist typing 'so don't use Windows' :-)
--
Alex Bligh
d. Is the idea here that QEMU is always built with CONFIG_PREFIX
having versioning inside it (in a distro environment)?
Can I suggest that at the very least, it should be possible to specify
an alternate path to the module directory via the CLI?
--
Alex Bligh
I also wonder about the utility of the subdirectories above, as
opposed to filename prefixes.
+1
--
Alex Bligh
mbols change on (literally)
every rebuild. Else every dev will build with modules turned
off!
--
Alex Bligh
What is the advantage of this enum and having
different types of module at all? If they are
all built together, why can't they all live
together in the same directory?
Seems like an overcomplication.
--
Alex Bligh
are at load time? So
qemu-img would never attempt to load (e.g.) the spice module.
--
Alex Bligh
any reliance
on modules at all. So we could scrap the whole modules infrastructure
and just tell people developing third party libraries to use weak
binding. As you can see, for librbd anyway, the changes required
to support weak binding are pretty minimal.
--
Alex Bligh
ld simply dlopen()
a fixed list of modules known at compile time from a single directory
(because we also know at compile which executable needs what, e.g.
that qemu-img doesn't need spice or whatever).
--
Alex Bligh
On 16 Sep 2013, at 12:04, Daniel P. Berrange wrote:
> On Mon, Sep 16, 2013 at 12:00:47PM +0100, Alex Bligh wrote:
>>
>> However, even if you don't use weak symbols, we could simply dlopen()
>> a fixed list of modules known at compile time from a single directory
&
d) could still help
here by permitting (e.g.) a blkrbd-new.so & blkrbd-old.so (hopefully
with more useful names), for those with an allergy to weak binding.
--
Alex Bligh
having different types
of modules etc.
One reason to avoid qemu-img (for instance) loading everything (if
it's present) is init time. I agree dlopen()'ing something that
never gets called should not eat too much RAM but it seems pointless.
--
Alex Bligh
to a second and log an error?
I presume the reason it's breaking aio_poll based timer threads is
because there is only one fd (the aio_notify_fd), there are
no other fd's, but there is a timeout (from the timers)? If
that's true, I think we /shouldn't/ return. Equally if there
are no timers but something is genuinely attempting to wait
on an aio_notify, I don't think we should return.
--
Alex Bligh
be Stefan's
bdrv_drain* patches in combination with this, as the timer stuff
definitely passed make check. I think I'll defer to Stefan on
how bdrv_drain* is meant to work now as I think it had or was
proposed to have major surgery after the timer patches went in.
--
Alex Bligh
to audit thread safety for use_icount=1.
--
Alex Bligh
Paolo,
On 18 Sep 2013, at 08:57, Paolo Bonzini wrote:
> Il 17/09/2013 19:32, Alex Bligh ha scritto:
>>
>> On 17 Sep 2013, at 18:04, Paolo Bonzini wrote:
>>
>>> Alex, what's missing before block.c and QED can use aio_timer_new on
>>> the main Ai
Paolo,
On 18 Sep 2013, at 09:23, Alex Bligh wrote:
>> Yes, that was my understanding too. Can we do it for 1.7?
Whilst we are changing the calling semantics, do you think
qemu_coroutine_yield() should also run the timers for the
aio_context? IE should timers always be deferred to th
rottled_reqs(). This
> function not only kicks throttled requests but also temporarily disables
> throttling so requests can run.
>
> The outdated FIXME comment can be removed. Also drop the busy = true
> assignment since we overwrite it immediately afterwards.
>
> Signed-off-by:
_poll() wait is not considered
> making progress.
>
> Adjust test-aio /aio/bh/callback-delete/one which assumed aio_poll(ctx,
> true) would immediately return false instead of blocking.
>
> Signed-off-by: Stefan Hajnoczi
Signed-off-by: Alex Bligh
(if I meant 'Reviewed-
through the list of active timers in a timer
list. I believe the latter is what active_timers_lock protects.
The list of timers attached to a clock is only modified when timers
are created and deleted which is (currently) under the BQL.
--
Alex Bligh
k would be very very read heavy, so RCU is probably a
sensible option.
This link may (or may not) help in understanding:
http://blog.alex.org.uk/2013/08/24/changes-to-qemus-timer-system/
--
Alex Bligh
ftruncate() if the block device
is large enough, and at least attempting to proceed further? Something
like the following (not-even compile tested) patch?
--
Alex Bligh
Signed-Off-By: Alex Bligh
diff --git a/block/raw-posix.c b/block/raw-posix.c
index 2ee5d69..be9a371 100644
--- a/block/raw-posix.c
r the reference to cdrom_reopen(),
sometimes by deleting fd_open(). I'm not quite sure why that's needed
there; perhaps doing it in raw_truncate and raw_getlength would
be sufficient.
Again, this is compile tested only.
--
Alex Bligh
Signed-Off-By: Alex Bligh
diff --git a/block/raw-
was against current trunk.
Command line simply:
qemu-img convert -O raw test.qcow /dev/xyz
Fails on ftruncate() as verified with strace.
I must admit I only 'tested' on trunk by reading the source.
--
Alex Bligh
--On 12 December 2011 10:48:43 +0100 Kevin Wolf wrote:
I was testing on:
amb@alex-test:~$ qemu-img --version
qemu-img version 0.12.3, Copyright (c) 2004-2008 Fabrice Bellard
That's the problem. It should work since 0.13.
Thanks
--
Alex Bligh
&s->clock_reset_notifier);
>
> s->suspend_notifier.notify = rtc_notify_suspend;
> --
>
--
Alex Bligh
gt; not call its reset_notifiers when the time is adjusted to a time-point
> before.The reset_notifiers is useless for this type of QEMUClock.
I think this is a staightforward bug. I suspect the issue is that at that point
in the patch series I hadn't get converted the reset_notifier stuff to us
for the good feedback, by the way; it will make v2 better.
No problem. Some time ago I rewrote chunks of the nbd test suite and
wrote the bit that tested parallel outstanding commands. At the back
of my mind is whether I should extend the test suite to test this
and how we could persuade a server to 'often fragment' so we can
test reassembly (some form of debug setting on the server like
'max fragment size' or similar I suspect).
--
Alex Bligh
signature.asc
Description: Message signed with OpenPGP using GPGMail
bly large), and should state that a server MAY reply to any
> request with DF set for a block larger than that minimum, with that
> error.
Yeah something like that. Or the server could simply publish this
as part of the option negotiation.
--
Alex Bligh
VERY server reply to be a structured
reply that simply set NBD_CHUNK_IS_END. That gives us a convenient
route to servers which only implement structured replies. With DF,
this would be little harder than implementing the current
protocol.
--
Alex Bligh
On 29 Mar 2016, at 21:00, Eric Blake wrote:
> I'm liking it - then we aren't sending a mandatory 0 error field on read
> chunks.
I'm writing it up as a strawman. I'll comment in a sec in further detail.
--
Alex Bligh
signature.asc
Description: Message signed with OpenPGP using GPGMail
Here's a strawman for the structured reply section. I haven't
covered negotation.
Signed-off-by: Alex Bligh
---
doc/proto.md | 114 +--
1 file changed, 111 insertions(+), 3 deletions(-)
diff --git a/doc/proto.md b/doc/prot
urposes. I am saying that if you've negotiated
structured replies, you should be be able to return either for any
command, as the client can disambiguate using the magic number.
Clearly some future commands might REQUIRE structured replies as there
would be no way to represent them in an unstructured reply.
--
Alex Bligh
signature.asc
Description: Message signed with OpenPGP using GPGMail
er to rely on 'data up to X' as being OK as
only one error is reported. I'd therefore suggest an error offset
of 2^32-1 means 'one or more error, assume all delivered data is
potentially erroneous'.
--
Alex Bligh
s send structured replies for
all commands if you negotiate structured replies, else always send
unstructured replies. We're talking an overhead of 8 bytes here
(flags & error offset); somehow I suspect the time to transmit
8 bytes is going to be negligible compared to disk time or the
rest of the network tx/rx time.
--
Alex Bligh
formatting
* Inserted 3 options for when structured vs unstructured
replies are used
* Made use of error location clearer. 0x indicates
'unknown error position'.
* Minor clarifications
Signed-off-by: Alex Bligh
---
doc/proto.md | 138 +++
ying an error, you (at least in theory) need to disambiguate
the two so you know whether the chunk at offset X was OK. That's
why I'm using 0x (now) to say "don't know where the error
is".
--
Alex Bligh
OFFSET_DATA` or
> +`NBD_REPLY_TYPE_OFFSET_HOLE`, although it MAY still send more than
> +reply (for error reporting, or a final `NBD_REPLY_TYPE_NONE`). If
"the flag is set and"
> +the client's length request is larger than 65,536 bytes (or if a
> +later
;> +MAY send `NBD_CMD_FLAG_DF`, which instructs the server
>> +not to fragment the reply. If this flag is set, the server
>> +MUST send either zero or one data chunks and an `NBD_CHUNKTYPE_END`
>> +only. Under such circumstances the server MAY error the command
>> +with `E
601 - 700 of 935 matches
Mail list logo