On Tue, Sep 14, 2010 at 4:47 PM, Kevin Wolf <kw...@redhat.com> wrote: > Am 14.09.2010 17:20, schrieb Anthony Liguori: >> On 09/14/2010 10:11 AM, Kevin Wolf wrote: >>> Am 14.09.2010 15:43, schrieb Anthony Liguori: >>> >>>> Hi Avi, >>>> >>>> On 09/14/2010 08:07 AM, Avi Kivity wrote: >>>> >>>>> Here's a draft of a plan that should improve qcow2 performance. It's >>>>> written in wiki syntax for eventual upload to wiki.qemu.org; lines >>>>> starting with # are numbered lists, not comments. >>>>> >>>> Thanks for putting this together. I think it's really useful to think >>>> through the problem before anyone jumps in and starts coding. >>>> >>>> >>>>> = Basics = >>>>> >>>>> At the minimum level, no operation should block the main thread. This >>>>> could be done in two ways: extending the state machine so that each >>>>> blocking operation can be performed asynchronously >>>>> (<code>bdrv_aio_*</code>) >>>>> or by threading: each new operation is handed off to a worker thread. >>>>> Since a full state machine is prohibitively complex, this document >>>>> will discuss threading. >>>>> >>>> There's two distinct requirements that must be satisfied by a fast block >>>> device. The device must have fast implementations of aio functions and >>>> it must support concurrent request processing. >>>> >>>> If an aio function blocks in the process of submitting the request, it's >>>> by definition slow. But even if you may the aio functions fast, you >>>> still need to be able to support concurrent request processing in order >>>> to achieve high throughput. >>>> >>>> I'm not going to comment in depth on your threading proposal. When it >>>> comes to adding concurrency, I think any approach will require a rewrite >>>> of the qcow2 code and if the author of that rewrite is more comfortable >>>> implementing concurrency with threads than with a state machine, I'm >>>> happy with a threaded implementation. >>>> >>>> I'd suggest avoiding hyperbole like "a full state machine is >>>> prohibitively complex". QED is a full state machine. qcow2 adds a >>>> number of additional states because of the additional metadata and sync >>>> operations but it's not an exponential increase in complexity. >>>> >>> It will be quite some additional states that qcow2 brings in, but I >>> suspect the really hard thing is getting the dependencies between >>> requests right. >>> >>> I just had a look at how QED is doing this, and it seems to take the >>> easy solution, namely allowing only one allocation at the same time. >> >> One L2 allocation, not cluster allocations. You can allocate multiple >> clusters concurrently and you can read/write L2s concurrently. >> >> Since L2 allocation only happens every 2GB, it's a rare event. > > Then your state machine is too complicated for me to understand. :-) > > Let me try to chase function pointers for a simple cluster allocation: > > bdrv_qed_aio_writev > qed_aio_setup > qed_aio_next_io > qed_find_cluster > qed_read_l2_table > ... > qed_find_cluster_cb > > This function contains the code to check if the cluster is already > allocated, right? > > n = qed_count_contiguous_clusters(s, request->l2_table->table, > index, n, &offset); > ret = offset ? QED_CLUSTER_FOUND : QED_CLUSTER_L2; > > The callback called from there is qed_aio_write_data(..., ret = > QED_CLUSTER_L2, ...) which means > > bool need_alloc = ret != QED_CLUSTER_FOUND; > /* Freeze this request if another allocating write is in progress */ > if (need_alloc) { > ... > > So where did I start to follow the path of a L2 table allocation instead > of a simple cluster allocation?
qed_aio_write_main() writes the main body of data into the cluster. Then it decides whether to update/allocate L2 tables if this is an allocating write. qed_aio_write_l2_update() is the function that gets called to touch the L2 table (it also handles the allocation case). Stefan