Re: [Qemu-devel] [RFC] qed: Add QEMU Enhanced Disk format

Anthony Liguori Thu, 09 Sep 2010 05:49:51 -0700

On 09/09/2010 01:45 AM, Avi Kivity wrote:

Loading very large L2 tables on demand will result in very longlatencies. Increasing cluster size will result in very long firstwrite latencies. Adding an extra level results in an extra randomwrite every 4TB.

It would be trivially easy to add another level of tables as a featurebit so let's delay the decision.

qed is very careful about ensuring that we don't need to do syncs andwe don't get corruption because of data loss. I don't necessarilybuy your checksumming argument.
The requirement for checksumming comes from a different place. Fordecades we've enjoyed very low undetected bit error rates. Howeverthe actual amount of data is increasing to the point that it makes anundetectable bit error likely, just by throwing a huge amount of bitsat storage. Write ordering doesn't address this issue.

I don't think we should optimize an image format for cheap disks and anold file system.

We should optimize for the future. That means a btrfs file systemand/or enterprise storage.

The point of an image format is not to recreate btrfs in software. It'sto provide a mechanism to allow users to move images around reasonablebut once an image is present on a reasonable filesystem, we should moreor less get the heck out of the way.

By creating two code paths within qcow2.
You're creating two code paths for users.


No, I'm creating a single path: QED.

There are already two code paths: raw and qcow2. qcow2 has had such abad history that for a lot of users, it's not even a choice.

Today, users have to choose between performance and reliability orfeatures. QED offers an opportunity to be able to tell users to justalways use QED as an image format and forget about raw/qcow2/everythingelse.

You can say, let's just make qcow2 better, but we've been trying thatfor years and we have an existence proof that we can do it in a straightforward fashion with QED.A new format doesn't introduce much additionalcomplexity. We provide image conversion tool and we can almostcertainly provide an in-place conversion tool that makes the processvery fast.

It requires users to make a decision. By the time qed is ready formass deployment, 1-2 years will have passed. How many qcow2 imageswill be in the wild then? How much scheduled downtime will be needed?

Zero if we're smart. You can do QED stream + live migration to do alive conversion from raw to QED.

  How much user confusion will be caused?

User confusion is reduced if we can make strong, clear statements: allusers should use QED even if they care about performance. Today,there's mass confusion because of the poor state of qcow2.

Virtualization is about compatibility. In-guest compatibility first,but keeping the external environment stable is also important. Wereally need to exhaust the possibilities with qcow2 before giving upon it.

IMHO, we're long past exhausting the possibilities with qcow2. We stillhaven't decided what we're going to do for 0.13.0. Are we going to shipqcow2 with awful performance (a 15 minute operation taking hours) orwith compromised data integrity?

It's been this way for every release since qcow2 existed. Let's not letsunk cost cloud our judgement here.

qcow2 is not a properly designed image format. It was a weekend hackingsession from Fabrice that he dropped in the code base and never reallyfinished doing what he originally intended. The improvements that havebeen made to it are almost at the heroic level but we're only hurtingour users by not moving on to something better.


Regards,

Anthony Liguori

Re: [Qemu-devel] [RFC] qed: Add QEMU Enhanced Disk format

Reply via email to