* Zheng Chuan (zhengch...@huawei.com) wrote: > > > On 2020/8/21 20:39, Dr. David Alan Gilbert wrote: > > * Daniel P. Berrangé (berra...@redhat.com) wrote: > >> On Fri, Aug 21, 2020 at 08:22:06PM +0800, Zheng Chuan wrote: > >>> > >>> > >>> On 2020/8/21 1:55, Dr. David Alan Gilbert wrote: > >>>> * Daniel P. Berrangé (berra...@redhat.com) wrote: > >>>>> On Thu, Aug 20, 2020 at 06:30:09PM +0100, Dr. David Alan Gilbert wrote: > >>>>>> * Chuan Zheng (zhengch...@huawei.com) wrote: > >>>>>>> Record hash results for each sampled page. > >>>>>>> > >>>>>>> Signed-off-by: Chuan Zheng <zhengch...@huawei.com> > >>>>>>> Signed-off-by: YanYing Zhuang <ann.zhuangyany...@huawei.com> > >>>>>>> --- > >>>>>>> migration/dirtyrate.c | 144 > >>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++ > >>>>>>> migration/dirtyrate.h | 7 +++ > >>>>>>> 2 files changed, 151 insertions(+) > >>>>>>> > >>>>>>> diff --git a/migration/dirtyrate.c b/migration/dirtyrate.c > >>>>>>> index c4304ef..62b6f69 100644 > >>>>>>> --- a/migration/dirtyrate.c > >>>>>>> +++ b/migration/dirtyrate.c > >>>>>>> @@ -25,6 +25,7 @@ > >>>>>>> #include "dirtyrate.h" > >>>>>>> > >>>>>>> CalculatingDirtyRateState CalculatingState = CAL_DIRTY_RATE_INIT; > >>>>>>> +static unsigned long int qcrypto_hash_len = QCRYPTO_HASH_LEN; > >>>>>> > >>>>>> Why do we need this static rather than just using the QCRYPTO_HASH_LEN > >>>>>> ? > >>>>>> It's never going to change is it? > >>>>>> (and anyway it's just a MD5 len?) > >>>>> > >>>>> I wouldn't want to bet on that given that this is use of MD5. We might > >>>>> claim this isn't security critical, but surprises happen, and we will > >>>>> certainly be dinged on security audits for introducing new use of MD5 > >>>>> no matter what. > >>>>> > >>>>> If a cryptographic hash is required, then sha256 should be the choice > >>>>> for any new code that doesn't have back compat requirements. > >>>>> > >>>>> If a cryptographic hash is not required then how about crc32 > >>>> > >>>> It doesn't need to be cryptographic; is crc32 the fastest reasonable > >>>> hash for use > >>>> in large areas? > >>>> > >>>> Dave > >>>> > >>>>> IOW, it doesn't make a whole lot of sense to say we need a cryptographic > >>>>> hash, but then pick the most insecure one. > >>>>> > >>>>> sha256 is slower than md5, but it is conceivable that in future we might > >>>>> gain support for something like Blake2b which is similar security level > >>>>> to SHA3, while being faster than MD5. > >>>>> > >>>>> Overall I'm pretty unethusiastic about use of MD5 being introduced and > >>>>> worse, being hardcoded as the only option. > >>>>> > >>>>> Regards, > >>>>> Daniel > >>>>> -- > >>>>> |: https://berrange.com -o- > >>>>> https://www.flickr.com/photos/dberrange :| > >>>>> |: https://libvirt.org -o- > >>>>> https://fstop138.berrange.com :| > >>>>> |: https://entangle-photo.org -o- > >>>>> https://www.instagram.com/dberrange :| > >>> > >>> Hi, Daniel, Dave. > >>> > >>> I do compare MD5 and SHA256 with vm memory of 128G under mempress of 100G. > >>> > >>> 1. Calculation speed > >>> 1) MD5 takes about 500ms to sample and hash all pages by > >>> record_ramblock_hash_info(). > >>> 2) SHA256 takes about 750ms to sample all pages by > >>> record_ramblock_hash_info(). > >>> > >>> 2. CPU Consumption > >>> 1) MD5 may have instant rise up to 48% for dirtyrate thread > >>> 2) SHA256 may have instant rise up to 75% for dirtyrate thread > >>> > >>> 3. Memory Consumption > >>> SHA256 may need twice memory than MD5 due to its HASH_LEN. > >>> > >>> I am trying to consider if crc32 is more faster and takes less memory and > >>> is more safer than MD5? > >> > >> No, crc32 is absolutely *weaker* than MD5. It is NOT a cryptographic > >> hash so does not try to guarantee collision resistance. It only has > >> 2^32 possible outputs. > >> > >> MD5 does try to guarantee collision resistance, but MD5 is considered > >> broken these days, so a malicious attacker can cause collisions if they > >> are motivated enough. > >> > >> IOW if you need collision resistance that SHA256 should be used. > > > > There's no need to guard against malicious behaviour here - this is just > > a stat to guide migration. > > If CRC32 is likely to be faster than md5 I suspect it's enough. > > > > Dave > > > Hi,Dave, Daniel. > > I did test by crc32,it is much faster than MD5 and SHA256:) > > As for 128G vm it takes only about 50ms to sample and hash all pages by > record_ramblock_hash_info(). > And the dirtyrate calculation is still good enough:) > ++++++++++++++++++++++++++++++++++++++++++ > | | dirtyrate | > ++++++++++++++++++++++++++++++++++++++++++ > | no mempress | 4MB/s | > ------------------------------------------ > | mempress 4096 1024 | 1248MB/s | > ++++++++++++++++++++++++++++++++++++++++++ > | mempress 4096 4096 | 4060MB/s | > ++++++++++++++++++++++++++++++++++++++++++ > > I will take crc32 in PatchV4, is that OK from the perspective of safety?
Yes, it's fine since it's only a heuristic anyway. Dave > In my opinion, it should be safe. > The crc32 is only for compare and the recorder will be free after calculation > is over. > The output is just dirtyrate for user to guide migration. > > What's more, i consider increase DIRTYRATE_DEFAULT_SAMPLE_PAGES from 256 to > 512 > which may takes about 75ms to sample and hash all pages by > record_ramblock_hash_info(). > > >> > >> Regards, > >> Daniel > >> -- > >> |: https://berrange.com -o- > >> https://www.flickr.com/photos/dberrange :| > >> |: https://libvirt.org -o- > >> https://fstop138.berrange.com :| > >> |: https://entangle-photo.org -o- > >> https://www.instagram.com/dberrange :| > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK