On Mon, 7 Dec 2009, Markus Armbruster wrote: > malc <av1...@comtv.ru> writes: > > > On Mon, 30 Nov 2009, Markus Armbruster wrote: > > > >> Commit a7d27b53 made zero-sized allocations a fatal error, deviating > >> from ISO C's malloc() & friends. Revert that, but take care never to > >> return a null pointer, like malloc() & friends may do (it's > >> implementation defined), because that's another source of bugs. > >> > >> Rationale: while zero-sized allocations might occasionally be a sign of > >> something going wrong, they can also be perfectly legitimate. The > >> change broke such legitimate uses. We've found and "fixed" at least one > >> of them already (commit eb0b64f7, also reverted by this patch), and > >> another one just popped up: the change broke qcow2 images with virtual > >> disk size zero, i.e. images that don't hold real data but only VM state > >> of snapshots. > >> > >> If a change breaks two uses, it probably breaks more. As a quick check, > >> I reviewed the first six of more than 200 uses of qemu_mallocz(), > >> qemu_malloc() and qemu_realloc() that don't have an argument of the form > >> sizeof(...) or similar: > > > > Bottom line: your point is that there are benign uses of zero allocation. > > There are, at the expense of memory consumption/fragmentation and useless > > code which, as your investigation shows, involve syscalls and what not. > > I doubt the performance impact is relevant, but since you insist on > discussing it... > > First and foremost, any performance argument not backed by measurements > is speculative. > > Potential zero allocation is common in code. Actual zero allocation is > rare at runtime. The amount of memory consumed is therefore utterly > trivial compared to total dynamic memory use. Likewise, time spent in > allocating is dwarved by time spent in other invocations of malloc() > several times over. > > Adding a special case for avoiding zero allocation is not free. Unless > zero allocations are sufficiently frequent, that cost exceeds the cost > of the rare zero allocation.
I bet you dollars to donuts that testing for zero before calling malloc is cheaper than the eventual test that is done inside it. > > > Outlook from my side of the fence: no one audited the code, no one > > knows that all of the uses are benign, abort gives the best automatic > > way to know for sure one way or the other. > > > > Rationale for keeping it as is: zero-sized allocations - overwhelming > > majority of the times, are not anticipated and not well understood, > > aborting gives us the ability to eliminate them in an automatic > > fashion. > > Keeping it as is releasing with known crash bugs, and a known pattern > for unknown crash bugs. Is that what you want us to do? I doubt it. Yes, it's better than having _silent_ bugs, which, furthermore, have a good potential of mainfesting themselves a lot further from the "bug" site. > > I grant you that there may be allocations we expect never to be empty, > and where things break when they are. Would you concede that there are > allocations that work just fine when empty? I wont dispute that. > > If you do, we end up with three kinds of allocations: known empty bad, > known empty fine, unknown. Aborting on known empty bad is okay with me. > Aborting on unknown in developement builds is okay with me, too. I > don't expect it to be a successful way to find bugs, because empty > allocations are rare. > > Initially, all allocations should be treated as "unknown". I want a way > to mark an allocation as "known empty fine". I figure you want a way to > mark "known empty bad". > > >> * load_elf32(), load_elf64() in hw/elf_ops.h: > >> > >> size = ehdr.e_phnum * sizeof(phdr[0]); > >> lseek(fd, ehdr.e_phoff, SEEK_SET); > >> phdr = qemu_mallocz(size); > >> > >> This aborts when the ELF file has no program header table, because > >> then e_phnum is zero (TIS ELF 1.2 page 1-6). Even if we require the > >> ELF file to have a program header table, aborting is not an acceptable > >> way to reject an unsuitable or malformed ELF file. > > > > If there's a malformed ELF files that must be supported (and by supported > > - user notification is meant) then things should be checked, because having > > gargantuan size will force qemu_mallocz to abort via OOM check (even > > with Linux and overcommit, since malloc will fallback to mmap which > > will fail) and this argument falls apart. > > ELF files with empty PHT are *not* malformed. Empty PHT is explicitly > permitted by ELF TIS. Likewise for empty segments. I already gave you > chapter & verse. Uh, it's you who brought the whole malformed issue. > > >> * load_elf32(), load_elf64() in hw/elf_ops.h: > >> > >> mem_size = ph->p_memsz; > >> /* XXX: avoid allocating */ > >> data = qemu_mallocz(mem_size); > >> > >> This aborts when the segment occupies zero bytes in the image file > >> (TIS ELF 1.2 page 2-2). > >> > >> * bdrv_open2() in block.c: > >> > >> bs->opaque = qemu_mallocz(drv->instance_size); > >> > >> However, vvfat_write_target.instance_size is zero. Not sure whether > >> this actually bites or is "only" a time bomb. > >> > >> * qemu_aio_get() in block.c: > >> > >> acb = qemu_mallocz(pool->aiocb_size); > >> > >> No existing instance of BlockDriverAIOCB has a zero aiocb_size. > >> Probably okay. > >> > >> * defaultallocator_create_displaysurface() in console.c has > >> > >> surface->data = (uint8_t*) qemu_mallocz(surface->linesize * > >> surface->height); > >> > >> With Xen, surface->linesize and surface->height come out of > >> xenfb_configure_fb(), which set xenfb->width and xenfb->height to > >> values obtained from the guest. If a buggy guest sets one to zero, we > >> abort. Not an good way to handle that. > > > > What is? Let's suppose height is zero, without explicit checks, user > > will never know why his screen doesn't update, with abort he will at > > least know that something is very wrong. > > My point isn't that permitting malloc(0) is the best way to handle it. > It's a better way than aborting, though. And this is precisely the gist of our disagreement. > > I'm not arguing against treating case 0 specially ever. > > >> Non-Xen uses aren't obviously correct either, but I didn't dig deeper. > >> > >> * load_device_tree() in device_tree.c: > >> > >> *sizep = 0; > >> dt_size = get_image_size(filename_path); > >> if (dt_size < 0) { > >> printf("Unable to get size of device tree file '%s'\n", > >> filename_path); > >> goto fail; > >> } > >> > >> /* Expand to 2x size to give enough room for manipulation. */ > >> dt_size *= 2; > >> /* First allocate space in qemu for device tree */ > >> fdt = qemu_mallocz(dt_size); > >> > >> We abort if the image is empty. Not an acceptable way to handle that. > >> > >> Based on this sample, I'm confident there are many more uses broken by > >> the change. > > > > Based on this sample i'm confident, we can eventually fix them should those > > become the problem. > > You broke working code. I'm glad you're confident we can fix it > "eventually". What about 0.12? Ship it broken? I'm fixing those as they arrive. > > >> Signed-off-by: Markus Armbruster <arm...@redhat.com> > >> --- > >> block/qcow2-snapshot.c | 5 ----- > >> qemu-malloc.c | 14 ++------------ > >> 2 files changed, 2 insertions(+), 17 deletions(-) > >> > >> diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c > >> index 94cb838..e3b208c 100644 > >> --- a/block/qcow2-snapshot.c > >> +++ b/block/qcow2-snapshot.c > >> @@ -381,11 +381,6 @@ int qcow2_snapshot_list(BlockDriverState *bs, > >> QEMUSnapshotInfo **psn_tab) > >> QCowSnapshot *sn; > >> int i; > >> > >> - if (!s->nb_snapshots) { > >> - *psn_tab = NULL; > >> - return s->nb_snapshots; > >> - } > >> - > >> sn_tab = qemu_mallocz(s->nb_snapshots * sizeof(QEMUSnapshotInfo)); > >> for(i = 0; i < s->nb_snapshots; i++) { > >> sn_info = sn_tab + i; > >> diff --git a/qemu-malloc.c b/qemu-malloc.c > >> index 295d185..aeeb78b 100644 > >> --- a/qemu-malloc.c > >> +++ b/qemu-malloc.c > >> @@ -44,22 +44,12 @@ void qemu_free(void *ptr) > >> > >> void *qemu_malloc(size_t size) > >> { > >> - if (!size) { > >> - abort(); > >> - } > >> - return oom_check(malloc(size)); > >> + return oom_check(malloc(size ? size : 1)); > >> } > >> > >> void *qemu_realloc(void *ptr, size_t size) > >> { > >> - if (size) { > >> - return oom_check(realloc(ptr, size)); > >> - } else { > >> - if (ptr) { > >> - return realloc(ptr, size); > >> - } > >> - } > >> - abort(); > >> + return oom_check(realloc(ptr, size ? size : 1)); > >> } > > > > http://groups.google.com/group/comp.std.c/browse_thread/thread/4e9af8847613d71f/6f75ad22e0768a0b?q=realloc++group:comp.std.c#6f75ad22e0768a0b > > --verbose ? Sigh... C90 - realloc(non-null, 0) = free(non-null); return NULL; C99 - realloc(non-null, 0) = free(non-null); return realloc(NULL, 0); GLIBC 2.7 = C90 How anyone can use this interface and it's implementations portably or "sanely" is beyond me. Do read the discussion the linked above. -- mailto:av1...@comtv.ru