juliehockett added a comment. In https://reviews.llvm.org/D41102#1017918, @lebedev.ri wrote:
> Is there some (internal to `BitstreamWriter`) logic that would 'assert()' if > trying to output some recordid > which is, according to the `BLOCKINFO_BLOCK`, should not be there? > E.g. outputting `VERSION` in `BI_COMMENT_BLOCK_ID`? Yes -- it will fail an assertion: `Assertion 'V == Op.getLiteralValue() && "Invalid abbrev for record!"' failed.` ================ Comment at: clang-doc/BitcodeWriter.cpp:191 + Record.clear(); + for (const char C : BlockIdNameMap[ID]) Record.push_back(C); + Stream.EmitRecord(llvm::bitc::BLOCKINFO_CODE_BLOCKNAME, Record); ---------------- lebedev.ri wrote: > Why do we have this indirection? > Is there a need to first to (unefficiently?) copy to `Record`, and then emit > from there? > Wouldn't this work just as well? > ``` > Record.clear(); > Stream.EmitRecord(llvm::bitc::BLOCKINFO_CODE_BLOCKNAME, BlockIdNameMap[ID]); > ``` No, since `BlockIdNameMap[ID]` returns a `StringRef`, which can be manipulated into an `std::string` or a `const char*`, but the `Stream` wants an `unsigned char`. So, the copying is to satisfy that. Unless there's a better way to convert a `StringRef` into an array of `unsigned char`? ================ Comment at: clang-doc/BitcodeWriter.cpp:196 +/// \brief Emits a record name to the BLOCKINFO block. +void ClangDocBitcodeWriter::emitRecordID(RecordId ID) { + assert(RecordIdNameMap[ID] && "Unknown Abbreviation"); ---------------- lebedev.ri wrote: > Hmm, so i've been staring at this and > http://llvm.org/doxygen/classllvm_1_1BitstreamWriter.html and i must say i'm > not fond of this indirection. > > What i don't understand is, in previous function, we don't store `BlockId`, > why do we want to store `RecordId`? > Aren't they both unstable, and are implementation detail? > Do we want to store it (`RecordId`)? If yes, please explain it as a new > comment in code. > > If no, i guess this would work too? > ``` > assert(RecordIdNameMap[ID] && "Unknown Abbreviation"); > Record.clear(); > Stream.EmitRecord(llvm::bitc::BLOCKINFO_CODE_SETRECORDNAME, > RecordIdNameMap[ID].Name); > ``` > And after that you can lower the default size of `SmallVector<> Record` down > to, hm, `4`? I'm not entirely certain what you mean -- in `emitBlockId()`, we are storing both the block id and the block name in separate records (`BLOCKINFO_CODE_SETBID`, `BLOCKINFO_CODE_BLOCKNAME`, respectively). In `emitRecordId()`, we're doing something slightly different, in that we emit one record with both the record id and the record name (in record `BLOCKINFO_CODE_SETRECORDNAME`). Replacing the copy loop here has the same issue as above, namely that there isn't an easy way to convert between a `StringRef` and an array of `unsigned char`. ================ Comment at: clang-doc/BitcodeWriter.cpp:240 + if (!prepRecordData(ID, Val)) return; + assert(Val < (1U << BitCodeConstants::IntSize)); + Record.push_back(Val); ---------------- lebedev.ri wrote: > Ok, now that i think about it, it can't be that easy. > Maybe > ``` > FIXME: assumes 8 bits per byte > assert(llvm::APInt(8U*sizeof(Val), Val, /*isSigned=*/true).getBitWidth() <= > BitCodeConstants::IntSize)); > ``` > Not sure whether `getBitWidth()` is really the right function to ask though. > (Not sure how this all works for negative numbers) That assertion fails :/ I could do something like `static_cast<int64_t>(Val) == Val` but that would require a) IntSize being a power of 2 b) updating the assert anytime IntSize is updated, and 3) still throws a warning about comparing a signed to an unsigned int... ================ Comment at: clang-doc/BitcodeWriter.h:53 + BI_LAST = BI_COMMENT_BLOCK_ID +}; + ---------------- lebedev.ri wrote: > juliehockett wrote: > > lebedev.ri wrote: > > > So what *exactly* does `BitCodeConstants::SubblockIDSize` mean? > > > ``` > > > static_assert(BI_LAST < (1U << BitCodeConstants::SubblockIDSize), "Too > > > many block id's!"); > > > ``` > > > ? > > It's the current abbrev id width for the block (described [[ > > https://llvm.org/docs/BitCodeFormat.html#enter-subblock-encoding | here > > ]]), so it's the max id width for the block's abbrevs. > So in other words that `static_assert()` is doing the right thing? > Add it after the `enum BlockId{}` then please, will both document things, and > ensure that things remain in a sane state. No...it's the (max) number of the abbrevs relevant to the block itself, which is to say some subset of the RecordIds for any given block (e.g. for a `BI_COMMENT_BLOCK`, the number of abbrevs would be 12 and so on the abbrev width would be 4). To assert for it we could put block start/end markers on the RecordIds and then use that to calculate the bitwidth, if you think the assertion should be there. https://reviews.llvm.org/D41102 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits