Thanks for your quick response, Mike. Database has its own raw page management over OS page management, and most likely database has its own checksum on page level, that's why I want to avoid checksum in Lucene Directory level.
Certainly checksum is good, I like the pattern(rewrite openChecksumInput according to real case): inputStream = directory.openChecksumInput(...); // at the end check checksum, as by-product CodecUtil.checkFooter(...) But I do not like the pattern: CodecUtil.checksumEntireFile(..), its purpose is pure checksum via reading all data, not the by-product. If the design/API is pluggable with default way, it'll be good enough for various scenario. Best regards, Duke If not now, when? If not me, who? On Tue, Dec 6, 2016 at 6:36 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > We have learned over time not to trust the underlying store to > correctly record the bytes we wrote to it. > > This is why checksumming is very strongly built into Lucene at this > point. If you disable checksumming, when bits do flip, you get exotic > exceptions at search time that might look like Lucene bugs and can > cost a lot of time to explain. > > It's not just the BlockTreeTermsReader; many other codec components > check the checksum with CodecUtil.checkFooter at search time. > > Can you explain why it's necessary to remove it for your database > files based Directory? > > Mike McCandless > > http://blog.mikemccandless.com > > > On Tue, Dec 6, 2016 at 5:25 AM, Duke DAI <duke.dai....@gmail.com> wrote: > > Hi all, > > > > I'm customizing Lucene Directory, which extends o.a.l.store.Directory > based > > on database files. I do not need checksum again on IndexIndex and > > IndexOutput. > > > > But in BlockTreeTermsReader constructor, following code open a > > hard-coded BufferedChecksumIndexInput to checksum on raw IndexInput. I > have > > to use CRC32 on IndexOutput to make through it. Is there any more > graceful > > way to do checksum, such as let Directory construct a checksum instance > > instead of API Directory.openChecksumInput ? > > > > > > String indexName = IndexFileNames.segmentFileName(segment, > > state.segmentSuffix, TERMS_INDEX_EXTENSION); > > indexIn = state.directory.openInput(indexName, state.context); > > CodecUtil.checkIndexHeader(indexIn, TERMS_INDEX_CODEC_NAME, > version, > > version, state.segmentInfo.getId(), state.segmentSuffix); > > CodecUtil.checksumEntireFile(indexIn); > > > > > > > > > > Best regards, > > Duke > > If not now, when? If not me, who? >