Thanks for your quick response, Mike.

Database has its own raw page management over OS page management, and most
likely database has its own checksum on page level, that's why I want to
avoid checksum in Lucene Directory level.

Certainly checksum is good, I like the pattern(rewrite openChecksumInput
according to real case):
inputStream = directory.openChecksumInput(...);
// at the end check checksum, as by-product
CodecUtil.checkFooter(...)

But I do not like the pattern:
CodecUtil.checksumEntireFile(..), its purpose is pure checksum via reading
all data, not the by-product.
If the design/API is pluggable with default way, it'll be good enough for
various scenario.




Best regards,
Duke
If not now, when? If not me, who?

On Tue, Dec 6, 2016 at 6:36 PM, Michael McCandless <
luc...@mikemccandless.com> wrote:

> We have learned over time not to trust the underlying store to
> correctly record the bytes we wrote to it.
>
> This is why checksumming is very strongly built into Lucene at this
> point.  If you disable checksumming, when bits do flip, you get exotic
> exceptions at search time that might look like Lucene bugs and can
> cost a lot of time to explain.
>
> It's not just the BlockTreeTermsReader; many other codec components
> check the checksum with CodecUtil.checkFooter at search time.
>
> Can you explain why it's necessary to remove it for your database
> files based Directory?
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Tue, Dec 6, 2016 at 5:25 AM, Duke DAI <duke.dai....@gmail.com> wrote:
> > Hi all,
> >
> > I'm customizing Lucene Directory, which extends o.a.l.store.Directory
> based
> > on database files. I do not need checksum again on IndexIndex and
> > IndexOutput.
> >
> > But in BlockTreeTermsReader constructor, following code open a
> > hard-coded BufferedChecksumIndexInput to checksum on raw IndexInput. I
> have
> > to use CRC32 on IndexOutput to make through it. Is there any more
> graceful
> > way to do checksum, such as let Directory construct a checksum instance
> > instead of API Directory.openChecksumInput ?
> >
> >
> >       String indexName = IndexFileNames.segmentFileName(segment,
> > state.segmentSuffix, TERMS_INDEX_EXTENSION);
> >       indexIn = state.directory.openInput(indexName, state.context);
> >       CodecUtil.checkIndexHeader(indexIn, TERMS_INDEX_CODEC_NAME,
> version,
> > version, state.segmentInfo.getId(), state.segmentSuffix);
> >       CodecUtil.checksumEntireFile(indexIn);
> >
> >
> >
> >
> > Best regards,
> > Duke
> > If not now, when? If not me, who?
>

Reply via email to