Re: Hardcoded checksum mechanism in BlockTreeTermsReader

2016-12-25 Thread Duke DAI
://www.thetaphi.de > eMail: u...@thetaphi.de > > > -Original Message- > > From: Michael McCandless [mailto:luc...@mikemccandless.com] > > Sent: Tuesday, December 6, 2016 12:30 PM > > To: Duke DAI > > Cc: Lucene Users > > Subject: Re: Hardcoded chec

RE: Hardcoded checksum mechanism in BlockTreeTermsReader

2016-12-06 Thread Uwe Schindler
Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Michael McCandless [mailto:luc...@mikemccandless.com] > Sent: Tuesday, December 6, 2016 12:30 PM > To: Duke DAI > Cc: Lucene Users > Subject: Re: Hardcoded checksum mechanism in BlockTreeTerm

Re: Hardcoded checksum mechanism in BlockTreeTermsReader

2016-12-06 Thread Michael McCandless
I see. Bits can also be flipped by the network as they are travelling to/from the DB. The end to end checksum Lucene does now would catch that. Anyway, that BlockTree index file that is being entirely checksummed is a very small file. And, using the first pattern is not easy for it because it n

Re: Hardcoded checksum mechanism in BlockTreeTermsReader

2016-12-06 Thread Duke DAI
Thanks for your quick response, Mike. Database has its own raw page management over OS page management, and most likely database has its own checksum on page level, that's why I want to avoid checksum in Lucene Directory level. Certainly checksum is good, I like the pattern(rewrite openChecksumIn

Re: Hardcoded checksum mechanism in BlockTreeTermsReader

2016-12-06 Thread Michael McCandless
We have learned over time not to trust the underlying store to correctly record the bytes we wrote to it. This is why checksumming is very strongly built into Lucene at this point. If you disable checksumming, when bits do flip, you get exotic exceptions at search time that might look like Lucene

Hardcoded checksum mechanism in BlockTreeTermsReader

2016-12-06 Thread Duke DAI
Hi all, I'm customizing Lucene Directory, which extends o.a.l.store.Directory based on database files. I do not need checksum again on IndexIndex and IndexOutput. But in BlockTreeTermsReader constructor, following code open a hard-coded BufferedChecksumIndexInput to checksum on raw IndexInput. I