On Mon, May 11, 2009 at 2:06 PM, Babak Farhang <farh...@gmail.com> wrote:
> I am not familiar with the details of CFS, but I didn't interpret > Michael's comment to mean that there is actually any rewriting going > on here. The problem here appears to be one of translating the > encrypted/compressed file position to the uncompressed file position. > Am I reading this right? Actually, CompoundFileWriter does seek back and overwrite bytes it had previously written. TermInfosWriter also does the same thing. And ChecksumIndexOutput, currently used only when writing the segments_N file, does as well. If we could fix all these places, eg by separately storing this metadata eg in the segments file, then we could deprecate & remove IndexOutput.seek entirely, which would be a nice simplification. > If in fact so, then a simple solution would be to push down all the > encoding logic into the RAF implementation itself. The "append-only" > RAF implementation would maintain a decoded view of the file. This > decoded view would include the (virtual) decoded file position. In > that case, CFS could be oblivious to the actual RAF implementation. Right, as long as the IndexOutput API implements "getFilePointer()" such that I can take that returned value, and later pass it to IndexInput.seek and it takes me back to the same spot, then that's all Lucene would need. Ie Lucene should never assume the long returned by getFilePointer is the actual byte offset in the file. Instead, it's a value private to the IndexOutput/Input impl. Mike --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org