[
https://issues.apache.org/jira/browse/LUCENE-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038136#comment-13038136
]
Michael McCandless commented on LUCENE-3126:
--------------------------------------------
bq. Patch does not handle all files well (few tests fail). Apparently, the .del
file should not be rolled into the .cfs.
Right, .del files never appear inside a CFS.
bq. SegmentMerger.createCompoundFile does this by default, however it's only
called from code that ensures no deletions exist. Would have been nice if this
method documented it .
Please add comments to this! It's non-obvious ;)
bq. Also, I think *.s<num> should not be rolled into .cfs (those are the
separate norms files). I don't know how to create such files in the first place
(thought they're of old format, but 3.1 indexes have them also), and
TestBackCompat fails.
Right, these too only live outside a CFS. You create them by opening a
writable IndexReader (I know: confusing!) and calling setNorm, then closing it.
They are not only for old indices... 4.0 creates them too.
bq. Is there a way to identify those files? Is it safe to check if the file
extension starts w/ IndexFileNames.SEPARATE_NORMS_EXTENSION? Feels hacky to me.
Hackish though it seems (I agree) I think that's the only way?
SegmentInfo.hasSeparateNorms is equally hacky...
bq. Another thing, I think in order to avoid shared doc stores (and whatever
other old-format) stuff, since it's only an optimization, that the code should
copy into CFS only if the segment version is on or after 3.1 (that is
StringHelper.getVersionComparator().compare(info.getVersion, "3.1") >= 0).
Shared doc stores, yes, but the separate del docs / norms are produced by all
versions.
More generally: does addIndexes properly refuse to import a too-old index? We
should throw IndexFormatTooOldExc in this case? (And, maybe also
IndexFormatTooNewExc?).
> IndexWriter.addIndexes can make any incoming segment into CFS if it isn't
> already
> ---------------------------------------------------------------------------------
>
> Key: LUCENE-3126
> URL: https://issues.apache.org/jira/browse/LUCENE-3126
> Project: Lucene - Java
> Issue Type: Improvement
> Components: core/index
> Reporter: Shai Erera
> Assignee: Shai Erera
> Priority: Minor
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-3126.patch
>
>
> Today, IW.addIndexes(Directory) does not modify the CFS-mode of the incoming
> segments. However, if IndexWriter's MP wants to create CFS (in general),
> there's no reason why not turn the incoming non-CFS segments into CFS. We
> anyway copy them, and if MP is not against CFS, we should create a CFS out of
> them.
> Will need to use CFW, not sure it's ready for that w/ current API (I'll need
> to check), but luckily we're allowed to change it (@lucene.internal).
> This should be done, IMO, even if the incoming segment is large (i.e., passes
> MP.noCFSRatio) b/c like I wrote above, we anyway copy it. However, if you
> think otherwise, speak up :).
> I'll take a look at this in the next few days.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]