[
https://issues.apache.org/jira/browse/LUCENE-8264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469894#comment-16469894
]
Erick Erickson edited comment on LUCENE-8264 at 5/10/18 4:36 AM:
-----------------------------------------------------------------
OK, we've pretty well disposed of the whole N-2 -> N upgrade issue, ain't gonna
happen. There are still two other cases where this would be useful:
1> N-1 -> N
2> adding DocValues without re-indexing
Of the two, <2> is probably the most immediately useful, I've seen a lot of
clients in the field be hurt when they realize that they'd have been better off
with docValues but didn't have them turned on.
Since I'm working on TMP, that's where I'm focusing. How to implement? A new
method on MergePolicy that no-op'd for everything except TMP? See the
discussion at LUCENE-8004, but the gist is:
1> some new methods on MergePolicy that returned information from the concrete
policy like default max merge segments (don't particularly like that). Callers
would have to "do the right thing", which is trappy.
OR
2> a new method on MergePolicy like {{findRewriteAllSegments}} that was
essentially {{findForcedMerges}} that makes some extra decisions. A
pass-through for everything except TMP currently.
Or is the right thing to do here is create, say a new MergePolicy
{{AddDocValuesBecaseYouDidntReadTheManualAboutWhyDocValuesWereAGoodThingMergePolicy}}?
Off the top of my head it would take (somehow) a list of fields to add
DocValues to and then "do the right thing". I don't have any details worked out
yet, want to discuss before diving in.
The requirement is that in a distributed system I can issue one command that'll
fix this everywhere I care about. I don't really have a clue how it'd deal with
being applied twice in a row, merging some segments with and some segments
without etc......
was (Author: erickerickson):
See comment 9-May.
> Allow an option to rewrite all segments
> ---------------------------------------
>
> Key: LUCENE-8264
> URL: https://issues.apache.org/jira/browse/LUCENE-8264
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Erick Erickson
> Assignee: Erick Erickson
> Priority: Major
>
> For the background, see SOLR-12259.
> There are several use-cases that would be much easier, especially during
> upgrades, if we could specify that all segments get rewritten.
> One example: Upgrading 5x->6x->7x. When segments are merged, they're
> rewritten into the current format. However, there's no guarantee that a
> particular segment _ever_ gets merged so the 6x-7x upgrade won't necessarily
> be successful.
> How many merge policies support this is an open question. I propose to start
> with TMP and raise other JIRAs as necessary for other merge policies.
> So far the usual response has been "re-index from scratch", but that's
> increasingly difficult as systems get larger.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]