[
https://issues.apache.org/jira/browse/LUCENE-5940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14130262#comment-14130262
]
Robert Muir commented on LUCENE-5940:
-------------------------------------
{quote}
are there technical issues here i'm unaware of beyond creating and maintaining
the backwards compat tests?
something outside of the codec mechanism that causes problems?
{quote}
There are plenty, first of all, maintaining back compat codecs has a real cost
to improving lucene in the future, because if e.g. I want to make a change to
the codec API, i have to make deal with tons of medieval index formats. Same
goes with structural changes like making docvalues updatable (shai had to fight
a lot here). Even stuff like simple code refactoring is expensive because its
just a ton of code.
Also the old codecs hang behind on features. They might not support various
features like offsets in the postings, payloads in the term vectors, missing
bitsets for docvalues, or whole datastructure types
(SORTED_SET/SORTED_NUMERIC), or even whole parts of the index (3.x with
docvalues at all). They are missing various useful statistics, etc. These are
just ones i've worked on myself recently, there are more, and there are more
coming (like Mike's range prefix feature). This makes things like testing
difficult.
Backwards compat drags around a lot of stuff for a long time (see the packed
ints api) that makes it more complex and hard to work with and make changes to.
It prevents and discourages real improvements to lucene.
There are plenty of bugs in the back compat, the last few indexes have been
riddled with them, some of them bad. Its undertested, overcomplex, and
undermaintained. Again, not sexy stuff to work on, nobody wants to improve it.
Finally, users want to have more options, but until we can minimize this
backwards compat, i'm personally going to push back very hard on any "options",
because we simply cannot take on more back compat. So the codec API goes mostly
wasted. Maybe we should rename it "backcompat" api, because thats all its
currently good for. Backcompat hurts the users here in this case. If we didn't
have so many ancient formats, we could instead provide (and actually support)
"breadth" instead, such as various options for the way to encode data so users
really can take advantage of it.
> change index backwards compatibility policy.
> --------------------------------------------
>
> Key: LUCENE-5940
> URL: https://issues.apache.org/jira/browse/LUCENE-5940
> Project: Lucene - Core
> Issue Type: Bug
> Reporter: Robert Muir
>
> Currently, our index backwards compatibility is unmanageable. The length of
> time in which we must support old indexes is simply too long.
> The index back compat works like this: everyone wants it, but there are
> frequently bugs, and when push comes to shove, its not a very sexy thing to
> work on/fix, so its hard to get any help.
> Currently our back compat "promise" is just a broken promise, because we
> cannot actually guarantee it for these reasons.
> I propose we scale back the length of time for which we must support old
> indexes.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]