[
https://issues.apache.org/jira/browse/LUCENE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14192255#comment-14192255
]
Tim Allison commented on LUCENE-5317:
-------------------------------------
I added my latest source code and standalone jars to work with 4.10.2 to my
lucene-addons [repo|https://github.com/tballison/lucene-addons] in case anyone
wants to try the code as is. There may be surprises.
The next step is to turn back to the lucene5317 branch in my fork and update
the trunk code.
The biggest functional difference between the original patch in October and the
current working code is that I added multivalued field handling.
> [PATCH] Concordance capability
> ------------------------------
>
> Key: LUCENE-5317
> URL: https://issues.apache.org/jira/browse/LUCENE-5317
> Project: Lucene - Core
> Issue Type: New Feature
> Components: core/search
> Affects Versions: 4.5
> Reporter: Tim Allison
> Labels: patch
> Fix For: 4.9
>
> Attachments: LUCENE-5317.patch, concordance_v1.patch.gz
>
>
> This patch enables a Lucene-powered concordance search capability.
> Concordances are extremely useful for linguists, lawyers and other analysts
> performing analytic search vs. traditional snippeting/document retrieval
> tasks. By "analytic search," I mean that the user wants to browse every time
> a term appears (or at least the topn) in a subset of documents and see the
> words before and after.
> Concordance technology is far simpler and less interesting than IR relevance
> models/methods, but it can be extremely useful for some use cases.
> Traditional concordance sort orders are available (sort on words before the
> target, words after, target then words before and target then words after).
> Under the hood, this is running SpanQuery's getSpans() and reanalyzing to
> obtain character offsets. There is plenty of room for optimizations and
> refactoring.
> Many thanks to my colleague, Jason Robinson, for input on the design of this
> patch.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]