I have the following task that I need to implement in .NET. I get a block of
text and need to assess whether this text is mostly readable or a bunch of
unreadable garbage. This text is generated by processes like OCR. I am not
looking to detect or correct small errors. Instead, I need to "triage
Hi Everyone,
If there was a straightforward way to take a Boolean Query created by the
Lucene Query Parser and convert it to a Span Query.
Ideally I'd like to take any ANDed clauses and require them to occur
withing $SPAN of the other ANDs.
I can't quite wrap my head around how to solve the prob
Give us an example of what you are really trying to match.
SpanNearQuery takes a list of clauses, which can be SpanTermQuery to match a
single term or SpanNearQuery to match a nested span. You can specify the
maximum distance between terms/spans - use nesting if you want to change
that distanc
Well I was hoping that someone knew of a recursive solution to
rewriting Boolean queries of arbitrary depth.
I suppose If I can rewrite
"london olympics" AND (football OR soccer) NOT nfl
into
"London Olympics" within_5_words_of (football or soccer)
not_within_5_words_of nfl
Then I should be ab
So I've taken my first shot at solving my problem using the three functions
below.
When I set the slop to 10 it produces the following result:
This BooleanQuery +content:"london olympics" +(+content:football
+content:or +content:soccer) -content:nfl
becomes this SpanQuery: spanNot(spanNear([spanN