h, which will probably
result in another post sooner or later.
Daniel
--
Daniel NollForensic and eDiscovery Software
Senior Developer The world's most advanced
Nuixemail data analys
t;the entry isn't in the
cache" from "the entry is in the cache but it's null".
Daniel
--
Daniel NollForensic and eDiscovery Software
Senior Developer The world's most advanced
Nuix
robably fine anyway, as I don't really want to encourage the former
way of formatting it as the latter is more concise. Actually it could
even be...
tag:(a AND (b OR c))
But I don't think my formatting logic is quite smart enough for that yet.
Daniel
--
Daniel Noll
uot;tag:a tag:b" and "tag:(a b)" both parse to the same node
structure (making it impossible to figure out which the user actually
used)?
Daniel
--
Daniel NollForensic and eDiscovery Software
Senior Developer
he API would explicitly pass the docBase for the
IndexReader - this would reduce the need to perform maths to determine
the docBase ourselves, and also make it possible to parallelise those
calls later.
Daniel
--
Daniel NollForensic and eDiscovery Software
Senior De
ter is now explicitly not
threadsafe. We weren't keeping any state in them anyway, but now we
will have to, so there is potential for a lot of new bugs if a filter
is somehow used by two queries running at the same time.
Daniel
--
Daniel NollForensic and eDiscovery
a filter which only matches the last doc for each
term. Then I don't have to pay for the storage of a filter... but I
guess it will cost to build this filter anyway so I don't know if it's
practical yet.
I guess storing the filter on disk would be an easier way to go, with
the caveat
etty large even if I use a BitSet. :-( Is there
any other way to go about it?
Daniel
--
Daniel NollForensic and eDiscovery Software
Senior Developer The world's most advanced
Nuixemai
ding SyntaxFormatter
to convert from QueryNode back to String.
3. What about going all the way from Query back to String? (My naive
answer to my own question here is that some QueryNodeProcessor may
perform an irreversible operation, making it impossible to do this,
but I thought I would throw
On Thu, Nov 19, 2009 at 16:01, Yonik Seeley wrote:
> On Wed, Nov 18, 2009 at 10:48 PM, Daniel Noll wrote:
>> But what if I want to find the highest? TermEnum can't step backwards.
>
> I've also wanted to do the same. It's coming with the new flex
of binary search by getting the
TermEnum for different terms until I find a term where there are terms
higher than the term but no terms higher than the term for the next
day?
Daniel
--
Daniel NollForensic and eDiscovery Software
Senior Developer
oes not
exist is somewhat simpler.
Daniel
--
Daniel NollForensic and eDiscovery Software
Senior Developer The world's most advanced
Nuixemail data analysis
http://nuix.com/
index
state (before adding docs.) When the IndexWriter was opened, another
reader was opened, so even though we thought we were closing both, it
turned out there were two readers and one writer, and we were only
closing one of the readers.
Daniel
--
Daniel Noll
time (though I was under the
impression that close() waited for merges and so forth to complete
before returning.)
Daniel
--
Daniel NollForensic and eDiscovery Software
Senior Developer The world's most advanced
ere using while providing no replacement
except for "write it yourself", the same as what happened when Hits
got canned.
Daniel
--
Daniel NollForensic and eDiscovery Software
Senior Developer The world's most advan
ve at least all used the
same filter.)
Daniel
--
Daniel NollForensic and eDiscovery Software
Senior Developer The world's most advanced
Nuixemail data analysis
http://nuix.com/
ly assuming a score of 1.0 for each hit, you would get something like...
1. "cool gaming laptop"=> 3 (cool, gaming, "cool gaming")
2. "cool gaming lappy"=> 3 (cool, gaming, "cool gaming")
3. "gaming laptop cool"=> 2 (cool,
arge number
of documents in the index.
It's a shame we don't have an inverted kind of HitCollector where we
can say "give me the next hit", so that we can get the best of both
worlds (like what StAX gives us in the XML world.)
Daniel
--
Daniel Noll
, it would have prevented the problem in its entirety as we
would have realised much sooner that it wasn't safe to override in the
beginning.
Daniel
--
Daniel NollForensic and eDiscovery Software
Senior Developer The world'
sy to try 2.4.1 and
see if it has been fixed, but was there a bug along these lines in 2.3.2?
Daniel
--
Daniel NollForensic and eDiscovery Software
Senior Developer The world's most advanced
Nuix
tion.)
Daniel
--
Daniel NollForensic and eDiscovery Software
Senior Developer The world's most advanced
Nuixemail data analysis
http://nuix.com/and
ene, wouldn't a trivial analyser which breaks on commas be the
way to go?
Daniel
--
Daniel NollForensic and eDiscovery Software
Senior Developer The world's most advanced
Nuixemail data
Michael McCandless wrote:
You could look at the docID of each hit, and compare to the .maxDoc() of
each underlying reader.
There is also MultiSearcher#subSearcher(int) which also works as you add
more without having to do the maths yourself.
Daniel
--
Daniel Noll
ontent" is more efficient.
Daniel
--
Daniel NollForensic and eDiscovery Software
Senior Developer The world's most advanced
Nuixemail data analysis
http://nuix.com/
g similar.
So you would end up with a DoubleMetaphoneFilter, which you could then
use with PerFieldAnalyzerWrapper to have it apply only to the fields you
use that for.
Daniel
--
Daniel NollForensic and eDiscovery Software
Senior Developer
x27;t particularly
surprising that it isn't stored. ;-)
Daniel
--
Daniel NollForensic and eDiscovery Software
Senior Developer The world's most advanced
Nuixemail data analysis
s sounding like an X-Y problem, so what are you actually trying to
achieve? It sounds like you don't want stemming (talking about "exact"
matches) yet you chose the snowball analyser (whose sole purpose is
stemming, unless I am mistaken...)
Daniel
--
Daniel Noll
age again instead of using
[:letter:] which is much more convenient.
Daniel
--
Daniel NollForensic and eDiscovery Software
Senior Developer The world's most advanced
Nuixemai
Basically I'm seeing some tokens come back with mixed digits and Hangul,
and I'm questioning the correctness of that.
Disclaimer: we're not performing any further processing of Korean in
subsequent filters at the current point in time, and I don't know the
language eit
ating it even says something like "it was originally designed for
GUI but was anyone even using it for that?" Some of us obviously were.
Daniel
--
Daniel NollForensic and eDiscovery Software
Senior Developer The world's m
in time, there is always the
need to get the "set of every match" for any given search eventually.
Maybe others have different opinions as they are working on webapps,
where the user is already expecting paging before they even see the
results page.
Daniel
--
Daniel Noll
know the *number* of hits, and don't need the
hits themselves, then you should just use a custom HitCollector which
increments a counter. It will run much faster.
Daniel
--
Daniel Noll
-
To unsubscribe, e-mail: [
to grow?
Daniel
--
Daniel Noll
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
some time now, just not by default.
Daniel
--
Daniel Noll
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
aniel
--
Daniel Noll
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Kwon, Ohsang wrote:
Why do you use to WildcardQuery? You are not need to whildcard. (maybe..)
Use term query.
What if you need to match a literal wildcard *and* an actual wildcard. :-)
Daniel
--
Daniel Noll
-
To
UN_TOKENIZED.
Source code QFT:
} else if (index == Index.NO_NORMS) {
this.isIndexed = true;
this.isTokenized = false;
this.omitNorms = true;
} ...
Daniel
--
Daniel Noll
-
To unsubscribe, e-mail: [EMAIL
ou have to
support older text indexes.)
Daniel
--
Daniel Noll
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
s to search for "the". If it gives no results
then you won't find "or" either, without reindexing with stop words off.
Daniel
--
Daniel Noll
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
same issue with regex queries here and had to apply a workaround of that
sort.
Daniel
--
Daniel Noll
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
y as
long as you have a BufferedReader wrapped around the entire thing.
Daniel
--
Daniel Noll
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
category
Not surprising at all. This is what you actually want:
+(content:blah content:blah content:blah) +categoryId:2
Your original query's only REQUIRED constraint was that it match the
category.
Daniel
--
Daniel
s you wrap it in a QueryFilter to cache the
result, but I found it to be "fast enough" even for relatively large
document sets.
Daniel
--
Daniel Noll
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional comm
On Thursday 26 June 2008 15:09:44 java_is_everything wrote:
> Hi all.
>
> Is there a way to obtain the number of documents in the Lucene index
> (2.0.0), having a particular term indexed, much like what we do in a
> database ?
I suspect the normal way is a HitCollector which does nothing but incre
On Monday 23 June 2008 18:08:29 Aditi Goyal wrote:
> Oh. For one moment I was elated to hear the news. :(
> Is there any way out?
*:* -"jakarta apache"
Or subclass QueryParser and override the getBooleanQuery() method to do this
behind the scenes using MatchAllDocsQuery.
Daniel
---
On Monday 23 June 2008 16:21:17 Aditi Goyal wrote:
> I think wildcard (*) cannot be used in the beginning :(
Wrong:
http://lucene.apache.org/java/2_3_0/api/core/org/apache/lucene/queryParser/QueryParser.html#setAllowLeadingWildcard(boolean)
Daniel
---
On Saturday 21 June 2008 18:57:49 Sebastin wrote:
> Since i am maintaining more than 1.5 years records in the windows 2003
> server,based on the user input for example if the user wants to display
> june 1 - june 15 folders and fetch the records from them.if the user wants
> to display may 1-may15
On Tuesday 10 June 2008 07:49:29 Otis Gospodnetic wrote:
> Hi Daniel,
>
> What makes you say that about language detection? Wouldn't that depend on
> the language detection approach or tool one uses and on the type and amount
> of content one trains language detector on? And what is the threshold
On Friday 30 May 2008 08:17:52 Alex wrote:
> Hi,
> other than the in memory terms (.tii), and the few kilobytes of opened file
> buffer, where are some other sources of significant memory consumption when
> searching on a large index ? (> 100GB). The queries are just normal term
> queries.
Norms
On Monday 26 May 2008 02:25:40 Tom Conlon wrote:
> Hi Mark,
>
> For example:
> you have a content field (default) and you also have an 'attributes'
> field.
>
> I'd like to add multiple attributes for a given document rather than
> just one value and be able to somehow search on the attributes.
>
>
On Saturday 10 May 2008 20:32:42 legrand thomas wrote:
> I think I cannot use the WildcardQuery because the term shouldn't start
> with "*" of "?". Should I use a QueryParser ? How can I do it ?
WildcardQuery does permit a wildcard at the front, it's just much slower.
Also, QueryParser allows w
On Thursday 01 May 2008 00:01:48 John Wang wrote:
> I am not sure how well lucene would perform with > 2 Billion docs in a
> single index anyway.
Even if they're in multiple indexes, the doc IDs being ints will still prevent
it going past 2Gi unless you wrap your own framework around it.
Daniel
On Thursday 03 April 2008 08:08:09 Dominique Béjean wrote:
> Hum, it looks like it is not true.
> Use a do-while loop make the first terms.term().field() generate a null
> pointer exception.
Depends which terms method you use.
TermEnum terms = reader.terms();
System.out.println(terms.term
On Tuesday 01 April 2008 18:51:55 Dominique Béjean wrote:
> IndexReader reader = IndexReader.open(temp_index);
> TermEnum terms = reader.terms();
>
> while (terms.next()) {
> String field = terms.term().field();
Gotcha: after calling terms() it's already pointin
On Wednesday 19 March 2008 18:28:15 Itamar Syn-Hershko wrote:
> 1. Build a Radix tree (PATRICIA) and populate it with all search terms.
> Phrase queries will be considered as one big string, regardless their
> spaces.
>
> 2. Iterate through your text ignoring spaces and punctuation marks, and for
>
On Thursday 20 March 2008 07:22:27 Mark Miller wrote:
> You might think, if I only ask for the top 10 docs, don't i only read 10
> field values? But of course you don't know what docs will be returned as
> each search comes in...so you have to cache them all.
If it lazily cached one field at a tim
On Wednesday 19 March 2008 01:44:33 Ramdas M Ramakrishnan wrote:
> I am using a MultiFieldQueryParser to parse and search the index. Once I
> have the Hits and iterate thru it, I need to know the following?
>
> For every hit document I need to know under which indexed field was this
> Hit originati
On Monday 17 March 2008 19:38:46 Michael McCandless wrote:
> Well ... expungeDeletes() first forces a flush, at which point the
> deletions are flushed as a .del file against the just flushed
> segment. Still, if you call expungeDeletes after every flush
> (commit) then it's only 1 segment whose d
On Thursday 13 March 2008 19:46:20 Michael McCandless wrote:
> But, when a normal merge of segments with deletions completes, your
> docIDs will shift. In trunk we now explicitly compute the docID
> shifting that happens after a merge, because we don't always flush
> pending deletes when flushing
On Thursday 13 March 2008 00:42:59 Erick Erickson wrote:
> I certainly found that lazy loading changed my speed dramatically, but
> that was on a particularly field-heavy index.
>
> I wonder if TermEnum/TermDocs would be fast enough on an indexed
> (UN_TOKENIZED???) field for a unique id.
>
> Mostl
On Thursday 13 March 2008 15:21:19 Asgeir Frimannsson wrote:
> >I was hoping to have IndexWriter take an AnalyzerFactory, where the
> > AnalyzerFactory produces Analyzer depending on some criteria of the
> > document, e.g. language.
> With PerFieldAnalyzerWrapper, you can specify which analyze
On Wednesday 12 March 2008 19:36:57 Michael McCandless wrote:
> OK, I think very likely this is the issue: when IndexWriter hits an
> exception while processing a document, the portion of the document
> already indexed is left in the index, and then its docID is marked
> for deletion. You can see
On Wednesday 12 March 2008 10:20:12 Michael McCandless wrote:
> Oh, so you do not see the problem with SerialMergeScheduler but you
> do with ConcurrentMergeScheduler?
[...]
> Oh, there are no deletions? Then this is very strange. Is it
> optimize that messes up the docIDs? Or, is it when you
On Wednesday 12 March 2008 09:53:58 Erick Erickson wrote:
> But to me, it always seems...er...fraught to even *think* about relying
> on doc ids. I know you've been around the block with Lucene, but do you
> have a compelling reason to use the doc ID and not your own unique ID?
From memory it was
On Tuesday 11 March 2008 19:55:39 Michael McCandless wrote:
> Hi Daniel,
>
> 2.3 should be no different from 2.2 in that docIDs only "shift" when
> a merge of segments with deletions completes.
>
> Could it be the ConcurrentMergeScheduler? Merges now run in the
> background by default and commit w
Hi all.
We're using the document ID to associate extra information stored outside
Lucene. Some of this information is being stored at load-time and some
afterwards; later on it turns out the information stored at load-time is
returning the wrong results when converting the database contents ba
On Monday 03 March 2008 05:40:39 Ghinwa Choueiter wrote:
> thank you. You were right. Indexing by "" does not do what I need.
>
> How would one represent a null index? Perhaps another way of asking the
> question is what query would return to me all the documents in the
> database (all-pass filter)
On Thursday 28 February 2008 01:52:27 Erick Erickson wrote:
> And don't iterate through the Hits object for more than 100 or so hits.
> Like Mark said. Really. Really don't ...
Is there a good trick for avoiding this?
Say you have a situation like this...
- User searches
- User sees first N h
On Wednesday 27 February 2008 03:33:53 Itamar Syn-Hershko wrote:
> I'm still trying to engineer the best possible solution for Lucene with
> Hebrew, right now my path is NOT using a stemmer by default, only by
> explicit request of the user. MoreLikeThis would only return relevant
> results if I wi
On Wednesday 27 February 2008 00:50:04 [EMAIL PROTECTED] wrote:
> Looks that this is really hard-coded behaviour, and not Analyzer-specific.
The whitespace part is coded into QueryParser.jj, yes. So are the quotes
and : and other query-specific things.
> I want to search for directories with to
On Tuesday 26 February 2008 01:05:27 [EMAIL PROTECTED] wrote:
> Hi all,
>
> I have the behaviour that when I search with Luke (version 0.7.1, Lucene
> version 2.2.0) inside an arbritray field, the QueryParser creates a
> PhraseQuery when I type in
> ~ termA/termB (no "...")
> When
On Tuesday 19 February 2008 21:08:59 [EMAIL PROTECTED] wrote:
> 1. IndexSearcher with a MultiReader will search the indexes
> sequentially?
Not exactly. It will fuse the indexes together such that things like TermEnum
will merge the ones from the real indexes, and will search using those
compos
On Monday 04 February 2008 21:51:39 Michael McCandless wrote:
> Even pre-2.3, you should have seen gains by adding threads, if indeed
> your hardware has good concurrency.
>
> And definitely with the changes in 2.3, you should see gains by
> adding threads.
With regards to this, I have been wonder
On Friday 25 January 2008 19:26:44 Paul Elschot wrote:
> There is no way to do exact phrase matching on OCR data, because no
> correction of OCR data will be perfect. Otherwise the OCR would have made
> the correction...
>
The problem I see with a fuzzy query is that if you have the fuzziness set
Hi all...
Just out of interest, why does field:* go via getWildcardQuery instead of
getPrefixQuery? It seems to me that it should be treated as a prefix of "",
but am I missing something important?
Also, I've noticed that although RangeQuery was optimised in a recent version
of Lucene, Prefix
On Tuesday 08 January 2008 00:52:35 Developer Developer wrote:
> here is another approach.
>
> StandardAnalyzer st = new StandardAnalyzer();
> StringReader reader= new StringReader("text to index...");
> TokenStream stream = st.tokenStream("content", reader);
>
> Then use the Field
On Monday 07 January 2008 11:35:59 chris.b wrote:
> is it possible to add a document to an index and, while doing so, get the
> terms in that document? If so, how would one do this? :x
My first thought would be: when adding fields to the document, use the Field
constructors which accept a TokenSt
Hi all.
We discovered that fullwidth letters are not treated as and fullwidth
digits are not treated as .
This in itself is probably easy to fix (including the filter for normalising
these back to the normal versions) but while sanity checking the blocks in
StandardTokenizer.jj I found some s
On Thursday 13 December 2007 23:07:49 游泳池的鱼 wrote:
> hehe ,you can do a test with PrefixQuery rewrite method,and extract terms .
> like this
> query = prefixQuery.rewrite(reader);
> query.extractTerms(set);
> for(String term : set){
> System.out.println(term);
> }
>
> It will give you
On Wednesday 12 December 2007 03:34:08 Helmut Jarausch wrote:
> Hi,
>
> I know how to set DEFAULT_OPERATOR_AND for an individual QueryParser
> Objekt (after creation)
>
> Since I always want this to be set, is there a means to set a (global)
> option such that any QueryParser object has this defaul
Hi all.
Suppose you have a text index with a field used for deduplication, and then
you later add a second field with further information that might also be used
for deduplication. We'll call them A and B for the sake of brevity.
If I have only a current text index, then I can use (a:foo AND b
On Thursday 08 November 2007 02:41:50 Lukasz Rzeszotarski wrote:
> I must write application, where client wants to make very complex query,
> like:
> find word "blabla" in (Content_1 OR Content_2) AND (...) AND (...)...
> and as a result he expectes not only documents, but also information in
On Tuesday 02 October 2007 12:25:47 Johnny R. Ruiz III wrote:
> Hi,
>
> I can't seem to find a way to delete duplicate in lucene index. I hve a
> unique key so it seems to be straight forward. But I can't find a simple
> way to do it except for putting each record in the index into HashMap.
>
On Monday 10 September 2007 23:53:06 AnkitSinghal wrote:
> And if i make the field as UNTOKENIZED i cannot search for queries like
> host:xyz.* .
I'm not sure why that wouldn't work. If the stored token is xyz.example.com,
then xyz.* will certainly match it.
Daniel
---
herwise need to do to ensure consistency.
Daniel
--
Daniel Noll
Nuix Pty Ltd
Suite 79, 89 Jones St, Ultimo NSW 2007, AustraliaPh: +61 2 9280 0699
Web: http://nuix.com/ Fax: +61 2 9212 6902
-
To
ore than others.
Daniel
--
Daniel Noll
Nuix Pty Ltd
Suite 79, 89 Jones St, Ultimo NSW 2007, AustraliaPh: +61 2 9280 0699
Web: http://nuix.com/ Fax: +61 2 9212 6902
-
To unsubscribe, e-mai
sort of thing only works with
untokenised fields, unless you have somewhere else you can store the
untokenised version which is quicker to iterate over.
Daniel
--
Daniel Noll
Nuix Pty Ltd
Suite 79, 89 Jones St, Ultimo NSW 2007, AustraliaPh: +61 2 9280 0699
Web: http://nuix.com/
way to do it for speed.
For the least code you can probably do...
BooleanFilter f = new BooleanFilter();
f.add(new FilterClause(RangeFilter.More("field", ""),
BooleanClause.Occur.MUST_NOT));
f = new CachingWrapperFilter(f);
Daniel
--
Daniel No
eally appreciate any help!
Why don't you just have your analyser lowercase the field at indexing time? I
don't see why you would use a FuzzyQuery for something where a normal
PhraseQuery should suffice.
Daniel
--
Daniel Noll
Nuix Pty Ltd
Suite 79, 89 Jones St, Ultimo NSW 2007, Austra
ping table from actual
document ID to the sequence ID. (e.g. if documents 1000 through 1999 are
deleted, there would be an entry in the table saying that ID 2000 starts at
document ID 1000.)
I just wanted to put the question out in case someone has solved the exact
same problem already.
Daniel
On Friday 22 June 2007 09:34:44 Tanya Levshina wrote:
> ramWriter.addDocument(doc);
>
> fsWriter.addIndexes(new Directory[] {ramDir,});
As IndexWriter already does this internally, I'm not exactly sure why you're
trying to implement it again on the outside.
On Tuesday 19 June 2007 11:03:25 Erik Hatcher wrote:
> > Good way to discourage potential contributors I suppose.
>
> And (most) spammers, which is really the point of requiring a
> profile.
I believe this is called "throwing the baby out with the bath water."
Dan
up, and click on the "Create Profile" button.
Good way to discourage potential contributors I suppose.
Daniel
--
Daniel Noll
Nuix Pty Ltd
Suite 79, 89 Jones St, Ultimo NSW 2007, AustraliaPh: +61 2 9280 0699
Web: http://nuix.com/
AQ page claims to be immutable, however.
Daniel
--
Daniel Noll
Nuix Pty Ltd
Suite 79, 89 Jones St, Ultimo NSW 2007, AustraliaPh: +61 2 9280 0699
Web: http://nuix.com/ Fax: +61 2 9212 6902
-
To unsubsc
this question on
it?
Daniel
--
Daniel Noll
Nuix Pty Ltd
Suite 79, 89 Jones St, Ultimo NSW 2007, AustraliaPh: +61 2 9280 0699
Web: http://nuix.com/ Fax: +61 2 9212 6902
-
To unsubscribe,
, this will
probably work well.
(It gets harder if you want to do it inside ordinary text content as well.)
Daniel
--
Daniel Noll
Nuix Pty Ltd
Suite 79, 89 Jones St, Ultimo NSW 2007, AustraliaPh: +61 2 9280 0699
Web: http://nuix.com/
by removing those
> "new" lines, but I don't want to maintain a custom
> lucene package.
>
> Please help!
Can you not use RegexQuery instead?
Daniel
--
Daniel Noll
Nuix Pty Ltd
Suite 79, 89 Jones St, Ultimo NSW 2007, AustraliaPh: +61 2 9280 0699
W
ny code example for this...
See IndexReader#termDocs(), termDocs#seek(Term), termDocs#skipTo(int) and
termDocs#freq().
If you need to do it for multiple documents and terms, you probably want to do
it in order to reduce redundant creation of multiple TermDocs objects.
Daniel
--
Daniel Noll
N
>
> But I've been wrong before.
Ah, I see. A feature I haven't toyed with just yet.
That's rather nice. :-)
Daniel
--
Daniel Noll
Nuix Pty Ltd
Suite 79, 89 Jones St, Ultimo NSW 2007, AustraliaPh: +61 2 9280 0699
Web: http://nuix.com/
nd end offset and highlight identically.
What *would* be tricky is phrase queries since inserting a new term breaks the
offsets AFAIK.
Although, I suppose you could always store the concepts in a different field
and not modify the analyser being used for the text itself.
Daniel
--
Daniel Noll
1 - 100 of 210 matches
Mail list logo