> : Do you have any suggestions on how to solve this in a
> "neat" way? And is
>
> Have you looked at the NumberTools class?
>
> As i recall it generates strings that are always printable, but as a
> result (of using fewer characters) are also always longer then the
> corrisponding value from Solr'
Up to now I have only needed to search a single index, but now I will have many
index shards to search across. My existing search mantained cached filters for
the index as well as a cache of my own unique ID fields in the index, keyed by
Lucene DocId.
Now I need to search multiple indices, I
On Thu, Aug 28, 2008 at 11:16 AM, Michael McCandless
<[EMAIL PROTECTED]> wrote:
>
> Yonik Seeley wrote:
>>
>> I wasn't originally going to add a Field.Index at all for omitNorms,
>> but Doug suggested it.
>> The problem with this type-safe way of doing things is the
>> combinatorial explosion.
>
>
: Is there a way? I could make the documents containing field A have
: another field B equals some flag, and then query for that flag, but
: that would be kind of inefficient.
There is space efficienty and there is time effeciency.
if there was only only value for field B (ie: "true" or "yes") t
: Do you have any suggestions on how to solve this in a "neat" way? And is
Have you looked at the NumberTools class?
As i recall it generates strings that are always printable, but as a
result (of using fewer characters) are also always longer then the
corrisponding value from Solr's NumberUt
Hello,
Lets say we have different document types, and one type of document
only contains field A.
How can I make a query so that I get all the documents that only has field A?
There is a get all documents query, but that would get all the
documents whether they contain field A or not.
Is there
I have implemented a MapReduce job to merge a bunch of Lucene 2.3.2
indices together, but the reducers randomly fail with the following
unchecked exception after thousands of successful merges:
org.apache.lucene.index.MergePolicy$MergeException: segment "_0 exists
in external directory yet the Mer
Hi,
In our application, I want users to be able to search for the updates they
make almost immediately. Hence, whenever they update, I spawn a thread
immediately to index. However, when the load on the application is very high
the number of threads spawned increases, and this results in "cannot
Guy, ulimit -n is your friend. As is the compound index format.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> From: Guy Gavriely <[EMAIL PROTECTED]>
> To: "java-user@lucene.apache.org"
> Sent: Thursday, September 11, 2008 10:28:34 AM
> Subjec
Hi Marie,
On 09/11/2008 at 4:03 AM, Marie-Christine Plogmann wrote:
> I am currently using the demo class IndexFiles to index some
> corpus. I have replaced the Standard by a GermanAnalyzer.
> Here, indexing works fine.
> But if i specify a different stopword list that should be
> used, the tokeni
Hi All,
I vaguely remember discussions on lucene remote-ability of HitCollectors
based search(). As far as I remember, it is not possible if I use
HitCollectors.
In lucene 3, we are doing away with a lot of search() variants, including
the ones that return Hits.
I would like to know which one o
Ok, i see the problems.
I will talk to my customer about this requirement. Perhaps he doesn't need it
anymore.
Again, thanks a lot to all, you saved my day!
Regards
Mirko
-Ursprüngliche Nachricht-
Von: Matthew Hall [mailto:[EMAIL PROTECTED]
Gesendet: Donnerstag, 11. September 2008 16:
How many fields are you winding up with for each document? One for
each term?
And what is the higher-level task you're trying to accomplish? What
distinguishes *why* a certain term in a certain document should
boost a particular document? Perhaps if you explained the higher
level task someone woul
Hi Guy,
I think that isn't a problem related to fields. I experienced this kind of
error caused by an limitation of the underlying file system. The problem was
that I had too much InputStreams open that had never been closed. Please
check that in your code and tell us if it worked.
Markus.
2008
Really thanks Karsten and Ian Lea!!
You gave me a very useful solutions
I'm going to try the last one of Karsten:
Because you easly can use lucene with 1 field and 365 different tokens
(20080101, 20080102, ...20081231).
even if the solution of Ian Lea seems to be a very good one and I'll try it
that's ok.
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Luther
RegexQuery might work, or how about splitting the digit string out into dates
and search for them e.g. 11000 could be stored as "avail: jan03 jan04
jan05" and a search for +avail:jan03 +avail:jan04 +avail:jan05 would get a
hit.
--
Ian.
On Thu, Sep 11, 2008 at 12:34 PM, luther bliss
Ah.. that's a darn good point..
Though, that second bit of code you have there could be used at display
time for him to get the functionality that he wants. You could also
modify it somewhat, and apply it against the displayable part of the hit
he's getting back rather than the individual tok
>>That should give you the functionality you are looking for.
If I understand your suggestion correctly, It won't. The Highlighter uses a
tokenized version of the document text.
Simplistically it does the following psuedo code:
for all tokens in documentTokenStream,
if(queryTermsSet.contains
Well, you could certainly manipulate your search string, removing the
wildcard punctuations, and then use that for what you pass to the
highlighter.
That should give you the functionality you are looking for.
-Matt
mark harwood wrote:
Is this possible?
Not currently, the highlighter
Hi,
I have to index terms with different boosts, meaning that if the word A appears
in two documents one document will be ranked higher.
I've tried to index them by putting them in different fields and give the
fields different boost but i ran into too many files (caused by too many fields
I g
>> Is this possible?
Not currently, the highlighter works with a list of words (or words AND phrases
using the new span support) and highlights those.
To do anything else would require the higlighter to faithfully re-implement
much of the logic in all of the different query types (fuzzy, wildcar
i's a question of math and arithmetic,not a question about lucene.there is
other good way deal with it.
Hi Luther,
your question:
"Is there a way to ask Lucene to search starting from a fixed position?"
the anwer: no, not by standard search.
But you don't want to use your field for scoring. So this is a field to
filter results.
you could easily change RangeFilter for this purpose but the new filt
hi folks,
I'm new to Lucene and I'm looking for a way to search a substring that
starts at a fixed position.
It isn't a classical substring search because it's a bit weird.
I indexed a field that represents the avability of a room in a hostal during
1 year.
The field is composed by 365 digits and
Ok, one final question:
If i query for "*ll*", the query is expanded to ("hallo" or "alle" or ...), so
the
Highligter will highlight the words "hallo" or "alle". But how can i highlight
only
the original query, so only the "ll"? Is this possible?
Thanks a lot
Mirko
-Ursprüngliche Nachricht
>
> just try it.and you will find answer.
>
Hi All,
In my project I use Hits from Searcher.search() for my query results. If I
am to move to Lucene 3's ways, I will have to use TopDocs I presume.
It'll be great if someone could guide me with some sort of skeleton code.
Also is it possible to cache the results like I do with Hits?
Anoth
You need to call rewrite on the query to expand it then give that version to
the highlighter - see the package javadocs.
http://lucene.apache.org/java/2_2_0/api/org/apache/lucene/search/highlight/package-summary.html#package_description
Cheers
Mark
- Original Message
From: "Sertic M
Ok, i gave it a try, but i ran into this TooManyClauses Exception. I see that
3ildcard queries are expanded before they are processed, and I see that i can
set the clauses count to Integer.MAXVALUE, and queries can consume a lot of
memory,
but one final thing is still open: does a wildcard query
Hi,
I am currently using the demo class IndexFiles to index some corpus. I have
replaced the Standard by a GermanAnalyzer. Here, indexing works fine.
But if i specify a different stopword list that should be used, the
tokenization doesn't seem to work properly. Mostly some letters are missing at
31 matches
Mail list logo