customizable Solr really is (rather the ease with which we can do it). Also
Solr doesn't support queryFilter out of the box (Hossman: there's nothing to
stop a solr request handler from using QueryFilter's if they want). How much
extra work is it?
out of the box, solr supports query filters.
Thanks Dima
the first link is very nice and I put some comment on that if you take a
look again but it has no decode method.
anyway, I decided to use solr solution
thanks again :)
On 7/16/07, Dima May <[EMAIL PROTECTED]> wrote:
Mohammad,
see for my 2 cents below,
Good luck.
D
On 7/16/07,
Thank you everyone for your response,
Size of our index is around 10GB. Our queries usually take similar response
times. Only exception is when we are updating our index and after the index
has been switched to a new one from the old one.
Yesterday, as Grant and Hossman pointed us to Solr and we
Are we sure about KeywordAnalyzer here? Which suppose to "Tokenizes"
the entire stream as a single token. (useful for data like zip codes,
ids, and some product names.)
In the scenario we are discussing, U.S. is just a token within the
text and we still would like to leverage from Standard
Use KeywordAnalyzer to leave "U.S." as-is and index it as-is.
Otis
--
Lucene Consulting -- http://lucene-consulting.com/
- Original Message
From: crspan <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Saturday, July 14, 2007 5:18:59 PM
Subject: index U.K. U.S. U.N. U.V.
Would
Hi Murali (redirecting to the more appropriate java-user list)
Sounds doable. I'd go with FSDirectory (or even its memory mapped cousin)
instead of RAMDirectory - let the OS cache Lucene indices. I'm looking at a
search cluster with 3 times that many machines (but not as high-end as your 8
CP
Hi,
I'm getting an error message when trying to create my Lucene index files.
The error is as follows:
Failed to load Main-Class manifest attribute from
e:\lib\jakarta-regexp-1.3\jakarta-regexp-1.3.jar
When I clear the error message by clicking OK it seems that the index
finishes running prope
Its in a contrib jar. Download the release:
http://www.apache.org/dyn/closer.cgi/lucene/java/ then look in the
contrib folder for a folder that has 'regex' in it for the correct jar.
- Mark
mhzmark wrote:
Hi, everybody.
I am new in lucene technology. I've downloaded lucene-demos-2.2.0.jar
Hi, everybody.
I am new in lucene technology. I've downloaded lucene-demos-2.2.0.jar and
lucene-core-2.2.0.jar. Then I added these files to my CLASSPATH. And now I can
successfully import classes like:
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
Often documents can be divided in "metadata" and "contents" sections. Say
you're indexing Web pages, you could index them with HEAD data all in one
field, and the BODY content in another. While also creating separate fields
for every HEAD field, e.g. TITLE etc.
At search time, you rewrite every qu
Another question would be what queries take the longest. Are your
response times pretty constant on a per-query basis or are there
outliers that could perhaps point to a different solution?
Finally, what is the size of your index? The total number of documents
is certainly useful, but so is the f
Mohammad,
see for my 2 cents below,
Good luck.
D
On 7/16/07, Mohammad Norouzi <[EMAIL PROTECTED]> wrote:
Hello
I have problem in range queries, for example, I have queries like
"field:[1
TO 25]" or "field:[1.1 TO 11.25]"
currently these queries not work. field:[20 TO 25] works fine but when
Some of the data sets that will be using have about 2 TB of data (90 million
web pages). The Snippet I will be generating I would like to include the
words that are being queried, so I don't want to simply store the first 2 or
3 lines. I have looked at the HighlighterTest and I do believe that i
Hello,
The issue is about lucene 1.9. Can you test it with lucene 2.2? Perhaps the
issue is already addressed and solved...
Regards Ard
>
> Thank you for the reply Ard,
>
> The tokens exist in the index and are returned accurately, except for
> the offsets. In this case I am not dealing with
The issue continues to exist with nightly 146 from Jul 10, 2007.
http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/146/
Ard Schrijvers wrote:
Hello,
The issue is about lucene 1.9. Can you test it with lucene 2.2? Perhaps the
issue is already addressed and solved...
Regards Ard
Thank you for the reply Ard,
The tokens exist in the index and are returned accurately, except for
the offsets. In this case I am not dealing with the positions, so the
termvector is specified as using 'with_offsets'. I have left the term
position incrememt as its default. Looking at the exist
Hello,
> Hi EVeryone,
>
> Thank you all for your replies.
>
> And reply to your questions Grant:
> We have more than 3 Million document in our index.
> We get more than 150,000 searches (queries) per day. We
> expect this no to go
> up.
Just curious, but suppose those 150.000 searches are don
Hello,
> Ard,
>
> I do have access to the URL's of the documents, but because I
> will be making
> short snippets for many pages (suppose it had about 20 hits
> per page and I
> need to make Snippets for each of them) I was worried it would be
> inefficient to open each "hit" tokenize it and th
Hello,
> Hi,
> I am storing custom values in the Tokens provided by a Tokenizer but
> when retrieving them from the index the values don't match.
What do you mean by retrieving? Do you mean retrieving terms, or do you mean
doing a search with words you know that should be in, but you do not fi
Hello
I have problem in range queries, for example, I have queries like "field:[1
TO 25]" or "field:[1.1 TO 11.25]"
currently these queries not work. field:[20 TO 25] works fine but when the
both limits of the range have different number of digits the query won't
work. so the solution is NumberToo
20 matches
Mail list logo