Re: Big problem with big indexes

Ariel Isaac Romero Cartaya Mon, 16 Oct 2006 05:43:57 -0700

First af all, what is your machine architecture ??? Do you have a super pc
???
I'm running this on a dual xeon hyperthreading 2,4 Ghz, 1 Gb RAM,  HD SATA.
I Can not get the times results you get. I think  that the problem may be in
the structure of my index, for example I use a special analyzer for the
english language, filter stopwords, stemming and synonims detection with
wordnet but I need to do this to get the results I get. Could you help me to
fix this problem.
I'm desesperate.
Greetings


On 10/11/06, Erick Erickson <[EMAIL PROTECTED]> wrote:

Something's extremely not right <G>....

First of all, I'm running a 1.4G index on a single machine and getting
very
good results, under 10 seconds even for the most complex queries I'm
firing.
This is with 870,000 documents, and includes sorting by criteria other
than
relevance. And using span queries. And using wildcards that build their
own
filters.

So, something must be very different about how you are using lucene to get
such poor search times.

So, please tell us significantly more about the structure of your index
and
post the shortest example you can of your search code that demonstrates
the
problem, and maybe some of the wiser heads than mine can help out too.

There should be no need to put the index in RAM, the index is just not big
enough.

So, some of the things I think would help analyze your problems....

1> hardware and op systems you're running on. Including how much memory
you're allowing your JVM to have.
2> network topology. If you're running the searchers locally and just
storing the indexes on remote machines, you're possibly having network
latency problems. Personally, I don't think your problem is properly
addressed by splitting your index. 600MB of index is just not big enough
to
need this.
3> This *should* work on a local machine with just a single index. How
much
trouble would it be to create it so? Can you try that and see what
difference that makes?
4> how did you build your index? Is it optimized? Can you give us an idea
of
how many fields you are storing and some indication of the relative sizes
of
each? Mostly, I'm asking whether you have a bunch of small fields and some
other very large ones.
5> Put one of the indexes on your local machine and get a copy of Luke
(google luke lucene) and fire off a few queries via Luke and tell us what
kind of results you get. Actually, this is probably the first thing you
should try. If you get radically different results with Luke than your
code,
you can be pretty sure you're doing something out of the ordinary.
6> Timings of *only* the search code. By that I mean the time it takes for
searcher.search to complete. It's vaguely possible that the search is
fine,
but something you're doing when processing the results is taking forever.
I
have no evidence for this, of course, but it'd be a useful bit of
information.

I don't know if this helps much, but from your description, I think
there's
a fundamental, correctable problem because nobody would use the product if
it gave such poor search times. And lots of people use it.

Best
Erick

On 10/11/06, Ariel Isaac Romero Cartaya <[EMAIL PROTECTED]> wrote:
>
> Hi everybody:
>
>      I have a big problem making prallel searches in big indexes.
>      I have indexed with lucene over 60 000 articles, I have distributed
> the
> indexes in 10 computers nodes so each index not exceed the 60 MB of
size.
> I
> makes parallel searches in those indexes but I get the search results
> after
> 40 MINUTES !!! Then I put the indexes in memory to do the parallel
> searches
> But still I get the search results after 3 minutes !!! that`s to mucho
> time
> waiting !!!
>   How Can I reduce the time of search ???
>   Could you help me please ???
>   I need help !!!!!
>
> Greetings
>
>

Re: Big problem with big indexes

Reply via email to