A stepping stone to the above is that, in DB terms, a Lucene index is
only one table. It has a suite of indexing features that are very
different from database search. The features are oriented to searching
large bodies of text for "ideas" rather than concrete words. It
searches a lot faster than a
It is also possible to sort by function. This allows you to avoid
storing an array of 1 int for all documents. It is slower than the raw
Lucene sort.
On Wed, Aug 25, 2010 at 1:46 AM, Toke Eskildsen
wrote:
> On Wed, 2010-08-25 at 07:16 +0200, Shelly_Singh wrote:
>> I have 1 bln documents to sort.
> Im using Windows and I'll try NIO, good idea, my app is already memory
> hungry in other areas so I guess MMapped is a no go, doe sit use heap or
perm
> memory ?
It uses address space for mapping the files into virtual memory (like a swap
file) - this is why it only works well for 64bit VMs. The
Uwe Schindler wrote:
That lock contention is fine there as this is the central point where all IO
is done. This does not mean that only one query is running in parallel, the
queries are still running in parallel. But there is one place where all IO
is waiting for one file descriptor. This is not
Uwe Schindler wrote:
Can you show us where it exactly blocks (e.g. use Ctrl-Break on windows to
print a thread dump)? IndexSearchers methods are not synchronized and
concurrent access is easy possible, all concurrent access is managed by the
underlying IndexReader. Maybe you synchronize somewhere
Can you show us where it exactly blocks (e.g. use Ctrl-Break on windows to
print a thread dump)? IndexSearchers methods are not synchronized and
concurrent access is easy possible, all concurrent access is managed by the
underlying IndexReader. Maybe you synchronize somewhere in your code?
-
U
Hi
My multithreaded code was always creating a new IndexSearcher for every
search, but I changed over to the recommendation of creating just one
index searcher and keeping it between searches. Now I find if I have
multiple threads trying to search they
block on the search method(), only one c
The SOLR wiki has lots of good information, start there:
http://wiki.apache.org/solr/
Otherwise, see below...
On Wed, Aug 25, 2010 at 6:20 AM, Schreiner Wolfgang <
wolfgang.schrei...@itsv.at> wrote:
> Hi all,
>
> We are currently evaluating potential search frameworks (such as Hibernate
> Search
I see you are coming from the database world. To get a better understanding
of Lucene, I would suggest you use the free version of DBSight, which let
you create Lucene index with SQL after a few clicks.
Basically Lucene is more like a list of denormalized documents. So if you
change your database
Hi all,
We are currently evaluating potential search frameworks (such as Hibernate
Search) which might be suitable to use in our project (using Spring, JPA with
Hibernate) ...
I am sending this E-Mail in hope you can advise me on a few issues that would
help us in our decision making process.
On Wed, 2010-08-25 at 07:16 +0200, Shelly_Singh wrote:
> I have 1 bln documents to sort. So, that would mean ( 8 bln bytes == 8GB RAM)
> bytes.
> All I have is 8 GB on my machine, so I do not think approach would work.
This implies that your numeric value can be more than 2 billion. Are you
sure
1 billion i.e. 1,000,000,000?
Either buy more RAM, lots more RAM, or skip lucene sorting and do your
own sorting for the top n hits. You might also want to look into
sharding/distributing your index.
--
Ian.
On Wed, Aug 25, 2010 at 6:16 AM, Shelly_Singh wrote:
> I have 1 bln documents to sor
12 matches
Mail list logo