Re: Lucene vs. Database

Sascha Fahl Wed, 01 Oct 2008 00:56:45 -0700

Hi,

there is a big conceptual difference. Lucene is working with aninverted index whatmeans that you have a list of words (terms) having a list of alldocuments that containthese word (term). Databases usually are working with normal indexeswhat meansyou have a document describing the words (terms) it contains. From myperspectiveyou should use a database for the details querys for a very simplereason. Queryinga document by its database id is quite fast. Depending on your DMBSthe documentid is the primary key, what means you have a fast datastructure (oftensome B-Treelike stuff) to access the data behind the id. In lucene you had toquery the inverted indexthat should be organized in a way that you can access it like the B-Tree. But the factthat your database probably is something written in C makes fileaccess with your

db a lot faster.

So as a conclusion. Use the db. At first the lucene index was notdesigned for queries likethe id query and secondly file access with your db should give you abetter performance.



Am 01.10.2008 um 09:43 schrieb agatone:

Hi,
I asked this question already on "lucene-general" list but also gotadvised
to ask here too.

I'm working on a project that has big database in the background (some
tables have about 1500000 rows). We decided to use Lucene for "faster"
search. Our search works similar as all searches: you write searchstring,get list of hits with detail link. But there is dilemma if we shouldstore
more data into index than it's needed.

One side of developing team insists that we should use lucene index as
somekind of storage for data so when you get hit, you go ontodetails andthen again use lucene to find document that matches the selected IDand takethe data from Lucene index. So in the end you end with copyingcomplete
database tables into the lucene index.
Other side insists on storing to index only data that is displayeddirectly
to the user when showing the search results list and needed for search
criteria. When you go onto details, you have the matching ID so youcanpickup that row from database by that ID rather than search itinside Lucene
index.
Can someone please describe drawbacks and advantages of bothapproaches.Actually can someone write down what's the actual profit, where andwhen of
the Lucene itself in real production env.
IT would be great if there is anyone who could write his experiencewith
indexing and searching large amount of data.


Thank you
--
View this message in context: 
http://www.nabble.com/Lucene-vs.-Database-tp19755932p19755932.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Sascha Fahl
Softwareentwicklung

evenity GmbH
Zu den Mühlen 19
D-35390 Gießen

Mail: [EMAIL PROTECTED]









---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Lucene vs. Database

Reply via email to