Hi,
there is a big conceptual difference. Lucene is working with an inverted index what means that you have a list of words (terms) having a list of all documents that contain these word (term). Databases usually are working with normal indexes what means you have a document describing the words (terms) it contains. From my perspective you should use a database for the details querys for a very simple reason. Querying a document by its database id is quite fast. Depending on your DMBS the document id is the primary key, what means you have a fast datastructure (often some B-Tree like stuff) to access the data behind the id. In lucene you had to query the inverted index that should be organized in a way that you can access it like the B- Tree. But the fact that your database probably is something written in C makes file access with your
db a lot faster.

So as a conclusion. Use the db. At first the lucene index was not designed for queries like the id query and secondly file access with your db should give you a better performance.


Am 01.10.2008 um 09:43 schrieb agatone:


Hi,
I asked this question already on "lucene-general" list but also got advised
to ask here too.

I'm working on a project that has big database in the background (some
tables have about 1500000 rows). We decided to use Lucene for "faster"
search. Our search works similar as all searches: you write search string, get list of hits with detail link. But there is dilemma if we should store
more data into index than it's needed.

One side of developing team insists that we should use lucene index as
somekind of storage for data so when you get hit, you go onto details and then again use lucene to find document that matches the selected ID and take the data from Lucene index. So in the end you end with copying complete
database tables into the lucene index.

Other side insists on storing to index only data that is displayed directly
to the user when showing the search results list and needed for search
criteria. When you go onto details, you have the matching ID so you can pickup that row from database by that ID rather than search it inside Lucene
index.

Can someone please describe drawbacks and advantages of both approaches. Actually can someone write down what's the actual profit, where and when of
the Lucene itself in real production env.

IT would be great if there is anyone who could write his experience with
indexing and searching large amount of data.


Thank you
--
View this message in context: 
http://www.nabble.com/Lucene-vs.-Database-tp19755932p19755932.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Sascha Fahl
Softwareentwicklung

evenity GmbH
Zu den Mühlen 19
D-35390 Gießen

Mail: [EMAIL PROTECTED]









---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to