Hi,
there is a big conceptual difference. Lucene is working with an
inverted index what
means that you have a list of words (terms) having a list of all
documents that contain
these word (term). Databases usually are working with normal indexes
what means
you have a document describing the words (terms) it contains. From my
perspective
you should use a database for the details querys for a very simple
reason. Querying
a document by its database id is quite fast. Depending on your DMBS
the document
id is the primary key, what means you have a fast datastructure (often
some B-Tree
like stuff) to access the data behind the id. In lucene you had to
query the inverted index
that should be organized in a way that you can access it like the B-
Tree. But the fact
that your database probably is something written in C makes file
access with your
db a lot faster.
So as a conclusion. Use the db. At first the lucene index was not
designed for queries like
the id query and secondly file access with your db should give you a
better performance.
Am 01.10.2008 um 09:43 schrieb agatone:
Hi,
I asked this question already on "lucene-general" list but also got
advised
to ask here too.
I'm working on a project that has big database in the background (some
tables have about 1500000 rows). We decided to use Lucene for "faster"
search. Our search works similar as all searches: you write search
string,
get list of hits with detail link. But there is dilemma if we should
store
more data into index than it's needed.
One side of developing team insists that we should use lucene index as
somekind of storage for data so when you get hit, you go onto
details and
then again use lucene to find document that matches the selected ID
and take
the data from Lucene index. So in the end you end with copying
complete
database tables into the lucene index.
Other side insists on storing to index only data that is displayed
directly
to the user when showing the search results list and needed for search
criteria. When you go onto details, you have the matching ID so you
can
pickup that row from database by that ID rather than search it
inside Lucene
index.
Can someone please describe drawbacks and advantages of both
approaches.
Actually can someone write down what's the actual profit, where and
when of
the Lucene itself in real production env.
IT would be great if there is anyone who could write his experience
with
indexing and searching large amount of data.
Thank you
--
View this message in context:
http://www.nabble.com/Lucene-vs.-Database-tp19755932p19755932.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Sascha Fahl
Softwareentwicklung
evenity GmbH
Zu den Mühlen 19
D-35390 Gießen
Mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]