Look at Compass wrapper for Lucene...

Regards, 
Aravind R Yarram
Enabling Technologies
Equifax Information Services LLC
1525 Windward Concourse, J42E
Alpharetta, GA 30005
desk: 770 740 6951
email: [EMAIL PROTECTED] 



"ನಾಗೇಶ್ ಸುಬ್ರಹ್ಮಣ್ಯ (Nagesh S)" <[EMAIL PROTECTED]> 
07/29/2008 10:02 AM
Please respond to
java-user@lucene.apache.org


To
java-user@lucene.apache.org
cc

Subject
Re: Using lucene as a database... good idea or bad idea?






Hi Ian,
Yes, I see that we are discussing an "option" here.

But, as I said before (the three parts to search-based solution), I do not
know (but, would like to know) how Lucene (java only - not Nutch, Solr,
etc.) can be used as a datastore.

Basically, I am not able to connect "database" and Lucene java. :)

Nagesh


On Tue, Jul 29, 2008 at 6:51 PM, Ian Lea <[EMAIL PROTECTED]> wrote:

> I don't think that anyone in this thread has said "should", just
> "could" - it is a valid option (IMHO).  Personally, I use it as a
> store for lucene related data because I know and like and trust it, it
> is already there for this project so no need to introduce another
> software dependency, and because it is blindingly fast.
>
>
> --
> Ian.
>
>
> On Tue, Jul 29, 2008 at 1:43 PM, ನಾಗೇಶ್ ಸುಬ್ರಹ್ಮಣ್ಯ 
(Nagesh S)
> <[EMAIL PROTECTED]> wrote:
> > The way I see it, search solutions (on whatever scale) have three
> components
> > - data aggregation, indexing/searching and presentation of results. I
> > thought, Lucene did the second part only.
> >
> > So, I do not quite follow, why should Lucene be used for datastore ?
> >
> > Nagesh
> >
> > On Tue, Jul 29, 2008 at 6:01 PM, Grant Ingersoll <[EMAIL PROTECTED]
> >wrote:
> >
> >> I think the answer is it can be done and probably quite well.  I also
> think
> >> it's informative that Nutch does not use Lucene for this function, as 
I
> >> understand it, but that shouldn't stop you either.  You might also 
have
> a
> >> look at Apache Jackrabbit, which uses Lucene underneath as a content
> >> repository.
> >>
> >> -Grant
> >>
> >>
> >> On Jul 29, 2008, at 5:34 AM, Ganesh - yahoo wrote:
> >>
> >>  Hello all,
> >>>
> >>> I am also interested in this. I want to archive the content of the
> >>> document using Lucene.
> >>>
> >>> Is it a good idea to use Lucene as storage engine?
> >>>
> >>> Regards
> >>> Ganesh
> >>>
> >>> ----- Original Message ----- From: "Ian Lea" <[EMAIL PROTECTED]>
> >>> To: <java-user@lucene.apache.org>
> >>> Sent: Tuesday, July 29, 2008 2:18 PM
> >>> Subject: Re: Using lucene as a database... good idea or bad idea?
> >>>
> >>>
> >>>  John
> >>>>
> >>>>
> >>>> I think it's a great idea, and do exactly this to store 5 million+
> >>>> documents with info that it takes way too long to get out of our
> >>>> Oracle database (think days).  Not as many docs as you are talking
> >>>> about, and less data for each doc, but I wouldn't have any concerns
> >>>> about scaling.  There are certainly lucene indexes out there bigger
> >>>> than what you propose.  You can compress the stored data to save 
some
> >>>> space.  Run times for optimization might get interesting but see
> >>>> recent threads for suggestions on that.  And since you are not too
> >>>> concerned about performance you may not need to optimize much, or 
even
> >>>> at all.
> >>>>
> >>>> Of course you need to remember that this is not a DBMS solution in 
the
> >>>> sense of transactions, recovery, etc. but I'm sure you are already
> >>>> aware of that.
> >>>>
> >>>>
> >>>> --
> >>>> Ian.
> >>>>
> >>>>
> >>>> On Tue, Jul 29, 2008 at 2:53 AM, John Evans <[EMAIL PROTECTED]> 
wrote:
> >>>>
> >>>>> Hi All,
> >>>>>
> >>>>> I have successfully used Lucene in the "tradtiional" way to 
provide
> >>>>> full-text search for various websites.  Now I am tasked with
> developing
> >>>>> a
> >>>>> data-store to back a web crawler.  The crawler can be configured 
to
> >>>>> retrieve
> >>>>> arbitrary fields from arbitrary pages, so the result is that each
> >>>>> document
> >>>>> may have a random assortment of fields.  It seems like Lucene may 
be
> a
> >>>>> natural fit for this scenario since you can obviously add 
arbitrary
> >>>>> fields
> >>>>> to each document and you can store the actually data in the 
database.
> >>>>> I've
> >>>>> done some research to make sure that it would meet all of our
> individual
> >>>>> requirements (that we can iterate over documents, update
> >>>>> (delete/replace)
> >>>>> documents, etc.) and everything looks good.  I've also seen a 
couple
> of
> >>>>> references around the net to other people trying similar things...
> >>>>> however,
> >>>>> I know it's not meant to be used this way, so I thought I would 
post
> >>>>> here
> >>>>> and ask for guidance?  Has anyone done something similar?  Is 
there
> any
> >>>>> specific reason to think this is a bad idea?
> >>>>>
> >>>>> The one thing that I am least certain about his how well it will
> scale.
> >>>>> We
> >>>>> may reach the point where we have tens of millions of documents 
and a
> >>>>> high
> >>>>> percentage of those documents may be relatively large (10k-50k 
each).
> >>>>>  We
> >>>>> actually would NOT be expecting/needing Lucene's normal extreme 
fast
> >>>>> text
> >>>>> search times for this, but we would need reasonable times for 
adding
> new
> >>>>> documents to the index, retrieving documents by ID (for iterating
> over
> >>>>> all
> >>>>> documents), optimizing the index after a series of changes, etc.
> >>>>>
> >>>>> Any advice/input/theories anyone can contribute would be greatly
> >>>>> appreciated.
> >>>>>
> >>>>> Thanks,
> >>>>> -
> >>>>> John
> >>>>>
> >>>>>
> >>>> 
---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: [EMAIL PROTECTED]
> >>>> For additional commands, e-mail: [EMAIL PROTECTED]
> >>>>
> >>>
> >>> Send instant messages to your online friends
> >>> http://in.messenger.yahoo.com
> >>> 
---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: [EMAIL PROTECTED]
> >>> For additional commands, e-mail: [EMAIL PROTECTED]
> >>>
> >>>
> >> --------------------------
> >> Grant Ingersoll
> >> http://www.lucidimagination.com
> >>
> >> Lucene Helpful Hints:
> >> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> >> http://wiki.apache.org/lucene-java/LuceneFAQ
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: [EMAIL PROTECTED]
> >> For additional commands, e-mail: [EMAIL PROTECTED]
> >>
> >>
> >
>


This message contains information from Equifax Inc. which may be confidential 
and privileged.  If you are not an intended recipient, please refrain from any 
disclosure, copying, distribution or use of this information and note that such 
actions are prohibited.  If you have received this transmission in error, 
please notify by e-mail [EMAIL PROTECTED]

  • Using luce... John Evans
    • Re: U... Hasan Diwan
    • Re: U... Ian Lea
      • R... Ganesh - yahoo
        • ... Grant Ingersoll
          • ... ನಾಗೇಶ್ ಸುಬ್ರಹ್ಮಣ್ಯ (Nagesh S)
            • ... Ian Lea
              • ... Grant Ingersoll
              • ... ನಾಗೇಶ್ ಸುಬ್ರಹ್ಮಣ್ಯ (Nagesh S)
                • ... Aravind . Yarram
                • ... Grant Ingersoll
                • ... ನಾಗೇಶ್ ಸುಬ್ರಹ್ಮಣ್ಯ (Nagesh S)
          • ... Karsten F.
            • ... Grant Ingersoll
              • ... Ganesh - yahoo
                • ... Karsten F.
                • ... Andy Liu
                • ... Ganesh - yahoo
                • ... Marcus Herou
                • ... Marcus Herou

Reply via email to