Please start a new topic when changing subjects. See: http://people.apache.org/~hossman/#threadhijack<http://people.apache.org/%7Ehossman/#threadhijack> Thread Hijacking on Mailing Lists
When starting a new discussion on a mailing list, please do not reply to an existing message, instead start a fresh email. Even if you change the subject line of your email, other mail headers still track which thread you replied to and your question is "hidden" in that thread and gets less attention. It makes following discussions in the mailing list archives particularly difficult. See Also: http://en.wikipedia.org/wiki/Thread_hijacking On Thu, Jan 8, 2009 at 10:07 AM, ahammad <ahmed.ham...@gmail.com> wrote: > > Hello, > > I came across some new information regarding the original architecture. We > have a file on a website that basically contains all the links of all the > articles that are searchable. This file is meant to be a crawler starting > point. The articles already have metadata that can be used for indexing. > The > data retrieval from the database is handled by something else which I > currently do not have access to (so I'm not exaclty sure how it's done). > > Would a crawler have to be written from scratch or would something like > Nutch be useful in this case? Basically I want to build an index from the > metadata of all the articles that are available. > > Thanks for all your help/suggestions > > Cheers > > P.S. Wasn't sure if I need a new topic for a new question, so I just used > this one > -- > View this message in context: > http://www.nabble.com/Help-with-installing-Lucene-tp21332541p21353560.html > Sent from the Lucene - Java Users mailing list archive at Nabble.com. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >