I'm working on a JSP-based, free-form text storage & retrieval system
based on lucene. Part of my desired feature set includes the ability
to retrieve, edit, and update text comprising the document. The user
flow involves:
A search for a document, whose "all" field is then retrieved, then it
can be
> - Fetch and index some pages (containing word and pdf documents) on
> daily basis.
> - Extract all pages that contain some provided keywords after fetching
> the pages.
> - Create some bulletin from fetched pages, bulletin will be in pdf
> format and are categorized based on keywords.
> - provide
Hi,
this sounds like job for Nutch (one of Lucene family projects).
On Sun, Apr 27, 2008 at 8:26 PM, Legolas wood <[EMAIL PROTECTED]> wrote:
> Hi
> Thank you for reading my post.
> I have to design a system with the following requirements, I think
> Lucene or one of the projects which are based
Greetings,
I am trying to use TrecDocMaker so I can successfully index and evaluate
lucene on a TReC collection.
It seems like I would just repeatedly call makeDocument() until all the
Documents have been created, but makeDocument appears to just read forever.
In general TrecDocMaker seems like
Hi
Thank you for reading my post.
I have to design a system with the following requirements, I think
Lucene or one of the projects which are based on Lucene can help me as a
base to continue on.
Here is the requirements:
- Fetch and index some pages (containing word and pdf documents) on
daily bas
There are actually several distributed indexing or searching projects in Lucene
(the top-level ASF Lucene project, not Lucene Java), and it's time to start
thinking about the possibility of bringing them together, finding
commonalities, etc.
Here is the summary:
- Lucene - distributed search vi
Thanks a lot :)
2008/4/26 Grant Ingersoll <[EMAIL PROTECTED]>:
>
> On Apr 26, 2008, at 2:33 AM, Samuel Guo wrote:
>
> Hi all,
> >
> > I am a lucene newbie:)
> >
> > It seems that lucene doesn't support distributed indexing:(
> > As some IR research papers mentioned, when the documents collection