Hi, Actually I'd really suggest you to 'buy' a copy of Lucene In Action - 2nd Edition. Its currently available as MEAP and its amazing. Perhaps the prices are also down 40% or something, though 'm not really sure about it.
-- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me. The distinction is yours to draw............ On Mon, May 18, 2009 at 10:43 AM, Ridzwan Aminuddin < ridzwan.aminud...@gmail.com> wrote: > HI all, thanks for the responses thus far. > > Another question linked to the first, do you guys know any good tutorials > or > startpoint for me to understand how to go about designing my own customized > analyzer? > > This would be of great help. Thanks in advance! > > Regards, > > Ridzwan > > 2009/5/14 Asbjørn A. <asbj...@fellinghaug.com> > > > Seid Mohammed: > > > I need this exactly solution. > > > Can you please tell me how could I DO IT? > > > I am badly in nead of it > > > > > > On Thu, May 14, 2009 at 5:58 AM, Ridzwan Aminuddin > > > > <r...@world-check.com>wrote: > > > > > > > >> Hi all, > > > >> > > > >> Is Lucene able to index phrases instead if individual terms? If it > is, > > can > > > >> we also feed it a 'thesaurus or dictionary' of phrases that it > should > > look > > > >> out for when indexing. Thanks in advance, > > > >> > > > >> Ridzwan > > > > Hi Seid. > > > > I constructed something like this in my master degree. What I bascially > > did was to write a custom analyzer. However I identified the top 100 > > most searched phrases in my collection, and then filtered for those. If > > a document contained a identified phrase, then the analyzer would > > construct a Token for those terms. > > > > Another approach is to decide that only two-word phrases should be > > search for. And, for instance, find verbs and use the "word right after". > > > > Example string: "The boat was very fast". > > Tokens: "The boat", "was", "very fast". > > > > If your analyzer contain a range of phrases to search for, then it > > should be straightforward to identify those phrases and index them as > > Tokens. Just remember to set the start and end offset of the Token to > > correct values. > > > > You can have a look at this thesis here: > > http://asbjorn.fellinghaug.com/filer/master/Master_thesis.pdf > > > > And the java source code here: > > http://daim.idi.ntnu.no/show.php?type=vedlegg&id=3429 > > > > Hope this helps. > > > > -- > > Asbjørn A. Fellinghaug > > asbj...@fellinghaug.com > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > >