How to create a Lucene in-memory index at webapp deployment time

2012-09-06 Thread Kasun Perera
I have a web java/jsp application running on Apache Tomcat server. In this web application I have used lucene, to index and calculate similrarity between some PDF documents(PDF documents are in the database). My live server dosent allow web-app to access files, so I have created the in-memory lucen

RE: Using a Lucene ShingleFilter to extract frequencies of bigrams in Lucene

2012-09-06 Thread Martin O'Shea
Thanks for that piece of advice. I ended up passing my snowballAnalyzer and standardAnalyzers as parameters to ShingleFilterWrappers and processing the outputs via a TermVectorMapper. It seems to work quite well. -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: 05

Re: Issue with documentation for org.apache.lucene.analysis.synonym.SynonymMap.Builder.add() method

2012-09-06 Thread Mark Parker
On Thu, Sep 6, 2012 at 12:40 PM, Robert Muir wrote: > On Thu, Sep 6, 2012 at 2:12 PM, Chris Hostetter > wrote: >> >> : Converted to U+000 by what, I wonder? Javadoc shouldn't be doing that. If >> : it does, I wonder if we need \\u instead? >> >> aparently it is... >> >> https://mail-archives

Re: Issue with documentation for org.apache.lucene.analysis.synonym.SynonymMap.Builder.add() method

2012-09-06 Thread Robert Muir
On Thu, Sep 6, 2012 at 2:12 PM, Chris Hostetter wrote: > > : Converted to U+000 by what, I wonder? Javadoc shouldn't be doing that. If > : it does, I wonder if we need \\u instead? > > aparently it is... > > https://mail-archives.apache.org/mod_mbox/harmony-dev/200802.mbox/%3c47b2f7ae.2000...

Re: Issue with documentation for org.apache.lucene.analysis.synonym.SynonymMap.Builder.add() method

2012-09-06 Thread Chris Hostetter
: Converted to U+000 by what, I wonder? Javadoc shouldn't be doing that. If : it does, I wonder if we need \\u instead? aparently it is... https://mail-archives.apache.org/mod_mbox/harmony-dev/200802.mbox/%3c47b2f7ae.2000...@gmail.com%3E -Hoss --

Re: Issue with documentation for org.apache.lucene.analysis.synonym.SynonymMap.Builder.add() method

2012-09-06 Thread Benson Margulies
On Thu, Sep 6, 2012 at 1:59 PM, Robert Muir wrote: > Thanks for reporting this Mark. > > I think it was not intended to have actual null characters here (or > probably anywhere in javadocs). > > Our javadocs checkers should be failing on stuff like this... > > On Thu, Sep 6, 2012 at 1:52 PM, Mark

Re: Issue with documentation for org.apache.lucene.analysis.synonym.SynonymMap.Builder.add() method

2012-09-06 Thread Robert Muir
Thanks for reporting this Mark. I think it was not intended to have actual null characters here (or probably anywhere in javadocs). Our javadocs checkers should be failing on stuff like this... On Thu, Sep 6, 2012 at 1:52 PM, Mark Parker wrote: > I'm building documentation from the Lucene 4.0.0

Issue with documentation for org.apache.lucene.analysis.synonym.SynonymMap.Builder.add() method

2012-09-06 Thread Mark Parker
I'm building documentation from the Lucene 4.0.0-BETA source (though this was also an issue with the ALPHA source), and the output has null characters in it. I believe that this is because the source looks like this: /** * Add a phrase->phrase synonym mapping. * Phrases are character

Re:Re: Re: Re: Re: How to incorporate the SmartCnAnalyzer in the core lucene jar

2012-09-06 Thread qibaoyuan
src folder At 2012-09-06 22:50:01,Cheng wrote: >Thanks. > >The instruction says that user can use IKAnalyzercfg.xml to configure the >extension dictionary and stopword dictionary. It also mentions that the xml >file should be put to the class root. > >In an eclipse java project, where is the

Re: Re: Re: Re: How to incorporate the SmartCnAnalyzer in the core lucene jar

2012-09-06 Thread Cheng
Thanks. The instruction says that user can use IKAnalyzercfg.xml to configure the extension dictionary and stopword dictionary. It also mentions that the xml file should be put to the class root. In an eclipse java project, where is the class root? Thanks On Thu, Sep 6, 2012 at 10:27 AM, qi

Re:Re: Re: Re: How to incorporate the SmartCnAnalyzer in the core lucene jar

2012-09-06 Thread qibaoyuan
check out http://code.google.com/p/ik-analyzer/ it's quite straightforward. At 2012-09-06 22:22:45,Cheng wrote: >I use 3.5 now, and plan to try 3.6. How can I use IKAnalyzer and make the >analyzer to use my own dictionary and work together with Lucene? > >Thanks so much for help. > >On Thu, Se

Re: Re: Re: How to incorporate the SmartCnAnalyzer in the core lucene jar

2012-09-06 Thread Cheng
I use 3.5 now, and plan to try 3.6. How can I use IKAnalyzer and make the analyzer to use my own dictionary and work together with Lucene? Thanks so much for help. On Thu, Sep 6, 2012 at 10:19 AM, 齐保元 wrote: > > > you'd better tell me the version of lucene.the latest version > ikanlyzer2012 sup

Re:Re: Re: How to incorporate the SmartCnAnalyzer in the core lucene jar

2012-09-06 Thread 齐保元
you'd better tell me the version of lucene.the latest version ikanlyzer2012 support lucene3.6 >IKAnalyzer is not supported in Lucene, right? > >On Thu, Sep 6, 2012 at 10:14 AM, wrote: > >> >> 1.fatjar is a tool for archiving jars/classes together NOTan analyzer. >> 2.smartcn seems not abl

Re:Re: How to incorporate the SmartCnAnalyzer in the core lucene jar

2012-09-06 Thread 齐保元
it's under contrib/analyzers/smartcn in lucene 3.6.maybe what you use is source code. At 2012-09-06 22:14:27,Cheng wrote: >Also, I checked and couldn't find the smartcn.jar in the originally shipped >Lucene jar. Should I build it myself? and how? >Thanks. > >On Thu, Sep 6, 2012 at 10:10 AM, Che

Re: Re: How to incorporate the SmartCnAnalyzer in the core lucene jar

2012-09-06 Thread Cheng
IKAnalyzer is not supported in Lucene, right? On Thu, Sep 6, 2012 at 10:14 AM, 齐保元 wrote: > > 1.fatjar is a tool for archiving jars/classes together NOTan analyzer. > 2.smartcn seems not able to import your own dictionay,it can only import > stop word dict;You can try IKAnalyzer instead. > > > A

Re:Re: How to incorporate the SmartCnAnalyzer in the core lucene jar

2012-09-06 Thread 齐保元
1.fatjar is a tool for archiving jars/classes together NOTan analyzer. 2.smartcn seems not able to import your own dictionay,it can only import stop word dict;You can try IKAnalyzer instead. At 2012-09-06 22:10:15,Cheng wrote: >Thanks. I will try that. > >Another question. How to use my own di

Re: How to incorporate the SmartCnAnalyzer in the core lucene jar

2012-09-06 Thread Cheng
Also, I checked and couldn't find the smartcn.jar in the originally shipped Lucene jar. Should I build it myself? and how? Thanks. On Thu, Sep 6, 2012 at 10:10 AM, Cheng wrote: > Thanks. I will try that. > > Another question. How to use my own dictionary instead of the default one > either in Fa

Re: How to incorporate the SmartCnAnalyzer in the core lucene jar

2012-09-06 Thread Cheng
Thanks. I will try that. Another question. How to use my own dictionary instead of the default one either in FatJAR or smartcn.jar? On Thu, Sep 6, 2012 at 10:07 AM, 齐保元 wrote: > > > import contrib/smartcn.jar is not complicated.or you can try FatJAR. > > > At 2012-09-06 22:04:58,Cheng wrote: >

Re:How to incorporate the SmartCnAnalyzer in the core lucene jar

2012-09-06 Thread 齐保元
import contrib/smartcn.jar is not complicated.or you can try FatJAR. At 2012-09-06 22:04:58,Cheng wrote: >Hi, > >The default Lucene core jar contains no the smartcn analyzer. How can I >include it into the core jar. > >Thanks!