NoClassDefFoundError for EarlyTerminatingSortingCollector

2014-10-02 Thread Cheng
Hi all, I am using the following simple code, which led to NoClassDefFoundError for EarlyTerminatingSortingCollector. Any one can help? Thanks. RAMDirectory index_dir = new RAMDirectory(); Analyzer analyzer = new StandardAnalyzer(); AnalyzingInfixSuggester suggester = new AnalyzingInfixSuggester

Lucene suggester can't suggest similar phrase

2014-09-29 Thread Cheng
Hi, I am using Lucene 4.10 suggester which I thought can return similar phrase. But it turned out the different way. My code is as follow: public static void main(String[] args) throws IOException { String path = "c:/data/suggest/dic.txt"; Dictionary dic; dic = new FileDictionary(new FileInpu

Can RAMDirectory work for gigabyte data which needs refreshing of the index all the time?

2014-05-14 Thread Cheng
Hi, I have an index of multiple gigabytes which serves 5-10 threads and needs refreshing very often. I wonder if RAMDirectory is the good candidate for this purpose. If not, what kind of directory is better? Thanks, Cheng

Why does QueryBuilder.createBooleanQuery create something different from input?

2014-05-12 Thread Cheng
Hi, I build a query using QueryBuilder.createBooleanQuery("title","【微信活动】6500盒“健康瘦身减肥”梅免费送"). When I check the query, the toString() of this query looks like: Query: title:而 title:不用 title:下载 title:2. title:目前 title:来说 title:已经 title:完美越狱 title:的人 title:没有 title:任何 title:必要 title:再用 title:红 titl

Re: How to add a field to hold a Java map object?

2013-02-13 Thread Cheng
; try { writer.addDocument(d); writer.commit(); } catch (Exception e) { } Unfortunately, when I search the index, all what I get is: {号=202, 栋=6}, which doesn't contain double quotes. Therefore I can't rebuild the map object with the return value. Please help. On Wed, Feb 13, 2013 at 10:

Re: How to add a field to hold a Java map object?

2013-02-13 Thread Cheng
e String representation of a Map, the same way you > do any other String: use StringField or an analyzer that keeps the > characters you want it to. Maybe WhitespaceAnalyzer. > > > -- > Ian. > > > On Wed, Feb 13, 2013 at 1:34 AM, Cheng wrote: > > Hi, > > >

Re: 回复: IndexReader.open and CorruptIndexException

2013-01-25 Thread Cheng
t;> SEVERE: Socket accept failed > >>> org.apache.tomcat.jni.Error: 24: Too many open files > >>> at org.apache.tomcat.jni.Socket.accept(Native Method) > >>> at > >>> > org.apache.tomcat.util.net.AprEndpoint$Acceptor.run(AprEndpoint.java:

Re: IndexReader.open and CorruptIndexException

2013-01-24 Thread Cheng
ndless > > http://blog.mikemccandless.com > > On Tue, Jan 22, 2013 at 8:20 AM, Cheng wrote: > > Hi, > > > > I run a Lucene application on Tomcat. The app will try to open a Linux > > directory, and sometime returns CorruptIndexException error. > > > > Shortly after I r

Re: Index size doubles every time when I synchronize the RAM-based index with the FD-based index

2012-09-30 Thread Cheng
What version of > lucene? > > -- > Ian. > > > On Fri, Sep 28, 2012 at 1:56 AM, Cheng wrote: > > Hi, > > > > I have a ram based index which occasionally needs to be persistent with a > > disk based index. Every time the size doubles which eats up my di

Re: Re: Re: Re: How to incorporate the SmartCnAnalyzer in the core lucene jar

2012-09-06 Thread Cheng
, qibaoyuan wrote: > check out http://code.google.com/p/ik-analyzer/ it's quite > straightforward. > > > > At 2012-09-06 22:22:45,Cheng wrote: > >I use 3.5 now, and plan to try 3.6. How can I use IKAnalyzer and make the > >analyzer to use my own dicti

Re: Re: Re: How to incorporate the SmartCnAnalyzer in the core lucene jar

2012-09-06 Thread Cheng
cn seems not able to import your own dictionay,it can only import > >> stop word dict;You can try IKAnalyzer instead. > >> > >> > >> At 2012-09-06 22:10:15,Cheng wrote: > >> >Thanks. I will try that. > >> > > >> >Another questi

Re: Re: How to incorporate the SmartCnAnalyzer in the core lucene jar

2012-09-06 Thread Cheng
yzer instead. > > > At 2012-09-06 22:10:15,Cheng wrote: > >Thanks. I will try that. > > > >Another question. How to use my own dictionary instead of the default one > >either in FatJAR or smartcn.jar? > > > >On Thu, Sep 6, 2012 at 10:07 AM wrote: > > >

Re: How to incorporate the SmartCnAnalyzer in the core lucene jar

2012-09-06 Thread Cheng
Also, I checked and couldn't find the smartcn.jar in the originally shipped Lucene jar. Should I build it myself? and how? Thanks. On Thu, Sep 6, 2012 at 10:10 AM, Cheng wrote: > Thanks. I will try that. > > Another question. How to use my own dictionary instead of the default o

Re: How to incorporate the SmartCnAnalyzer in the core lucene jar

2012-09-06 Thread Cheng
Thanks. I will try that. Another question. How to use my own dictionary instead of the default one either in FatJAR or smartcn.jar? On Thu, Sep 6, 2012 at 10:07 AM, 齐保元 wrote: > > > import contrib/smartcn.jar is not complicated.or you can try FatJAR. > > > At 2012-09-06 22:

Re: RAMDirectory unexpectedly slows

2012-06-30 Thread Cheng
the past. > > Can you explain how you are using Lucene? > > You may also want to try the CachingRAMDirectory patch on > https://issues.apache.org/jira/browse/LUCENE-4123 > > Mike McCandless > > http://blog.mikemccandless.com > > On Sat, Jun 16, 2012 at 7:18 AM, Cheng

Re: RAMDirectory unexpectedly slows

2012-06-18 Thread Cheng
ues.apache.org/jira/browse/LUCENE-4123 > > Mike McCandless > > http://blog.mikemccandless.com > > On Sat, Jun 16, 2012 at 7:18 AM, Cheng wrote: > > After a number of test, the performance of MMapDirectory is not even > close > > to that of RAMDirectory, in terms of sp

Re: RAMDirectory unexpectedly slows

2012-06-16 Thread Cheng
hetaphi.de > eMail: u...@thetaphi.de > > > > -Original Message- > > From: Cheng [mailto:zhoucheng2...@gmail.com] > > Sent: Monday, June 04, 2012 6:10 PM > > To: java-user@lucene.apache.org > > Subject: Re: RAMDirectory unexpectedly slows > >

Re: RAMDirectory unexpectedly slows

2012-06-04 Thread Cheng
n e.g. > Wikipedia. > > Uwe > -- > Uwe Schindler > H.-H.-Meier-Allee 63, 28213 Bremen > http://www.thetaphi.de > > > > Cheng schrieb: > > Please shed more insight into the difference between JVM heap size and the > memory size used by Lucene. > > What

Re: RAMDirectory unexpectedly slows

2012-06-04 Thread Cheng
ill be cached in RAM regardless in the OS system IO > cache. > > 1. > https://github.com/elasticsearch/elasticsearch/blob/master/src/main/java/org/apache/lucene/store/bytebuffer/ByteBufferDirectory.java > > On Mon, Jun 4, 2012 at 10:55 AM, Cheng wrote: > > My indexes are 500MB

Re: RAMDirectory unexpectedly slows

2012-06-04 Thread Cheng
the file system cache of the operating system, so copying data > to Java heap space is not useful." > > -- Jack Krupansky > > -Original Message- From: Cheng > Sent: Monday, June 04, 2012 10:08 AM > To: java-user@lucene.apache.org > Subject: RAMDirectory unexpecte

Re: TaxonomySearch & similar words?

2012-02-22 Thread Cheng
Thank you. The alternative sounds reasonable. On Thu, Feb 23, 2012 at 12:54 PM, Shai Erera wrote: > Hi Cheng, > > You will need to use the exact path labels in order to get to the category > 'Mark Twain', unless you index multiple paths from start, e.g.: > /author/Amer

TaxonomySearch & similar words?

2012-02-22 Thread Cheng
Hi, I am using Taxonomy Search to build a facet comprising things such as “/author/American/Mark Twain”. Since the word "author" has a synonym of "writer", can I use "writer" instead of "author" to get the path? Currently I can only use exactly the word "author" to do it. Thanks

Re: How to separate one index into multiple?

2012-02-20 Thread Cheng
great idea! On Sun, Feb 19, 2012 at 9:43 PM, Li Li wrote: > you can delete by query like -category:category1 > > On Sun, Feb 19, 2012 at 9:41 PM, Li Li wrote: > > > I think you could do as follows. taking splitting it to 3 indexes for > > example. > > you can copy the index 3 times. > > for co

Re: Can I use multiple writers of different applications on a same FSDirectory?

2012-02-14 Thread Cheng
only have one writer against one index at a time. Lucene's > > locking will prevent anything else. > > > > > > -- > > Ian. > > > > > > On Tue, Feb 14, 2012 at 4:49 PM, Cheng wrote: > > > Hi, > > > > > > I need to mana

Re: When to refresh writer?

2012-02-14 Thread Cheng
mit() when you want changes to be durable (survive > OS/JVM crash, power loss, etc.). > > Mike McCandless > > http://blog.mikemccandless.com > > On Mon, Feb 13, 2012 at 1:17 PM, Cheng wrote: > > Hi, > > > > My application will go on for ever. When is good t

Re: slow speed of searching

2012-02-08 Thread Cheng
thanks a lot On Wed, Feb 8, 2012 at 9:48 PM, Ian Lea wrote: > http://wiki.apache.org/lucene-java/ImproveSearchingSpeed > > (the 3rd item is Use a local filesystem!) > > -- > Ian. > > > On Wed, Feb 8, 2012 at 12:44 PM, Cheng wrote: > > Hi, > > > >

Re: NRTManager and AlreadyClosedException

2012-02-08 Thread Cheng
; // Do not use s after this! > s = null; > > -- > Ian. > > > On Wed, Feb 8, 2012 at 12:09 PM, Cheng wrote: > > You are right. There is a method by which I do searching. At the end of > the > > method, I release the index searcher (not the searchermanager). > > &g

Re: NRTManager and AlreadyClosedException

2012-02-08 Thread Cheng
Calling release() multiple times? > > From the exception message the first sounds most likely. > > > -- > Ian. > > > On Wed, Feb 8, 2012 at 5:20 AM, Cheng wrote: > > Hi, > > > > I am using NRTManager and NRTManagerReopenThread. Though I don'

Re: Configure writer to write to FSDirectory?

2012-02-06 Thread Cheng
rformance w/o it (after removing the > commit calls). NRT is very fast... > > Mike McCandless > > http://blog.mikemccandless.com > > On Mon, Feb 6, 2012 at 11:46 AM, Cheng wrote: > > Good point. I should remove the commits. > > > > Any difference between NRTCas

Re: Configure writer to write to FSDirectory?

2012-02-06 Thread Cheng
all flushed segments. > > Mike McCandless > > http://blog.mikemccandless.com > > On Mon, Feb 6, 2012 at 10:45 AM, Cheng wrote: > > Uwe, when I meant speed is slow, I didn't refer to instant visibility of > > changes, but that the changes may be synchronized with FSDirecto

Re: Configure writer to write to FSDirectory?

2012-02-06 Thread Cheng
Agree. On Mon, Feb 6, 2012 at 11:53 PM, Uwe Schindler wrote: > Hi Cheng, > > all pros and cons are explained in those articles written by Mike! As soon > as there are harddisks in the game, there is a slowdown, what do you > expect? > If you need it faster, buy SSDs! :-) >

Re: Configure writer to write to FSDirectory?

2012-02-06 Thread Cheng
necessary. > > If you are using NRTManager why do you care how long this takes? How > often are you calling it? Why? > > > -- > Ian. > > > On Mon, Feb 6, 2012 at 3:45 PM, Cheng wrote: > > Uwe, when I meant speed is slow, I didn't refer to instant visibi

Re: Configure writer to write to FSDirectory?

2012-02-06 Thread Cheng
date the index? Time taken for updates to become visible in search > results? Time taken for searches to run on the IndexSearcher returned > from SearcherManager? Something else? > > > -- > Ian. > > > On Mon, Feb 6, 2012 at 3:27 PM, Cheng wrote: > > Ian, > >

Re: Configure writer to write to FSDirectory?

2012-02-06 Thread Cheng
/goo.gl/mzAHt > http://goo.gl/5RoPx > http://goo.gl/vSJ7x > > Uwe > > - > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > > -Original Message- > > From: Cheng [mailto:zhoucheng2...@gmail.c

Re: Configure writer to write to FSDirectory?

2012-02-06 Thread Cheng
Ian, I encountered an issue that I need to frequently update the index. The NRTManager seems not very helpful on this front as the speed is slower than RAMDirectory is used. Any improvement advice? On Mon, Feb 6, 2012 at 10:24 PM, Cheng wrote: > That really helps! I will try it

Re: Configure writer to write to FSDirectory?

2012-02-06 Thread Cheng
ith > >> nrtm.updateDocument(...), and to search use > >> > >> IndexSearcher searcher = srchm.acquire(); > >> try { > >> search ... > >> } finally { > >> srchm.release(searcher); > >> } > >> > >> All thread s

Re: Configure writer to write to FSDirectory?

2012-02-06 Thread Cheng
e. And I bet it'll be blindingly fast. > > Don't forget to close() things down at the end. > > > -- > Ian. > > > > On Mon, Feb 6, 2012 at 12:15 AM, Cheng wrote: > > I was trying to, but don't know how to even I read some of your blogs. >

Re: Configure writer to write to FSDirectory?

2012-02-05 Thread Cheng
//blog.mikemccandless.com > > On Sun, Feb 5, 2012 at 9:03 AM, Cheng wrote: > > Hi Uwe, > > > > My challenge is that I need to update/modify the indexes frequently while > > providing the search capability. I was trying to use FSDirectory, but > found > >

Re: Configure writer to write to FSDirectory?

2012-02-05 Thread Cheng
know of MMapDirectory, and wonder if it is as fast as RAMDirectory. On Sun, Feb 5, 2012 at 4:14 PM, Uwe Schindler wrote: > Hi Cheng, > > It seems that you use a RAMDirectory for *caching*, otherwise it makes no > sense to write changes back. In recent Lucene versions, this is

Re: How to avoid filtering stop words like "IS" in StandardAnalyzer

2012-01-28 Thread Cheng
gt; > > > > -Original Message- > > > From: Pedro Lacerda [mailto:pslace...@gmail.com] > > > Sent: Saturday, January 28, 2012 12:49 PM > > > To: java-user@lucene.apache.org > > > Subject: Re: How to avoid filtering stop words like "

How to avoid filtering stop words like "IS" in StandardAnalyzer

2012-01-27 Thread Cheng
Hi, I don't want to filter certain stop words within the StandardAnalyzer? Can I do so? Ideally, I would like to have a customized StandardAnalyzer. Thanks.

Re: Cleaning up writer after certain idle time?

2012-01-25 Thread Cheng
simon.willna...@googlemail.com> wrote: > Hey, > > > On Wed, Jan 25, 2012 at 11:01 PM, Cheng wrote: > > Hi, > > > > I am using multiple writer instances in a web service. Some instances are > > busy all the time, while some aren't. I wonder how to configure the >

Cleaning up writer after certain idle time?

2012-01-25 Thread Cheng
Hi, I am using multiple writer instances in a web service. Some instances are busy all the time, while some aren't. I wonder how to configure the writer to dissolve itself after a certain time of idling, say 30 seconds. If the answer is yes, can I do more in the dissolving, such as writing the ch

NRTManager, NRTManagerReopenThread and ExecutorServices example

2012-01-18 Thread Cheng
Hi, can any of you provide a working code example that utilizes the NRTManager, NRTManagerReopenThread and ExecutorServices instances? The limited availability of information regarding these classes really drives me nut. Thanks

Re: Is Lucene a good candidate for a Google-like search engine?

2012-01-16 Thread Cheng
greate thanks On Mon, Jan 16, 2012 at 5:56 AM, findbestopensource < findbestopensou...@gmail.com> wrote: > Check out the presentation. > http://java.dzone.com/videos/archive-it-scaling-beyond > > Web archive uses Lucene to index billions of pages. > > Regards > Aditya > www.findbestopensource.com

How NRTManagerReopenThread works with Java Executor framework?

2012-01-15 Thread Cheng
I saw the link, https://builds.apache.org/job/Lucene-3.x/javadoc/contrib-misc/org/apache/lucene/index/NRTManagerReopenThread.html, which talks about how to use the NRTManagerReopenThread. I am currently using the Java ExecutorService framework to utilize a multiple threading scenario. Pls see belo

Re: Is it necessary to create a new searcher?

2012-01-14 Thread Cheng
I just found some interesting stuff here: https://builds.apache.org/job/Lucene-3.x/javadoc/contrib-misc/org/apache/lucene/index/NRTManagerReopenThread.html How the NRTManager is plugged into my executeservice framework? On Sun, Jan 15, 2012 at 1:04 AM, Cheng wrote: > That sounds like wha

Re: Is it necessary to create a new searcher?

2012-01-14 Thread Cheng
ose readers will be reopened. > > So in general, a reopen after a small number of updates may well be > > quicker than a reopen after a large number of updates. How important > > is it that your searches get up to date data? If vital, you'll have to > > reopen. If not

10 million entities and 100 million related information

2012-01-12 Thread Cheng
I have 10MM entities, for each of which I will index 10-20 fields. Also, I will have to index 100MM related information of the entities, and each piece of the information will have to go through some Analyzer. I have a few questions: 1) Can I use just one index folder for all the data? 2) If I h

Re: Build RAMDirectory on FSDirectory, and then synchronzing the two

2012-01-12 Thread Cheng
The reason is I have indexes on hard drive but want to load them into ram for faster searching, adding, deleting, etc. Using RAMDirectory can help achieve this goal. On Thu, Jan 12, 2012 at 6:36 PM, Sanne Grinovero wrote: > Maybe you could explain why you are doing this? Someone could suggest >

Is it necessary to create a new searcher?

2012-01-11 Thread Cheng
I am currently using the following statement at the end of each index writing, although I don't know if the writing modifies the indexes or not: is = new IndexSearcher(IndexReader.openIfChanged(ir)); # is -> IndexSearcher, ir-> IndexReader My question is how expensive to create a searcher insta

Re: Seem contradictive -- indexwriter in handling multiple threads

2012-01-11 Thread Cheng
; Mike McCandless > > http://blog.mikemccandless.com > > On Wed, Jan 11, 2012 at 3:29 PM, Cheng wrote: > > Will do if I see a perf gain. > > > > The other issue is that in each thread my apps will not only do indexing > > but searching. That means I will have to pass

Re: Seem contradictive -- indexwriter in handling multiple threads

2012-01-11 Thread Cheng
to an FSDir). > > If you see a perf gain then please report back! > > Mike McCandless > > http://blog.mikemccandless.com > > On Wed, Jan 11, 2012 at 3:09 PM, Cheng wrote: > > Can I create a RAMDirectory based writer and have it work cross all > > threads? In the sen

Re: Seem contradictive -- indexwriter in handling multiple threads

2012-01-11 Thread Cheng
Can I create a RAMDirectory based writer and have it work cross all threads? In the sense, I would like to use RAMDirectory every where and have the RAMDirectory written to FSDirectory in the end. I suppose that should work, right? On Wed, Jan 11, 2012 at 2:31 PM, Michael McCandless < luc...@mik

Seem contradictive -- indexwriter in handling multiple threads

2012-01-11 Thread Cheng
I have read a lot about IndexWriter and multi-threading over the Internet. It seems to me that the normal practice is: 1) use a same indexwriter instance for multiple threads; 2) create an individual RAMDirectory per threads; 3) use addIndexes(Directory[]) methods to add to a local drive folder al

shared instance of IndexWriter doesn't improve proformance

2012-01-10 Thread Cheng
Hi, I use a same instance of writer for multiple threads. It turns out that the time to finish jobs is more than to create a new writer instance in each thread. What would be the possible reasons? Thanks

Re: Build RAMDirectory on FSDirectory, and then synchronzing the two

2012-01-10 Thread Cheng
I tried IndexWriterConfig.OpenMode CREATE, and the size is doubled. The only way that is effective is the writer's deleteAll() methods. On Mon, Jan 9, 2012 at 5:23 AM, Ian Lea wrote: > If you load an existing disk index into a RAMDirectory, make some > changes in RAM and call addIndexes to add

Build RAMDirectory on FSDirectory, and then synchronzing the two

2012-01-08 Thread Cheng
Hi, I new a RAMDirectory based upon a FSDirectory. After a few modifications, I would like to synchronize the two. Some on the mailing list provided a solution that uses addIndex() function. However, the FSDirectory simply combines with the RAMDirectory, and the size doubled. How can I do a rea

Strategy for large index files

2012-01-07 Thread Cheng
Hi, my servlet application is running a large index of 20G. I don't think it can be loaded to RAM at one time. What are the general strategies to improve the search and write performance? Thanks

Shared IndexWriter does not increase speed

2012-01-06 Thread Cheng
Hi, I am trying to use a shared IndexWriter instance for a multi-thread application. Surprisingly, this under performs by creating a writer instance within a thread. My code is as follow. Can someone help explain why? Thanks. Scenario 1: shared IndexWriter instance RAMDirectory ramDir = new RA

queryParser.ParseException & Encountered ""

2012-01-01 Thread Cheng
Hi, I was trying to use QueryParser for some chinese, but encountered the following issues: (1) org.apache.lucene.queryParser.ParseException: Cannot parse '大众UP!': Encountered "" at line 1, column 5. the error seems to be the Chinese exclamation mark. (2) org.apache.lucene.queryParser.ParseExce

Re: How to use RAMDirectory more efficiently

2012-01-01 Thread Cheng
xWriter( fs, ... ); > try { >writer.addIndexes( ram ); > } finally { > writer.close(); > } > } > > > http://lucene.apache.org/java/3_3_0/api/core/org/apache/lucene/index/IndexWriter.html#addIndexes(org.apache.lucene.store.Directory > .. >

How to use RAMDirectory more efficiently

2011-12-31 Thread Cheng
Hi, Suppose that we have a huge amount of indices on hard drives but working in RAMDirectory is a must, how can we decide which part of the indices to be loaded into RAM, how to modify the indices, and when and how to synchronize the indices with those on hard drives? Any thoughts? Thanks!

How to save in-memory index into disk

2011-12-31 Thread Cheng
Hi, I am creating a RAMDirectory based upon a folder on disk. After doing a lot of adding, deleting, or updating, I want to flush the changes to the disk. However, the flush() function is not available for 3.5. How can I save the changes to disk? Thanks!

Can't get a hit

2011-12-29 Thread Cheng
Hi, I need to save a list of records into an index on hard drive. I keep a writer and a reader open till the end of the operation. My issue is that I need to compare each of the new records with each of the records that have been saved into the index. There are plenty of duplicate records in the

Re: Search multiple directories simultaneously

2011-06-23 Thread Cheng
> index2.close(); > index3.close(); > ... > > > - > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > > -Original Message- > > From: Cheng [mailto:zhoucheng2...@gmail.com] > > Sent: Thu

Search multiple directories simultaneously

2011-06-23 Thread Cheng
Hi, I have multiple indexed folders (or directories), each holding indexing files for specific purposes. I want to do a search over these folders (or directories) in a same query. Is it possible? Thanks

JobClient.runJob(job) in Fetcher.java

2011-05-25 Thread Cheng
Hi, I notice that there are a few run() methods in Fetcher.java and that the following statement in Crawler.java calls the JobClient.runJob(job) in Fetcher.java. fetcher.fetch(segs[0], threads, org.apache.nutch.fetcher.Fetcher.isParsing(conf)); I would like to know which run() in Fetcher.java has

Re: Is there a limit on the size of the text for a single field?

2011-05-25 Thread Cheng Zhou
thanks lan. On Wed, May 25, 2011 at 11:44 PM, Ian Lea wrote: > Sure. See the javadocs for IndexWriter.setMaxFieldLength or > LimitTokenCountAnalyzer if you are using 3.1.0. > > > -- > Ian. > > > On Wed, May 25, 2011 at 4:24 PM, Cheng Zhou > wrote: > > Hi

Is there a limit on the size of the text for a single field?

2011-05-25 Thread Cheng Zhou
Hi, I wonder if I can associate a text string of over 5MB with a single field. Thanks.

Re: how to search multiple fields

2011-05-25 Thread Cheng Zhou
elds and that of the field boost? Cheng On Wed, May 25, 2011 at 6:20 PM, Ian Lea wrote: > > Quite a few Lucene examples on lines shows how to insert multiple fields > > into a Document and how to query the indexed file with certain fields and > > queried text. I would like to kn

How to create document objects in our case

2011-05-20 Thread Cheng Zhou
Hi, I have a large number of XML files to be indexed by Lucene. All the files share similar structure as below: .. Things to be noted are: The root element of Group has 30 or so attributes, and it usually has over 2000 Subgroup elements, which in turn also have more than 20

RE: Lucene 3.3 in Eclipse

2011-05-16 Thread cheng
the list - didn't notice that my reply went to Cheng directly) There is an Ant target "get-db-jar" that can do the downloading for you - you can see the URL it uses here: <http://svn.apache.org/viewvc/lucene/java/tags/lucene_3_0_3/contrib/db/bdb/build.xml?view=markup#l49>

RE: Lucene 3.3 in Eclipse

2011-05-16 Thread cheng
tream class is not available. Please see this, "public final class FastCharStream implements CharStream" What is it? Do you know where to download it? 3) The QueryParser class can't be resolve. Please see this, SrndQuery lq = QueryParser.parse(queryText); Thanks, Cheng -

RE: Lucene 3.3 in Eclipse

2011-05-15 Thread cheng
package, which are under contrib/db/bdb/src/java folder. Do you know when I can find the proper jar file? Cheng -Original Message- From: Steven A Rowe [mailto:sar...@syr.edu] Sent: Sunday, May 15, 2011 10:08 PM To: java-user@lucene.apache.org Subject: RE: Lucene 3.3 in Eclipse Hi Cheng

Lucene 3.3 in Eclipse

2011-05-15 Thread cheng
Hi, I created a java project for Lucene 3.3 in Eclipse, and found that in the DbHandleExtractor.java file, the package of com.sleepycat.db.internal.Db is not resolved. How can I overcome this? I have tried to download .jar for this, but don't know which and where to download. Thanks

RE: Memory eaten up by String, Term and TermInfo?

2008-10-06 Thread Peter Cheng
> > http://java.sun.com/javase/technologies/hotspot/gc/index.jsp > > -Original Message- > From: Peter Cheng [mailto:[EMAIL PROTECTED] > Sent: Sunday, October 05, 2008 7:55 AM > To: java-user@lucene.apache.org > Subject: RE: Memory eaten up by String, Term and Te

RE: Memory eaten up by String, Term and TermInfo?

2008-10-05 Thread Peter Cheng
//search.dbsight.com > > Lucene Database Search in 3 minutes: > > > http://wiki.dbsight.com/index.php?title=Create_Lucene_Database > _Search_in_3_minutes > > DBSight customer, a shopping comparison site, (anonymous per > > request) got > > 2.6 Million Euro funding! &g

RE: Memory eaten up by String, Term and TermInfo?

2008-09-14 Thread Peter Cheng
t; Instant Scalable Full-Text Search On Any Database/Application > > site: http://www.dbsight.net > > demo: http://search.dbsight.com > > Lucene Database Search in 3 minutes: > > > http://wiki.dbsight.com/index.php?title=Create_Lucene_Database > _Search_in_3_minutes > &

Luke issues "Unknown format version: -6"

2008-08-26 Thread Jiao, Jason (NSN - CN/Cheng Du)
Hi there, I use luke v0.8.1 which build base on lucene 2.3.0. First, I run lucene/demo/IndexFiles to build index successfully. Then I use luke to open index, but luke issues "Unknown format version: -6" . I check the documentation of lucene which said "lucene 2.3.2 does not contain any new

RE: How to search

2008-08-26 Thread Jiao, Jason (NSN - CN/Cheng Du)
The lucene FAQ says: What wildcard search support is available from Lucene? Lucene supports wild card queries which allow you to perform searches such as book*, which will find documents containing terms such as book, bookstore, booklet, etc. Lucene refers to this type of a query as a 'prefix quer

Re: How Lucene Search

2008-06-26 Thread Alex Cheng
the debugger that came with eclipse is pretty good for this purpose. You can create a small project and then attach Lucene source for the purpose of debugging. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e

IndexDeletionPolicy to delete commits after N minutes

2008-06-25 Thread Alex Cheng
hi, what is the correct way to instruct the indexwriter (or other classes?) to delete old commit points after N minutes ? I tried to write a customized IndexDeletionPolicy that uses the parameters to schedule future jobs to perform file deletion. However, I am only getting the filenames through the

IndexDeletionPolicy to delete after N minutes

2008-06-25 Thread Alex Cheng
hi, what is the correct way to instruct the indexwriter to delete old commit points after N minutes ? I tried to write a customized IndexDeletionPolicy that uses the parameters to schedule future jobs to do file deletion. However, I am only getting the filenames, and not absolute file names. thank

instruct IndexDeletionPolicy to delete old commits after N minutes

2008-06-25 Thread Alex Cheng
hi, what is the correct way to instruct the indexwriter to delete old commit points after N minutes ? I tried to write a customized IndexDeletionPolicy that uses the parameters to schedule future jobs to do file deletion. However, I am only getting the filenames, and not absolute file names. thank

Is there a Term ID for each distinctive term indexed in Lucene?

2007-08-31 Thread Tao Cheng
Hi all, I found that instead of storing a term ID for a term in the index, Lucene stores the actual term string value. I am wondering if there ever is such a "term ID" for each distinctive term indexed in Lucne, similar as a "doc ID" for each distinctive document indexed in Lucene. In other words

Lucene javadoc not up-to-date?

2007-05-28 Thread Tao Cheng
I've encountered a few discrepcies between the javadoc of Lucene and the source code. I use: http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/javadoc/ as the most up-to-date javadoc reference. For instance, the SegmentTermDocs class implements the TermDocs interface. However, there is