Re: IndexReader.reopen() question

2011-03-04 Thread Lee
Thanks Ian, and Mike -- the code below was the result of badly copying the Javadocs in exasperation and panic: all points taken with gratitude. Cheers Lee On 04/03/2011 16:40, Ian Lea wrote: Looks basically OK to me. I wonder if you need the isCurrent() check as well as if (newReader

IndexReader.reopen() question

2011-03-04 Thread Lee Goddard
Hello list, Does this look correct? I am told it is not functioning, in that new entries to the index are not being picked-up? Thanks Lee try { if (! reader.isCurrent()){ IndexReader newReader = reader.reopen(); if (newReader != reader

Wiki edit rights

2015-05-28 Thread Lee Hinman
Hi Java-user mailing list, Please add me to the ContributorsGroup wiki page so I can edit the wiki wikiuser: leehinman Thanks! ;; Lee - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands

Lucene Index Structure

2008-08-21 Thread David Lee
Clarification question: If I don't store term vectors, then I: -- won't have information on the position of matching terms -- I don't have the term frequency vector -- but I should still have the frequency of terms per document in the .frq file, right? So what's the difference between the term f

Re: Lucene Index Structure

2008-08-21 Thread David Lee
ley wrote: > > On Thu, Aug 21, 2008 at 7:20 PM, David Lee <[EMAIL PROTECTED]> wrote: >> >>> Clarification question: >>> >>> If I don't store term vectors, then I: >>> -- won't have information on the position of matching terms >>&

Clarification about segments

2008-08-22 Thread David Lee
So from what I understand, is it true that if mergeFactor is 10, then when I index my first 9 documents, I have 9 separate segments, each containing 1 document? And when searching, it will search through every segment? Thanks! David

Re: Clarification about segments

2008-08-25 Thread David Lee
; and > #optimize(getMergeFactor()) > (btw #optimize() is equal to optimize(1) ). > > > Best regards > Karsten > > p.s. and yes, searching goes through every segment. > > > David Lee-26 wrote: > > > > So from what I understand, is it true that if

Lucene AND queries

2008-09-25 Thread David Lee
Hi, I was wondering when lucene queries two or more terms, does that mean the time it takes will be twice as long? For example if I search +lucene +apache, then does lucene get all the documents that match 'lucene' and all the documents that match 'apache', and then combine them together? Or can it

Re: Lucene AND queries

2008-09-25 Thread David Lee
ROTECTED]> wrote: > On Thu, Sep 25, 2008 at 1:39 PM, David Lee <[EMAIL PROTECTED]> wrote: > > I was wondering when lucene queries two or more terms, does that mean the > > time it takes will be twice as long? For example if I search +lucene > > +apache, then does lucene ge

Extracting Dates

2008-10-02 Thread David Lee
t of these projects are associated to lucene, someone might know. David Lee

Re: ClassCastException when writing to index writer

2008-10-03 Thread Edwin Lee
Hi Paul, The clone() in SegmentInfos is correct. The best practice of clone is to delegate the clone to the super class (if you look at the source code for Vector, it too delegates to its super class, which is the Object) to create a shallow copy, and then do a cloning of each of its mutable field

Re: ClassCastException when writing to index writer

2008-10-03 Thread Edwin Lee
i think, very likely, you have another copy of java.util.Vector loaded, and this one tries to be too clever with its implementation of clone (instantiate a new Vector instance) instead of delegating to its super class (Object). HTH, Edwin --- Chris Hostetter <[EMAIL PROTECTED]> wrote: > > :

RE: Memory eaten up by String, Term and TermInfo?

2008-10-06 Thread Edwin Lee
Hi, Probably off-topic, but just like to plug a bit on my blog post here: http://tinyurl.com/4vytcc :p (incidentally, Java GC is one of my favourite topics) It's not very detailed, but i would like to think it's a good place to start reading... Just like to point out a couple of things: 1. If yo

Re: ClassCastException when writing to index writer

2008-10-06 Thread Edwin Lee
> >package that I have downloaded (some sort of incompatibility?). I will > try > >to recompile the lucene package in my own environment and see if I can > fix > >the problem. > > > > > > On Sat, Oct 4, 2008 at 2:21 AM, Edwin Lee <[EMAIL PROTE

Re: ClassCastException when writing to index writer

2008-10-06 Thread Edwin Lee
tells me that there is something wrong with Lucene build file that > > causes this problem, but I have no idea what it could be. In Lucene's > > common-build.xml, I changed the 1.4 properties to 1.6 propreties but still > > to no avail. > > > >

How can I get document's top n raw score?

2008-02-01 Thread Lisa Lee
I need know document's top n raw score & term. For example, If one document have {apple, banana, coconut} terms, and I need top 2 score in the document. Simple way is just search all term in the document and sort the score - like as below. first, search about 'apple' term then write the score

How to Retrieve Found Term?

2008-04-19 Thread Edwin Lee
Hi all, i'm using Lucene 2.3.1. What i'm trying to do seems straightforward enough (to me), but i just can't find the method to do so. Let's say i'm doing a PhraseQuery of the phrase "apples and oranges" with a non-zero slop value, and it returns, e.g., 20 Hits. Because i'm using non-zero slo

RE: How to Retrieve Found Term?

2008-04-20 Thread Edwin Lee
Thanks, Edwin > Date: Sat, 19 Apr 2008 22:01:17 +0200 > From: [EMAIL PROTECTED] > To: java-user@lucene.apache.org > Subject: Re: How to Retrieve Found Term? > > Edwin Lee skrev: >> Hi all, >> >> i'm using Lucene 2.3.1. What i'm trying to

RE: How to Retrieve Found Term?

2008-04-22 Thread Edwin Lee
Hi Karl, Thanks for the suggestions, i would be glad to contribute back to the project. i'm not too familiar with the inner workings of Lucene though; how does such a functionality feature in a Query implementation? My naive interpretation, when i first got hold of Lucene, is that Query is wha

Question: Can lucene do parallel indexing?

2008-06-27 Thread David Lee
If I'm using a computer that has multiple cores, or if I want to use several computers to speed up the indexing process, how should I do that? Is there some kind of support for that in the API? David Lee

Nested Proximity searches

2008-06-30 Thread David Lee
Is it possible to do nested proximity searches with lucene? i.e. can I say I want a to be within 1 word of b and then that group to be within 4 words of c? The syntax ""a b"~1" c"~4 doesn't seem to work (since it treats the first two quotes as a pair and the later 2 as another pair).

Do Lucene Deletes delete the physical file? If yes, is there a way not to?

2008-07-02 Thread David Lee
iling list for simple questions like this? I tried googling, but didn't seem to get the information I wanted. Thanks! David Lee

Would Someone Give Me Pointer On How to Index Database?

2005-10-26 Thread Sam Lee
Hi, I want to use Lucene/Nutch to index my mysql database. I think of using JDBC, is it a good idea? I searched all over the web, but all the examples are non-lucene/Nutch related. Would you guys give me pointers or websites or examples on how to use JDBC on Lucene/Nutch to index mysql databas

Review of Using Compass With Lucene to Index Database?

2005-10-26 Thread Sam Lee
Hi, I just found a open source project called Compass that works with Lucene to index database like mysql. Has anyone used it? If so, please let us know what you think about Compass. Many thanks. __ Do You Yahoo!? Tired of spam? Yahoo! Mail has

Re: Review of Using Compass With Lucene to Index Database?

2005-10-26 Thread Sam Lee
p://www.theserverside.com/tss?service=direct/0/NewsThread/threadViewer.markNoisy.link&sp=l35679&sp=l180646 > > Chris > > Lucene Search On Any Database > http://www.dbsight.net > > On 10/26/05, Sam Lee <[EMAIL PROTECTED]> > wr

trying to boost a phrase higher than its individual words

2005-10-27 Thread Andy Lee
I have a situation where I want to search for individual words in a phrase as well as the phrase itself. For example, if the user enters ["classical music"] (with quotes) I want to find documents that contain "classical music" (the phrase) *and* the individual words "classical" and "music"

Re: trying to boost a phrase higher than its individual words

2005-10-28 Thread Andy Lee
On Oct 28, 2005, at 10:38 AM, Erik Hatcher wrote: So in this case a matching document must have both terms? Or could it just have one or the other? If it must have both, you could try a PhraseQuery with a slop of Integer.MAX_VALUE. PhraseQuery scores closer matches higher. Good to know,

Re: trying to boost a phrase higher than its individual words

2005-10-28 Thread Andy Lee
On Oct 28, 2005, at 8:17 PM, Chris Hostetter wrote: One thing to keep in mind is that if you have things you are adding to hte query to restrict the results, but you don't want them to contribute to the score, then try using a Filter instead. If you can't find an easy way to replace a query

How to Use Nutch to Crawl Database?

2005-10-29 Thread Victor Lee
Hi, How do I use Nutch to crawl internal database instead of web server? Does Nutch even support this option? Many thanks. __ Yahoo! FareChase: Search multiple travel sites in one click. http://farechase.yahoo.com --

Re: Reverse sorting by index order

2005-11-03 Thread Andy Lee
On Nov 3, 2005, at 9:37 AM, Oren Shir wrote: If I understand correctly, when sorting by Sort.INDEXORDER the oldest documents that were added to the index will be returned first. I want the reverse, because I'm more interested in newer documents. Looking at the source, I see that Sort.INDEXOR

Re: Reverse sorting by index order

2005-11-03 Thread Andy Lee
On Nov 3, 2005, at 10:22 AM, Oren Shir wrote: There is no constructor for Sort(SortField, boolean) in Lucene API. Which version are you using? I think 1.9rc1. I have a pretty recent svn checkout -- maybe this constructor is new. --Andy -

I Need System Design Suggestion. Please.

2005-11-04 Thread Victor Lee
Hi, I am going to use mysql db to store some data, use lucene(java) to index these data, and use Hibernate to map them. I was originally thinking of using PHP to input the data the visitors enter into the mysql db. But if I use PHP and use mysql statement directly, it may defeat the part of the pur

Is There Other Ports of Nutch?

2005-11-05 Thread Victor Lee
Hi, I know that there are several ports of Lucene, like cLucene, pLucene, etc. Are there other ports of Nutch besides java? Many thanks. __ Yahoo! Mail - PC Magazine Editors' Choice 2005 http://mail.yahoo.com ---

Best Way To Index Database Using Lucene?

2005-11-06 Thread Victor Lee
Hi, I use php and mysql. The visitors enters data through the web and the data is stored in the database. I want to make portions of that data to be searchable using Lucene. I am thinking of giving that data to Lucene for indexing at the same time of inputing that same data into the database

Re: Best Way To Index Database Using Lucene?

2005-11-06 Thread Victor Lee
I forgot to mention that if I use php-java-bridge to use Lucene to index at the same time I input the data into the mysql db, I don't even need to use JDBC. If I index inside the business logic layer which is java, then I will have to use JDBC. --- Victor Lee <[EMAIL PROTECTED]> w

Is It a Good Idea to Save Frequently Search Results in Database to Make It Faster?

2005-11-24 Thread Victor Lee
Hi, I use Lucene to index stuff that are changed very often but don't need to be real-time to searchers. e.g. the search result can be changed couple times per minute, but I only need to show the change every 5 minutes or so. Is it a good idea to save the search result to a database like m

Re: Is It a Good Idea to Save Frequently Search Results in Database to Make It Faster?

2005-11-24 Thread Victor Lee
Sorry, actually I meant all search results, not just frequent results. And there is only one search term per search, it's the stuff that belongs to the search terms change often. Victor Lee <[EMAIL PROTECTED]> wrote: Hi, I use Lucene to index stuff that are changed very ofte

Re: Is It a Good Idea to Save Frequently Search Results in Database to Make It Faster?

2005-11-24 Thread Victor Lee
re searcher.search(Query) using a basic query type like TermQuery, I very seriously doubt you'd beat MySQL performance. What kind of Query are you using for your searches? Erik On 24 Nov 2005, at 17:54, Victor Lee wrote: > Sorry, actually I meant all search results, not just frequent

Re: Is It a Good Idea to Save Frequently Search Results in Database to Make It Faster?

2005-11-25 Thread Victor Lee
I'd put my money on Lucene beating MySQL in the TermQuery scenario you described (e=hello) ;) But you'd be wise to step out of design mode and get some real-world tests going. And even if there is a performance difference, we're talking milliseconds most likely. Erik On 24 No

How to Use Memoryindex for Lots of Queries With Sort?

2005-11-27 Thread Victor Lee
Hi, I am using Memeoryindex as described here: http://dsd.lbl.gov/nux/api/org/apache/lucene/index/memory/MemoryIndex.html . I am using it to match lots(10 thousands) of queries with one document. Then I want to rank them based on score and some other variables. I want to know if there i

Strategies for updating indexes.

2005-04-05 Thread Lee Turner
time it will need to be queued in some way to make sure it happens after the re-indexing. I was just wondering if anyone had any pointers for doing this kind of thing. Any help would be gratefully appreciated. Many thanks Lee Lee Turner | Java Developer | Oyster Partners D. +44 (0)20

RE: Strategies for updating indexes.

2005-04-05 Thread Lee Turner
hread pausing the queue to stop it processing while the re-indexing takes place. I will also take a look at quartz. Your input is very much appreciated Many thanks Lee -Original Message- From: Jens Kraemer [mailto:[EMAIL PROTECTED] Sent: 05 April 2005 09:30 To: java-user@lucene.apach

Use of ThreadLocal in TermInfosReader

2005-08-19 Thread Lee Turner
eadLocals hanging around ? Any help would be greatly appreciated. Many thanks Lee Lee Turner | Java Developer | Oyster Partners D. +44 (0)20 74461418 T. +44 (0)20 7446 7500 www.oyster.com _

Do you believe in Clause sanity?

2005-10-13 Thread Andy Lee
The API for BooleanQuery only seems to allow adding clauses. The nearest way I can see to *remove* a clause is by laboriously constructing a new BooleanQuery (assuming you aren't absolutely tied to the original instance) and adding all the clauses from the original query except the one you

Re: Do you believe in Clause sanity?

2005-10-13 Thread Andy Lee
Oops, I'm confusing libraries. I meant I want to remove a Nutch Clause from a Nutch Query. --Andy On Oct 13, 2005, at 4:45 PM, Andy Lee wrote: The API for BooleanQuery only seems to allow adding clauses. The nearest way I can see to *remove* a clause is by laboriously construct

How To Implement Google Adwords-Like Text Ad?

2005-10-20 Thread Sam Lee
Hi, I am implementing a Google Adwords-like Text Ad thing. In Adwords, advertisers enter keywords and phases in their ads. When visitor visits a webpage with potential Google text ads, I want to know how they link the webpage to the actual text ads? Linking those text ads to the webpage is easy, th

Re: How To Implement Google Adwords-Like Text Ad?

2005-10-21 Thread Sam Lee
ypically describe > pages of that class > - it then automatically creates the pattern > descriptors it will use > against other pages. > > Hope this helps, > > Simon > > Sam Lee wrote: > > >Hi, > >I am implementing a Google Adwords-like Text Ad &

Recommendation on Reading or Websites or Examples of How to Use Lucene?

2005-10-21 Thread Sam Lee
Hi, Do you guys have good recommendation on websites that have detail explanation about how to use Lucene? If they have source examples too, that would be great. I already read the book Lucene in Action. Many thanks. __ Yahoo! FareChase: Se

Can I Do Reverse Search?

2005-10-22 Thread Sam Lee
Hi, Normally, lucene or Nutch can match query "nike shoe -blue" with "red nike shoe". But what about matching "red nike shoe" with query "nike shoe -blue"? It is the other way around. Can I do it with a combinations of API? Many thanks. __ Do Yo

Re: Can I Do Reverse Search?

2005-10-23 Thread Sam Lee
s it? Please > elaborate more on what you're after. Maybe what > you're looking for > is the contrib/memory and the MemoryIndex within > that Subversion area. > > Erik > > > On 22 Oct 2005, at 18:54, Sam Lee wrote: > > > Hi, > >

Re: Can I Do Reverse Search?

2005-10-23 Thread Sam Lee
m your page (ajax), remove stop > words, build a > query from the page words by connect the words with > OR and you will > find the best matching ad. > You may need to limit the words per page or set the > maximum clauses > to a much higher number. > HTH > Stefan &g

Re: Can I Do Reverse Search?

2005-10-23 Thread Sam Lee
d positive one called > negative > you query have to look somehow like this: > positive: (keyword1 keywordN) AND NOT > negative:(keyword1 keywordN) > > Am 23.10.2005 um 20:50 schrieb Sam Lee: > > > Yes, I thought of that. But since the ads have > > negative keywor

Re: Can I Do Reverse Search?

2005-10-23 Thread Sam Lee
d the MemoryIndex within > that Subversion area. > > Erik > > > On 22 Oct 2005, at 18:54, Sam Lee wrote: > > > Hi, > > Normally, lucene or Nutch can match query "nike > shoe > > -blue" with "red nike shoe". > > > > But

How Fast is MemoryIndex? How Much Resource Does It Use?

2005-10-23 Thread Sam Lee
Hi, Someone suggested that I should use MemoryIndex to match content to a large # of queries. e.g. "nike red shoes" --match--> "nike shoes -blue" and --match--> "nike shoes -black"... What if I have 10 of these queries for each content? and there maybe 100 of these contents. But how f

Re: How Fast is MemoryIndex? How Much Resource Does It Use?

2005-10-24 Thread Sam Lee
s ( eg +/- > operators) which may > cause them to fail when run as queries against the > MemoryIndexed subject > doc which is why the first "query the queries" > search is insufficient to > find the matches. > > Cheers, > Mark > > > Sam Lee wro

How to Integrate Lucene/Nutch with Mysql?

2005-10-25 Thread Sam Lee
Hi, My network is designed to have a bunch of advertisers to enter their ads with keywords. I think of using mysql to store those, and then use lucene and part of nutch to index them from mysql db, so that the websites can find and show the ads. But how do I integrate lucene/nutch with mysql?

Re: How to Integrate Lucene/Nutch with Mysql?

2005-10-25 Thread Sam Lee
t; <http://issues.apache.org/jira/browse/LUCENE-434> > > For the record, Derby is the Apache open source > database. It's a > full-featured relational database backed by an > active open source > community: http://db.apache.org/derby/. > > Cheers, > -Rick &g

Re: How to Integrate Lucene/Nutch with Mysql?

2005-10-25 Thread Sam Lee
available > as open source... > Am 25.10.2005 um 09:14 schrieb Sam Lee: > > > Hi, > > My network is designed to have a bunch of > advertisers > > to enter their ads with keywords. I think of > using > > mysql to store those, and then use lucene and part &g

Can Lucene be Used To Substitute Real Database?

2005-10-25 Thread Sam Lee
Hi, I am wondering if I can use Lucene to substitute real database like mysql db? I know that many people use lucene only to index mysql db because of inferior full-text index of mysql. Can Lucene to be used in place of mysql so that website visitors can input data that will in turn inserting

Re: Can Lucene be Used To Substitute Real Database?

2005-10-25 Thread Sam Lee
t; On Dienstag 25 Oktober 2005 22:37, Sam Lee wrote: > > > Can Lucene to be used in place of mysql so that > > website visitors can input data that will in turn > > inserting row into Lucene just like mysql db? > > That's a bad idea. Lucene lacks a real update (you

Lucene index performance

2007-06-17 Thread Lee Li Bin
Hi, I would like to know how's the performance during indexing and searching of results on a large index files would be like. And is it possible to create multiple index files and search across multiple index files? If possible, may I know how could it be done? Thanks a lot. ---

Lucene Query

2007-06-18 Thread Lee Li Bin
gth()); for (int c = 0; c < hits.length(); c++) { Document doc = hits.doc(c); System.out.println("Query found in file: " + doc.get("path")); System.out.println("Content: " + doc.get("text")); } Regards, Lee Li Bin

RE: Lucene for chinese search

2007-06-18 Thread Lee Li Bin
omcat for Chinese search using Lucence 2. do we need to use JSP meta / page encoding ? what is the encoding for jsp? Regards, Lee Li Bin -Original Message- From: Chris Lu [mailto:[EMAIL PROTECTED] Sent: Monday, June 18, 2007 2:10 AM To: java-user@lucene.apache.org Subjec

RE: Lucene for chinese search

2007-06-18 Thread Lee Li Bin
hieu Lecarme [mailto:[EMAIL PROTECTED] Sent: Monday, June 18, 2007 8:58 PM To: java-user@lucene.apache.org Subject: Re: Lucene for chinese search Lee Li Bin a écrit : > Hi, > > I still met problem for searching of Chinese words. > XMl file which is the datasource and analyzer has already

RE: Lucene for chinese search

2007-06-19 Thread Lee Li Bin
stant Scalable Full-Text Search On Any Database/Application > > site: http://www.dbsight.net > > demo: http://search.dbsight.com > > Lucene Database Search in 3 minutes: > > http://wiki.dbsight.com/index.php? > > title=Create_Lucene_Database_Search_in_3_minutes

TermVector

2007-06-24 Thread Lee Li Bin
Hi, May I know how do I store TermVector? When I set the last parameter to true, isn't it setting storeTermVector to true? But I get null value in TermFreqVector. BTW, I'm using lucene 1.4.3 Not intended to upgrade to 2.0 docAll.add(Field.Text("contentText", new StringReader(allCo

Pagination

2007-06-29 Thread Lee Li Bin
Hi, does anyone knows how to do pagination on jsp page using the number of hits return? Or any other solutions? Do provide me with some sample coding if possible or a step by step guide. Sry if I'm asking too much, I'm new to lucene. Thanks

RE: Pagination

2007-07-02 Thread Lee Li Bin
Hi, I still have no idea of how to get it done. Can give me some details? The web application is in jsp btw. Thanks a lot. Regards, Lee Li Bin -Original Message- From: Chris Lu [mailto:[EMAIL PROTECTED] Sent: Saturday, June 30, 2007 2:21 AM To: java-user@lucene.apache.org Subject

RE: Pagination

2007-07-02 Thread Lee Li Bin
Hi, Thanks Mark! I do have the same question as Alixandre. How do I get the content of the document instead of the document id? Thanks. Regards, Lee Li Bin -Original Message- From: Alixandre Santana [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 03, 2007 12:55 AM To: java-user

RE: Pagination

2007-07-02 Thread Lee Li Bin
Hi Mark, How do I display results on the second page? I manage to display on one page using your coding. Regards, Lee Li Bin -Original Message- From: Alixandre Santana [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 03, 2007 12:55 AM To: java-user@lucene.apache.org Subject: Re

Chinese words highlighting

2007-07-05 Thread Lee Li Bin
Hi, Anyone knows how to highlight Chinese character? When I do the highlight, it tends to highlight the whole sentence instead of the keywords. For Chinese highlighting, do I need to use the TermVector in order to highlight the correct keywords? Thanks

Newbie questions re: scoring

2006-05-04 Thread Lee, Andrew J \(CA - Toronto\)
ought would have been trimmed off with the higher threshold. With a threshold of 0.15 they would score 0.17, and with a threshold of 0.30 they are scoring something like 0.33. Can anybody explain this? My trimming is coming post-index-searching, so this is pretty confusing. Thanks in

Newbie synonyms question

2006-07-24 Thread Lee, Andrew J \(CA - Toronto\)
to the synonym index and let Lucene perform all the work of synonym lookup / replacement? Thanks in advance, Andrew Lee - *** *** Confidentiality Warning: This message and

RE: Newbie synonyms question

2006-07-26 Thread Lee, Andrew J \(CA - Toronto\)
synonyms question Hi Andrew, There is othing built into Lucene for synonyms, but you can grab the code from Lucene in Action to see how they can be handled (plus: http://www.lucenebook.com/search?query=synonyms for some context) Otis - Original Message From: "Lee, Andrew J (CA - To