Re: search for special condition.

2008-08-12 Thread 장용석
hi. thank you for your response. I was found the way with your help. There are class that name is ConstantScoreRangeQuery and NumberTools. Reference site is here. http://markmail.org/message/dcirmifoat6uqf7y#query:org.apache.lucene.document.NumberTools+page:1+mid:tld3uekaylmu2cwt+state:results

Re: Searching Tokenized x Un_tokenized

2008-08-12 Thread Otis Gospodnetic
Perhaps you can lowercase the text prior to passing it to Lucene? Or perhaps you can have a custom Analyzer that treats the whole input as 1 Token (see KeywordAnalyzer -- http://lucene.apache.org/java/2_3_2/api/org/apache/lucene/analysis/KeywordAnalyzer.html ), but also includes LowerCaseFilter

Re: Searching Tokenized x Un_tokenized

2008-08-12 Thread Andre Rubin
Thanks Otis, that was exactly what was happening. 1) According to here: http://wiki.apache.org/lucene-java/LuceneFAQ#head-133cf44dd3dff3680c96c1316a663e881eeac35a wildcard queries are not passed through the Analyzer, but they are always set to lower case. 2) And according to here: http://wiki.apa

Re: possible to read index into memory?

2008-08-12 Thread Otis Gospodnetic
Another very simple alternative to using RAMDirectory is the use of RAM FS: http://search.yahoo.com/search?p=ramfs http://search.yahoo.com/search?p=tmpfs Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Chris Hostetter <[EMAIL PROTECTED]>

Re: search for special condition.

2008-08-12 Thread Otis Gospodnetic
Hi, Lucene doesn't have the greater than operator. Perhaps you can use range queries to accomplish the same thing. http://lucene.apache.org/java/2_3_2/queryparsersyntax.html#Range%20Searches Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > Fro

Re: Searching Tokenized x Un_tokenized

2008-08-12 Thread Otis Gospodnetic
Andre, Check the Lucene FAQ, there is an entry about wildcards and analysis (which doesn't take place for wildcard queries). Could that be it? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Andre Rubin <[EMAIL PROTECTED]> > To: java-user

Re: top terms

2008-08-12 Thread Otis Gospodnetic
There is a class for doing that in contrib/miscellaneous I think, though it too probably loops through TermEnum. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Cam Bazz <[EMAIL PROTECTED]> > To: java-user@lucene.apache.org > Sent: Tuesday

Re: possible to read index into memory?

2008-08-12 Thread Chris Hostetter
: On one index, I am seeing no speed change when flipping between : RAMDirectory IndexSearcher and file system version. that is probably because even if you just use an FSDirectory, your OS will cache the disk "pages" in RAM for you -- all using a RAMDirectory does for you is garuntee that the

Re: possible to read index into memory?

2008-08-12 Thread Kalani Ruwanpathirana
Did you try this? byte [] buffer = new byte [100] ; LuceneUtils.copy(fsDir, ramDir, buffer); Kalani On Wed, Aug 13, 2008 at 6:26 AM, Darren Govoni <[EMAIL PROTECTED]> wrote: > Hello, > The kind sir below recommended the RAMDirectory for loading an on-disk > index into memory (the entire data)

Re: possible to read index into memory?

2008-08-12 Thread Darren Govoni
Hello, The kind sir below recommended the RAMDirectory for loading an on-disk index into memory (the entire data) and using IndexSearcher off that. It seemed to worked very well. On one index, I am seeing no speed change when flipping between RAMDirectory IndexSearcher and file system version.

top terms

2008-08-12 Thread Cam Bazz
hello, how do we get the terms with the highest frequency for a given field? I know one can TermEnum terms = searcher.getIndexReader().terms() then, iterate over it and filter the fields required and count them, but is there a way to get lets say top 50 terms for a given field without iterating?

Re: Searching Tokenized x Un_tokenized

2008-08-12 Thread Andre Rubin
My searches for my String tokenized field was working properly. I switched the field to un_tokenized, rebuilt the index, and now my searches only return strings that match the query string in lower case. For example, searching for 'us*': The tokenized field version would find 'USA' and 'usa' The

Case studies for Lucene in Action 2nd edition

2008-08-12 Thread Otis Gospodnetic
Hello, We are working on Lucene in Action 2nd edition. One of the well received chapters from LIA #1 was the Case Studies chapter. Case studies in this chapter are from external LIA contributors who were willing to share information about how they use Lucene. We want to refresh this chapter

Re: Hibernate Search

2008-08-12 Thread Shalin Shekhar Mangar
Perhaps Solr and DataImportHandler may also be of interest to you. http://lucene.apache.org/solr http://wiki.apache.org/solr/DataImportHandler On Tue, Aug 12, 2008 at 1:17 PM, Sascha Fahl <[EMAIL PROTECTED]>wrote: > Hi, > > what do you think about Hibernate Search to handle the indexing of datab

Re: Hibernate Search

2008-08-12 Thread Emmanuel Bernard
You probably should ask your question on the Hibernate forum as well. You will more likely find actual Hibernate Search users there :) http://forum.hibernate.org/viewforum.php?f=9 -- Emmanuel Bernard http://in.relation.to/Bloggers/Emmanuel | http://blog.emmanuelbernard.com | http://twitter.c

Re: when do internal doc IDs change?

2008-08-12 Thread Chris Hostetter
: Subject: when do internal doc IDs change? This is a FAQ... "When is it possible for document IDs to change?" http://wiki.apache.org/lucene-java/LuceneFAQ#head-e1de2630fe33fb6eb6733747a5bf870f600e1b4c : One idea I have is to maintain a set of binary on-disk files (one for : each field we want

when do internal doc IDs change?

2008-08-12 Thread Robert Stewart
We have a problem where using FieldCache (or using TermEnum/TermDocs directly) in order to pre-cache several fields. It is a bottleneck, because we open a new searchable index snapshot very frequently (every minute). Each time we get a new snapshot of our master index (basically a copy using h

Re: integrating with postgres

2008-08-12 Thread Shalin Shekhar Mangar
On Tue, Aug 12, 2008 at 10:20 PM, Mark Miller <[EMAIL PROTECTED]> wrote: > Doing everything for this yourself for something simple is probably not > that much work - but in the end your probably going to want _more_. I would > recommend you set yourself up with solr and check out > https://issues

Re: integrating with postgres

2008-08-12 Thread Mark Miller
mark wrote: hi i am new to lucene. my data is stored in a table in postgres, i want to be able to do full text search based on two columns. how do i integrate postgres & lucene? are there any guides? thanks - To unsubscribe, e-m

RE: CheckIndex possibly not detecting/fixing all corruptions?

2008-08-12 Thread John O'Brien
Hi Mike, Apologies for the delay in getting back. I have since figured out that the reason Luke gave an error when we searched on the "fixed" index was (possibly) because it was a really old version (0.6 2005/02/16) - I tried again with v 0.8.1 (2008-02-13) and Luke can search on the "fixed

integrating with postgres

2008-08-12 Thread mark
hi i am new to lucene. my data is stored in a table in postgres, i want to be able to do full text search based on two columns. how do i integrate postgres & lucene? are there any guides? thanks - To unsubscribe, e-mail: [EMAIL PR

RE: Query to ignore certain phrases

2008-08-12 Thread Steven A Rowe
Hi Jeff, I don't know of a query parser that will allow you to acheive this. However, if you can programmatically construct (at least a component of) your queries, then you may want to check out Lucene's SpanQuery functionality. In particular, using your example, if you combine a SpanFirstQuery

Re: Results by unique id's

2008-08-12 Thread Karsten F.
hi Martin, I think you are searching for DuplicateFilter http://www.nabble.com/how-to-get--all-unique--documents-based-on-keyword-feild-to18807014.html best regards Karsten wysiecki wrote: > > Hello, > > thanks for help in advance. > > my example docs: > > two fileds company_id and co

Re: Results by unique id's

2008-08-12 Thread Martin vWysiecki
Hello Chris, Sorry but this is not the solution for me, because i've got more fields which are imported, for example url doc1;1;"car volvo","company1.com/volvo" doc2;1;"car toyota","company1.com/toyota" doc3;2;"car mitsubishi","company2.com/mitsubishi" doc4;2;"car skoda","company2.com/skoda" so,

Re: Results by unique id's

2008-08-12 Thread Chris Lu
Maybe re-organize the index structure as doc1:1; "car volvo", "car toyota" doc2;2;"car mitsubishi", "car skoda" You can add the content field twice for the same company_id. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsig

Results by unique id's

2008-08-12 Thread Martin vWysiecki
Hello, thanks for help in advance. my example docs: two fileds company_id and content doc1;1;"car volvo" doc2;1;"car toyota" doc3;2;"car mitsubishi" doc4;2;"car skoda" my search "car" Now i would like to get only doc 1 and 3 because doc2 is the same company, same company_id, same for doc 4 I

Re: Query to ignore certain phrases

2008-08-12 Thread Doron Cohen
> > I think it should look something like this > > "white house" NOT "russian white house"~1 "a b c"~1 just matches more 'easily' than "a b c". It will match for instance "a b d c". The NOT however excludes all documents which match this, unlike requested logic. In fact, Q1: "a b" NOT "a

Re: search for special condition.

2008-08-12 Thread Cheolgoo Kang
How about using NumberTools and range query/filters? http://lucene.apache.org/java/2_3_2/api/core/org/apache/lucene/document/NumberTools.html - Cheolgoo Kang 2008/8/12 장용석 <[EMAIL PROTECTED]>: > hi. > > I am searching for lucene api or function like query "FIELD > 1000" > > For example, a user

Re: delete by doc id

2008-08-12 Thread Cam Bazz
Hello Andy, Thanks for your input. I understand what you are saying and trying to use lucene as a relational db is a little too far, however, in certain specialized areas, lucene works better than relational databases. If you can setup the scheme, so that it is non-normalized, and if you dont need

Re: Clarification on deletion process...

2008-08-12 Thread Michael McCandless
Some more details below... <[EMAIL PROTECTED]> wrote: > The documentation for delete operation seems to be confusing (i am going > thru the book and also posted in the books forums...), so appreciate if > someone can let me know if my below understanding is correct. > > When i delete a document fr

search for special condition.

2008-08-12 Thread 장용석
hi. I am searching for lucene api or function like query "FIELD > 1000" For example, a user wants to search a product which price is bigger then user's input. If user's input is 1 then result are the products in index just like "PRICE > 1" Is there any way to search like that? thanks. J

Re: delete by doc id

2008-08-12 Thread Cam Bazz
I get the id's to delete from a query in indexsearcher. I think I am going trunk, hope it wont cause a lot of pain. best. -C.B. On Sat, Aug 9, 2008 at 2:30 AM, Michael McCandless < [EMAIL PROTECTED]> wrote: > > It's risky. > > How would you get the IDs to know which ones to delete? A separate

Re: Query to ignore certain phrases

2008-08-12 Thread Alexander Aristov
I think it should look something like this "white house" NOT "russian white house"~1 http://lucene.apache.org/java/docs/queryparsersyntax.html#Escaping%20Special%20Characters Alex On 12/08/2008, Jeff French <[EMAIL PROTECTED]> wrote: > > > We're trying to perform a query where if our intended s

Hibernate Search

2008-08-12 Thread Sascha Fahl
Hi, what do you think about Hibernate Search to handle the indexing of database content? It often is a problem to have database and index coherent. So does anyone of you have experiences in using Hibernate Search there for? Regards, Sascha

Re: Query to ignore certain phrases

2008-08-12 Thread Doron Cohen
I can't see how to accomplish this without writing some special code, and not just because of query parsing. Phrases are searched by iterating the participating term positions and when a match is found say for "b c" there is no way to know whether another query "a b c d" matches exactly the corres