Re: Am I correctly parsing the strings ? Terms or Phrases ?

2011-03-22 Thread Patrick Diviacco
Your answer is quite clear, but my question is a bit more specific: as you seen from my snippet ( I copy it here again), I'm already using BooleanQuery and QueryParser.parse method. booleanQuery.add(new QueryParser(org.apache.lucene.util.Version.LUCENE_40, "tags", new WhitespaceAnalyzer(org.apache

Re: Results: get per field scores ?

2011-03-22 Thread Patrick Diviacco
I'm not only interested in ordering relevant docs, but measure similarity per field. (More specificially, i'm passing the results to a classifier.. but this is out of topic) On 22 March 2011 23:27, Erick Erickson wrote: > yes, explain will slow things down, but I'd assumed that > you only cared

Re: RE: ParallelMultisearcher

2011-03-22 Thread Lance Norskog
You should not use the ParallelMultiSearcher. It will not be in the 4.0 release. On Thu, Mar 17, 2011 at 11:38 AM, Uwe Schindler wrote: > Hi Ganesh, > > this method is also in 2.9.1, it is just inherited from the superclass! You > have to also look at the complete javadocs. Not every method that

Re: TermDoc to TermDocsEnum

2011-03-22 Thread nitinhardeniya
Where i can find MIGRATE.txt ? On Wed, Mar 23, 2011 at 3:07 AM, nitin hardeniya wrote: > hey > > no null doesn't work > > I have tried > tds=MultiFields.getTermDocsEnum(reader, null, "content", term); > > this is not showing error but not it shows error at > getSpans() > > Thanks > Nitin > > On W

Re: TermDoc to TermDocsEnum

2011-03-22 Thread nitinhardeniya
hey no null doesn't work I have tried tds=MultiFields.getTermDocsEnum(reader, null, "content", term); this is not showing error but not it shows error at getSpans() Thanks Nitin On Wed, Mar 23, 2011 at 1:30 AM, Michael McCandless-2 [via Lucene] < ml-node+2716623-544966764-77...@n3.nabble.com>

Re: Results: get per field scores ?

2011-03-22 Thread Erick Erickson
yes, explain will slow things down, but I'd assumed that you only cared about this for debugging. What is the use-case for having it on all the time? Best Erick On Tue, Mar 22, 2011 at 12:40 PM, Patrick Diviacco wrote: > I've been told search explain should be used for debugging only because it

Re: TermDoc to TermDocsEnum

2011-03-22 Thread Michael McCandless
Try looking at MIGRATE.txt? Passing null for the skipDocs should be fine. Likely you need to use MultiFields.getTermDocsEnum, but that entails a performance hit (vs going segment by segment yourself). Mike http://blog.mikemccandless.com On Tue, Mar 22, 2011 at 1:56 PM, nitinhardeniya wrote: >

TermDoc to TermDocsEnum

2011-03-22 Thread nitinhardeniya
hi I have a code that work fine with lucene 3.2 where i used TermDocs to find the corpusTF here is the code public void calculateCorpusTF(IndexReader reader) throws IOException { // TODO Auto-generated method stub Iterator it = word.iterator(); I

Re: Distributing a Lucene application?

2011-03-22 Thread Chris Lu
Each database having its own index should be fine. However, just checking modified timestamp may not be enough, since there could be items deleted. You can check DBSight for this purpose. It can do remote index replication across WAN. But, if the NY index is synchronized before NY database doe

Re: Results: get per field scores ?

2011-03-22 Thread Patrick Diviacco
I've been told search explain should be used for debugging only because it slows down a lot computations. Is it true ? On 22 March 2011 14:29, Erick Erickson wrote: > Try Searcher.explain. > > Best > Erick > > On Tue, Mar 22, 2011 at 4:34 AM, Patrick Diviacco > wrote: > > Is there a way to disp

Re: Wanted: a directory of quick-and-(not too)dirty analyzers for multi-language RDF.

2011-03-22 Thread David Causse
On Tue, Mar 22, 2011 at 04:15:53PM +0100, fr.jur...@voila.fr wrote: > The only thing I need is the middle layer: a Java component extending > Lucene, that'd pull a plausible Analyzer out of its magic hat, for every > ISO 639-1 language tag however unlikely that turns up in the RDF input. > Not ju

Re: Wanted: a directory of quick-and-(not too)dirty analyzers for multi-language RDF.

2011-03-22 Thread fr . jurain
Hi Luceners, this is my 1st experience with ARQ, LARQ & Lucene; everyth. went smooth so far, however the slope seems to be getting steeper suddenly. The initial problem was to develop a Java app to build, then to browse through, a repository of RDF data. With TDB/ARQ, this is now running smoot

Re: how to get all documents in the results ?

2011-03-22 Thread mailtojiangmingyuan
yes,i think the "MatchAllDocsQuery" should feed your need. -- Original -- From: "Anshum"; Date: 2011年3月22日(星期二) 晚上7:40 To: "java-user"; Subject: Re: how to get all documents in the results ? Hi Patrick, You may have a look at this, perhaps this will help you

Re: Building a query of single terms...

2011-03-22 Thread Patrick Diviacco
all right. I still have one last question. If I pass a new QueryParser to booleanQuery.add method, am I actually passing multiple single terms or it is the same than just passing the string ? See example below, please... booleanQuery.add(new QueryParser(org.apache.lucene.util.Version.LUCENE_40, "

Re: Results: get per field scores ?

2011-03-22 Thread Erick Erickson
Try Searcher.explain. Best Erick On Tue, Mar 22, 2011 at 4:34 AM, Patrick Diviacco wrote: > Is there a way to display Lucene scores per field instead of the global one > ? > Both my query and my docs have 3 fields. > > I would like to see the scores for each field in the results. Can I ? > > Or

Re: Am I correctly parsing the strings ? Terms or Phrases ?

2011-03-22 Thread Erick Erickson
A good habit to develop is to print out the toString() of the assembled queries, that'll get you going pretty quickly understanding what the query assembly is all about without having to wait for people to respond. But the short form is that phrase queries require all the terms to be adjacent, whi

Re: Performance problems with lazily loaded fields

2011-03-22 Thread Erick Erickson
Don't do that Let's back up a second and ask why in the world you want to do this, what's the use-case you're satisfying? Because spinning through all the results and getting information from the underlying documents is inherently expensive since, as Sanne says, you're doing disk seeks. Most L

Re: Building a query of single terms...

2011-03-22 Thread Erick Erickson
The easiest way to figure out this kind of thing is to print out the toString() on the queries after they're assembled. I believe you'll find that the difference is that the PhraseQuery would find text like "Term1 Term2 Term3" but not text like "Term1 some stuff Term2 more stuff Term3" whereas Bool

Re: How to normalize Lucene scores... (over all queries)

2011-03-22 Thread Erick Erickson
You can't. If by "normalize" you mean compare the scores between two different queries, it's meaningless. The scores from one query to another are not comparable. If by "normalize" you mean make into a value between 0 and 1, anywhere you have access to raw scores I believe you also have access to

Re: how to get all documents in the results ?

2011-03-22 Thread Anshum
MatchAllDocs does not consider only a single field but all fields i.e. it takes a *:* query. *1. * *Snip Query query = new MatchAllDocsQuery(); TopDocs td = is.search(query, ir.numDocs()); ScoreDoc[ ] scoreDocs = td.scoreDocs; for(ScoreDoc scoreDoc:scoreDocs){ ... Your code... } /Sn

Re: how to get all documents in the results ?

2011-03-22 Thread Patrick Diviacco
1. "all" docs 2. because matchalldocs only consider one field at once. I'm searching over multiple fields instead. 3. could you tell me more about this ? It might be a solution! On 22 March 2011 12:18, Anshum wrote: > so a few things > 1. are you looking to get 'all' documents or only docs m

Re: how to get all documents in the results ?

2011-03-22 Thread Anshum
so a few things 1. are you looking to get 'all' documents or only docs matching your query? 2. if its about fetching all docs, why not use the matchalldocs query? 3. did you try using a collector instead of topdocs? -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Mar 22, 2011 at 4:46 PM, Pat

Re: how to get all documents in the results ?

2011-03-22 Thread Patrick Diviacco
I don't think the link you suggested can help, but maybe I'm wrong. Also, the parameter MAX_HITS is not useful, it just limit the results, it doesn't add the not relevant docs. On 22 March 2011 12:10, Anshum wrote: > Hi Patrick, > You may have a look at this, perhaps this will help you with i

Re: how to get all documents in the results ?

2011-03-22 Thread Anshum
Hi Patrick, You may have a look at this, perhaps this will help you with it. Let me know if you're still stuck up. http://stackoverflow.com/questions/3300265/lucene-3-iterating-over-all-hits -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Mar 22, 2011 at 4:10 PM, wrote: > Not sure what yo

Grouping...

2011-03-22 Thread Dawn Zoë Raison
Hi Folks, Before I run off and reinvent the wheel here - has anyone done any form of result grouping with lucene? My use case looks something like this: Newspaper pages are stored as documents in the lucene index. I need to list the newpapers that match my criteria in date order, so that I ca

RE: how to get all documents in the results ?

2011-03-22 Thread karl.wright
Not sure what your use case actually is, but it sounds like you may be unclear how Lucene works. Each query clause you have will produce an iterator that walks over the documents that match that clause. All the documents from the entire, root query get scored. The scoring evaluation per docum

Re: Is it possible to update only selected fields in a document ?

2011-03-22 Thread Anshum
Yes, that's how its generally done. Also, you should just handle data/fields aptly rather than trying to avoid them in the first place. You could safely add these, use these internally and never return these or use these for an end user search. -- Anshum Gupta http://ai-cafe.blogspot.com On Tue,

Re: Is it possible to update only selected fields in a document ?

2011-03-22 Thread shrinath.m
On Tue, Mar 22, 2011 at 3:57 PM, Anshum-2 [via Lucene] < ml-node+2714275-817267840-376...@n3.nabble.com> wrote: > Also, > Is there a particular reason why you wouldn't want to index that > considering > you'd want to 'update' documents. > I'd not mind indexing it, but I wouldn't want it to pop up

Re: Is it possible to update only selected fields in a document ?

2011-03-22 Thread Anshum
Also, Is there a particular reason why you wouldn't want to index that considering you'd want to 'update' documents. Its good practice to index the unique field specially if you have one. It has generally helped more often than not. -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Mar 22, 201

Re: Is it possible to update only selected fields in a document ?

2011-03-22 Thread Michael Wechner
On 3/22/11 10:09 AM, shrinath.m wrote: On Tue, Mar 22, 2011 at 1:37 PM, Michael Wechner [via Lucene]< ml-node+2714008-984126374-376...@n3.nabble.com> wrote: are you looking for something like http://hrycan.com/2009/11/26/updating-document-fields-in-lucene/ ? Precisely that. I am OK with st

Re: Distributing a Lucene application?

2011-03-22 Thread Johannes Zillmann
Can have a look at http://katta.sourceforge.net/ or http://wiki.apache.org/solr/SolrCollectionDistributionScripts HTH Johannes On Mar 22, 2011, at 9:30 AM, sol myr wrote: > Hi, > What are my options for distributing an application that uses Lucene? > > Our current application works against a d

Re: Is it possible to update only selected fields in a document ?

2011-03-22 Thread shrinath.m
On Tue, Mar 22, 2011 at 1:37 PM, Michael Wechner [via Lucene] < ml-node+2714008-984126374-376...@n3.nabble.com> wrote: > are you looking for something like > > http://hrycan.com/2009/11/26/updating-document-fields-in-lucene/ > > ? > Precisely that. I am OK with storing the fields, but I wanted to

Results: get per field scores ?

2011-03-22 Thread Patrick Diviacco
Is there a way to display Lucene scores per field instead of the global one ? Both my query and my docs have 3 fields. I would like to see the scores for each field in the results. Can I ? Or should I run the query 3 times for each single field ? thanks

Distributing a Lucene application?

2011-03-22 Thread sol myr
Hi, What are my options for distributing an application that uses Lucene? Our current application works against a database of INVENTORY. We schedule hourly checks for modified items (timestamp-based), and update a single Lucene index. Now we want to distribute out application, to a Grid, with fail

how to get all documents in the results ?

2011-03-22 Thread Patrick Diviacco
I'm using the following code because I want to see the entire collection in my query results: //adding wildcards-term to see all results rest = new TermQuery(new Term("*","*")); booleanQuery.add(rest, BooleanClause.Occur.SHOULD); But it doesn't work, I only see the relevant docs and not all the o

Re: Is it possible to update only selected fields in a document ?

2011-03-22 Thread Michael Wechner
On 3/22/11 8:40 AM, shrinath.m wrote: On Tue, Mar 22, 2011 at 12:39 PM, Anshum-2 [via Lucene]< ml-node+2713899-1210341880-376...@n3.nabble.com> wrote: No as of now, there's no way to do so. Thank you Anshum-2, how do you propose I do this ? I have thought of a way like this : - first get the

Re: Am I correctly parsing the strings ? Terms or Phrases ?

2011-03-22 Thread Patrick Diviacco
OK, so I'm currently doing this: booleanQuery.add(new QueryParser(org.apache.lucene.util.Version.LUCENE_40, "tags", new WhitespaceAnalyzer(org.apache.lucene.util.Version.LUCENE_40)).parse(phrase[i]); , BooleanClause.Occur.SHOULD); I just want to add single terms to my booleanQuery. if I pass a q

Re: Is it possible to update only selected fields in a document ?

2011-03-22 Thread shrinath.m
On Tue, Mar 22, 2011 at 12:39 PM, Anshum-2 [via Lucene] < ml-node+2713899-1210341880-376...@n3.nabble.com> wrote: > No as of now, there's no way to do so. Thank you Anshum-2, how do you propose I do this ? I have thought of a way like this : - first get the doc based on a unique id into a HashMa

Re: Is it possible to update only selected fields in a document ?

2011-03-22 Thread Anshum
Hi, No as of now, there's no way to do so. -- Anshum Gupta http://ai-cafe.blogspot.com On Tue, Mar 22, 2011 at 12:29 PM, shrinath.m wrote: > I am asking for partial update in Lucene, > where I want to update only a selected field of all fields in the document. > Does Lucene provide any way to

Is it possible to update only selected fields in a document ?

2011-03-22 Thread shrinath.m
I am asking for partial update in Lucene, where I want to update only a selected field of all fields in the document. Does Lucene provide any way to do this ? How to approach this ? -- View this message in context: http://lucene.472066.n3.nabble.com/Is-it-possible-to-update-only-selected-fie