Hi Ian, Thanks for the reply. I am not sure if the bq solution will b able to solve the problem. Let me explain with an example -
document 1 - (some text) IBM - 0.6 Google - 0.1 Apple - 0.4 Now suppose I index the document based on the "company name" and "confidence scores" separately and search using the bq where the Numeric Field search is based on "anything below 0.5" and text = "IBM". Here, by mistake the document 1 will be chosen (as it has been stored with 0.6, 0.1 and 0.4). But actually it should not be - as the "IBM" score is 0.6. So in gist - this problem needs some sort of linking between the company name and the scores. --d On Wed, Mar 21, 2012 at 10:41 AM, Ian Lea <ian....@gmail.com> wrote: > Why do you want to link name and confidence in one field? Store > confidence as a NumericField and search something like > > BooleanQuery bq = new BooleanQuery(); > Query nameq = parser.parse(...) or whatever > Query confq = NumericRangeQuery.newXxx(...); > bq.add(nameq, ...); > bq,add(confq, ...); > > and search using bq. > > > -- > Ian. > > > On Wed, Mar 21, 2012 at 2:20 PM, Deb Lucene <deb.luc...@gmail.com> wrote: > > Hi Group, > > > > Sorry for cross posting! > > > > We need to index a document corpus (news articles) with some meta data > > features. The meta data are actually company names with some scoring (a > > double, between 0 to 1). For example, two documents can be - > > > > document 1 > > (some text - say a technical article from NY times). It comes with the > > metadata like - > > IBM - 0.5 > > Google - 0.9 > > Apple - 0.3 > > > > where 0.5, 0.9, 0.3 are some confidence scores for the company names. > > > > Similarly, the document 2 is about some IT article and then the meta data > > are like - > > IBM - 0.6 > > Google - 0.1 > > Apple - 0.4 > > > > now we can index the documents based on the contents or the company names > > easily. But here the problem is we need to create a "field" where the > > company names and the scores are linked. So that we can search something > > like - > > > > query = where the "company name" (a field) is "IBM" and the scores of IBM > > is > 0.5. > > So in that case the document 2 will be retrieved. > > > > I am wondering if anyone has ideas about using the company names and > scores > > (linked) together as a field. > > > > Thanks in advance, > > > > --d > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >