Yes, I am.
The UID example Michael gave provides a way for us not to branch from
lucene code base.
I am trying to improve on it by storing the uid using position (since
position info is not used for ids) which would buy use in load time
quite a bit.
-John
On Nov 19, 2007 4:28 PM, Yonik Seeley <
On Nov 19, 2007 6:39 PM, John Wang <[EMAIL PROTECTED]> wrote:
> oh, is there a way of opening that?
Well, you can keep track of position increments yourself and then
choose the correct position increment so that the position you want is
indexed. AFAIK, positions increments must be positive , so y
oh, is there a way of opening that?
In the UID example Mike gave, it seems that uid can be stored in the
position part of the data.
It would be very efficient in both load time and index size to be able
to do that.
Thanks
-john
On Nov 19, 2007 1:22 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote:
>
Gentlefolk,
Well, the javadocs as patched at LUCENE-584 try to change all
the cases of zero scoring to 'non matching'.
I'm happily bracing for a minor conflict with that patch. In case
someone wants to take another look at the javadocs as
patched there, don't let me stop you...
Regards,
Paul Els
In the sample TestSort represents my problem:
In the info below I need get the list of "contents" that contains
"x" (A,C,E,G,I) and other list of index (5,2,3) that not contain info
replicated.
The first list I get using any query of type: query = new TermQuery
(new Term ("contents", "x"))
On Nov 19, 2007 5:03 PM, Chris Hostetter <[EMAIL PROTECTED]> wrote:
> (I'm not actually sure how the Hits class treats negative values
All Lucene search methods except ones that take a HitCollector filter
out final scores <= 0
Solr does allow scores <=0 through since it had different collection
m
: My simplistic view has been that all the docs returned via Hits
: or HitCollector have scores > 0, and all the rest have scores of 0,
: and this view is supported by the explanation of
: HitCollector.collect
:
: " Called once for every non-zero scoring document, with the
: document number and i
On Nov 19, 2007 4:14 PM, John Wang <[EMAIL PROTECTED]> wrote:
>What is the right way of setting customized position value on a
> token at indexing time.
You set the positionIncrement, and the lucene indexing code determines
the absolute position. You can't set an absolute position yourself.
Hi:
What is the right way of setting customized position value on a
token at indexing time.
Thanks
-John
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
German,
How would be it ?
You have 2 index ?. One for seach main (keyword) and other for location ?
You do 2 search, The first is the search main e the second is the
search location ,but insert the filter.
What type of Filter do use ?
I have the bitset of search main (keyword), but I
Hallo Daniel;
the number returned by delete is 0, but the "uid" shows up in Luke so it is
there.
I close the reader after every delete and then re-open it for the next
delete (see my code snippets below).
Eric
Daniel Naber-10 wrote:
>
> On Sonntag, 18. November 2007, flateric wrote:
>
>> Ind
Thank you guys for your prompt answers. I'm a beginner with Lucene and I still
had some unclarities regarding its scoring function. Your answers really
cleared things up for me. I guess a direct comparison with LSI is not possible
after all, only the comparison between LSI and the pure VSM.
Th
I could be mistaken, but I think the earlier answer was right; a document
with no terms matching has a score of 0, so you can assume that all
documents NOT returned by the query have a score of 0. If you look at the
scoring formula on this page, it is hard to see how you can get a negative
scor
Lucene only scores those documents that have at least one match term,
it doesn't implement a pure vector space model whereby all documents
are scored (it uses a combination of the Boolean Model and VSM).
Thus, I am not sure you can do a pure comparison. I suppose you could
simulating the
Hi Sonia,
I agree with Erick here. Negative scores don't make sense and Lucene
never computes scores for documents that don't match a query.
E. g. if your query is: "term1 OR term2", then every document that
contains term1 or term2 or both will have a score greater than 0. But if
two docs don't c
I am trying to order all the documents in the index according to their
similarity to a given query. I am interested in having a complete list of *all*
the documents in the index with their score. From what I understood by reading
some documentation, Lucene internally assigns scores to all the do
I am trying to order all the documents in the index according to their
similarity to a given query. I am interested in having a complete list of *all*
the documents in the index with their score. From what I understood by reading
some documentation, Lucene internally assigns scores to all the do
Could you explain a bit more what problem you're trying to solve?
The reason I ask is that your question doesn't make sense to me,
since I have no idea what you expect by the term "negative score".
My simplistic view has been that all the docs returned via Hits
or HitCollector have scores > 0, and
Also, are you re-opening the reader underlying your *searcher* before
you query and still get the deleted docs?
Also, look with Luke to see if the specific uid you *think* you've deleted
is really gone.
Best
Erick
On Nov 19, 2007 6:42 AM, Daniel Naber <[EMAIL PROTECTED]> wrote:
> On Sonntag, 18
A facet is a group condition, could be a single value of the doc or a
set of filters.
On Nov 19, 2007 1:09 PM, Haroldo Nascimento <[EMAIL PROTECTED]> wrote:
> German,
>
> When You said:
> "I collect every facet's bitset ... "
> what is a facet ? Is there the each option of filter of your site ?
I have already defined a Lucene Filter for every "id" of "ubicacion".
I just create the bitset for every value, and count it against the result.
One possible optimization is to read the terms of the field you're
trying to "group", that's the optimization we'll be working soon on
our app.
I never
German,
When You said:
"I collect every facet's bitset ... "
what is a facet ? Is there the each option of filter of your site ?
How you get the every facets ?
On Nov 19, 2007 1:05 PM, Haroldo Nascimento <[EMAIL PROTECTED]> wrote:
> German,
>
> What I need is similar to the your site
> ht
German,
What I need is similar to the your site
http://listados.deremate.com.ar/panaderia .
I have many results of search, but I show any result (for example:
first 10 for first page) , but for create the options of filter of
location I need read all results fof search. The problem of
performa
Hi everyone,
I am trying to obtain the score for each document in the index relative to a
given query. For example, if I have the query "search file", I am trying to get
the list of all documents in the index and their scores relative to the given
query. I tried first using Hits, which gave me
I think, based on your previous question, that you just need to use
the search() method that returns TopDocs, not the lower-level
HitCollector method. From the TopDocs, you can then access the
ScoreDoc, which will give you info about the doc and the score. See http://www.lucenebootcamp.com/
Why do you need the doc's info?
If you're grouping you may not need detail on each group condition.
Here is a sample of faceted (grouped) search:
http://listados.deremate.com.ar/mp3
(Sorry, it's in spanish)
Simply I collect every facet's bitset and intersect it against the
result's bitset (keywo
Try "java -verbose" to see more info on class loading.
Also try "java -classpath=yourClassPath" from command line.
Note that separators in the classpath may differ between operating
systems - e.g. ";" in Windows but ":" in Linux...
Doron
Liaqat Ali <[EMAIL PROTECTED]> wrote on 19/11/2007 15:43:30
Mark,
How I can get the information of Document. I think that is in the
implementation do method abstract collect. How I can get it .
Below is the example of javadoc the Lucene.
Searcher searcher = new IndexSearcher(indexReader);
final BitSet bits = new BitSet(indexReader.maxDoc());
se
Ah, I see. This just means change into the directory via the command
line where you unpacked the installation.
HTH,
Grant
On Nov 19, 2007, at 8:34 AM, Liaqat Ali wrote:
I m new to lucene and want to clear about some questions.
When I unpacked the Lucene, which i downloaded from Apache site
Liaqat,
What exactly are you looking for? Are you sure you want to build the
source of lucene and then use it? Alternatively you could simply use the
lucene jar file (ie. already built for you) and start playing around
with it. This jar file is bundled in the archive that you might have
downloaded.
Hi All,
I m new to Lucene. I m facing problem while running the Lucene Demo to
index lucene src code. I download the 2.1.0 version of Lucene and
extracted it binary to C:\lucene-2.1.0.
I also set up the CLASSPATH to Lucene-Core and Lucene Demo Jar files.
But when i execute the following co
I m new to lucene and want to clear about some questions.
When I unpacked the Lucene, which i downloaded from Apache site.
I ran the Build.txt file and there are five steps to set up lucene.
Lucene Build Instructions
$Id: BUILD.txt 476955 2006-11-19 22:28:41Z hossman $
Basic steps:
0) Instal
Can you provide more details? Are you actually using Lucene or some
third party product that uses Lucene? What steps did you take to get
this?
-Grant
On Nov 19, 2007, at 5:42 AM, Liaqat Ali wrote:
Hi All,
Can some explain to me this line. I encounter this line while
setting up Lucene
On Sonntag, 18. November 2007, flateric wrote:
> IndexReader ir = IndexReader.open(fsDir);
> ir.deleteDocuments(new Term("uid", uid));
> ir.close();
>
> Has absolutely no effect.
What number does ir.deleteDocuments return? If it's 0, the uid cannot be
found. If it's > 0: note that you need to re
Hallo Daniel;
thank you for your quick reply.
The "uid" field exists (UN_TOKENIZED and stored). The IndexWriter is also
closed while I'm using the IndexReader to delete.
Thanks,
Eric
Daniel Naber-10 wrote:
>
> On Sonntag, 18. November 2007, flateric wrote:
>
>> Has absolutely no effect. I al
Hallo Kapil;
thanks for your quick answer:
* "An IndexReader can be opened on a directory for which an IndexWriter
is opened already, but it cannot be used to delete documents from the
index then. "
sounded like a match, but I checked that and the IndexWriter is definitely
closed.
Regards,
Eric
You sould never use the hits for other use than retrieving a group of
results (usually a page of 10-20-30 docs).
You could see Apache Solr's implementation of faceted search.
I've use that code as a guide to group & count diferent facets (or
conditions, fields as you wanna call it), is pretty fast
Hi All,
Can some explain to me this line. I encounter this line while setting up
Lucene...
Connect to the top-level of your Lucene installation
Kindly guide me in this regard.
Liaqat Ali
-
To unsubscribe, e-mail: [EMAIL
Hi Fayyaz,
I recommend to use SAX or, maybe, a custom parser for large xml files .It
should be faster than using Digester. The main difference between those xml
parsers is that Digester needs to load the entire xml document in memory when
it creates those objects, meanwhile you can parse the doc
39 matches
Mail list logo