This came up in the list with several solutions - look for:
Asserting that a value must match the entire content of a field
Doron
"Kainth, Sachin" <[EMAIL PROTECTED]> wrote on 13/03/2007
03:18:50:
> Hi all,
>
> Is it possible to search whether a term is equal to the entire contents
> of a fiel
I'd like to add a field to every document in an index... that I'd rather
not rebuild from scratch (yet). This is behind Solr (so a ParallelReader
won't work without core modifications, right?).
Is there a way I could create an index with the same number of documents
and only the new fie
Hi all,
I try to analysis a sample tii file(lucene 2.0.0), IndexTermCount is 2 in the
file, but I don't know the meaning of these bytes "00 00 FF FF FF FF 0F 00 00
00 14" after the field SkipInterval. It shall be a
according to the file format. Who can help me on this? thanks a lot.
The att
I'd like to add a field to every document in an index... that I'd
rather not rebuild from scratch (yet). This is behind Solr (so a
ParallelReader won't work without core modifications, right?).
Is there a way I could create an index with the same number of
documents and only the new field
hmmm...now I wonder wheter it is possible to access this lengthNorm value so
that it can be used as before but without creating any nrm file -->
setOmitNorm = true
Any other suggestion on how i could get the same rank as before by making use
of this lengthNorm but without creating nrm fil
You can store the fields in the index itself if you want, without indexing them
(just flag it as stored/unindexed). I believe storing fields should not incur
the "norms" size problem, please correct me if I'm wrong.
Thanks,
Xiaocheng
maureen tanuwidjaja <[EMAIL PROTECTED]> wrote: Ya...I think i
Ya...I think i will store it in the database so that later it could be used in
scoring/ranking for retrieval...:)
Another thing i would like to see is whether the precision or recall will be
much affaected by this...
Regards,
Maureen
Xiaocheng Luan <[EMAIL PROTECTED]> wrote:One side
Or, you may index the fields that you want "exact matches" as non-tokenized.
Thanks,
Xiaocheng
Bhavin Pandya <[EMAIL PROTECTED]> wrote: Hi kainth,
>So for example if I have a field with this text: "world cup" and I do a
>search for "cup" I want it to return false but for another field that
>conta
One side-effect of turning off the norms may be that the scoring/ranking will
be different? Do you need to search by each of these many fields? If not, you
probably don't have to index these fields (but store them for retrieval?).
Just a thought.
Xiaocheng
Michael McCandless <[EMAIL PROTECTED]>
John - a bug with code is best. No gods here.
Otis
- Original Message
From: John Wang <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Friday, March 9, 2007 2:22:35 AM
Subject: FieldCache: flush cache explicitly
I think the api should allow for explicitly flush the fieldcache.
Erick Erickson wrote:
The javadocs point out that this line
* int* nb = mIndexReaderClone.deleteDocuments(urlTerm)
removes*all* documents for a given term. So of course you'll fail
to delete any documents the second time you call
deleteDocuments with the same term.
Isn't the code snippet belo
I have read that with Lucene it is not possible to do wildcard searches
with * or ? as the first character. Wildcard searches with * as the
Lucene supports it. If you are using QueryParser to parse your queries see
http://lucene.apache.org/java/docs/api/org/apache/lucene/queryParser/QueryPars
It means it return the term vectors for all the fields on that
document where you have enabled TermVector when creating the Document.
i.e. new Field(, TermVector.YES) (see http://lucene.apache.org/
java/docs/api/org/apache/lucene/document/Field.TermVector.html for
the full array of optio
It's possible to do leading wildcard searches in Lucene as of 2.1. See
http://wiki.apache.org/lucene-java/LuceneFAQ#head-4d62118417eaef0dcb87f4370583f809848ea695
(http://tinyurl.com/366suf)
-Original Message-
From: Oystein Reigem [mailto:[EMAIL PROTECTED]
Sent: Tuesday, March 13, 2007 11
Hi,
I have read that with Lucene it is not possible to do wildcard searches
with * or ? as the first character. Wildcard searches with * as the
first character (or both first and last character) are useful for text
in languages that have a lot of compound words, like German and the
Scandinavi
Hi all,
The documentation for the above method mentions something called a
vectorized field. Does anyone know what a vectorized field is?
This email and any attached files are confidential and copyright protected. If
you are not the addressee, any dissemination of this communication is stric
Thank u Erick,
I'll look more into docs to check why I get a search result and no
deletion ...
could have been less rude to me though ...
I feel a very mean person now :-(
anyway thank u for your time
__
Matt
-Original Message-
From: Erick Eri
Well, don't label things urgent. Since this forum is is free, you have
no right to demand a quick response.
You'd get better responses if there was some evidence that you
actually tried to find answers to your questions before posting
them. We all have other duties, and taking time out to answer
Hi,
I have put this question as "urgent" because I can notice I don't have
often answers,
If I'm asking the wrong way, please tell me...
Before I delete a document I search it in the index to be sure there is
a hit (via a Term object),
When I find a hit I delete the document (with the same Term
Mark Miller wrote:
Depends on the work you want to do. If you want to highlight a simple
XML doc the approach would be to extract all of the text elements and
run them through the highlighter and then correctly update them. That
would be mostly simple DOM manipulation.
OK.
I guess there wil
When performing a query and getting a result set back, if one wants to
know which terms from the query actually matched, is Highlighter still
the best way to go with the latest Lucene, or should I start looking
at query term frequency vectors?
Just trying to find a non-expensive way of doing this
ok mike.I'll try it and see wheter could work :) then I will proceed to
optimize the index.
Well then i guess it's fine to use the default value for maxMergeDocs which
is INTEGER.MAX?
Thanks a lot
Regards,
Maureen
Michael McCandless <[EMAIL PROTECTED]> wrote:
"maureen tanuwidj
"maureen tanuwidjaja" <[EMAIL PROTECTED]> wrote:
> How to disable lucene norm factor?
Once you've created a Field and before adding to your Document
index, just call field.setOmitNorms(true).
Note, however, that you must do this for all Field instances by that
same field name because whenever
Depends on the work you want to do. If you want to highlight a simple
XML doc the approach would be to extract all of the text elements and
run them through the highlighter and then correctly update them. That
would be mostly simple DOM manipulation. The same approach should work
with any forma
Hi all,
How to disable lucene norm factor?
Thanks,
Maureen
-
We won't tell. Get more on shows you hate to love
(and love to hate): Yahoo! TV's Guilty Pleasures list.
Hi,
I want to implement fulltext search on a collection of documents. I try
to figure out which system is the better choice - eXist, or Lucene, or
some combination of the two. I have some knowledge of eXist, but don't
know too much about Lucene.
I'd like to display the result of a search as
Hi Mike,
How to disable/turn off the norm?is it while indexing?
Thanks,
Maureen
-
Need Mail bonding?
Go to the Yahoo! Mail Q&A for great tips from Yahoo! Answers users.
"maureen tanuwidjaja" <[EMAIL PROTECTED]> wrote:
> "The only simple workaround I can think of is to set maxMergeDocs to
> keep all segments "small". But then you may have too many segments
> with time. Either that or find a way to reduce the number of unique
> fields that you actually need to
"Michael McCandless" <[EMAIL PROTECTED]> wrote:
> The only simple workaround I can think of is to set maxMergeDocs to
> keep all segments "small". But then you may have too many segments
> with time. Either that or find a way to reduce the number of unique
> fields that you actually need to sto
Oops sorry,mistyping..
I have the searching result in 30 SECONDS to 3 minutes, which is actually
quite unacceptable for the "search engine" I build...Is there any
recommendation on how faster searching could be done?
maureen tanuwidjaja <[EMAIL PROTECTED]> wrote: Hi mike
"The on
Hi mike
"The only simple workaround I can think of is to set maxMergeDocs to
keep all segments "small". But then you may have too many segments
with time. Either that or find a way to reduce the number of unique
fields that you actually need to store."
It is not possible for me to reduce
Hi,
I need to merge several indexes (I call them incremental index) with my
main index.
Each incremental index can contain the same url's of the main index,
that's why I have a list of url's to update, that I will delete from the
main index before merging with an incremental index.
I have also
"maureen tanuwidjaja" <[EMAIL PROTECTED]> wrote:
> "One thing that stands out in your listing is: your norms file
> (_1ke1.nrm) is enormous compared to all other files. Are you indexing
> many tiny docs where each docs has highly variable fields or
> something?"
>
> Ya I also confuse
Hi Mike..
"One thing that stands out in your listing is: your norms file
(_1ke1.nrm) is enormous compared to all other files. Are you indexing
many tiny docs where each docs has highly variable fields or something?"
Ya I also confuse why this nrm file is trmendous in size.
I am ind
Hi kainth,
So for example if I have a field with this text: "world cup" and I do a
search for "cup" I want it to return false but for another field that
contains exactly the text "cup" I want the result to be true.
You fire only phrase query on the first field where you want only "world
cup"
Hi all,
Is it possible to search whether a term is equal to the entire contents
of a field rather than that the field contains a term?
So for example if I have a field with this text: "world cup" and I do a
search for "cup" I want it to return false but for another field that
contains exactly the
"maureen tanuwidjaja" <[EMAIL PROTECTED]> wrote:
> How much actually the disk space needed to optimize the index?The
> explanation given in documentation seems to be very different with the
> practical situation
>
> I have an index file of size 18.6 G and I am going to optimize it.I
Dear All
How much actually the disk space needed to optimize the index?The
explanation given in documentation seems to be very different with the
practical situation
I have an index file of size 18.6 G and I am going to optimize it.I keep
this index in mobile Hard Disk with capacit
38 matches
Mail list logo