HI,
I'm new to the lucene. I downloaded lucene 2.4.1.
I have one xml file which contains few special characters like 'å', 'ø,' °'
etc.(these are Danish language elements).
How can I search these things.
Uday Kumar Reddy Maddigatla
Software Engineer(Progrator|gatetrade)
MACH
Hi David:
We built bobo-browse specifically for these types of usecases:
http://code.google.com/p/bobo-browse
Let me know if you need any help getting it going.
-John
On Mon, Apr 20, 2009 at 12:59 PM, Karsten F.
wrote:
>
> Hi David,
>
> correct: you should avoid reading the content o
Thanks for the responses, everyone. Where shall we host? My company
can offer space in our building in Factoria, but it's not exactly a
'cool' or 'fun' place. I can also reserve a room at a local library. I
can bring some beer and light refreshments.
On Mon, Apr 20, 2009 at 7:22 AM, Matthew Hall
I don't think you *can* create a Term that spans two fields. Perhaps
you'd be better off just doing a search, getting the doc ID back then
adding a new version of the document.
You *could* think about reindexing your corpus and indexing an
additional field that was the concatenation of the two fie
It is not legal to share purchased e-books in this manner. Please
purchase copies of the books you read, otherwise authors have very
little incentive to dedicate months (14 months in the case of Lucene
in Action, first edition) of their lives to writing this content.
Erik
On Apr 2
Mike,
I made a standalone tool like you suggested which prints out the size of
each doc in the index, none of the docs are more than 1MB !!! The queries
are the same. They repeat throughout the test. We give about 6GB of heap to
the application and yes we are on 64 bit JVM.
I hit upon anothe
Erick means we need to see *all* of your code (inlcuding how you get the
score and the Explanation you are printing) to understand why they don't
match.
All you've shown is the output of your program and the generation of a
Hits object.
-Hoss
--
Hi David,
correct: you should avoid reading the content of a document inside a
hitcollector.
Normaly that means to cache all you need in main memory. Very simple and
fast is a facet with only 255 possible values and exactly one value per
document. In this case you need only an byte[IndexReader.ma
Strange.. as far as I can tell I never even got this email at all, was
it not originally sent to the lucene lists?
Matt
Grant Ingersoll wrote:
Lest you think silence equals acceptance...
This is not appropriate use of these lists.
-Grant
On Apr 19, 2009, at 11:58 PM, wu fuheng wrote:
welc
Lest you think silence equals acceptance...
This is not appropriate use of these lists.
-Grant
On Apr 19, 2009, at 11:58 PM, wu fuheng wrote:
welcome to download
http://www.ultraie.com/admin/flist.php
-
To unsubscribe, e-
What if you're unique id is a composite of two field when you create the
document?
I.E.
doc.add(new Field("partno", "123345",
Field.Store.whatever, Field.Index.UN_TOKENIZED);
doc.add(new Field("storeLoc", "Springfield",
Field.Store.whatever, Field.Index.UN_TOKENIZED);
How do you create a Term fo
Robert,
99% of the documents are inserted as soon as we discover them, so the
INDEXORDER is largely correct. However, two factors keep me from using
INDEXORDER. The first is that a small portion of our records (1%) enter the
index late (so they appear out of order with respect to the other 99%
(excuse the cross-post)
I'm presenting a webinar on Solr. Registration is limited, so sign up
soon. Looking forward to "seeing" some of you there!
Thanks,
Erik
"Got data? You can build your own Solr-powered Search Engine!"
Erik Hatcher, Lucene/Solr Committer and author, will show
David,
One suggestion I have for your large index. Is it possible to index these
documents ordered by Date? (and ingest new docs in Date order?)
This way index order = Date order, you can do this sort very quickly by
using Sort.INDEXORDER
with huge indexes I try to see if there's a way i can hav
Same here, sadly there isn't much call for Lucene user groups in Maine.
It would be nice though ^^
Matt
Amin Mohammed-Coleman wrote:
I would love to come but I'm afraid I'm stuck in rainy old England :(
Amin
On 18 Apr 2009, at 01:08, Bradford Stephens
wrote:
OK, we've got 3 people... t
Honestly I'm more focused on intelligent ways to do faster and more complex
GIS features.
As I said the most time consuming part is the DistainceFilter, which is
required to sort by distance.
I'm playing with several ideas on how to do those better, and get a win
there.
However if anyone wants to
Hi Karsten,
My index contains about 100M documents, and I'm trying to count results
on around 300 facets. At the moment I'm keeping a set of cached facet
bitsets and then comparing the query result against those bitsets.
Performance is pretty lousy. It takes more than 2s to calculate the
cardinali
Thanks!
I wound up indexing both versions in the same index, and boosting the words
that appeared in the "good word" list!
Thanks again for your advice!
Matthew Hall-7 wrote:
>
> Erm, I likely should have mentioned that this technique requires the use
> of a MultiFieldQueryParser.
>
> Matt
Hi,
I saw a very old thread that suggests an implementation for Synonyms that
takes into account differnt weight to differnt synonyms and gives a penalty
factor to synonyms, to avoid getting documents with the synonyms prior to
documents with the original words.
http://mail-archives.apache.org/mod
Have you thought about subclassing MultiTermQuery and provide a
FilteredTermEnum? When you do this, the query can be either BooleanQuery or
a Filter.
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
> -Original Message-
> From: patric
20 matches
Mail list logo