I am using the standard analyzer.
This problem only happen when I set the query to BooleanClause.Occur.SHOULD
instead of BooleanClause.Occur.MUST while creating the query
John Wang wrote:
>
> What analyzers are you using for both query and indexing?Can you also post
> some code on you indexed?
What analyzers are you using for both query and indexing?Can you also post
some code on you indexed?
-John
On Fri, Apr 24, 2009 at 8:02 PM, blazingwolf7 wrote:
>
> Hi,
>
> I created a query that will find a match inside documents. Example of text
> match "terror india"
> And documents with this
Hi,
I created a query that will find a match inside documents. Example of text
match "terror india"
And documents with this exact match does exists.
My query generated is like this: (title:"terror india"^4 content:"terror
india"^3 site:"terror india")
But why does it not return any results?
can
Hi Michael:
We are using it internally here at LinkedIn for both our search engine
as well as our social graph engine. And we have a team developing actively
on it. Let us know how we can help you.
-John
On Fri, Apr 24, 2009 at 1:56 PM, Michael Mastroianni <
mmastroia...@glgroup.com> wrote:
OK I opened https://issues.apache.org/jira/browse/LUCENE-1611.
Christiaan, could you try out that patch to see if it fixes the
semi-infinite merging? Thanks. (You'll need to back-port to 2.4.1,
but it's a very small patch so hopefully not a problem).
Mike
On Fri, Apr 24, 2009 at 5:11 PM, Micha
On Fri, Apr 24, 2009 at 4:46 PM, MakMak wrote:
>
> - We had a 2.3.2 index earlier. We have reindexed using 2.4.1 now.
So the hang still happens with 2.4.1?
> - SAN is ruled out. This occurs even with local file system.
OK.
Have you confirmed things are really hung, vs just taking a long time?
On Fri, Apr 24, 2009 at 5:02 PM, Christiaan Fluit
wrote:
> Rollback does not work for me, as my IW is in auto-commit mode. It gives an
> IllegalStateException when I invoke it.
>
> A workaround that does work for me is to close and reopen the IndexWriter
> immediately after an OOME occurs.
Ahh,
--- On Sat, 4/25/09, andykan1...@yahoo.com wrote:
From: andykan1...@yahoo.com
Subject: Piece of coded needed
To: java-user@lucene.apache.org
Date: Saturday, April 25, 2009, 1:37 AM
Hi every body
I know it may seem stupid, but I'm in the middle of a research and I need a
piece of code in luc
Hi every body
I know it may seem stupid, but I'm in the middle of a research and I need a
piece of code in lucene to give me a weight matrix of a text collection and a
given query:
W i,j = (f i,j)x(idf i)
AND for the query:
W i,q = (0.5 + (0.5xfreq i,q)/Max(freq i,q))x (idf i )
where:
f
Michael McCandless wrote:
- even though the commitMerge returns false, it should probably not get into
an infinite loop. Is this an internal Lucene problem or is there something I
can/should do about it myself?
Yes, something is wrong with Lucene's handling of OOME. It certainly
should not lea
Hi--
Has anyone here used kamikaze much? I'm interested in using it in
situations where I'll have several docidsets of >2M, plus several in the
10s of thousands.
On prototype basis, I got something running nicely using OpenBitSet, but
I can't use that much memory for my real application.
- We had a 2.3.2 index earlier. We have reindexed using 2.4.1 now.
- SAN is ruled out. This occurs even with local file system.
- One more point, this occurs with very high load on the application. about
2-3 requests per second, the search part of each request is within
milliseconds. the page size
On Tue, Apr 21, 2009 at 6:25 PM, MakMak wrote:
> Ran CheckIndex. This is what it prints out:
>
> cantOpenSegments: false
> numBadSegments: 0
> numSegments: 14
> segmentFormat: FORMAT_HAS_PROX [Lucene 2.4]
> segmentsFileName: segments_2od
> totLoseDocCount: 0
> clean: true
> toolOutOfDate: false
Otis Gospodnetic wrote:
No. But you could look at an existing index, pull out one Document at a time,
pull out any stored Field values from each Document, and write those to a text
file. You'd have to write the code for this yourself.
Actually, the latest version of Luke (http://www.getopt.
I don't think there's an easy way to jump straight from term + freq
per doc to a Lucene index.
Mike
On Tue, Apr 21, 2009 at 7:14 AM, Thomas Pönitz
wrote:
> Hi,
>
> I have the same problem as discussed here:
> http://mail-archives.apache.org/mod_mbox/lucene-java-user/200511.mbox/%3c200511021310.1
Can you describe how you change the index? EG are you committing very
frequently?
It's odd to get 1000+ files in the index in 10 minutes unless you are
committing frequently.
If so, you may need a smarter deletion policy that stays in touch w/
the readers to know precisely which commit point the
Make a RangeFilter that visits only docs in your time period, then run
a search w/ a custom HitCollector that looks at the source of each doc
and tallies up the results? For performance, you'll probably need to
load the source using FieldCache (FieldCache.DEFAULT.getStrings(...)).
Or, use Solr's
On 4/24/2009 3:16 AM, Doron Cohen wrote:
> On Fri, Apr 24, 2009 at 12:28 AM, Steven Bethard wrote:
>
>> On 4/23/2009 2:08 PM, Marcus Herou wrote:
>>> But perhaps one could use a FieldCache somehow ?
>> Some code snippets that may help. I add the PageRank value as a field of
>> the documents I inde
See IndexReader.setTermInfosIndexDivisor() for a way to help reduce memory
usage without needing to re-index.
If you have indexed fields with omitNorms off (the default) you will be paying
a 1 byte per field per document memory cost and may need to look at re-indexing
Cheers
Mark
- Orig
Hi
I would like to upgrade to Lucene 2.9, I can see the daily builds on
Hudson, should I just take the last work that worked, or ar ethere any
particular builds that have been tested and hence are possibly more stable.
thanks Paul
-
Hi!
Is there any way to reduce memory footprint doing a search over a very large
index (20G). I've getting OOMs with 512m heap!
cheers
--
Douglas Campos
Theros Consulting
+55 11 9267 4540
+55 11 3020 8168
Nothing that marries WordNet with Lucene other than that syns stuff exists in
Lucene contrib (but it may exist on SourceForce, in Google Code, etc.). There
are several WordNet java libraries you could use to combing WN and Lucene:
http://www.simpy.com/user/otis/search/wordnet
Otis
--
Semat
Hi I was using a RAMDirectory and this was working fine but have now
moved over to a filesystem directory to preserve space, the directory
is just initialized once
directory = new RAMDirectory();
directory =
FSDirectory.getDirectory(Platform.getPlatformLicenseFolder()+ "/" +
TAG_BROWSER
No. But you could look at an existing index, pull out one Document at a time,
pull out any stored Field values from each Document, and write those to a text
file. You'd have to write the code for this yourself.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Origin
Thanks Guys for the answers!
Steven, I tried with the ".*" instead of "*" but it did not worked as
desired. The ".*" does not replace any symbol(s) in the query. I tested
with different Analyzers. Depending on Analyzer it is omitted or ".*"
are treated just as normal symbols.
Mark, your clas
I'm puzzled why you say
"By the above out put we can say that StandardAnalyzer is
enough to get rid of danish elements."
It does NOT get rid of the accents, according to your own output.
If your goal is to go ahead and index multiple language documents
in a single index then search it, I'd recom
On Thu, Apr 23, 2009 at 11:52 PM, wrote:
> I figured it out. We are using Hibernate Search and in my ORM class I
> am doing the following:
>
> @Field(index=Index.TOKENIZED,store=Store.YES)
> protected String objectId;
>
> So when I persisted a new object to our database I was inadvertently
> cre
On Fri, Apr 24, 2009 at 12:28 AM, Steven Bethard wrote:
> On 4/23/2009 2:08 PM, Marcus Herou wrote:
> > But perhaps one could use a FieldCache somehow ?
>
> Some code snippets that may help. I add the PageRank value as a field of
> the documents I index with Lucene like this:
>
>Document docum
Hi Thanks for your reply.
After gone threw with the site which you given... i understood that
StandardAnalyzer is enough to handle these special characters.
i'm attaching one class called AnalysisDemo.java. By executing that class
i'm able to say the above sentance(i.e StandardAnalyzer is enough
29 matches
Mail list logo