Thank you Steve, now it's implementation time...
I'll be back :)
/M
On Fri, Apr 24, 2009 at 3:13 AM, Steven Bethard wrote:
> On 4/23/2009 2:42 PM, Marcus Herou wrote:
> > So what you basically are saying is that:
> >
> > 1. You have an index which contains data that is more or less static (no
>
On 4/23/2009 2:42 PM, Marcus Herou wrote:
> So what you basically are saying is that:
>
> 1. You have an index which contains data that is more or less static (no
> updates) or you have another update interval than the PR interval.
> 2. A PR index which is rebuilt (from scratch ?) every X days/wee
Never mind of how to open the ParallellReader stuff (I am an idiot): RTFM:
http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/index/ParallelReader.html
But the rest is of course interesting :)
/M
On Thu, Apr 23, 2009 at 11:42 PM, Marcus Herou
wrote:
> Thanks! (I started my reply and then
Thanks! (I started my reply and then saw that you added code snippets)
I think we are narrowing down the problem to the updating issue of the
PageRank score.
So what you basically are saying is that:
1. You have an index which contains data that is more or less static (no
updates) or you have an
On 4/23/2009 2:08 PM, Marcus Herou wrote:
> But perhaps one could use a FieldCache somehow ?
Some code snippets that may help. I add the PageRank value as a field of
the documents I index with Lucene like this:
Document document = new Document();
double pageRank = this.pageRanks.getCount(
On 4/23/2009 1:58 PM, Doron Cohen wrote:
>> I think we are doing similar things, at least I am trying to implement
>> document boosting with pagerank. Having issues of howto appky the scoring
>> of
>> specific docs without actually reindex them. I feel something should be
>> done
>> at query time w
But perhaps one could use a FieldCache somehow ?
/M
On Thu, Apr 23, 2009 at 11:07 PM, Marcus Herou
wrote:
> Yes I have considered it for 30 minutes :)
>
> How do one apply that in the real world ?
>
> If the only thing I get access to is the actual docId would it not be
> really expensive to get
Yes I have considered it for 30 minutes :)
How do one apply that in the real world ?
If the only thing I get access to is the actual docId would it not be really
expensive to get the Document itself from the index and later use some field
in it as external lookup in some optimized structure for t
>
> I think we are doing similar things, at least I am trying to implement
> document boosting with pagerank. Having issues of howto appky the scoring
> of
> specific docs without actually reindex them. I feel something should be
> done
> at query time which looks at external data but do not know h
I figured it out. We are using Hibernate Search and in my ORM class I
am doing the following:
@Field(index=Index.TOKENIZED,store=Store.YES)
protected String objectId;
So when I persisted a new object to our database I was inadvertently
creating a document in the Lucene index with the tokenized a
Hi.
I think we are doing similar things, at least I am trying to implement
document boosting with pagerank. Having issues of howto appky the scoring of
specific docs without actually reindex them. I feel something should be done
at query time which looks at external data but do not know howto impl
Doron, thanks for the reply.
> Is it possible that, for at least one document, multiple "objectId"
fields
> were created?
> This would also create this problem.
I read that online as well. I don't think so. We do have an update
process that updates the index. During the update process we have
Could an ExternalFileField help me ?
http://lucene.apache.org/solr/api/org/apache/solr/schema/ExternalFileField.html
On Thu, Apr 23, 2009 at 10:01 PM, Marcus Herou
wrote:
> Hi.
>
> Confusing subject eh ? Trying to become a little clearer in a few
> sentences.
>
> We have a Solr/Lucene index where
Hi.
Confusing subject eh ? Trying to become a little clearer in a few sentences.
We have a Solr/Lucene index where each document is a Blog Entry. We have
just implemented the PageRank algorithm for Blogs and are about to add a
column to the index called score and perhaps adjust the document boost
On Thu, Apr 23, 2009 at 10:39 PM, wrote:
> I'm getting a strange error when I make a Lucene (2.2.0) query:
>
> java.lang.RuntimeException: there are more terms than documents in field
> "objectId", but it's impossible to sort on tokenized fields
>
Is it possible that, for at least one document,
Sorry for that terrible formatting. Let me try again.
==
Hello,
I'm getting a strange error when I make a Lucene (2.2.0) query:
java.lang.RuntimeException: there are more terms than documents in field
"objectId", but it's impossible to sort
Hello,
I'm getting a strange error when I make a Lucene (2.2.0) query w/ the
following call:
java.lang.RuntimeException: there are more terms than documents in field
"objectId", but it's impossible to sort on tokenized fields
at
org.apache.lucene.search.FieldCacheImpl$10.createValue(
On 4/22/2009 2:26 PM, Doron Cohen wrote:
> Steve, I added a patch in https://issues.apache.org/jira/browse/LUCENE-1608,
>
> which allows to wrap any query in a value source, and then create a value
> source query out of it.
> Let us know how this works for you...
Thanks! I'll try this out in the
Related: https://issues.apache.org/jira/browse/LUCENE-1486
- Original Message
From: Steven A Rowe
To: "java-user@lucene.apache.org"
Sent: Thursday, 23 April, 2009 16:54:08
Subject: RE: SpanQuery wildcards?
Hi Ivan, SpanRegexQuery should work - just use ".*" instead of "*". - Steve
Hi Ivan, SpanRegexQuery should work - just use ".*" instead of "*". - Steve
> -Original Message-
> From: Ivan Vasilev [mailto:ivasi...@sirma.bg]
> Sent: Thursday, April 23, 2009 11:42 AM
> To: LUCENE MAIL LIST
> Subject: SpanQuery wildcards?
>
> Hy Guys,
>
> Does anybody knows if there i
Hy Guys,
Does anybody knows if there is way to use wild cards in SpanQuery?
My idea is for example instead of query - content:"expansive
computer"~10 - we to use query - content:"exp* comp*"~10. And the
results of first query to be subset of those of second one.
I tried with parsing the above w
OK, this is a much different problem than you were originally
asking about, effectively "how to index/search mixed language
documents".
This topic has been discussed multiple times on the user list, I
think your first step should be to search the archive. I *was*
going to find the old searchable m
On Tue, Apr 21, 2009 at 6:40 PM, Christiaan Fluit
wrote:
> I may be on to something already.
>
> I just looked at the commitMerge code and was surprised to see that the
> commitMerge message that is almost at the beginning wasn't printed. Then I
> saw the "if (hitOOM) return false;" part that tak
Dear Murat,
I saw your question and wondered how did you implement these changes?
The requirement below are the same ones as I am trying to code now.
Did you modify the source code itself or only used Lucene's jar and just
override code?
I would very much apprecicate if you could give me a short
24 matches
Mail list logo