Thanks. It might be that Nutch sets some values. I am not able to find
anything in the config files though.
We are using nutch' solrindex.
--
Ole-Martin Mørk
http://twitter.com/olemartin
http://flickr.com/olemartin
On Mon, Oct 5, 2009 at 2:28 PM, Simon Willnauer <
simon.willna...@googlemail.com>
Have a look at your schema definition I guess thats the place where boosts
are set if not defined in the data you send to you solr instance.
simon
On Mon, Oct 5, 2009 at 2:14 PM, Ole-Martin Mørk wrote:
> That might be true. The document boost did not change, but maybe the field
> boost changed.
That might be true. The document boost did not change, but maybe the field
boost changed. Is it possible to retrieve the field boost from solr?
--
Ole-Martin Mørk
On Mon, Oct 5, 2009 at 2:01 PM, Simon Willnauer <
simon.willna...@googlemail.com> wrote:
> I still guess that the document has been i
Could it be that the tokenization schema for URL have changed between
the times you added documents? I.e. yielding more tokens when you got
the low fieldNorm value. Number of documents should not impact the
fieldnorm, the value is based on number of tokens in the field, field
and document b
I still guess that the document has been indexed with different boost
factors the first time if you did not change the length of the URL.
Can you make sure this did not happen?
simon
On Mon, Oct 5, 2009 at 12:45 PM, Ole-Martin Mørk wrote:
> I did not change the url. The length of the title was i
I did not change the url. The length of the title was increased by 1, from
41 to 42 characters.
--
Ole-Martin Mørk
On Mon, Oct 5, 2009 at 12:39 PM, Karl Wettin wrote:
> sorry, I ment title.
>
> 5 okt 2009 kl. 11.57 skrev Simon Willnauer:
>
>
> Ole-Martin, did you mention that you did not chang
sorry, I ment title.
5 okt 2009 kl. 11.57 skrev Simon Willnauer:
Ole-Martin, did you mention that you did not change the URL value
but the
title?
simon
On Mon, Oct 5, 2009 at 11:52 AM, Karl Wettin
wrote:
Hi Ole-Martin,
how many characters was it in the url in before and after update?
Ole-Martin, did you mention that you did not change the URL value but the
title?
simon
On Mon, Oct 5, 2009 at 11:52 AM, Karl Wettin wrote:
> Hi Ole-Martin,
>
> how many characters was it in the url in before and after update?
>
>
> karl
>
> 5 okt 2009 kl. 10.21 skrev Ole-Martin Mørk:
>
>
>
Hi Ole-Martin,
how many characters was it in the url in before and after update?
karl
5 okt 2009 kl. 10.21 skrev Ole-Martin Mørk:
Hi. I am trying to understand Lucene's scoring algorithm. We're
getting some strange results. First we search for a given page by it's
url. We get this resul
Did another update:
9.707364 = fieldWeight(url:"our super secret url" in 0), product of:
1.0 = tf(phraseFreq=1.0)
31.063566 = idf(url: www=7329 host=323 com=7329
article=2458 something=4 something=46 704290075=3)
0.3125 = fieldNorm(field=url, doc=0)
FieldNorm value is not changed this time.
I don't think I changed any boost values, at least not on purpose. I think
the reason for the changed document id is that, to my knowledge, an update
is a delete and an add.
The code for my solrj update:
public void updateDocument(SolrDocument document) {
SolrServer server = new CommonsHtt
Did you change any boost values for URL field or document while reindexing
the document by any chance? Or do you look at different documents - one is
internal id 0 and other is internal id 22 - this could be the updated one
just curious if that might be the cause?!
simon
On Mon, Oct 5, 2009 at 10
12 matches
Mail list logo