Thanks. It might be that Nutch sets some values. I am not able to find anything in the config files though. We are using nutch' solrindex.
-- Ole-Martin Mørk http://twitter.com/olemartin http://flickr.com/olemartin On Mon, Oct 5, 2009 at 2:28 PM, Simon Willnauer < simon.willna...@googlemail.com> wrote: > Have a look at your schema definition I guess thats the place where boosts > are set if not defined in the data you send to you solr instance. > > simon > > On Mon, Oct 5, 2009 at 2:14 PM, Ole-Martin Mørk <olemar...@gmail.com> > wrote: > > > That might be true. The document boost did not change, but maybe the > field > > boost changed. Is it possible to retrieve the field boost from solr? > > -- > > Ole-Martin Mørk > > > > > > On Mon, Oct 5, 2009 at 2:01 PM, Simon Willnauer < > > simon.willna...@googlemail.com> wrote: > > > > > I still guess that the document has been indexed with different boost > > > factors the first time if you did not change the length of the URL. > > > Can you make sure this did not happen? > > > > > > simon > > > > > > On Mon, Oct 5, 2009 at 12:45 PM, Ole-Martin Mørk <olemar...@gmail.com > > > >wrote: > > > > > > > I did not change the url. The length of the title was increased by 1, > > > from > > > > 41 to 42 characters. > > > > -- > > > > Ole-Martin Mørk > > > > > > > > > > > > On Mon, Oct 5, 2009 at 12:39 PM, Karl Wettin <karl.wet...@gmail.com> > > > > wrote: > > > > > > > > > sorry, I ment title. > > > > > > > > > > 5 okt 2009 kl. 11.57 skrev Simon Willnauer: > > > > > > > > > > > > > > > Ole-Martin, did you mention that you did not change the URL value > > but > > > > the > > > > >> title? > > > > >> > > > > >> simon > > > > >> > > > > >> On Mon, Oct 5, 2009 at 11:52 AM, Karl Wettin < > karl.wet...@gmail.com > > > > > > > >> wrote: > > > > >> > > > > >> Hi Ole-Martin, > > > > >>> > > > > >>> how many characters was it in the url in before and after update? > > > > >>> > > > > >>> > > > > >>> karl > > > > >>> > > > > >>> 5 okt 2009 kl. 10.21 skrev Ole-Martin Mørk: > > > > >>> > > > > >>> > > > > >>> Hi. I am trying to understand Lucene's scoring algorithm. We're > > > > >>> > > > > >>>> getting some strange results. First we search for a given page > by > > > it's > > > > >>>> url. We get this result: > > > > >>>> > > > > >>>> 0.0014793393 = fieldWeight(url:"our super secret url" in 22), > > > product > > > > >>>> of: > > > > >>>> 1.0 = tf(phraseFreq=1.0) > > > > >>>> 32.31666 = idf(url: www=7327 host=321 com=7327 article=2456 > > > > >>>> something=2 something=44 704290075=1) > > > > >>>> 4.5776367E-5 = fieldNorm(field=url, doc=22) > > > > >>>> > > > > >>>> When this is done, we use solrJ to read and write the document. > > The > > > > >>>> only change is the title of the document (appends the number 2) > > > > >>>> > > > > >>>> We search again and the fieldNorm is changed significantly: > > > > >>>> > > > > >>>> 9.874598 = fieldWeight(url:"our super secret url" in 0), product > > of: > > > > >>>> 1.0 = tf(phraseFreq=1.0) > > > > >>>> 31.598713 = idf(url: www=7328 host=322 com=7328 article=2457 > > > > >>>> something=3 somthing=45 704290075=2) > > > > >>>> 0.3125 = fieldNorm(field=url, doc=0) > > > > >>>> > > > > >>>> Why does the value of fieldNorm change so much? > > > > >>>> > > > > >>>> Looking forward to your answers. > > > > >>>> > > > > >>>> -- > > > > >>>> Ole-Martin Mørk > > > > >>>> http://twitter.com/olemartin > > > > >>>> http://flickr.com/olemartin > > > > >>>> > > > > >>>> > > > > >>> > > > > >>> > > --------------------------------------------------------------------- > > > > >>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > > > >>> For additional commands, e-mail: > java-user-h...@lucene.apache.org > > > > >>> > > > > >>> > > > > >>> > > > > > > > > > > > --------------------------------------------------------------------- > > > > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > > > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > > > > > > > > > > > > > > > > >