Sorry, I meant the encodeNormValue and decodeNormValue methods on the TFIDFSimilarity class -
public byte encodeNormValue(float f) public float decodeNormValue(byte b) On Thu, Jun 19, 2014 at 12:08 PM, Robert Muir <rcm...@gmail.com> wrote: > No they do not. The method is: > > public abstract long computeNorm(FieldInvertState state); > > > > On Thu, Jun 19, 2014 at 1:54 PM, Nalini Kartha <nalinikar...@gmail.com> > wrote: > > Thanks for the info! > > > > We're more interested in changing the lengthnorm function vs using > > additional stats for scoring so option 2 seems like the right way. > > > > It looks like the encode and decode methods deal with bytes right now - > > would changing those APIs to deal with longs instead be a good idea? It > > looks like the byte returned from encode is always being cast to long and > > the byte passed into decode is always a long to begin with. If we make > this > > change, would it be useful to submit a patch for it? > > > > Thanks, > > Nalini > > > > > > On Thu, Jun 19, 2014 at 10:28 AM, Uwe Schindler <u...@thetaphi.de> wrote: > > > >> Hi, > >> > >> You may not need to change the length-norm at all: If you want to > support > >> *additional* statistics, add a docvalues field to your index where you > can > >> store that information in addition to the Lucene-Default statistics. > Based > >> on a function query you can then use it for scoring. In fact, you can > then > >> also use a different data type for the statistics value. The norms in > >> Lucene are already internally handled as docvalues fields, too. > >> > >> On the other hand, if you want to modify the lengthNorm and you use a > >> non-float value, you have to also modify the encodeNorm/decodeNorm > methods > >> of the similarity. The default uses a very lossy float->1byte > >> transformation. > >> > >> Uwe > >> > >> ----- > >> Uwe Schindler > >> H.-H.-Meier-Allee 63, D-28213 Bremen > >> http://www.thetaphi.de > >> eMail: u...@thetaphi.de > >> > >> > >> > -----Original Message----- > >> > From: Nalini Kartha [mailto:nalinikar...@gmail.com] > >> > Sent: Thursday, June 19, 2014 7:14 PM > >> > To: java-user@lucene.apache.org > >> > Subject: Changing field lengthnorm to store length > >> > > >> > Hi, > >> > > >> > We're interested in having access to the number of terms in the fields > >> for a > >> > document vs the pre-calculated lengthnorm at scoring time - we want > >> > experiment with different lengthnorm functions so it seems like > storing > >> the > >> > raw length and then doing the norm calculation at query time would > work. > >> > > >> > Is changing the lengthnorm method on Similarity class to return the > raw > >> > number of terms the right way to go to for this? We realize this will > >> result in > >> > taking up more than a byte to store the value but we're OK with this. > >> Will this > >> > break anything else under the hood? > >> > > >> > Thanks, > >> > Nalini > >> > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > >> For additional commands, e-mail: java-user-h...@lucene.apache.org > >> > >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >