> 2011/5/26 Jan Hubicka <hubi...@ucw.cz>: > >> On Thu, May 26, 2011 at 12:45 PM, Jan Hubicka <hubi...@ucw.cz> wrote: > >> >> > >> >> This looks all very hackish with no immediate benefit mostly because > >> >> of the use of lto_output_string. I think what you should do instead > >> >> is split up lto_output_string_with_length into the piece that streams > >> >> the string itself to the string-stream and returns an index into it > >> >> and the piece streaming the index to the specified stream. Then you > >> >> can simply bitpack that index and the two int line / column fields. > >> > > >> > Hmm, I plan to optimize string streaming (since we always stream one > >> > uleb to > >> > set it is non-NULL that can be easilly handled by assigining NULL string > >> > index > >> > 0). How precisely you however suggest to bitpack line/column and string > >> > offset > >> > together? > >> > >> Similar to how you suggested, stream bits for a changed flag but > >> instead of finishing the bitpack simply stream HOST_BITS_PER_INT > >> bits for line (if changed), colunn (if changed) and file string index (if > >> changed and the index is 'int'). > >> > >> I mostly want to avoid the split between the changed bits and the > >> data output, esp. breaking the bitpack. > > > > Well, that won't get me for < 1 byte overhead when location is unchanged or > > unknown (that is true for about half of cases in my stats). > > Why not? it would be 3 bits.
Hmm, I see, so you suggest to move all the data into bitpack in order to couple it with the bitpack in tree streaming w/o need to have two functions. (i originally understood the message as you are objecting to the idea of storing only the changes). I see one can save it into bitpack, though i was under impression that we want to avoid variably sized bitpacks as then the accessors no longer expands to simple arithmetics. > > > Additionally > > HOST_BITS_PER_INT is host sensitive and wasteful compared to ulebs here as > > the > > line numbers, file indexes and columns are all usually small numbers, so > > they > > ought to fit in 16, 8 and 8 bits most of time. So we would end up in need > > of > > inventing something like uleb in bitpack? > > Well, we could do that by default for > 8 bit values we pack. I think > the location CSE using only 3 bits for unchanged locations should > save the most, not so much the use of ulebs for line/column. > (you could also encode the number of needed bytes for a changed > line/column and then only that many number of bytes). Well, since almost half of time we save at least one of the 3 indices, I think the stupid implementation would burn 3 bytes at average at least. This would be sort of comparable with the existing uleb path that is 5 bytes. OK, if we want to go for variably sized bitpacks, I guess I can simply store number of bits needed to represent the number followed by the number itself for all three indices. It will need massaging string i/o together with the patch as currently string indexes go into the ulebs. Can do that if you think it is better option. Do we have some limits on maximal bitpack size? Honza