Hi Brian,

Yes, it seems I'll have to go the custom function route, given their 
non-standard encoding. Thanks for confirming, and really appreciate those 
tests. I fixed my code to handle >2 byte ints and special case for the 9th 
byte (for which SQLite encoding treats all bits as data). Leaving here in 
case anyone else is interested. Have a good weekend!

package main

import "fmt"

const MaxVarintLen64 = 9

func Uvarint(buf []byte) (uint64, int) {
var x uint64
var s uint = 7
for i, b := range buf {
if i == MaxVarintLen64 {
// Catch byte reads past MaxVarintLen64.
// See issue https://golang.org/issues/41185
return 0, -(i + 1) // overflow
}
if i == MaxVarintLen64-1 && b > 1 {
x <<= s + 1
return x | uint64(b), i + 1
}

if b < 0x80 {
x <<= s
return x | uint64(b), i + 1
}
x <<= s
x |= uint64(b & 0x7f)
}
return 0, 0
}

func main() {
fmt.Println(Uvarint([]byte{0x81, 0x47}))                                   
        // should return 199, 2
fmt.Println(Uvarint([]byte{0xff, 0xff, 0x7f}))                             
        // should return 2097151 (=0x1fffff), 3
fmt.Println(Uvarint([]byte{0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 
0x7f}))       // should return 72057594037927935 (=0xffffffffffffff), 8
fmt.Println(Uvarint([]byte{0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 
0xff})) // should return 18446744073709551615 (=0xffffffffffffffff), 9
} 

On Saturday, October 4, 2025 at 9:45:26 AM UTC+2 Brian Candler wrote:

> So in short, you are saying that the byte sequence 0x81, 0x47 written by 
> SQLite decodes by binary.Uvarint to 9089, but you wanted it to decode to 
> 199.
>
> What this means is: the encoding that SQLite has chosen to use is *not* 
> the varint as defined by protobuf (and implemented by the Go standard 
> library). And therefore, you do indeed need to write your own custom 
> decoding function.
>
> The SQLite file format is defined here: 
> https://www.sqlite.org/fileformat.html
>
> *A variable-length integer or "varint" is a static Huffman encoding of 
> 64-bit twos-complement integers that uses less space for small positive 
> values. A varint is between 1 and 9 bytes in length. The varint consists of 
> either zero or more bytes which have the high-order bit set followed by a 
> single byte with the high-order bit clear, or nine bytes, whichever is 
> shorter. The lower seven bits of each of the first eight bytes and all 8 
> bits of the ninth byte are used to reconstruct the 64-bit twos-complement 
> integer. Varints are big-endian: bits taken from the earlier byte of the 
> varint are more significant than bits taken from the later bytes.*
>
> And for protobuf, see: 
> https://protobuf.dev/programming-guides/encoding/#varints
>
> On Saturday, 4 October 2025 at 01:31:25 UTC+1 R. Men wrote:
>
>> Sure, I'll share my code and what I'm trying to do. Thank you all for the 
>> help so far. My program reads the sql table's metadata to determine the 
>> type and length of each column in the table. These values are encoded as 
>> varint of unsigned bigendian integers. I already validated the expected 
>> values match the tables's actual data type/size.
>>
>> package main
>>
>> import (
>> "encoding/binary"
>> "fmt"
>> )
>>
>> func main() {
>> // SQLite format 3, sample DB file record header
>> //Expected:          7        23      27       27      1         199
>> //                        |-------| |-------| |-------| |-------| 
>> |------| |----------------|
>> inputs := []byte{0x07, 0x17, 0x1b, 0x1b, 0x01, 0x81, 0x47}
>> offset := 0
>> for remaining := len(inputs); remaining > 0; {
>> d, n := binary.Uvarint(inputs[offset:])
>> if n <= 0 {
>> break
>> }
>>
>> remaining -= n
>> offset += n
>> fmt.Println(d, n)
>>
>> // Actual output
>> // 7 1
>> // 23        1
>> // 27 1
>> // 27 1
>> // 1 1
>> // 9089   2
>> }
>> }
>>
>> I now see why I get the 9089 figure after looking at Uvarint source code (
>> https://cs.opensource.google/go/go/+/refs/tags/go1.25.1:src/encoding/binary/varint.go
>> ):
>>
>> func Uvarint(buf []byte) (uint64, int) {
>> var x uint64
>> var s uint
>> for i, b := range buf {
>> if i == MaxVarintLen64 {
>> // Catch byte reads past MaxVarintLen64.
>> // See issue https://golang.org/issues/41185
>> return 0, -(i + 1) // overflow
>> }
>> if b < 0x80 {
>> if i == MaxVarintLen64-1 && b > 1 {
>> return 0, -(i + 1) // overflow
>> }
>> return x | uint64(b)<<s, i + 1
>> }
>> x |= uint64(b&0x7f) << s  
>> s += 7
>> }
>> return 0, 0
>> }
>>
>> Here I see the bits after the first byte are left-shifted by 7 before 
>> concatenating and left-padding.
>> My solution so far has been to create custom uvarint function that 
>> performs the left-shift before the concat, preserving the byte order. 
>>
>> func Uvarint(buf []byte) (uint64, int) {
>> var x uint64
>> var s uint
>> for i, b := range buf {
>> if i == MaxVarintLen64 {
>> // Catch byte reads past MaxVarintLen64.
>> // See issue https://golang.org/issues/41185
>> return 0, -(i + 1) // overflow
>> }
>> if b < 0x80 {
>> if i == MaxVarintLen64-1 && b > 1 {
>> return 0, -(i + 1) // overflow
>> }
>> x <<= s 
>> return x | uint64(b), i + 1
>> }
>> x <<= s
>> x |= uint64(b&0x7f)
>> s += 7
>> }
>> return 0, 0
>> }
>>
>> I would prefer to use the go library's functions if at all possible 
>> rather than make my own but so far I haven't found alternatives or even 
>> discussions on this topic. If anything's unclear let me know. Cheers.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/golang-nuts/4a16f202-2609-4ff2-b13e-684e06f1d518n%40googlegroups.com.

Reply via email to