On Thursday, 11 August 2016 02:24:49 UTC+3, Alex Flint wrote:
>
> There are around 2M strings, and their total size is ~6 GB, so an average 
> of 3k each.
>

What kind of data? How large is the alphabet? What is the distribution of 
letters? Examples would be good :)
 

>
> I actually looked briefly at Go's compress/flate to see whether something 
> like what you're describing is possible without writing my own compressor 
> but I couldn't see any obvious way to get at the underlying compressor 
> state. Or perhaps I'm looking in the wrong package - any pointers would be 
> appreciated.
>
> On Wed, Aug 10, 2016 at 3:42 PM Ian Lance Taylor <ia...@golang.org 
> <javascript:>> wrote:
>
>> On Wed, Aug 10, 2016 at 3:27 PM, Alex Flint <alex....@gmail.com 
>> <javascript:>> wrote:
>> >
>> > I have long list of short strings that I want to compress, but I want 
>> to be
>> > able to decompress an arbitrary string in the list at any time without
>> > decompressing the entire list.
>> >
>> > I know the list ahead of time and it doesn't matter how much 
>> preprocessing
>> > time is involved. It is also fine if there is some significant O(1) 
>> memory
>> > overhead at runtime.
>> >
>> > Any suggestions?
>>
>> You say the strings are "short": how short?  How many strings are
>> there?  How much total data in the uncompressed strings?
>>
>> What is your target for the total amount of memory used by the
>> compressed strings plus any data required to decompress them?
>>
>> One approach that comes to mind is building an optimized Huffman table
>> for the full set of strings, and compressing each one separately using
>> that table.  Then each string is represented by a bit offset into the
>> resulting bitstream, and each can be decompressed separately.  But you
>> would need storage at run time not only for the bitstream, but also
>> for the Huffman table.
>>
>> Ian
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to