One of the key concepts in astc's text handling is that all strings use the same encoding. So no, my system doesn't mix encoding/decoding with compression/decompression; encoding and decoding are completely out of scope. There is a *transformation format* issue, but not an encoding issue. For example, having got aByteArray from somewhere, aString := SCSUDecoder decode: (ByteArray unzip: aByteArray) uncompresses in one step (which knows nothing about encodings) and decodes in another (which is admittedly the Simple Compression Scheme for Unicode, but knows nothing of FLATE compression). Similarly, aByteArray := (EightBitEncoder type: 'ISO-8859-1' encode: aString) zipped separates encoding from compression.
On Thu, 10 Oct 2019 at 02:17, Sven Van Caekenberghe <s...@stfx.eu> wrote: > > Richard, > > Your implementation mixes zipping/unzipping and encoding/decoding, dictating > a single way to do so, if I understand it correctly. > > The composition with several messages allows for end users to choose their > own encoding format, depending on their own needs, which I think is more > flexible. > > Sven > > > On 4 Oct 2019, at 06:36, Richard O'Keefe <rao...@gmail.com> wrote: > > > > Here's how it would look in my library: > > compressed := original zipped. > > "There is currently one definition, in > > AbstractStringOrByteArray, covering [ReadOnly]ByteArray, > > [ReadOnly]String and its many subclasses, > > ByteBuffer,StringBuffer, Substring, [ReadOnly]ShortArray, > > [ReadOnly]MappedByteArray, and some others. This relies on > > _ asByteArraySize and _ asByteArrayDo: _. There is no need for > > a separate #utf8Encoded, that's what asByteArrayDo: *does*." > > > > copy := original class unzip: compressed. > > "This is a little trickier, but not hugely so. There is no need > > for special case code. [ReadOnly][Mapped]ByteArray and ByteBuffer > > are > > sequences of bytes, Stringy things are Unicode, and > > [ReadOnly]ShortArrays are treated as UTF16." > > > > As far as I can tell, this just works for the original use case. > > > > On Fri, 4 Oct 2019 at 11:42, PBKResearch <pe...@pbkresearch.co.uk> wrote: > >> > >> Richard > >> > >> I don't think so. The case being considered for my problem is the > >> compression of a ByteArray produced by applying #utf8Encoded to a > >> WideString, but it extends to any other form of ByteArray. If you > >> substitute ByteArray for SomeClass in your examples, I think you will see > >> why the chosen interface was used. > >> > >> Peter Kenny > >> > >> > >> -----Original Message----- > >> From: Pharo-users <pharo-users-boun...@lists.pharo.org> On Behalf Of > >> Richard O'Keefe > >> Sent: 03 October 2019 23:08 > >> To: Any question about pharo is welcome <pharo-users@lists.pharo.org> > >> Subject: Re: [Pharo-users] How to zip a WideString > >> > >> The interface should surely be > >> SomeClass > >> methods for: 'compression' > >> zipped "return a byte array" > >> > >> class methods for: 'decompression' > >> unzip: aByteArray "return an instance of SomeClass" > >> > >> > > > >