Re: [go-nuts] Efficiently switch io.Reader to another decoder on error

2025-01-14 Thread Rory Campbell-Lange
Thanks for the pointer, Roger. After finally getting the normalising to rawstd base64 encoding to work I was trying to get my head around the fact that base64 content seems to often have several newlines around it. Then I found encoding/base64, which has the func (r *newlineFilteringReader) Re

Re: [go-nuts] Efficiently switch io.Reader to another decoder on error

2025-01-14 Thread roger peppe
Tangentially related to this thread, a while back, I wrote a Go implementation of the base64 command that is agnostic about which encoding it reads (and can write all the possible encodings). It can be installed with: go install github.com/rogpeppe/misc/cmd/base64@latest It's arguably a little too

Re: [go-nuts] Efficiently switch io.Reader to another decoder on error

2025-01-14 Thread Rory Campbell-Lange
Thanks for finding that foolish error, Brian. To wrap the thread up, the implementation below seems to work ok for reading both base64.RawStdEncoding and base64.StdEncoding encoded data using the base64.RawStdEncoding decoder. Example usage: b64 := NewB64Translator(bytes.NewReader(encodedB

Re: [go-nuts] Efficiently switch io.Reader to another decoder on error

2025-01-14 Thread 'Brian Candler' via golang-nuts
I was more or less right. The input string, which you encoded to "Qm9uam91ciwgam95ZXV4IGxpb24K", contains an encoded newline at the end. It's not spurious. Confirmed by the "echo" pipeline I gave above, or in Go itself: https://go.dev/play/p/6kSxiCfCTo4 You can also confirm it by multiplying th

Re: [go-nuts] Efficiently switch io.Reader to another decoder on error

2025-01-14 Thread 'Brian Candler' via golang-nuts
Sorry ignore that, I hadn't checked your playground link. On Tuesday, 14 January 2025 at 10:07:53 UTC Brian Candler wrote: > > AS I wrote earlier, I'm trying to avoid reading the entire email part > into memory to discover if I should use base64.StdEncoding or > base64.RawStdEncoding. > > As I

Re: [go-nuts] Efficiently switch io.Reader to another decoder on error

2025-01-14 Thread 'Brian Candler' via golang-nuts
> AS I wrote earlier, I'm trying to avoid reading the entire email part into memory to discover if I should use base64.StdEncoding or base64.RawStdEncoding. As I asked before, why would you ever need to use RawStdEncoding? It just means the MIME part was invalid, most likely corrupted/truncated

Re: [go-nuts] Efficiently switch io.Reader to another decoder on error

2025-01-13 Thread robert engels
You wouldn’t get an eof if the data is properly encoded. Not sure what the problem is. You need to be doing something with the Reader - most likely writing to a file, streaming to a database record, etc. I would simplify the code to a single test case that demonstrates the issue you are having

Re: [go-nuts] Efficiently switch io.Reader to another decoder on error

2025-01-13 Thread Rory Campbell-Lange
I'm just doing the reverse of that, I think, by removing the padding. I can't seem to trigger an EOF with this code below: > >n, err = b.br.Read(h) > >if err != nil { > >return n, err > >} On 13/01/25, robert engels (reng...@ix.netcom.com) wrote: > As has bee

Re: [go-nuts] Efficiently switch io.Reader to another decoder on error

2025-01-13 Thread robert engels
As has been pointing out, you don’t need to read the whole thing into memory, just wrap the data provider with one that adds the padding it doesn’t exist - and always read with the padded decoder. To add the padding you only need to keep track of the count of characters read before eof to deter

Re: [go-nuts] Efficiently switch io.Reader to another decoder on error

2025-01-13 Thread Rory Campbell-Lange
AS I wrote earlier, I'm trying to avoid reading the entire email part into memory to discover if I should use base64.StdEncoding or base64.RawStdEncoding. The following seems to work reasonably well: type B64Translator struct { br *bufio.Reader } func NewB64Translator(r io.R

Re: [go-nuts] Efficiently switch io.Reader to another decoder on error

2025-01-13 Thread Rory Campbell-Lange
Thanks very much for the playground link and thoughts. The use case is reading base64 email parts, which could be of a very large size. It is unclear when processing these parts if they are base64 padded or not. I'm trying to avoid reading the entire email part into memory. Consequently I thin

Re: [go-nuts] Efficiently switch io.Reader to another decoder on error

2025-01-13 Thread 'Axel Wagner' via golang-nuts
Just realized: If you twist the idea around, you get something easy to implement and more correct. Instead of stripping padding if it exist, you can ensure that the body *is* padded to a multiple of 4 bytes: https://go.dev/play/p/SsPRXV9ZfoS You can then feed that to base64.StdEncoding. If the wrap

Re: [go-nuts] Efficiently switch io.Reader to another decoder on error

2025-01-13 Thread 'Axel Wagner' via golang-nuts
Hi, one way to solve your problem is to wrap the body into an io.Reader that strips off everything after the first `=` it finds. That can then be fed to base64.RawStdEncoding. This approach requires no extra buffering or copying and is easy to implement: https://go.dev/play/p/CwcVz7oietI The down

Re: [go-nuts] Efficiently switch io.Reader to another decoder on error

2025-01-12 Thread Robert Engels
No worries - happy to help. One last thing base64 coding is fairly trivial - a cursory shows that the padded version uses = signs. I suspect you could write a decoder that handled either during the decoding. > On Jan 12, 2025, at 3:29 PM, Rory Campbell-Lange > wrote: > > Thanks very much fo

Re: [go-nuts] Efficiently switch io.Reader to another decoder on error

2025-01-12 Thread Rory Campbell-Lange
Thanks very much for the links, pointers and possible solution. Trying to read base64 standard (padded) encoded data with base64.RawStdEncoding can produce an error such as illegal base64 data at input byte Reading base64 raw (unpadded) encoded data produces the EOF error. I'll go with tr

Re: [go-nuts] Efficiently switch io.Reader to another decoder on error

2025-01-12 Thread robert engels
Also, see this https://stackoverflow.com/questions/69753478/use-base64-stdencoding-or-base64-rawstdencoding-to-decode-base64-string-in-go as I expected the error should be reported earlier than the end of stream if the chosen format is wrong. > On Jan 12, 2025, at 2:57 PM, robert engels wrote:

Re: [go-nuts] Efficiently switch io.Reader to another decoder on error

2025-01-12 Thread robert engels
Also, this is what Gemini provided which looks basically correct - but I think encapsulating it with a Rewind() method would be easier to understand. While Go doesn't have a built-in PushbackReader like some other languages (e.g., Java), you can implement similar functionality using a custom s

Re: [go-nuts] Efficiently switch io.Reader to another decoder on error

2025-01-12 Thread Robert Engels
You can see the two pass reader here https://stackoverflow.com/questions/20666594/how-can-i-push-bytes-into-a-reader-in-go But yea, the basic premise is that you buffer the data so you can rewind if needed Are you certain it is reading to the end to return EOF? It may be returning eof once th

Re: [go-nuts] Efficiently switch io.Reader to another decoder on error

2025-01-12 Thread Rory Campbell-Lange
Thanks for the suggestion of a ReadSeeker to wrap an io.Reader. My google fu must be deserting me. I can find PushbackReader implementations in Java, but the only similar thing for Go I could find was https://gitlab.com/osaki-lab/iowrapper. If you have a specific recommendation for a ReadSeeker

Re: [go-nuts] Efficiently switch io.Reader to another decoder on error

2025-01-12 Thread robert engels
create a ReadSeeker that wraps the Reader providing the buffering (mark & reset) - normally the buffer only needs to be large enough to detect the format contained in the Reader. You can search Google for PushbackReader in Go and you’ll get a basic implementation. > On Jan 12, 2025, at 12:52 P

[go-nuts] Efficiently switch io.Reader to another decoder on error

2025-01-12 Thread Rory Campbell-Lange
I'm looking to develop an alternative to an existing piece of code that reads email parts into byte slices and then returns these after decoding. As library users may not wish to use these email parts and because there a multiple byte slice copies being used, I'm attempting to rationalise the p