> This works, but the downside is that each {...} of bytes has to be pulled into memory. And the functions that is called is already designed to receive an io.Reader and parse the VERY large inner blob in an efficient manner.
Is the inner blob decoder actually using a json.Decoder, as shown in your example func secondDecoder()? In that case, the simplest and most efficient answer is to create a persistent json.Decoder which wraps the underlying io.Reader directly, and just keep calling w2.Decode(&v) on each call. It will happily consume the stream, one object at a time. If that's not possible for some reason, then it sounds like you want to break the outer stream at outer object boundaries, i.e. { ... }, without fully parsing it. You can do that with json.RawMessage: https://play.golang.org/p/BitE6l27160 However, you've still read each object as a stream of bytes into memory, and you've still done some of the work of parsing the JSON to find the start and end of each object. You can turn it back into an io.Reader by creating a bytes.NewBuffer around it, if that's what the inner parser requires. However if each object is large, and you really need to avoid reading it into memory at all, then you'd need some sort of rewindable stream. Another approach is to stop the source generating pretty-printed JSON, and make it generate in JSON-Lines <https://jsonlines.org/> format instead. It sounds like you're unable to change the source, but you might be able to un-prettyprint the JSON by using an external tool (perhaps jq can do this). Then I am thinking you could make a custom io.Reader which returns data up to a newline, then sends EOF and sends you a fresh io.Reader for the next line. But this is all very complicated, when keeping the inner Decoder around from object to object is a simple solution to the problem that you described. Is there some other constraint which prevents you from doing this? On Saturday, 27 March 2021 at 19:42:40 UTC greg.sa...@gmail.com wrote: > Good afternoon, > > For a case where there's a file containing a sequence of hashes (it could > be arrays too, as the underlying object type seems irrelevant) as per > RFC-7464. I cannot figure out how to handle this in a memory efficient way > that doesn't involve pulling each blob > > I've tried to express this on Go playground here: > https://play.golang.org/p/Aqx0gnc39rn > Note that I'm using exponent-io/jsonpath as the JSON decoder, but > certainly that could be swapped for something else. > > In essence here is an example of the input bytes: > > { > "elements" : [ > { > "Space" : "YCbCr", > "Point" : { > "Cb" : 0, > "Y" : 255, > "Cr" : -10 > } > }, > { > "Point" : { > "B" : 255, > "R" : 98, > "G" : 218 > }, > "Space" : "RGB" > } > ] > } > { > "elements" : [ > { > "Space" : "YCbCr", > "Point" : { > "Cb" : 3000, > "Y" : 355, > "Cr" : -310 > } > }, > { > "Space" : "RGB", > "Point" : { > "B" : 355, > "G" : 318, > "R" : 108 > } > } > ] > } > { > "elements" : [ > { > "Space" : "YCbCr", > "Point" : { > "Cr" : -410, > "Cb" : 400, > "Y" : 455 > } > }, > { > "Space" : "RGB", > "Point" : { > "B" : 455, > "R" : 118, > "G" : 418 > } > } > ] > } > > I can iterate through that with this code: > > w := json.NewDecoder(bytes.NewReader(j)) > for w.More() { > var v interface{} > w.Decode(&v) > fmt.Printf("%+v\n", v) > } > > This works, but the downside is that each {...} of bytes has to be pulled > into memory. And the functions that is called is already designed to > receive an io.Reader and parse the VERY large inner blob in an efficient > manner. > > So in principal, this is kinda want I want to do, but maybe I'm looking at > it all wrong: > > > w := json.NewDecoder(bytes.NewReader(j)) > for w.More() { > reader2 := ???? //Some io.Reader that represents each of the 3 json-seq > blocks > secondDecoder(reader2) > } > > func secondDecoder(reader io.Reader) { > w2 := json.NewDecoder(reader) > var v interface{} > w2.Decode(&v) > fmt.Printf("%+v\n", v) > } > > Any ideas on how to solve this problem? > > I should note that it is not possible for the input to change in this case > as the system that consumes it is not the same one that has been generating > it for the past 5 years. > > Thanks! > > - Greg > > -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/a2540f79-ad5c-40b8-a1bc-295bc27c9e5dn%40googlegroups.com.