On Wed, Jun 4, 2014 at 3:33 PM, Matt Kielo <mki...@oculusinfo.com> wrote: > Im trying run some spark code on a cluster but I keep running into a > "java.io.StreamCorruptedException: invalid type code: AC" error. My task > involves analyzing ~50GB of data (some operations involve sorting) then > writing them out to a JSON file. Im running the analysis on each of the > data's ~10 columns and have never had a successful run. My program seems to > run for a varying amount of time each time (~between 5-30 minutes) but it > always terminates with this error.
I can tell you that this usually means somewhere something wrote objects to the same OutputStream with multiple ObjectOutputStreams. AC is a header value. I don't obviously see where/how that could happen, but maybe it rings a bell for someone. This could happen if an OutputStream is reused across object serializations but new ObjectOutputStreams are opened, for example.