You need to use wholetextfiles to read the whole file at once. Otherwise, It can be split.
DB Tsai - Sent From My Phone On Mar 17, 2016 12:45 AM, "Blaž Šnuderl" <snud...@gmail.com> wrote: > Hi. > > We have json data stored in S3 (json record per line). When reading the > data from s3 using the following code we started noticing json decode > errors. > > sc.textFile(paths).map(json.loads) > > > After a bit more investigation we noticed an incomplete line, basically > the line was > >> {"key": "value", "key2": <- notice the line abruptly ends with no json >> close tag etc > > > It is not an issue with our data and it doesn't happen very often, but it > makes us very scared since it means spark could be dropping data. > > We are using spark 1.5.1. Any ideas why this happens and possible fixes? > > Regards, > Blaž Šnuderl >