It is dynamically generated and written at s3 bucket not historical data so I guess it doesn't have jsonlines format
On Thu, Jun 18, 2020 at 9:16 AM Jörn Franke <jornfra...@gmail.com> wrote: > Depends on the data types you use. > > Do you have in jsonlines format? Then the amount of memory plays much less > a role. > > Otherwise if it is one large object or array I would not recommend it. > > > Am 18.06.2020 um 15:12 schrieb Chetan Khatri < > chetan.opensou...@gmail.com>: > > > > > > Hi Spark Users, > > > > I have a 50GB of JSON file, I would like to read and persist at HDFS so > it can be taken into next transformation. I am trying to read as > spark.read.json(path) but this is giving Out of memory error on driver. > Obviously, I can't afford having 50 GB on driver memory. In general, what > is the best practice to read large JSON file like 50 GB? > > > > Thanks >