Hi,
I'm in a somewhat similar situation. Here's what I do (it seems to be
working so far):
1. Stream in the JSON as a plain string.
2. Feed this string into a JSON library to validate it (I use Circe).
3. Using the same library, parse the JSON and extract fields X, Y and Z.
4. Create a dataset wi
Hi Lian,
"What have you tried?" would be a good starting point. Any help on this?
How do you read the JSONs? readStream.json? You could use readStream.text
followed by filter to include/exclude good/bad JSONs.
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
Mastering Spark SQL h