Hi, I have prices coming through Kafka in the following format
key,{JSON data} The key is needed as part of data post to NoSQL database like Aerospike. The following is record of topic from Kafka ba7e6bdc-2a92-4dc3-8e28-a75e1a7d58f2,{"rowkey":"ba7e6bdc-2a92-4dc3-8e28-a75e1a7d58f2","ticker":"SBRY", "timeissued":"2019-06-18T22:10:26", "price":555.75} The "key":"value" pairs inside {} are valid JSON as shown below in JSONLint https://jsonlint.com/ { "rowkey": "ba7e6bdc-2a92-4dc3-8e28-a75e1a7d58f2", "ticker": "SBRY", "timeissued": "2019-06-18T22:10:26", "price": 555.75 } Now I need to extract values from this JSON. One way would be to go through dstream dstream.foreachRDD { pricesRDD => if (!pricesRDD.isEmpty) // data exists in RDD { for(row <- pricesRDD.collect.toArray) { println(row) println(row._2.split(',').view(0).toString) println(row._2.split(',').view(1).split(':').view(1).toString) println(row._2.split(',').view(2).split(':').view(1).toString) println(row._2.split(',').view(3).split(':').view(1).toString) And I get hit and miss as shown in the sample below with incorrect parsing (ba7e6bdc-2a92-4dc3-8e28-a75e1a7d58f2,{"rowkey":"ba7e6bdc-2a92-4dc3-8e28-a75e1a7d58f2","ticker":"SBRY", "timeissued":"2019-06-18T22:10:26", "price":555.75}) {"rowkey":"ba7e6bdc-2a92-4dc3-8e28-a75e1a7d58f2" "SBRY" //corrrect "2019-06-18T22 // missing half 555.75} // incorrect Is there any way reading JSON data systematically? Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.