Okay Thanks a lot Fabian! On Wed, Apr 27, 2016 at 12:34 PM, Fabian Hueske <fhue...@gmail.com> wrote:
> You should do the parsing in a Map operator. Map applies the MapFunction to > each element in the DataSet. > So you can either implement another MapFunction or extend the one you have > to call the JSON parser. > > 2016-04-27 6:40 GMT+02:00 Punit Naik <naik.puni...@gmail.com>: > > > Hi > > > > So I managed to do the map part. I stuc with the "import > > scala.util.parsing.json._" library for parsing. > > > > First I read my JSON: > > > > val data=env.readTextFile("file:///home/punit/vik-in") > > > > Then I transformed it so that it can be parsed to a map: > > > > val j=data.map { x => ("\"\"\"").+(x).+("\"\"\"") } > > > > > > I check it by printing "j"s 1st value and its proper. > > > > But when I tried to parse "j" like this: > > > > JSON.parseFull(j.first(1)) ; it did not parse because the object > > "j.first(1)" is still a Dataset object and not a String object. > > > > So how can I get the underlying java object from the dataset object? > > > > On Tue, Apr 26, 2016 at 3:32 PM, Fabian Hueske <fhue...@gmail.com> > wrote: > > > > > Hi, > > > > > > you need to implement the MapFunction interface [1]. > > > Inside the MapFunction you can use any JSON parser library such as > > Jackson > > > to parse the String. > > > The exact logic depends on your use case. > > > > > > However, you should be careful to not initialize a new parser in each > > map() > > > call, because that would be quite expensive. > > > I recommend to extend the RichMapFunction and instantiate a parser in > the > > > open() method. > > > > > > Best, Fabian > > > > > > [1] > > > > > > > > > https://ci.apache.org/projects/flink/flink-docs-release-1.0/apis/batch/dataset_transformations.html#map > > > [2] > > > > > > > > > https://ci.apache.org/projects/flink/flink-docs-release-1.0/apis/common/index.html#specifying-transformation-functions > > > > > > 2016-04-26 10:44 GMT+02:00 Punit Naik <naik.puni...@gmail.com>: > > > > > > > Hi Fabian > > > > > > > > Thanks for the reply. Yes my json is separated by new lines. It would > > > have > > > > been great if you had explained the function that goes inside the > map. > > I > > > > tried to use the 'scala.util.parsing.json._' library but got no luck. > > > > > > > > On Tue, Apr 26, 2016 at 1:11 PM, Fabian Hueske <fhue...@gmail.com> > > > wrote: > > > > > > > > > Hi Punit, > > > > > > > > > > JSON can be hard to parse in parallel due to its nested structure. > It > > > > > depends on the schema and (textual) representation of the JSON > > whether > > > > and > > > > > how it can be done. The problem is that a parallel input format > needs > > > to > > > > be > > > > > able to identify record boundaries without context information. > This > > > can > > > > be > > > > > very easy, if your JSON data is a list of JSON objects which are > > > > separated > > > > > by a new line character. However, this is hard to generalize. > That's > > > why > > > > > Flink does not offer tooling for it (yet). > > > > > > > > > > If your JSON objects are separated by new line characters, the > > easiest > > > > way > > > > > is to read it as text file, where each line results in a String and > > > parse > > > > > each object using a standard JSON parser. This would look like: > > > > > > > > > > ExecutionEnvironment env = > > > > ExecutionEnvironment.getExecutionEnvironment(); > > > > > > > > > > DataSet<String> text = env.readTextFile("/path/to/jsonfile"); > > > > > DataSet<YourObject> json = text.map(new > > > > YourMapFunctionWhichParsesJSON()); > > > > > > > > > > Best, Fabian > > > > > > > > > > 2016-04-26 8:06 GMT+02:00 Punit Naik <naik.puni...@gmail.com>: > > > > > > > > > > > Hi > > > > > > > > > > > > I am new to Flink. I was experimenting with the Dataset API and > > found > > > > out > > > > > > that there is no explicit method for loading a JSON file as > input. > > > Can > > > > > > anyone please suggest me a workaround? > > > > > > > > > > > > -- > > > > > > Thank You > > > > > > > > > > > > Regards > > > > > > > > > > > > Punit Naik > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Thank You > > > > > > > > Regards > > > > > > > > Punit Naik > > > > > > > > > > > > > > > -- > > Thank You > > > > Regards > > > > Punit Naik > > > -- Thank You Regards Punit Naik