Looks like the parallelization into RDD was the right move I was omitting,
JavaRDD jsonRDD = new JavaSparkContext(sparkSession.
sparkContext()).parallelize(results);
then I created a schema as
List fields = new ArrayList();
fields.add(DataTypes.createStructField("column_name1",
DataTypes.String
Maybe you could try something like that: SparkSession sparkSession =
SparkSession
.builder()
.appName("Rows2DataSet")
.master("local")
.getOrCreate();
List results = new LinkedList();
JavaRDD jsonRDD =
Hello!
I am running Spark on Java and bumped into a problem I can't solve or find
anything helpful among answered questions, so I would really appreciate
your help.
I am running some calculations, creating rows for each result:
List results = new LinkedList();
for(something){
results.add(RowFac
Hello!
I am running Spark on Java and bumped into a problem I can't solve or find
anything helpful among answered questions, so I would really appreciate
your help.
I am running some calculations, creating rows for each result:
List results = new LinkedList();
for(something){
results.add(RowFac