See https://github.com/apache/spark/pull/22688
+WEnchen, here looks the problem raised. This might have to be considered as a blocker ... On Thu, 11 Oct 2018, 2:48 pm assaf.mendelson, <assaf.mendel...@rsa.com> wrote: > Hi, > > I created a datasource writer WITHOUT a reader. When I do, I get an > exception: org.apache.spark.sql.AnalysisException: Data source is not > readable: DefaultSource > > The reason for this is that when save is called, inside the source match to > WriterSupport we have the following code: > > val source = cls.newInstance().asInstanceOf[DataSourceV2] > source match { > case ws: WriteSupport => > val sessionOptions = DataSourceV2Utils.extractSessionConfigs( > source, > df.sparkSession.sessionState.conf) > val options = sessionOptions ++ extraOptions > --> val relation = DataSourceV2Relation.create(source, options) > > if (mode == SaveMode.Append) { > runCommand(df.sparkSession, "save") { > AppendData.byName(relation, df.logicalPlan) > } > > } else { > val writer = ws.createWriter( > UUID.randomUUID.toString, df.logicalPlan.output.toStructType, > mode, > new DataSourceOptions(options.asJava)) > > if (writer.isPresent) { > runCommand(df.sparkSession, "save") { > WriteToDataSourceV2(writer.get, df.logicalPlan) > } > } > } > > but DataSourceV2Relation.create actively creates a reader > (source.createReader) to extract the schema: > > def create( > source: DataSourceV2, > options: Map[String, String], > tableIdent: Option[TableIdentifier] = None, > userSpecifiedSchema: Option[StructType] = None): DataSourceV2Relation > = { > val reader = source.createReader(options, userSpecifiedSchema) > val ident = tableIdent.orElse(tableFromOptions(options)) > DataSourceV2Relation( > source, reader.readSchema().toAttributes, options, ident, > userSpecifiedSchema) > } > > > This makes me a little confused. > > First, the schema is defined by the dataframe itself, not by the data > source, i.e. it should be extracted from df.schema and not by > source.createReader > > Second, I see that relation is actually only use if the mode is > SaveMode.append (btw this means if it is needed it should be defined inside > the "if"). I am not sure I understand the portion of the AppendData but why > would reading from the source be included? > > Am I missing something here? > > Thanks, > Assaf > > > > -- > Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ > > --------------------------------------------------------------------- > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > >