rdblue commented on a change in pull request #6: Support customizing the 
location where data is written in Spark
URL: https://github.com/apache/incubator-iceberg/pull/6#discussion_r239182889
 
 

 ##########
 File path: 
spark/src/main/java/com/netflix/iceberg/spark/source/IcebergSource.java
 ##########
 @@ -89,7 +92,11 @@ public DataSourceReader createReader(DataSourceOptions 
options) {
           .toUpperCase(Locale.ENGLISH));
     }
 
-    return Optional.of(new Writer(table, lazyConf(), format));
+    String dataLocation = options.get(TableProperties.WRITE_NEW_DATA_LOCATION)
+        .orElse(table.properties().getOrDefault(
+            TableProperties.WRITE_NEW_DATA_LOCATION,
+            new Path(new Path(table.location()), "data").toString()));
+    return Optional.of(new Writer(table, lazyConf(), format, dataLocation));
 
 Review comment:
   What I think is strange is passing the location of a write into the writer 
when we're passing table into the writer. Why isn't that logic entirely handled 
in the writer? The normal case is for the write location to come from table 
config. I'm not even sure that we should allow overriding the write location in 
Spark's write properties. What is the use case there?
   
   I like your reasoning about not passing options as a map to make testing 
clear in general, but doing it here just shifts the concern to a different 
test. The test case is that setting "write.folder-storage.path" in Spark 
options changes the location of output files. A test that passes in the 
location can validate that the location is respected, but what we actually want 
to do is test that the table's location defaults, or is set by the table 
property, or (maybe) is set by Spark options.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to