after seeing Hyukjin Kwon's comment in SPARK-17583 i think its safe to say
that what i am seeing with csv is not bug or regression. it was unintended
and/or unreliable behavior in spark 2.0.x

On Wed, Nov 30, 2016 at 5:56 PM, Koert Kuipers <ko...@tresata.com> wrote:

> running our inhouse unit-tests (that work with spark 2.0.2) against spark
> 2.1.0-rc1 i see the following issues.
>
> any test that use avro (spark-avro 3.1.0) have this error:
> java.lang.AbstractMethodError
>     at org.apache.spark.sql.execution.datasources.FileFormatWriter$
> SingleDirectoryWriteTask.<init>(FileFormatWriter.scala:232)
>     at org.apache.spark.sql.execution.datasources.
> FileFormatWriter$.org$apache$spark$sql$execution$
> datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:182)
>     at org.apache.spark.sql.execution.datasources.
> FileFormatWriter$$anonfun$write$1$$anonfun$3.apply(
> FileFormatWriter.scala:129)
>     at org.apache.spark.sql.execution.datasources.
> FileFormatWriter$$anonfun$write$1$$anonfun$3.apply(
> FileFormatWriter.scala:128)
>     at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
>     at org.apache.spark.scheduler.Task.run(Task.scala:99)
>     at org.apache.spark.executor.Executor$TaskRunner.run(
> Executor.scala:282)
>     at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1145)
>     at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:615)
>     at java.lang.Thread.run(Thread.java:745)
>
>
> so looks like some api got changed or broken. i dont know if this is an
> issue or if this is OK.
>
> also a bunch of unit test related to reading and writing csv files fail.
> the issue seems to be newlines inside quoted values. this worked before and
> now it doesnt work anymore. i dont know if this was an accidentally
> supported feature and its ok to be broken? i am not even sure it is a good
> idea to support newlines inside quoted values. anyhow they still get
> written out the same way as before, but now when reading it back in things
> break down.
>
>
> On Mon, Nov 28, 2016 at 8:25 PM, Reynold Xin <r...@databricks.com> wrote:
>
>> Please vote on releasing the following candidate as Apache Spark version
>> 2.1.0. The vote is open until Thursday, December 1, 2016 at 18:00 UTC and
>> passes if a majority of at least 3 +1 PMC votes are cast.
>>
>> [ ] +1 Release this package as Apache Spark 2.1.0
>> [ ] -1 Do not release this package because ...
>>
>>
>> To learn more about Apache Spark, please see http://spark.apache.org/
>>
>> The tag to be voted on is v2.1.0-rc1 (80aabc0bd33dc5661a90133156247
>> e7a8c1bf7f5)
>>
>> The release files, including signatures, digests, etc. can be found at:
>> http://people.apache.org/~pwendell/spark-releases/spark-2.1.0-rc1-bin/
>>
>> Release artifacts are signed with the following key:
>> https://people.apache.org/keys/committer/pwendell.asc
>>
>> The staging repository for this release can be found at:
>> https://repository.apache.org/content/repositories/orgapachespark-1216/
>>
>> The documentation corresponding to this release can be found at:
>> http://people.apache.org/~pwendell/spark-releases/spark-2.1.0-rc1-docs/
>>
>>
>> =======================================
>> How can I help test this release?
>> =======================================
>> If you are a Spark user, you can help us test this release by taking an
>> existing Spark workload and running on this release candidate, then
>> reporting any regressions.
>>
>> ===============================================================
>> What should happen to JIRA tickets still targeting 2.1.0?
>> ===============================================================
>> Committers should look at those and triage. Extremely important bug
>> fixes, documentation, and API tweaks that impact compatibility should be
>> worked on immediately. Everything else please retarget to 2.1.1 or 2.2.0.
>>
>>
>>
>

Reply via email to