[jira] [Commented] (SPARK-23404) When the underlying buffers are already direct, we should copy them to the heap memory

2018-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16361933#comment-16361933 ] Apache Spark commented on SPARK-23404: -- User '10110346' has created a pull request f

[jira] [Assigned] (SPARK-23404) When the underlying buffers are already direct, we should copy them to the heap memory

2018-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23404: Assignee: (was: Apache Spark) > When the underlying buffers are already direct, we sho

[jira] [Assigned] (SPARK-23404) When the underlying buffers are already direct, we should copy them to the heap memory

2018-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23404: Assignee: Apache Spark > When the underlying buffers are already direct, we should copy th

[jira] [Updated] (SPARK-23404) When the underlying buffers are already direct, we should copy them to the heap memory

2018-02-12 Thread liuxian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liuxian updated SPARK-23404: Description: If the memory mode is _ON_HEAP_,when the underlying buffers are direct, we should copy them to

[jira] [Updated] (SPARK-23404) When the underlying buffers are already direct, we should copy them to the heap memory

2018-02-12 Thread liuxian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liuxian updated SPARK-23404: Summary: When the underlying buffers are already direct, we should copy them to the heap memory (was: When

[jira] [Created] (SPARK-23404) When the underlying buffers are already direct, we should copy it to the heap memory

2018-02-12 Thread liuxian (JIRA)
liuxian created SPARK-23404: --- Summary: When the underlying buffers are already direct, we should copy it to the heap memory Key: SPARK-23404 URL: https://issues.apache.org/jira/browse/SPARK-23404 Project: S

[jira] [Updated] (SPARK-23403) java.lang.ArrayIndexOutOfBoundsException: 10

2018-02-12 Thread Naresh Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naresh Kumar updated SPARK-23403: - Docs Text: val washing_flat=sc.textFile("hdfs://ip-172-31-53-45:8020/user/narine91267897/washing

[jira] [Created] (SPARK-23403) java.lang.ArrayIndexOutOfBoundsException: 10

2018-02-12 Thread Naresh Kumar (JIRA)
Naresh Kumar created SPARK-23403: Summary: java.lang.ArrayIndexOutOfBoundsException: 10 Key: SPARK-23403 URL: https://issues.apache.org/jira/browse/SPARK-23403 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-23340) Empty float/double array columns in ORC file should not raise EOFException

2018-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-23340: -- Description: This issue updates Apache ORC dependencies to 1.4.3 released on February 9th. Apac

[jira] [Updated] (SPARK-23340) Empty float/double array columns in ORC file should not raise EOFException

2018-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-23340: -- Description: This issue updates Apache ORC dependencies to 1.4.3 released on February 9th. Apac

[jira] [Updated] (SPARK-23340) Empty float/double array columns in ORC file should not raise EOFException

2018-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-23340: -- Summary: Empty float/double array columns in ORC file should not raise EOFException (was: Empt

[jira] [Updated] (SPARK-23340) Empty float/double array columns in ORC file raise EOFException

2018-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-23340: -- Summary: Empty float/double array columns in ORC file raise EOFException (was: Empty float/dou

[jira] [Updated] (SPARK-23340) Empty float/double array columns raise EOFException

2018-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-23340: -- Description: This issue updates Apache ORC dependencies to 1.4.3 released on February 9th. Apac

[jira] [Updated] (SPARK-23340) Empty float/double array columns raise EOFException

2018-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-23340: -- Priority: Critical (was: Major) > Empty float/double array columns raise EOFException > --

[jira] [Updated] (SPARK-23340) Empty float/double array columns raise EOFException

2018-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-23340: -- Summary: Empty float/double array columns raise EOFException (was: Update ORC to 1.4.3) > Em

[jira] [Updated] (SPARK-23340) Empty float/double array columns raise EOFException

2018-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-23340: -- Component/s: SQL > Empty float/double array columns raise EOFException > --

[jira] [Updated] (SPARK-23340) Update ORC to 1.4.3

2018-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-23340: -- Description: This issue updates Apache ORC dependencies to 1.4.3 released on February 9th. Apac

[jira] [Updated] (SPARK-23402) Dataset write method not working as expected for postgresql database

2018-02-12 Thread Pallapothu Jyothi Swaroop (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pallapothu Jyothi Swaroop updated SPARK-23402: -- Description: I am using spark dataset write to insert data on postgresq

[jira] [Updated] (SPARK-23402) Dataset write method not working as expected for postgresql database

2018-02-12 Thread Pallapothu Jyothi Swaroop (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pallapothu Jyothi Swaroop updated SPARK-23402: -- Description: I am using spark dataset write to insert data on postgresq

[jira] [Updated] (SPARK-23402) Dataset write method not working as expected for postgresql database

2018-02-12 Thread Pallapothu Jyothi Swaroop (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pallapothu Jyothi Swaroop updated SPARK-23402: -- Attachment: Emsku[1].jpg > Dataset write method not working as expected

[jira] [Created] (SPARK-23402) Dataset write method not working as expected for postgresql database

2018-02-12 Thread Pallapothu Jyothi Swaroop (JIRA)
Pallapothu Jyothi Swaroop created SPARK-23402: - Summary: Dataset write method not working as expected for postgresql database Key: SPARK-23402 URL: https://issues.apache.org/jira/browse/SPARK-23402

[jira] [Commented] (SPARK-23397) Scheduling delay causes Spark Streaming to miss batches.

2018-02-12 Thread Vivek Patangiwar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16361851#comment-16361851 ] Vivek Patangiwar commented on SPARK-23397: -- Thanks for your response Sean. An e

[jira] [Commented] (SPARK-20090) Add StructType.fieldNames to Python API

2018-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16361838#comment-16361838 ] Apache Spark commented on SPARK-20090: -- User 'gatorsmile' has created a pull request

[jira] [Updated] (SPARK-20090) Add StructType.fieldNames to Python API

2018-02-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-20090: Target Version/s: 2.3.0 > Add StructType.fieldNames to Python API > ---

[jira] [Resolved] (SPARK-23303) improve the explain result for data source v2 relations

2018-02-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-23303. - Resolution: Fixed Fix Version/s: 2.4.0 > improve the explain result for data source v2 relations >

[jira] [Commented] (SPARK-23377) Bucketizer with multiple columns persistence bug

2018-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16361830#comment-16361830 ] Apache Spark commented on SPARK-23377: -- User 'viirya' has created a pull request for

[jira] [Updated] (SPARK-23316) AnalysisException after max iteration reached for IN query

2018-02-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23316: Target Version/s: 2.3.0 > AnalysisException after max iteration reached for IN query >

[jira] [Commented] (SPARK-23377) Bucketizer with multiple columns persistence bug

2018-02-12 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16361824#comment-16361824 ] Joseph K. Bradley commented on SPARK-23377: --- Thanks for reconsidering here [~vi

[jira] [Resolved] (SPARK-23379) remove redundant metastore access if the current database name is the same

2018-02-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-23379. - Resolution: Fixed Assignee: Feng Liu Fix Version/s: 2.4.0 > remove redundant metastore ac

[jira] [Assigned] (SPARK-23400) Add the extra constructors for ScalaUDF

2018-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23400: Assignee: Apache Spark (was: Xiao Li) > Add the extra constructors for ScalaUDF > ---

[jira] [Updated] (SPARK-23400) Add the extra constructors for ScalaUDF

2018-02-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23400: Affects Version/s: (was: 2.2.1) (was: 2.1.2) > Add the extra constructors fo

[jira] [Assigned] (SPARK-23400) Add the extra constructors for ScalaUDF

2018-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23400: Assignee: Xiao Li (was: Apache Spark) > Add the extra constructors for ScalaUDF > ---

[jira] [Updated] (SPARK-23400) Add the extra constructors for ScalaUDF

2018-02-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23400: Summary: Add the extra constructors for ScalaUDF (was: Add two extra constructors for ScalaUDF) > Add the

[jira] [Updated] (SPARK-23400) Add the extra constructors for ScalaUDF

2018-02-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23400: Description: The last few releases, we changed the interface of ScalaUDF. Unfortunately, some Spark Packag

[jira] [Commented] (SPARK-23400) Add the extra constructors for ScalaUDF

2018-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16361819#comment-16361819 ] Apache Spark commented on SPARK-23400: -- User 'gatorsmile' has created a pull request

[jira] [Resolved] (SPARK-23323) DataSourceV2 should use the output commit coordinator.

2018-02-12 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-23323. - Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 20490 [https://githu

[jira] [Assigned] (SPARK-23323) DataSourceV2 should use the output commit coordinator.

2018-02-12 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-23323: --- Assignee: Ryan Blue > DataSourceV2 should use the output commit coordinator. > -

[jira] [Commented] (SPARK-23377) Bucketizer with multiple columns persistence bug

2018-02-12 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16361724#comment-16361724 ] Liang-Chi Hsieh commented on SPARK-23377: - For now, I think neither 3rd option or

[jira] [Commented] (SPARK-23377) Bucketizer with multiple columns persistence bug

2018-02-12 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16361718#comment-16361718 ] Liang-Chi Hsieh commented on SPARK-23377: - I have no objection to [~josephkb]'s p

[jira] [Commented] (SPARK-23230) When hive.default.fileformat is other kinds of file types, create textfile table cause a serde error

2018-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16361696#comment-16361696 ] Apache Spark commented on SPARK-23230: -- User 'cxzl25' has created a pull request for

[jira] [Comment Edited] (SPARK-23377) Bucketizer with multiple columns persistence bug

2018-02-12 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16361154#comment-16361154 ] Joseph K. Bradley edited comment on SPARK-23377 at 2/13/18 1:10 AM: ---

[jira] [Updated] (SPARK-23352) Explicitly specify supported types in Pandas UDFs

2018-02-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23352: Fix Version/s: 2.3.1 > Explicitly specify supported types in Pandas UDFs >

[jira] [Commented] (SPARK-23154) Document backwards compatibility guarantees for ML persistence

2018-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16361654#comment-16361654 ] Apache Spark commented on SPARK-23154: -- User 'jkbradley' has created a pull request

[jira] [Commented] (SPARK-20307) SparkR: pass on setHandleInvalid to spark.mllib functions that use StringIndexer

2018-02-12 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16361631#comment-16361631 ] Miao Wang commented on SPARK-20307: --- [~felixcheung] I will do it during the Lunar New Y

[jira] [Resolved] (SPARK-23230) When hive.default.fileformat is other kinds of file types, create textfile table cause a serde error

2018-02-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-23230. - Resolution: Fixed Assignee: dzcxzl Fix Version/s: 2.3.0 > When hive.default.fileformat is

[jira] [Updated] (SPARK-22820) Spark 2.3 SQL API audit

2018-02-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-22820: Fix Version/s: 2.3.0 > Spark 2.3 SQL API audit > --- > > Key: SPARK-228

[jira] [Resolved] (SPARK-22820) Spark 2.3 SQL API audit

2018-02-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-22820. - Resolution: Fixed > Spark 2.3 SQL API audit > --- > > Key: SPARK-2282

[jira] [Resolved] (SPARK-23313) Add a migration guide for ORC

2018-02-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-23313. - Resolution: Fixed Fix Version/s: 2.3.0 > Add a migration guide for ORC > -

[jira] [Assigned] (SPARK-23313) Add a migration guide for ORC

2018-02-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li reassigned SPARK-23313: --- Assignee: Dongjoon Hyun > Add a migration guide for ORC > - > >

[jira] [Commented] (SPARK-23154) Document backwards compatibility guarantees for ML persistence

2018-02-12 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16361575#comment-16361575 ] Joseph K. Bradley commented on SPARK-23154: --- I'd prefer to put it in the subsec

[jira] [Resolved] (SPARK-23378) move setCurrentDatabase from HiveExternalCatalog to HiveClientImpl

2018-02-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-23378. - Resolution: Fixed Assignee: Feng Liu Fix Version/s: 2.4.0 > move setCurrentDatabase from

[jira] [Created] (SPARK-23401) Improve test cases for all supported types and unsupported types

2018-02-12 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-23401: Summary: Improve test cases for all supported types and unsupported types Key: SPARK-23401 URL: https://issues.apache.org/jira/browse/SPARK-23401 Project: Spark

[jira] [Commented] (SPARK-23390) Flaky Test Suite: FileBasedDataSourceSuite in Spark 2.3/hadoop 2.7

2018-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16361537#comment-16361537 ] Apache Spark commented on SPARK-23390: -- User 'gatorsmile' has created a pull request

[jira] [Created] (SPARK-23400) Add two extra constructors for ScalaUDF

2018-02-12 Thread Xiao Li (JIRA)
Xiao Li created SPARK-23400: --- Summary: Add two extra constructors for ScalaUDF Key: SPARK-23400 URL: https://issues.apache.org/jira/browse/SPARK-23400 Project: Spark Issue Type: Bug Compo

[jira] [Assigned] (SPARK-23399) Register a task completion listner first for OrcColumnarBatchReader

2018-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23399: Assignee: (was: Apache Spark) > Register a task completion listner first for OrcColumn

[jira] [Assigned] (SPARK-23399) Register a task completion listner first for OrcColumnarBatchReader

2018-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23399: Assignee: Apache Spark > Register a task completion listner first for OrcColumnarBatchRead

[jira] [Commented] (SPARK-23399) Register a task completion listner first for OrcColumnarBatchReader

2018-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16361508#comment-16361508 ] Apache Spark commented on SPARK-23399: -- User 'dongjoon-hyun' has created a pull requ

[jira] [Updated] (SPARK-23399) Register a task completion listner first for OrcColumnarBatchReader

2018-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-23399: -- Description: This is related with SPARK-23390. Currently, there was a opened file leak for Orc

[jira] [Created] (SPARK-23399) Register a task completion listner first for OrcColumnarBatchReader

2018-02-12 Thread Dongjoon Hyun (JIRA)
Dongjoon Hyun created SPARK-23399: - Summary: Register a task completion listner first for OrcColumnarBatchReader Key: SPARK-23399 URL: https://issues.apache.org/jira/browse/SPARK-23399 Project: Spark

[jira] [Assigned] (SPARK-23394) Storage info's Cached Partitions doesn't consider the replications (but sc.getRDDStorageInfo does)

2018-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23394: Assignee: Apache Spark > Storage info's Cached Partitions doesn't consider the replication

[jira] [Assigned] (SPARK-23394) Storage info's Cached Partitions doesn't consider the replications (but sc.getRDDStorageInfo does)

2018-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23394: Assignee: (was: Apache Spark) > Storage info's Cached Partitions doesn't consider the

[jira] [Commented] (SPARK-23394) Storage info's Cached Partitions doesn't consider the replications (but sc.getRDDStorageInfo does)

2018-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16361347#comment-16361347 ] Apache Spark commented on SPARK-23394: -- User 'attilapiros' has created a pull reques

[jira] [Assigned] (SPARK-23388) Support for Parquet Binary DecimalType in VectorizedColumnReader

2018-02-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li reassigned SPARK-23388: --- Assignee: James Thompson > Support for Parquet Binary DecimalType in VectorizedColumnReader > --

[jira] [Resolved] (SPARK-23388) Support for Parquet Binary DecimalType in VectorizedColumnReader

2018-02-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-23388. - Resolution: Fixed Fix Version/s: 2.3.0 > Support for Parquet Binary DecimalType in VectorizedColum

[jira] [Comment Edited] (SPARK-23310) Perf regression introduced by SPARK-21113

2018-02-12 Thread Nicolas Poggi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16361071#comment-16361071 ] Nicolas Poggi edited comment on SPARK-23310 at 2/12/18 6:35 PM: ---

[jira] [Created] (SPARK-23398) DataSourceV2 should provide a way to get the source schema

2018-02-12 Thread Ryan Blue (JIRA)
Ryan Blue created SPARK-23398: - Summary: DataSourceV2 should provide a way to get the source schema Key: SPARK-23398 URL: https://issues.apache.org/jira/browse/SPARK-23398 Project: Spark Issue Ty

[jira] [Updated] (SPARK-23398) DataSourceV2 should provide a way to get a source's schema.

2018-02-12 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated SPARK-23398: -- Summary: DataSourceV2 should provide a way to get a source's schema. (was: DataSourceV2 should provide

[jira] [Commented] (SPARK-23377) Bucketizer with multiple columns persistence bug

2018-02-12 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16361154#comment-16361154 ] Joseph K. Bradley commented on SPARK-23377: --- [~viirya]'s patch currently change

[jira] [Updated] (SPARK-23377) Bucketizer with multiple columns persistence bug

2018-02-12 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-23377: -- Priority: Critical (was: Major) > Bucketizer with multiple columns persistence bug > -

[jira] [Commented] (SPARK-23310) Perf regression introduced by SPARK-21113

2018-02-12 Thread Nicolas Poggi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16361071#comment-16361071 ] Nicolas Poggi commented on SPARK-23310: --- Q72 of TPC-DS is also affected around 30%

[jira] [Resolved] (SPARK-23390) Flaky Test Suite: FileBasedDataSourceSuite in Spark 2.3/hadoop 2.7

2018-02-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-23390. - Resolution: Fixed Assignee: Wenchen Fan Fix Version/s: 2.3.0 > Flaky Test Suite: FileBase

[jira] [Commented] (SPARK-20327) Add CLI support for YARN custom resources, like GPUs

2018-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16360944#comment-16360944 ] Marcelo Vanzin commented on SPARK-20327: bq. I think the point is, without reflec

[jira] [Commented] (SPARK-20327) Add CLI support for YARN custom resources, like GPUs

2018-02-12 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16360933#comment-16360933 ] Sean Owen commented on SPARK-20327: --- I think the point is, without reflection, using 3.

[jira] [Comment Edited] (SPARK-20327) Add CLI support for YARN custom resources, like GPUs

2018-02-12 Thread Szilard Nemeth (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16360904#comment-16360904 ] Szilard Nemeth edited comment on SPARK-20327 at 2/12/18 3:43 PM: --

[jira] [Commented] (SPARK-20327) Add CLI support for YARN custom resources, like GPUs

2018-02-12 Thread Szilard Nemeth (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16360904#comment-16360904 ] Szilard Nemeth commented on SPARK-20327: Hey [~vanzin]! I see what you said abou

[jira] [Resolved] (SPARK-23391) It may lead to overflow for some integer multiplication

2018-02-12 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-23391. --- Resolution: Fixed Fix Version/s: 2.3.0 2.2.2 Issue resolved by pull request

[jira] [Assigned] (SPARK-23391) It may lead to overflow for some integer multiplication

2018-02-12 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-23391: - Assignee: liuxian > It may lead to overflow for some integer multiplication > -

[jira] [Commented] (SPARK-23308) ignoreCorruptFiles should not ignore retryable IOException

2018-02-12 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16360809#comment-16360809 ] Steve Loughran commented on SPARK-23308: BTW bq I should get at least ~82k part

[jira] [Comment Edited] (SPARK-23397) Scheduling delay causes Spark Streaming to miss batches.

2018-02-12 Thread Shahbaz Hussain (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16360802#comment-16360802 ] Shahbaz Hussain edited comment on SPARK-23397 at 2/12/18 2:29 PM: -

[jira] [Commented] (SPARK-23397) Scheduling delay causes Spark Streaming to miss batches.

2018-02-12 Thread Shahbaz Hussain (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16360802#comment-16360802 ] Shahbaz Hussain commented on SPARK-23397: - can we be able to make job creation a

[jira] [Commented] (SPARK-23397) Scheduling delay causes Spark Streaming to miss batches.

2018-02-12 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16360798#comment-16360798 ] Sean Owen commented on SPARK-23397: --- That sounds correct. The next batch executes as so

[jira] [Commented] (SPARK-23397) Scheduling delay causes Spark Streaming to miss batches.

2018-02-12 Thread Shahbaz Hussain (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16360793#comment-16360793 ] Shahbaz Hussain commented on SPARK-23397: - Yes ,if current Batch Processing time

[jira] [Updated] (SPARK-23396) Spark HistoryServer will OMM if the event log is big

2018-02-12 Thread KaiXinXIaoLei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KaiXinXIaoLei updated SPARK-23396: -- Description: if the event log is  big, the historyServer web will be out of memory . My eventl

[jira] [Updated] (SPARK-23392) Add some test case for images feature

2018-02-12 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-23392: -- Priority: Trivial (was: Major) > Add some test case for images feature > -

[jira] [Updated] (SPARK-23396) Spark HistoryServer will OMM if the event log is big

2018-02-12 Thread KaiXinXIaoLei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KaiXinXIaoLei updated SPARK-23396: -- Description: if the event log is  big, the historyServer web will be out of memory . My eventl

[jira] [Updated] (SPARK-23396) Spark HistoryServer will OMM if the event log is big

2018-02-12 Thread KaiXinXIaoLei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KaiXinXIaoLei updated SPARK-23396: -- Attachment: eventlog.png > Spark HistoryServer will OMM if the event log is big > -

[jira] [Updated] (SPARK-23396) Spark HistoryServer will OMM if the event log is big

2018-02-12 Thread KaiXinXIaoLei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KaiXinXIaoLei updated SPARK-23396: -- Description: if the event log is  big, the historyServer web will be out of memory !eventlog.p

[jira] [Updated] (SPARK-23396) Spark HistoryServer will OMM if the event log is big

2018-02-12 Thread KaiXinXIaoLei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KaiXinXIaoLei updated SPARK-23396: -- Description: if the event log is  big, the historyServer web will be out of memory !eventlog.p

[jira] [Updated] (SPARK-23396) Spark HistoryServer will OMM if the event log is big

2018-02-12 Thread KaiXinXIaoLei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KaiXinXIaoLei updated SPARK-23396: -- Attachment: (was: eventlog.png) > Spark HistoryServer will OMM if the event log is big > --

[jira] [Commented] (SPARK-23397) Scheduling delay causes Spark Streaming to miss batches.

2018-02-12 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16360789#comment-16360789 ] Sean Owen commented on SPARK-23397: --- This is how it's supposed to work. Batches don't o

[jira] [Resolved] (SPARK-23343) Increase the exception test for the bind port

2018-02-12 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-23343. --- Resolution: Won't Fix > Increase the exception test for the bind port > -

[jira] [Created] (SPARK-23397) Scheduling delay causes Spark Streaming to miss batches.

2018-02-12 Thread Shahbaz Hussain (JIRA)
Shahbaz Hussain created SPARK-23397: --- Summary: Scheduling delay causes Spark Streaming to miss batches. Key: SPARK-23397 URL: https://issues.apache.org/jira/browse/SPARK-23397 Project: Spark

[jira] [Updated] (SPARK-23396) Spark HistoryServer will OMM if the event log is big

2018-02-12 Thread KaiXinXIaoLei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KaiXinXIaoLei updated SPARK-23396: -- Attachment: eventlog.png > Spark HistoryServer will OMM if the event log is big > -

[jira] [Updated] (SPARK-23396) Spark HistoryServer will OMM if the event log is big

2018-02-12 Thread KaiXinXIaoLei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KaiXinXIaoLei updated SPARK-23396: -- Attachment: (was: historyServer.png) > Spark HistoryServer will OMM if the event log is big

[jira] [Updated] (SPARK-23396) Spark HistoryServer will OMM if the event log is big

2018-02-12 Thread KaiXinXIaoLei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KaiXinXIaoLei updated SPARK-23396: -- Description: if the event log is  big, the historyServer web will be out of memory   was:if

[jira] [Commented] (SPARK-23396) Spark HistoryServer will OMM if the event log is big

2018-02-12 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16360784#comment-16360784 ] Sean Owen commented on SPARK-23396: --- This is far too vague. It seems to overlap with re

[jira] [Updated] (SPARK-23396) Spark HistoryServer will OMM if the event log is big

2018-02-12 Thread KaiXinXIaoLei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KaiXinXIaoLei updated SPARK-23396: -- Attachment: historyServer.png > Spark HistoryServer will OMM if the event log is big >

[jira] [Updated] (SPARK-23396) Spark HistoryServer will OMM if the event log is big

2018-02-12 Thread KaiXinXIaoLei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KaiXinXIaoLei updated SPARK-23396: -- Attachment: historyServer.png > Spark HistoryServer will OMM if the event log is big >

[jira] [Updated] (SPARK-23396) Spark HistoryServer will OMM if the event log is big

2018-02-12 Thread KaiXinXIaoLei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KaiXinXIaoLei updated SPARK-23396: -- Attachment: (was: historyServer.png) > Spark HistoryServer will OMM if the event log is big

[jira] [Created] (SPARK-23396) Spark HistoryServer will OMM if the event log is big

2018-02-12 Thread KaiXinXIaoLei (JIRA)
KaiXinXIaoLei created SPARK-23396: - Summary: Spark HistoryServer will OMM if the event log is big Key: SPARK-23396 URL: https://issues.apache.org/jira/browse/SPARK-23396 Project: Spark Issue

  1   2   >