Spark SQL question

2023-01-27 Thread Kohki Nishio
'm no expert in SQL, but feel like it's a strange behavior... does anybody have a good explanation for it ? Thanks -- Kohki Nishio

Re: GC issue - Ext Root Scanning

2021-11-16 Thread Kohki Nishio
ly Spark related, but maybe some ideas will > surface. > Of course, reducing memory allocation in your app if possible always helps. > > > On Mon, Nov 15, 2021 at 10:18 AM Kohki Nishio wrote: > >> it's a VM, but it has 16 cores and 32 processors. >> >

Re: GC issue - Ext Root Scanning

2021-11-15 Thread Kohki Nishio
tma Gandhi" > > +91 73500 12833 > deic...@gmail.com > > Facebook: https://www.facebook.com/deicool > LinkedIn: www.linkedin.com/in/deicool > > "Plant a Tree, Go Green" > > Make In India : http://www.makeinindia.com/home > > > On Mon, Nov 15, 202

GC issue - Ext Root Scanning

2021-11-14 Thread Kohki Nishio
Cards: 1.1 ms] [Humongous Register: 0.7 ms] [Humongous Reclaim: 0.3 ms] [Free CSet: 0.7 ms] [Eden: 8096.0M(8096.0M)->0.0B(8096.0M) Survivors: 96.0M->96.0M Heap: 23.3G(160.0G)->15.4G(160.0G)] [Times: user=23.46 sys=1.03, real=5.72 secs] -- Kohki Nishio

Re: Possibly a memory leak issue in Spark

2021-09-22 Thread Kohki Nishio
ls how much > metadata remains in the driver post task/stage/job competition. > > On Sep 22, 2021, at 12:42 PM, Kohki Nishio wrote: > > I believe I have enough information, raised this > > https://issues.apache.org/jira/browse/SPARK-36827 > > thanks > -Kohki > > >

Re: Lock issue with SQLConf.getConf

2021-09-11 Thread Kohki Nishio
Awesome, thanks! On Sat, Sep 11, 2021 at 6:34 AM Sean Owen wrote: > Looks like this was improved in > https://issues.apache.org/jira/browse/SPARK-35701 for 3.2.0 > > On Fri, Sep 10, 2021 at 10:21 PM Kohki Nishio wrote: > >> Hello, >> I'm running spark in local

Lock issue with SQLConf.getConf

2021-09-10 Thread Kohki Nishio
izedMap.get(Collections.java:2586) - waiting to lock <0x7fc901c7d9f8> (a java.util.Collections$SynchronizedMap) at org.apache.spark.sql.internal.SQLConf.getConf(SQLConf.scala:3750) at org.apache.spark.sql.internal.SQLConf.planChangeLogLevel(SQLConf.scala:3160) at org.apache.spark.sql.catalyst.rules.PlanChangeLogger.(RuleExecutor.scala:49) --- -- Kohki Nishio

Re: JavaSerializerInstance is slow

2021-09-07 Thread Kohki Nishio
rg/jira/browse/SPARK-5300). >> >> I think there would definitely be interest in having a reliable and >> efficient local mode in Spark but it's a pretty different use case than >> what Spark originally focused on. >> >> Antonin >> >> On 03/09/2021 05:

JavaSerializerInstance is slow

2021-09-02 Thread Kohki Nishio
I'm seeing many threads doing deserialization of a task, I understand since lambda is involved, we can't use Kryo for those purposes. However I'm running it in local mode, this serialization is not really necessary, no? Is there any trick I can apply to get rid of this thread contention ? I'm seei

Re: Ordering pushdown for Spark Datasources

2021-04-05 Thread Kohki Nishio
loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > >

Ordering pushdown for Spark Datasources

2021-04-04 Thread Kohki Nishio
on't see any current activity for ordering pushdown. Thanks -- Kohki Nishio

DataSourceV2 with ordering pushdown

2020-12-22 Thread Kohki Nishio
is ordered ? Is working with a physical plan the only way to achieve this ? Thanks -- Kohki Nishio

Re: ClassLoader problem - java.io.InvalidClassException: scala.Option; local class incompatible

2017-02-20 Thread Kohki Nishio
Created a jira, I believe SBT is a valid use case, but it's resolved as Not a Problem .. https://issues.apache.org/jira/browse/SPARK-19675 On Mon, Feb 20, 2017 at 10:36 PM, Kohki Nishio wrote: > Hello, I'm writing a Play Framework application which does Spark, however > I

ClassLoader problem - java.io.InvalidClassException: scala.Option; local class incompatible

2017-02-20 Thread Kohki Nishio
2.11 that's why I'm getting this. I believe ExecutorClassLoader needs to override loadClass method as well, can anyone comment on this ? It's picking up Option class from system classloader. Thanks -- Kohki Nishio

Re: Parquet partitioning for unique identifier

2015-09-04 Thread Kohki Nishio
mostly > irrelevant. > > Cheng > > > On 9/4/15 1:24 AM, Kohki Nishio wrote: > > let's say I have a data like htis > >ID | Some1 | Some2| Some3 | > A1 | kdsfajfsa | dsafsdafa | fdsfafa | > A2 | dfsfafasd | 23jfdsjkj | 980dfs |

Re: Parquet partitioning for unique identifier

2015-09-03 Thread Kohki Nishio
.com> wrote: > >> Did you specify partitioning column while saving data.. >> On Sep 3, 2015 5:41 AM, "Kohki Nishio" wrote: >> >>> Hello experts, >>> >>> I have a huge json file (> 40G) and trying to use Parquet as a file >>> for

Parquet partitioning for unique identifier

2015-09-02 Thread Kohki Nishio
o with it. It would be ideal if I could provide a partitioner based on the unique identifier value like computing its hash value or something. One of the option would be to produce a hash value and add it as a separate column, but it doesn't sound right to me. Is there any other ways I can try ? Regards, -- Kohki Nishio

FAILED_TO_UNCOMPRESS error from Snappy

2015-08-20 Thread Kohki Nishio
l.Try$.apply(Try.scala:161) at scala.util.Success.map(Try.scala:206) at org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:300) at org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:51) ... 33 more --