Re: Spark driver main thread hanging after SQL insert

2015-01-02 Thread Patrick Wendell
Hi Alessandro,

Can you create a JIRA for this rather than reporting it on the dev
list? That's where we track issues like this. Thanks!.

- Patrick

On Wed, Dec 31, 2014 at 8:48 PM, Alessandro Baretta
 wrote:
> Here's what the console shows:
>
> 15/01/01 01:12:29 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 58.0,
> whose tasks have all completed, from pool
> 15/01/01 01:12:29 INFO scheduler.DAGScheduler: Stage 58 (runJob at
> ParquetTableOperations.scala:326) finished in 5493.549 s
> 15/01/01 01:12:29 INFO scheduler.DAGScheduler: Job 41 finished: runJob at
> ParquetTableOperations.scala:326, took 5493.747061 s
>
> It is now 01:40:03, so the driver has been hanging for the last 28 minutes.
> The web UI on the other hand shows that all tasks completed successfully,
> and the output directory has been populated--although the _SUCCESS file is
> missing.
>
> It is worth noting that my code started this job as its own thread. The
> actual code looks like the following snippet, modulo some simplifications.
>
>   def save_to_parquet(allowExisting : Boolean = false) = {
> val threads = tables.map(table => {
>   val thread = new Thread {
> override def run {
>   table.insertInto(t.table_name)
> }
>   }
>   thread.start
>   thread
> })
> threads.foreach(_.join)
>   }
>
> As far as I can see the insertInto call never returns. Any idea why?
>
> Alex

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: Spark driver main thread hanging after SQL insert

2015-01-02 Thread Alessandro Baretta
Patrick,

Sure. I was interested in knowing if anyone experienced a similar issue and
whether there was any known workaround. Anyway will report on JIRA.

Alex
On Jan 2, 2015 9:13 AM, "Patrick Wendell"  wrote:

> Hi Alessandro,
>
> Can you create a JIRA for this rather than reporting it on the dev
> list? That's where we track issues like this. Thanks!.
>
> - Patrick
>
> On Wed, Dec 31, 2014 at 8:48 PM, Alessandro Baretta
>  wrote:
> > Here's what the console shows:
> >
> > 15/01/01 01:12:29 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 58.0,
> > whose tasks have all completed, from pool
> > 15/01/01 01:12:29 INFO scheduler.DAGScheduler: Stage 58 (runJob at
> > ParquetTableOperations.scala:326) finished in 5493.549 s
> > 15/01/01 01:12:29 INFO scheduler.DAGScheduler: Job 41 finished: runJob at
> > ParquetTableOperations.scala:326, took 5493.747061 s
> >
> > It is now 01:40:03, so the driver has been hanging for the last 28
> minutes.
> > The web UI on the other hand shows that all tasks completed successfully,
> > and the output directory has been populated--although the _SUCCESS file
> is
> > missing.
> >
> > It is worth noting that my code started this job as its own thread. The
> > actual code looks like the following snippet, modulo some
> simplifications.
> >
> >   def save_to_parquet(allowExisting : Boolean = false) = {
> > val threads = tables.map(table => {
> >   val thread = new Thread {
> > override def run {
> >   table.insertInto(t.table_name)
> > }
> >   }
> >   thread.start
> >   thread
> > })
> > threads.foreach(_.join)
> >   }
> >
> > As far as I can see the insertInto call never returns. Any idea why?
> >
> > Alex
>


Re: Highly interested in contributing to spark

2015-01-02 Thread Manoj Kumar
Hello,

Thanks for your quick comments and encouragement.

I tried building Spark from source using build/sbt assembly

It however fails at this point

downloading
https://repo1.maven.org/maven2/org/scala-lang/scala-library/2.10.4/scala-library-2.10.4.jar
with SSL certificate errors. I understand that it is due to this problem (
http://apache-spark-user-list.1001560.n3.nabble.com/sbt-sbt-assembly-fails-with-ssl-certificate-error-td3046.html
)

but I'm not sure why it still it uses https when this PR
https://github.com/apache/spark/pull/209 has fixed it. Any help would be
greatful.






On Fri, Jan 2, 2015 at 11:51 AM, Nick Pentreath 
wrote:

> Oh actually I was confused with another project, yours was not LSH sorry!
>
>
>
> —
> Sent from Mailbox 
>
>
> On Fri, Jan 2, 2015 at 8:19 AM, Nick Pentreath 
> wrote:
>
>> I'm sure Spark will sign up for GSoC again this year - and id be
>> surprised if there was not some interest now for projects :)
>>
>> If I have the time at that point in the year I'd be happy to mentor a
>> project in MLlib but will have to see how my schedule is at that point!
>>
>> Manoj perhaps some of the locality sensitive hashing stuff you did for
>> scikit-learn could find its way to Spark or spark-projects.
>>
>> —
>> Sent from Mailbox 
>>
>>
>> On Fri, Jan 2, 2015 at 6:28 AM, Reynold Xin  wrote:
>>
>>> Hi Manoj,
>>>
>>> Thanks for the email.
>>>
>>> Yes - you should start with the starter task before attempting larger
>>> ones.
>>> Last year I signed up as a mentor for GSoC, but no student signed up. I
>>> don't think I'd have time to be a mentor this year, but others might.
>>>
>>>
>>> On Thu, Jan 1, 2015 at 4:54 PM, Manoj Kumar <
>>> manojkumarsivaraj...@gmail.com>
>>> wrote:
>>>
>>> > Hello,
>>> >
>>> > I am Manoj (https://github.com/MechCoder), an undergraduate student
>>> highly
>>> > interested in Machine Learning. I have contributed to SymPy and
>>> > scikit-learn as part of Google Summer of Code projects and my
>>> bachelor's
>>> > thesis. I have a few quick (non-technical) questions before I dive
>>> into the
>>> > issue tracker.
>>> >
>>> > Are the ones marked trivial easy to fix ones, that I could try before
>>> > attempting slightly more ambitious ones? Also I would like to know if
>>> > Apache Spark takes part in Google Summer of Code projects under the
>>> Apache
>>> > Software Foundation. It would be really great if it does!
>>> >
>>> > Looking forward!
>>> >
>>> > --
>>> > Godspeed,
>>> > Manoj Kumar,
>>> > Mech Undergrad
>>> > http://manojbits.wordpress.com
>>> >
>>>
>>
>>
>


-- 
Godspeed,
Manoj Kumar,
Intern, Telecom ParisTech
Mech Undergrad
http://manojbits.wordpress.com


Re: Highly interested in contributing to spark

2015-01-02 Thread Ganelin, Ilya
I might be seeing a similar error - I¹m trying to build behind a proxy. I
was able to build until recently, but now when I run mvn clean package, I
get the following errors:

I would love to know what¹s going on here.

Exception in thread "pool-1-thread-1" Exception in thread "main"
java.lang.ExceptionInInitializerError
java.lang.ExceptionInInitializerError
at java.lang.J9VMInternals.ensureError(J9VMInternals.java:186)
at java.lang.J9VMInternals.ensureError(J9VMInternals.java:186)
at 
java.lang.J9VMInternals.recordInitializationFailure(J9VMInternals.java:175)
at 
java.lang.J9VMInternals.recordInitializationFailure(J9VMInternals.java:175)

at javax.crypto.KeyAgreement.getInstance(Unknown Source)
at com.ibm.jsse2.lb.h(lb.java:129)
at javax.crypto.KeyAgreement.getInstance(Unknown Source)
at com.ibm.jsse2.lb.h(lb.java:129)
at com.ibm.jsse2.lb.a(lb.java:165)
at com.ibm.jsse2.l$c_.a(l$c_.java:18)
at com.ibm.jsse2.lb.a(lb.java:165)
at com.ibm.jsse2.l$c_.a(l$c_.java:18)   at com.ibm.jsse2.l.a(l.java:172)
at com.ibm.jsse2.m.a(m.java:38)
at com.ibm.jsse2.l.a(l.java:172)

at com.ibm.jsse2.m.a(m.java:38)
at com.ibm.jsse2.m.h(m.java:21)
at com.ibm.jsse2.m.h(m.java:21)
at com.ibm.jsse2.qc.a(qc.java:110)
at com.ibm.jsse2.qc.(qc.java:822)
at com.ibm.jsse2.qc.a(qc.java:110)
at com.ibm.jsse2.qc.(qc.java:822)
at 
com.ibm.jsse2.SSLSocketFactoryImpl.createSocket(SSLSocketFactoryImpl.java:1
0)  at 
com.ibm.jsse2.SSLSocketFactoryImpl.createSocket(SSLSocketFactoryImpl.java:1
0)

at 
org.apache.maven.wagon.providers.http.httpclient.conn.ssl.SSLConnectionSock
etFactory.createLayeredSocket(SSLConnectionSocketFactory.java:274)
at 
org.apache.maven.wagon.providers.http.httpclient.impl.conn.HttpClientConnec
tionOperator.upgrade(HttpClientConnectionOperator.java:167)
at 
org.apache.maven.wagon.providers.http.httpclient.conn.ssl.SSLConnectionSock
etFactory.createLayeredSocket(SSLConnectionSocketFactory.java:274)
at 
org.apache.maven.wagon.providers.http.httpclient.impl.conn.HttpClientConnec
tionOperator.upgrade(HttpClientConnectionOperator.java:167)
at 
org.apache.maven.wagon.providers.http.httpclient.impl.conn.PoolingHttpClien
tConnectionManager.upgrade(PoolingHttpClientConnectionManager.java:329)
at 
org.apache.maven.wagon.providers.http.httpclient.impl.execchain.MainClientE
xec.establishRoute(MainClientExec.java:392) at
org.apache.maven.wagon.providers.http.httpclient.impl.conn.PoolingHttpClien
tConnectionManager.upgrade(PoolingHttpClientConnectionManager.java:329)
at 
org.apache.maven.wagon.providers.http.httpclient.impl.execchain.MainClientE
xec.establishRoute(MainClientExec.java:392)
at 
org.apache.maven.wagon.providers.http.httpclient.impl.execchain.MainClientE
xec.execute(MainClientExec.java:218)
at 
org.apache.maven.wagon.providers.http.httpclient.impl.execchain.ProtocolExe
c.execute(ProtocolExec.java:194)
at 
org.apache.maven.wagon.providers.http.httpclient.impl.execchain.MainClientE
xec.execute(MainClientExec.java:218)
at 
org.apache.maven.wagon.providers.http.httpclient.impl.execchain.ProtocolExe
c.execute(ProtocolExec.java:194)

at 
org.apache.maven.wagon.providers.http.httpclient.impl.execchain.RetryExec.e
xecute(RetryExec.java:85)
at 
org.apache.maven.wagon.providers.http.httpclient.impl.execchain.RedirectExe
c.execute(RedirectExec.java:108)at
org.apache.maven.wagon.providers.http.httpclient.impl.execchain.RetryExec.e
xecute(RetryExec.java:85)

at 
org.apache.maven.wagon.providers.http.httpclient.impl.client.InternalHttpCl
ient.doExecute(InternalHttpClient.java:186)
at 
org.apache.maven.wagon.providers.http.httpclient.impl.client.CloseableHttpC
lient.execute(CloseableHttpClient.java:82)
at 
org.apache.maven.wagon.providers.http.AbstractHttpClientWagon.execute(Abstr
actHttpClientWagon.java:756)at
org.apache.maven.wagon.providers.http.httpclient.impl.execchain.RedirectExe
c.execute(RedirectExec.java:108)

at 
org.apache.maven.wagon.providers.http.AbstractHttpClientWagon.fillInputData
(AbstractHttpClientWagon.java:854)
at 
org.apache.maven.wagon.providers.http.httpclient.impl.client.InternalHttpCl
ient.doExecute(InternalHttpClient.java:186)
at 
org.apache.maven.wagon.providers.http.httpclient.impl.client.CloseableHttpC
lient.execute(CloseableHttpClient.java:82)
at 
org.apache.maven.wagon.providers.http.AbstractHttpClientWagon.execute(Abstr
actHttpClientWagon.java:756)at
org.apache.maven.wagon.StreamWagon.getInputStream(StreamWagon.java:116)
at org.apache.maven.wagon.StreamWagon.getIfNewer(StreamWagon.java:88)
at org.apache.maven.wagon.StreamWagon.get(StreamWagon.java:61)
at 
org.apache.maven.wagon.providers.http.AbstractHttpClientWagon.fillInputData