On Fri, Jul 17, 2015 at 6:58 PM, Shivaram Venkataraman
wrote:
> I am not sure why the ASF JIRA can be only used to track one set of
> artifacts that are packaged and released together. I agree that marking a
> fix version as 1.5 for a change in another repo doesn't make a lot of sense,
> but we co
Hi Shivaram,
Thank you for the explanation. Is there a direct way to check the length of the
lineage i.e. that the computation is repeated?
Best regards, Alexander
From: Shivaram Venkataraman [mailto:shiva...@eecs.berkeley.edu]
Sent: Friday, July 17, 2015 10:10 AM
To: Ulanov, Alexander
Cc: shiv
Some replies inline
On Wed, Jul 15, 2015 at 1:08 AM, Sean Owen wrote:
> The code can continue to be a good reference implementation, no matter
> where it lives. In fact, it can be a better more complete one, and
> easier to update.
>
> I agree that ec2/ needs to retain some kind of pointer to th
Hi Alexander,
Feel free to submit an "improvement" JIRA.
Best,
Burak
On Thu, Jul 16, 2015 at 4:20 PM, Ulanov, Alexander
wrote:
> Hi Burak,
>
>
>
> If I change the code as you suggested then it fails with (given that
> blockSize is 1):
>
> “org.apache.spark.SparkException: The MatrixBlock
You can also use checkpoint to truncate the lineage and the data can be
persisted to HDFS. Fundamentally the state of the RDD needs to be saved to
memory or disk if you don't want to repeat the computation.
Thanks
Shivaram
On Thu, Jul 16, 2015 at 4:59 PM, Ulanov, Alexander
wrote:
> Dear Spark
You are right. There are some odd things about this connector. Earlier I got
OOM exception with this connector just because there was a bug in the
connector which transferred only 64 bytes before closing the connection and
now this one
Strangely I copied the data into another data frame and it work
I think that might be a connector issue. You say you are using Spark 1.4,
are you also using 1.4 version of the Spark-cassandra-connector? The do
have some bugs around this, e.g.
https://datastax-oss.atlassian.net/browse/SPARKC-195. Also, I see that you
import org.apache.spark.sql.cassandra.Cassand
The "It is not a test" failed test message means that something went wrong
in a suite-wide setup or teardown method. This could be some sort of race
or flakiness. If this problem persists, we should file a JIRA and label it
with "flaky-test" so that we can find it later.
On Thu, Jul 16, 2015 at
We've run into other problems caused by our old Kryo versions. I agree that
the Chill dependency is one of the main blockers to upgrading Kryo, but I
don't think that it's insurmountable: if necessary, we could just publish
our own forked version of Chill under our own namespace, similar to what we
spark version 1.4
import com.datastax.spark.connector._
import org.apache.spark._
import org.apache.spark.sql.cassandra.CassandraSQLContext
import org.apache.spark.SparkConf
//import com.microsoft.sqlserver.jdbc.SQLServerDriver
import java.sql.Connection
import java.sql.DriverManager
import java.
https://github.com/apache/spark/pull/7421
On Fri, Jul 17, 2015 at 3:26 AM, Xiaoyu Ma
wrote:
> Hi guys,
> I saw when Hive Table object created it tries to load all existing
> partitions.
>
>
> @transient val hiveQlPartitions: Seq[Partition] = table.getAllPartitions.map
> { p =>
> val tPartitio
Hi guys,
I saw when Hive Table object created it tries to load all existing partitions.
@transient val hiveQlPartitions: Seq[Partition] = table.getAllPartitions.map {
p =>
val tPartition = new org.apache.hadoop.hive.metastore.api.Partition
tPartition.setDbName(databaseName)
tPartition.setT
12 matches
Mail list logo