date:20171009

Fwd: [MLlib] RowMatrix computeSVD Native ARPACK support not detecting.

2017-10-09 Thread Abdullah Bashir

Hi, I am getting the following Warning when i run the pyspark job: My Code is mat = RowMatrix(tf_rdd_vec.cache()) # RDD is cached svd = mat.computeSVD(num_topics, computeU=False) I am using Ubuntu 16.04 EC2 instance. And I have installed all following libraries into my system. sudo apt insta

How to avoid creating meta files (.crc files)

2017-10-09 Thread Vikash Pareek

Hi Users, Is there any way to avoid creation of .crc files when writing an RDD with saveAsTextFile method? My use case is, I have mounted S3 on the local file system using S3FS and saving an RDD to mounting point. by looking at S3, I found one .crc file for each part file and even _SUCCESS file.

Re: [MLlib] RowMatrix computeSVD Native ARPACK support not detecting.

2017-10-09 Thread Weichen Xu

Does you get the warning info such as: `Failed to load implementation from: com.github.fommil.netlib.NativeSystemBLAS` `Failed to load implementation from: com.github.fommil.netlib.NativeRefBLAS` ? These two errors are thrown in `com.github.fommil.netlib.BLAS`, but it catch the original exception

Does Spark 2.2.0 support Dataset>> ?

2017-10-09 Thread kant kodali

Hi All, I am wondering if spark supports Dataset>> ? when I do the following it says no map function available? Dataset>> resultDs = ds.map(lambda, Encoders.bean(List.class)); Thanks!

Re: Does Spark 2.2.0 support Dataset>> ?

2017-10-09 Thread Koert Kuipers

it supports Dataset>> where X must be a supported type also. Object is not a supported type. On Mon, Oct 9, 2017 at 7:36 AM, kant kodali wrote: > Hi All, > > I am wondering if spark supports Dataset>> ? > > when I do the following it says no map function available? > > Dataset>> resultDs = ds.ma

[Spark SQL] Missing data in Elastisearch when writing data with elasticsearch-spark connector

2017-10-09 Thread sixers

### Issue description We have an issue with data consistency when storing data in Elasticsearch using Spark and elasticsearch-spark connector. Job finishes successfully, but when we compare the original data (stored in S3), with the data stored in ES, some documents are not present in Elasticsearc

Re: [Spark SQL] Missing data in Elastisearch when writing data with elasticsearch-spark connector

2017-10-09 Thread ayan guha

Have you raised it in ES connector github as issues? In my past experience (with hadoop connector with Pig), they respond pretty quickly. On Tue, Oct 10, 2017 at 12:36 AM, sixers wrote: > ### Issue description > > We have an issue with data consistency when storing data in Elasticsearch > using

Why does Spark need to set log levels

2017-10-09 Thread Daan Debie

Hi all! I would love to use Spark with a somewhat more modern logging framework than Log4j 1.2. I have Logback in mind, mostly because it integrates well with central logging solutions such as the ELK stack. I've read up a bit on getting Spark 2.0 (that's what I'm using currently) to work with any

Re: Does Spark 2.2.0 support Dataset>> ?

2017-10-09 Thread kant kodali

Hi Koert, Thanks! If I have this Dataset>> what would be the Enconding?is it Encoding.kryo(Seq.class) ? Also shouldn't List be supported? Should I create a ticket for this? On Mon, Oct 9, 2017 at 6:10 AM, Koert Kuipers wrote: > it supports Dataset>> where X must be a supported type > also. O

Re: Does Spark 2.2.0 support Dataset>> ?

2017-10-09 Thread Koert Kuipers

if you are willing to use kryo encoder you can do your original Dataset< List>>> i think for example in scala i create here an intermediate Dataset[Any]: scala> Seq(1,2,3).toDS.map(x => if (x % 2 == 0) x else x.toString)(org.apache.spark.sql.Encoders.kryo[Any]).map{ (x: Any) => x match { case i:

Re: Does Spark 2.2.0 support Dataset>> ?

2017-10-09 Thread kant kodali

Tried the following. dataset.map(new MapFunction>>() { @Override public List> call(String input) throws Exception { List> temp = new ArrayList<>(); temp.add(new HashMap()); return temp; } }, Encoders.kryo(List.class)); This doesn't even compile. error: no s

Re: How to convert Array of Json rows into Dataset of specific columns in Spark 2.2.0?

2017-10-09 Thread kant kodali

https://issues.apache.org/jira/browse/SPARK-8 On Sun, Oct 8, 2017 at 11:58 AM, kant kodali wrote: > I have the following so far > > private StructType getSchema() { > return new StructType() > .add("name", StringType) > .add("address", StringType) > .a

Re: Cases when to clear the checkpoint directories.

2017-10-09 Thread Tathagata Das

Any changes in the Java code (to be specific, the generated bytecode) in the functions you pass to Spark (i.e., map function, reduce function, as well as it closure dependencies) counts as "application code change", and will break the recovery from checkpoints. On Sat, Oct 7, 2017 at 11:53 AM, Joh

UnresolvedAddressException in Kubernetes Cluster

2017-10-09 Thread Suman Somasundar

Hi, I am trying to deploy a Spark app in a Kubernetes Cluster. The cluster consists of 2 machines - 1 master and 1 slave, each of them with the following config: RHEL 7.2 Docker 17.03.1 K8S 1.7. I am following the steps provided in https://apache-spark-on-k8s.github.io/userdocs/running-on-kuber

Need help

2017-10-09 Thread Mahender Sarangam

Hi, I'm new to spark and big data, we are doing some poc and building our warehouse application using Spark. Can any one share with me guidance like Naming Convention for HDFS Name,Table Names, UDF and DB Name. Any sample architecture diagram. -Mahens

Fwd: [MLlib] RowMatrix computeSVD Native ARPACK support not detecting.

How to avoid creating meta files (.crc files)

Re: [MLlib] RowMatrix computeSVD Native ARPACK support not detecting.

Does Spark 2.2.0 support Dataset>> ?

Re: Does Spark 2.2.0 support Dataset>> ?

[Spark SQL] Missing data in Elastisearch when writing data with elasticsearch-spark connector

Re: [Spark SQL] Missing data in Elastisearch when writing data with elasticsearch-spark connector

Why does Spark need to set log levels

Re: Does Spark 2.2.0 support Dataset>> ?

Re: Does Spark 2.2.0 support Dataset>> ?

Re: Does Spark 2.2.0 support Dataset>> ?

Re: How to convert Array of Json rows into Dataset of specific columns in Spark 2.2.0?

Re: Cases when to clear the checkpoint directories.

UnresolvedAddressException in Kubernetes Cluster

Need help

15 matches

Site Navigation

Mail list logo

Footer information