date:20160710

Re: How to spin up Kafka using docker and use for Spark Streaming Integration tests

2016-07-10 Thread Lars Albertsson

Let us assume that you want to build an integration test setup where you run all participating components in Docker. You create a docker-compose.yml with four Docker images, something like this: # Start docker-compose.yml version: '2' services: myapp: build: myapp_dir links: - ka

location of a partition in the cluster/ how parallelize method distribute the RDD partitions over the cluster.

2016-07-10 Thread Mazen

Hi, Any hint about getting the location of a particular RDD partition on the cluster? a workaround? Parallelize method on RDDs partitions the RDD into splits as specified or per as per the default parallelism configuration. Does parallelize actually distribute the partitions into the cluster

IS NOT NULL is not working in programmatic SQL in spark

2016-07-10 Thread Radha krishna

Hi All,IS NOT NULL is not working in programmatic sql. check below for input output and code. Input 10,IN 11,PK 12,US 13,UK 14,US 15,IN 16, 17,AS 18,AS 19,IR 20,As val cntdat = sc.textFile("/user/poc_hortonworks/radha/gsd/sample.txt"); case class CNT (id:Int , code : String) val cntdf = cntd

IS NOT NULL is not working in programmatic SQL in spark

2016-07-10 Thread radha

Hi All,IS NOT NULL is not working in programmatic sql. check below for input output and code. Input 10,IN 11,PK 12,US 13,UK 14,US 15,IN 16, 17,AS 18,AS 19,IR 20,As val cntdat = sc.textFile("/user/poc_hortonworks/radha/gsd/sample.txt"); case class CNT (id:Int , code : String) val cntdf = cnt

Re: IS NOT NULL is not working in programmatic SQL in spark

2016-07-10 Thread Sean Owen

It doesn't look like you have a NULL field, You have a string-value field with an empty string. On Sun, Jul 10, 2016 at 3:19 PM, Radha krishna wrote: > Hi All,IS NOT NULL is not working in programmatic sql. check below for input > output and code. > > Input > > 10,IN > 11,PK > 12,US > 13,UK

Re: IS NOT NULL is not working in programmatic SQL in spark

2016-07-10 Thread Radha krishna

Ok thank you, how to achieve the requirement. On Sun, Jul 10, 2016 at 8:44 PM, Sean Owen wrote: > It doesn't look like you have a NULL field, You have a string-value > field with an empty string. > > On Sun, Jul 10, 2016 at 3:19 PM, Radha krishna wrote: > > Hi All,IS NOT NULL is not working in

How to Register Permanent User-Defined-Functions (UDFs) in SparkSQL

2016-07-10 Thread Lokesh Yadav

Hi with sqlContext we can register a UDF like this: sqlContext.udf.register("sample_fn", sample_fn _ ) But this UDF is limited to that particular sqlContext only. I wish to make the registration persistent, so that I can access the same UDF in any subsequent sqlcontext. Or is there any other way to

Re: IS NOT NULL is not working in programmatic SQL in spark

2016-07-10 Thread Radha krishna

I want to apply null comparison to a column in sqlcontext.sql, is there any way to achieve this? On Jul 10, 2016 8:55 PM, "Radha krishna" wrote: > Ok thank you, how to achieve the requirement. > > On Sun, Jul 10, 2016 at 8:44 PM, Sean Owen wrote: > >> It doesn't look like you have a NULL field,

KEYS file?

2016-07-10 Thread Phil Steitz

I can't seem to find a link the the Spark KEYS file. I am trying to validate the sigs on the 1.6.2 release artifacts and I need to import 0x7C6C105FFC8ED089. Is there a KEYS file available for download somewhere? Apologies if I am just missing an obvious link. Phil ---

Network issue on deployment

2016-07-10 Thread Jean Georges Perrin

Hi, So far I have been using Spark "embedded" in my app. Now, I'd like to run it on a dedicated server. I am that far: - fresh ubuntu 16, server name is mocha / ip 10.0.100.120, installed scala 2.10, installed Spark 1.6.2, recompiled - Pi test works - UI on port 8080 works Log says: Spark Comm

Re: KEYS file?

2016-07-10 Thread Shuai Lin

Not sure where you see " 0x7C6C105FFC8ED089". I think the release is signed with the key https://people.apache.org/keys/committer/pwendell.asc . I think this tutorial can be helpful: http://www.apache.org/info/verification.html On Mon, Jul 11, 2016 at 12:57 AM, Phil Steitz wrote: > I can't seem

Re: Network issue on deployment

2016-07-10 Thread Jean Georges Perrin

I tested that: I set: _JAVA_OPTIONS=-Djava.net.preferIPv4Stack=true SPARK_LOCAL_IP=10.0.100.120 I still have the warning in the log: 16/07/10 14:10:13 WARN Utils: Your hostname, micha resolves to a loopback address: 127.0.1.1; using 10.0.100.120 instead (on interface eno1) 16/07/10 14:10:13 WAR

Re: StreamingKmeans Spark doesn't work at all

2016-07-10 Thread Biplob Biswas

Hi, I know i am asking again, but I tried running the same thing on mac as well as some answers on the internet suggested it could be an issue with the windows environment, but still nothing works. Can anyone atleast suggest whether its a bug with spark or is it something else? Would be really g

Re: KEYS file?

2016-07-10 Thread Phil Steitz

On 7/10/16 10:57 AM, Shuai Lin wrote: > Not sure where you see " 0x7C6C105FFC8ED089". I That's the key ID for the key below. > think the release is signed with the > key https://people.apache.org/keys/committer/pwendell.asc . Thanks! That key matches. The project should publish a KEYS file [1]

Re: IS NOT NULL is not working in programmatic SQL in spark

2016-07-10 Thread Takeshi Yamamuro

Hi, One of solutions to use `spark-csv` (See: https://github.com/databricks/spark-csv#features). To load NULL, you can use `nullValue` there. // maropu On Mon, Jul 11, 2016 at 1:14 AM, Radha krishna wrote: > I want to apply null comparison to a column in sqlcontext.sql, is there > any way to

Re: Network issue on deployment

2016-07-10 Thread Jean Georges Perrin

It appears like i had issues in my /etc/hosts... it seems ok now > On Jul 10, 2016, at 2:13 PM, Jean Georges Perrin wrote: > > I tested that: > > I set: > > _JAVA_OPTIONS=-Djava.net.preferIPv4Stack=true > SPARK_LOCAL_IP=10.0.100.120 > I still have the warning in the log: > > 16/07/10 14:10:13

"client / server" config

2016-07-10 Thread Jean Georges Perrin

I have my dev environment on my Mac. I have a dev Spark server on a freshly installed physical Ubuntu box. I had some connection issues, but it is now all fine. In my code, running on the Mac, I have: 1 SparkConf conf = new SparkConf().setAppName("myapp").setMaster("spark://10.0

Spark crashes with two parquet files

2016-07-10 Thread Javier Rey

Hi everybody, I installed Spark 1.6.1, I have two parquet files, but when I need show registers using unionAll, Spark crash I don't understand what happens. But when I use show() only one parquet file this is work correctly. code with fault: path = '/data/train_parquet/' train_df = sqlContext.r

Re: Spark crashes with two parquet files

2016-07-10 Thread Takeshi Yamamuro

Hi, What's the schema in the parquets? Also, could you show us the stack trace when the error happens? // maropu On Mon, Jul 11, 2016 at 11:42 AM, Javier Rey wrote: > Hi everybody, > > I installed Spark 1.6.1, I have two parquet files, but when I need show > registers using unionAll, Spark cra

How to run Zeppelin and Spark Thrift Server Together

2016-07-10 Thread Chanh Le

Hi everybody, We are using Spark to query big data and currently we’re using Zeppelin to provide a UI for technical users. Now we also need to provide a UI for business users so we use Oracle BI tools and set up a Spark Thrift Server (STS) for it. When I run both Zeppelin and STS throw error: I

Re: "client / server" config

2016-07-10 Thread ayan guha

Is this terminating the execution or spark application still runs after this error? One thing for sure, it is looking for local file on driver (ie your mac) @ location: file:/Users/jgp/Documents/Data/restaurants-data.json On Mon, Jul 11, 2016 at 12:33 PM, Jean Georges Perrin wrote: > > I have m

Re: "client / server" config

2016-07-10 Thread Jean Georges Perrin

Good for the file :) No it goes on... Like if it was waiting for something jg > On Jul 10, 2016, at 22:55, ayan guha wrote: > > Is this terminating the execution or spark application still runs after this > error? > > One thing for sure, it is looking for local file on driver (ie your mac)

Re: How to run Zeppelin and Spark Thrift Server Together

2016-07-10 Thread ayan guha

Hi Can you try using JDBC interpreter with STS? We are using Zeppelin+STS on YARN for few months now without much issue. On Mon, Jul 11, 2016 at 12:48 PM, Chanh Le wrote: > Hi everybody, > We are using Spark to query big data and currently we’re using Zeppelin to > provide a UI for technical us

Re: "client / server" config

2016-07-10 Thread ayan guha

Yes, that is expected to move on. If it looks it is waiting for something, my first instinct would be to check network connectivity such as your cluster must have access back to your Mac to read the file (it is probably waiting to time out) On Mon, Jul 11, 2016 at 12:59 PM, Jean Georges Perrin wr

Re: How to run Zeppelin and Spark Thrift Server Together

2016-07-10 Thread Chanh Le

Hi Ayan, It is brilliant idea. Thank you every much. I will try this way. Regards, Chanh > On Jul 11, 2016, at 10:01 AM, ayan guha wrote: > > Hi > > Can you try using JDBC interpreter with STS? We are using Zeppelin+STS on > YARN for few months now without much issue. > > On Mon, Jul 11,

Re: How to run Zeppelin and Spark Thrift Server Together

2016-07-10 Thread Takeshi Yamamuro

Hi, ISTM multiple sparkcontexts are not recommended in spark. See: https://issues.apache.org/jira/browse/SPARK-2243 // maropu On Mon, Jul 11, 2016 at 12:01 PM, ayan guha wrote: > Hi > > Can you try using JDBC interpreter with STS? We are using Zeppelin+STS on > YARN for few months now without

Re: Spark crashes with two parquet files

2016-07-10 Thread Takeshi Yamamuro

The log explicitly said "java.lang.OutOfMemoryError: Java heap space", so you need to allocate more JVM memory for spark? // maropu On Mon, Jul 11, 2016 at 11:59 AM, Javier Rey wrote: > Also the problem appears when I used clause: unionAll > > 2016-07-10 21:58 GMT-05:00 Javier Rey : > >> This i

mllib based on dataset or dataframe

2016-07-10 Thread jinhong lu

Hi, Since the DataSet will be the major API in spark2.0, why mllib will DataFrame-based, and 'future development will focus on the DataFrame-based API.’ Any plan will change mllib form DataFrame-based to DataSet-based? = Thanks, lujinhong --

Spark logging

2016-07-10 Thread SamyaMaiti

Hi Team, I have a spark application up & running on a 10 node Standalone cluster. When i launch the application in cluster mode i am able to create separate log file for driver & executors (common for all executors). But, my requirement is to create separate log file for each executors. Is it fe

Re: KEYS file?

2016-07-10 Thread Shuai Lin

> > at least links to the keys used to sign releases on the > download page +1 for that. On Mon, Jul 11, 2016 at 3:35 AM, Phil Steitz wrote: > On 7/10/16 10:57 AM, Shuai Lin wrote: > > Not sure where you see " 0x7C6C105FFC8ED089". I > > That's the key ID for the key below. > > think the releas

Re: StreamingKmeans Spark doesn't work at all

2016-07-10 Thread Shuai Lin

I would suggest you run the scala version of the example first, so you can tell whether it's a problem of the data you provided or a problem of the java code. On Mon, Jul 11, 2016 at 2:37 AM, Biplob Biswas wrote: > Hi, > > I know i am asking again, but I tried running the same thing on mac as >

Problem connecting Zeppelin 0.6 to Spark Thrift Server

2016-07-10 Thread Mich Talebzadeh

Hi, I can use JDBC connection to connect from Squirrel client to Spark thrift server and this works fine. I have Zeppelin 0.6.o that works OK with the default spark interpreter. I configured JDBC interpreter to connect to Spark thrift server as follows [image: Inline images 1] I can use beeli

Re: How to run Zeppelin and Spark Thrift Server Together

2016-07-10 Thread Chanh Le

Hi Ayan, I tested It works fine but one more confuse is If my (technical) users want to write some code in zeppelin to apply thing into Hive table? Zeppelin and STS can’t share Spark Context that mean we need separated process? Is there anyway to use the same Spark Context of STS? Regards, Chan

Re: mllib based on dataset or dataframe

2016-07-10 Thread Yanbo Liang

DataFrame is a kind of special case of Dataset, so they mean the same thing. Actually the ML pipeline API will accept Dataset[_] instead of DataFrame in Spark 2.0. We can say that MLlib will focus on the Dataset-based API for futher development more accurately. Thanks Yanbo 2016-07-10 20:35 GMT-0

Re: Isotonic Regression, run method overloaded Error

2016-07-10 Thread Yanbo Liang

Hi Swaroop, Would you mind to share your code that others can help you to figure out what caused this error? I can run the isotonic regression examples well. Thanks Yanbo 2016-07-08 13:38 GMT-07:00 dsp : > Hi I am trying to perform Isotonic Regression on a data set with 9 features > and a label

Re: How to run Zeppelin and Spark Thrift Server Together

2016-07-10 Thread ayan guha

Hi When you say "Zeppelin and STS", I am assuming you mean "Spark Interpreter" and "JDBC interpreter" respectively. Through Zeppelin, you can either run your own spark application (by using Zeppelin's own spark context) using spark interpreter OR you can access STS, which is a spark application

Re: How to spin up Kafka using docker and use for Spark Streaming Integration tests

location of a partition in the cluster/ how parallelize method distribute the RDD partitions over the cluster.

IS NOT NULL is not working in programmatic SQL in spark

IS NOT NULL is not working in programmatic SQL in spark

Re: IS NOT NULL is not working in programmatic SQL in spark

Re: IS NOT NULL is not working in programmatic SQL in spark

How to Register Permanent User-Defined-Functions (UDFs) in SparkSQL

Re: IS NOT NULL is not working in programmatic SQL in spark

KEYS file?

Network issue on deployment

Re: KEYS file?

Re: Network issue on deployment

Re: StreamingKmeans Spark doesn't work at all

Re: KEYS file?

Re: IS NOT NULL is not working in programmatic SQL in spark

Re: Network issue on deployment

"client / server" config

Spark crashes with two parquet files

Re: Spark crashes with two parquet files

How to run Zeppelin and Spark Thrift Server Together

Re: "client / server" config

Re: "client / server" config

Re: How to run Zeppelin and Spark Thrift Server Together

Re: "client / server" config

Re: How to run Zeppelin and Spark Thrift Server Together

Re: How to run Zeppelin and Spark Thrift Server Together

Re: Spark crashes with two parquet files

mllib based on dataset or dataframe

Spark logging

Re: KEYS file?

Re: StreamingKmeans Spark doesn't work at all

Problem connecting Zeppelin 0.6 to Spark Thrift Server

Re: How to run Zeppelin and Spark Thrift Server Together

Re: mllib based on dataset or dataframe

Re: Isotonic Regression, run method overloaded Error

Re: How to run Zeppelin and Spark Thrift Server Together

36 matches

Site Navigation

Mail list logo

Footer information