Reading from cassandra store in rdd

2016-05-04 Thread Yasemin Kaya
Hi, I asked this question datastax group but i want to ask also spark-user group, someone may face this problem. I have a data in Cassandra and want to get data to SparkRDD. I got an error , searched it but nothing changed. Is there anyone can help me to fix it? I can connect Cassandra and cqlsh

Re: Saving model S3

2016-03-21 Thread Yasemin Kaya
Hi Ted, I don't understand the issue that you want to learn? Could you be more clear please? 2016-03-21 15:24 GMT+02:00 Ted Yu : > Was speculative execution enabled ? > > Thanks > > On Mar 21, 2016, at 6:19 AM, Yasemin Kaya wrote: > > Hi, > > I am using S3

Saving model S3

2016-03-21 Thread Yasemin Kaya
Hi, I am using S3 read data also I want to save my model S3. In reading part there is no error, but when I save model I am getting this error . I tried to change the way from s3n to s3a but nothing change, different errors comes. *reading pat

Re: reading file from S3

2016-03-16 Thread Yasemin Kaya
ts that we do not use. Please use roles and you >>> will not have to worry about security. >>> >>> Regards, >>> Gourav Sengupta >>> >>> On Tue, Mar 15, 2016 at 2:38 PM, Sabarish Sasidharan < >>> sabarish@gmail.com> wrote: >

Re: reading file from S3

2016-03-15 Thread Yasemin Kaya
03-15 12:33 GMT+02:00 Yasemin Kaya : > >> Hi, >> >> I am using Spark 1.6.0 standalone and I want to read a txt file from S3 >> bucket named yasemindeneme and my file name is deneme.txt. But I am getting >> this error. Here is the simple code >> <https://

reading file from S3

2016-03-15 Thread Yasemin Kaya
Hi, I am using Spark 1.6.0 standalone and I want to read a txt file from S3 bucket named yasemindeneme and my file name is deneme.txt. But I am getting this error. Here is the simple code Exception in thread "main" java.lang.IllegalArgumentE

concurrent.RejectedExecutionException

2016-01-23 Thread Yasemin Kaya
Hi all, I'm using spark 1.5 and getting this error. Could you help i cant understand? 16/01/23 10:11:59 ERROR TaskSchedulerImpl: Exception in statusUpdate java.util.concurrent.RejectedExecutionException: Task org.apache.spark.scheduler.TaskResultGetter$$anon$2@62c72719 rejected from java.util.con

Re: write new data to mysql

2016-01-08 Thread Yasemin Kaya
When i change the version to 1.6.0, it worked. Thanks. 2016-01-08 21:27 GMT+02:00 Yasemin Kaya : > Hi, > There is no write function that Todd mentioned or i cant find it. > The code and error are in gist > <https://gist.github.com/yaseminn/f5a2b78b126df71dfd0b>. Could you che

Re: write new data to mysql

2016-01-08 Thread Yasemin Kaya
track_on_alarm", connectionProps) > > HTH. > > -Todd > > On Fri, Jan 8, 2016 at 10:53 AM, Ted Yu wrote: > >> Which Spark release are you using ? >> >> For case #2, was there any error / clue in the logs ? >> >> Cheers >> >> On Fri

write new data to mysql

2016-01-08 Thread Yasemin Kaya
Hi, I want to write dataframe existing mysql table, but when i use *peopleDataFrame.insertIntoJDBC(MYSQL_CONNECTION_URL_WRITE, "track_on_alarm",false)* it says "Table track_on_alarm already exists." And when i *use peopleDataFrame.insertIntoJDBC(MYSQL_CONNECTION_URL_WRITE, "track_on_alarm",true)

Re: Struggling time by data

2015-12-25 Thread Yasemin Kaya
upByKey().filter{case (_, (a, b)) => abs(a._1, a._1) < 30min} > > does it work for you ? > > 2015-12-25 16:53 GMT+08:00 Yasemin Kaya : > >> hi, >> >> I have struggled this data couple of days, i cant find solution. Could >> you help me? >> >>

Struggling time by data

2015-12-25 Thread Yasemin Kaya
hi, I have struggled this data couple of days, i cant find solution. Could you help me? *DATA:* *(userid1_time, url) * *(userid1_time2, url2)* I want to get url which are in 30 min. *RESULT:* *If time2-time1<30 min* *(user1, [url1, url2] )* Best, yasemin -- hiç ender hiç

Re: rdd split into new rdd

2015-12-23 Thread Yasemin Kaya
riteria a bit more ? The above seems to be a Set, >> not a Map. >> >> Cheers >> >> On Wed, Dec 23, 2015 at 7:11 AM, Yasemin Kaya wrote: >> >>> Hi, >>> >>> I have data >>> *JavaPairRDD> *format. In example: >>>

rdd split into new rdd

2015-12-23 Thread Yasemin Kaya
Hi, I have data *JavaPairRDD> *format. In example: *(1610, {a=1, b=1, c=2, d=2}) * I want to get *JavaPairRDD>* In example: *(1610, {a, b})* *(1610, {c, d})* Is there a way to solve this problem? Best, yasemin -- hiç ender hiç

groupByKey()

2015-12-08 Thread Yasemin Kaya
Hi, Sorry for the long inputs but it is my situation. i have two list and i wana grupbykey them but some value of list disapear. i can't understand this. (8867989628612931721,[1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

Re: rdd conversion

2015-10-26 Thread Yasemin Kaya
Oct 26, 2015 at 9:40 AM, Yasemin Kaya wrote: > >> Hi, >> >> I have *JavaRDD>>>* and I want to >> convert every map to pairrdd, i mean >> * JavaPairRDD>. * >> >> There is a loop in list to get the indexed map, when I write code below, >>

rdd conversion

2015-10-26 Thread Yasemin Kaya
Hi, I have *JavaRDD>>>* and I want to convert every map to pairrdd, i mean * JavaPairRDD>. * There is a loop in list to get the indexed map, when I write code below, it returns me only one rdd. JavaPairRDD> mapToRDD = IdMapValues.mapToPair(new PairFunction>>, Integer, ArrayList>() { @Override

Model exports PMML (Random Forest)

2015-10-07 Thread Yasemin Kaya
Hi, I want to export my model to PMML. But there is no development about random forest. It is planned to 1.6 version. Is it possible producing my model (random forest) PMML xml format manuelly? Thanks. Best, yasemin -- hiç ender hiç

ML Pipeline

2015-09-28 Thread Yasemin Kaya
Hi, I am using Spar 1.5 and ML Pipeline. I create the model then give the model unlabeled data to find the probabilites and predictions. When I want to see the results, it returns me error. //creating model final PipelineModel model = pipeline.fit(trainingData); JavaRDD rowRDD1 = unlabeledTest .

Re: spark 1.5, ML Pipeline Decision Tree Dataframe Problem

2015-09-18 Thread Yasemin Kaya
; "rdd.toDf()" on your RDD to convert it into a dataframe. > > On Fri, Sep 18, 2015 at 7:32 AM, Yasemin Kaya wrote: > >> Hi, >> >> I am using *spark 1.5, ML Pipeline Decision Tree >> <http://spark.apache.org/docs/latest/ml-decision-tree.html#output-colu

spark 1.5, ML Pipeline Decision Tree Dataframe Problem

2015-09-18 Thread Yasemin Kaya
Hi, I am using *spark 1.5, ML Pipeline Decision Tree * to get tree's probability. But I have to convert my data to Dataframe type. While creating model there is no problem but when I am using model on my data there is a prob

Re: Random Forest MLlib

2015-09-15 Thread Yasemin Kaya
Hi Maximo, Is there a way getting precision and recall from pipeline? In MLlib version I get precision and recall metrics from MulticlassMetrics but ML pipeLine says only testErr. Thanks yasemin 2015-09-10 17:47 GMT+03:00 Yasemin Kaya : > Hi Maximo, > Thanks alot.. > Hi Yasemin, &g

Multilabel classification support

2015-09-11 Thread Yasemin Kaya
Hi, I want to use Mllib for multilabel classification, but I find http://spark.apache.org/docs/latest/mllib-classification-regression.html, it is not what I mean. Is there a way to use multilabel classification? Thanks alot. Best, yasemin -- hiç ender hiç

Re: Random Forest MLlib

2015-09-10 Thread Yasemin Kaya
Hi Maximo, Thanks alot.. Hi Yasemin, We had the same question and found this: https://issues.apache.org/jira/browse/SPARK-6884 Thanks, Maximo On Sep 10, 2015, at 9:09 AM, Yasemin Kaya wrote: Hi , I am using Random Forest Alg. for recommendation system. I get users and users' res

Random Forest MLlib

2015-09-10 Thread Yasemin Kaya
Hi , I am using Random Forest Alg. for recommendation system. I get users and users' response yes or no (1/0). But I want to learn the probability of the trees. Program says x user yes but with how much probability, I want to get these probabilities. Best, yasemin -- hiç ender hiç

Re: EC2 cluster doesn't work saveAsTextFile

2015-08-10 Thread Yasemin Kaya
m> > @deanwampler <http://twitter.com/deanwampler> > http://polyglotprogramming.com > > On Mon, Aug 10, 2015 at 7:08 AM, Yasemin Kaya wrote: > >> Hi, >> >> I have EC2 cluster, and am using spark 1.3, yarn and HDFS . When i submit >> at local there i

EC2 cluster doesn't work saveAsTextFile

2015-08-10 Thread Yasemin Kaya
Hi, I have EC2 cluster, and am using spark 1.3, yarn and HDFS . When i submit at local there is no problem , but i run at cluster, saveAsTextFile doesn't work."*It says me User class threw exception: Output directory hdfs://172.31.42.10:54310/./weblogReadResult

Re: java.lang.ClassNotFoundException

2015-08-08 Thread Yasemin Kaya
Thanx Ted, i solved it :) 2015-08-08 14:07 GMT+03:00 Ted Yu : > Have you tried including package name in the class name ? > > Thanks > > > > On Aug 8, 2015, at 12:00 AM, Yasemin Kaya wrote: > > Hi, > > I have a little spark program and i am getting an error why

java.lang.ClassNotFoundException

2015-08-08 Thread Yasemin Kaya
Hi, I have a little spark program and i am getting an error why i dont understand. My code is https://gist.github.com/yaseminn/522a75b863ad78934bc3. I am using spark 1.3 Submitting : bin/spark-submit --class MonthlyAverage --master local[4] weather.jar error: ~/spark-1.3.1-bin-hadoop2.4$ bin/sp

Re: Amazon DynamoDB & Spark

2015-08-07 Thread Yasemin Kaya
Thanx Jay. 2015-08-07 19:25 GMT+03:00 Jay Vyas : > In general the simplest way is that you can use the Dynamo Java API as is > and call it inside a map(), and use the asynchronous put() Dynamo api call > . > > > > On Aug 7, 2015, at 9:08 AM, Yasemin Kaya wrote: > >

Amazon DynamoDB & Spark

2015-08-07 Thread Yasemin Kaya
Hi, Is there a way using DynamoDB in spark application? I have to persist my results to DynamoDB. Thanx, yasemin -- hiç ender hiç

Broadcast value

2015-06-12 Thread Yasemin Kaya
Hi, I am taking Broadcast value from file. I want to use it creating Rating Object (ALS) . But I am getting null. Here is my code : At lines 17 & 18 is ok but 19 returns null so 21 returns me error. Why I don't understand.Do you have any ide

Re: Cassandra Submit

2015-06-09 Thread Yasemin Kaya
. > > As to the port issue -- what about this: > > $bin/cassandra-cli -h localhost -p 9160 > Connected to: "Test Cluster" on localhost/9160 > Welcome to Cassandra CLI version 2.1.5 > > > On Tue, Jun 9, 2015 at 1:29 PM, Yasemin Kaya wrote: > >> My jar fi

Re: Cassandra Submit

2015-06-09 Thread Yasemin Kaya
n Tue, Jun 9, 2015 at 10:18 AM, Yasemin Kaya wrote: > >> Sorry my answer I hit terminal lsof -i:9160: result is >> >> lsof -i:9160 >> COMMAND PIDUSER FD TYPE DEVICE SIZE/OFF NODE NAME >> java7597 inosens 101u IPv4 85754 0t0 TCP local

Re: Cassandra Submit

2015-06-09 Thread Yasemin Kaya
Sorry my answer I hit terminal lsof -i:9160: result is lsof -i:9160 COMMAND PIDUSER FD TYPE DEVICE SIZE/OFF NODE NAME java7597 inosens 101u IPv4 85754 0t0 TCP localhost:9160 (LISTEN) so 9160 port is available or not ? 2015-06-09 17:16 GMT+03:00 Yasemin Kaya : > Yes

Re: Cassandra Submit

2015-06-09 Thread Yasemin Kaya
TEN) > > ​ > I am running an out-of-the box cassandra conf where > > rpc_address: localhost > # port for Thrift to listen for clients on > rpc_port: 9160 > > > > On Tue, Jun 9, 2015 at 7:36 AM, Yasemin Kaya wrote: > >> I couldn't find any solutio

Re: Cassandra Submit

2015-06-09 Thread Yasemin Kaya
I couldn't find any solution. I can write but I can't read from Cassandra. 2015-06-09 8:52 GMT+03:00 Yasemin Kaya : > Thanks alot Mohammed, Gerard and Yana. > I can write to table, but exception returns me. It says "*Exception in > thread "main" java.io.

Re: Cassandra Submit

2015-06-08 Thread Yasemin Kaya
you have connectivity before you try to make a a connection via spark. > > On Mon, Jun 8, 2015 at 4:43 AM, Yasemin Kaya wrote: > >> Hi, >> I run my project on local. How can find ip address of my cassandra host >> ? From cassandra.yaml or ?? >> >> yasemin &g

Re: Cassandra Submit

2015-06-08 Thread Yasemin Kaya
Hi, I run my project on local. How can find ip address of my cassandra host ? >From cassandra.yaml or ?? yasemin 2015-06-08 11:27 GMT+03:00 Gerard Maas : > ? = > > On Mon, Jun 8, 2015 at 10:12 AM, Yasemin Kaya wrote: > >> Hi , >> >> How can I find spark.ca

Re: Cassandra Submit

2015-06-08 Thread Yasemin Kaya
.host setting. It should be > pointing to one of your Cassandra nodes. > > > > Mohammed > > > > *From:* Yasemin Kaya [mailto:godo...@gmail.com] > *Sent:* Friday, June 5, 2015 7:31 AM > *To:* user@spark.apache.org > *Subject:* Cassandra Submit > > > > Hi, &

Cassandra Submit

2015-06-05 Thread Yasemin Kaya
Hi, I am using cassandraDB in my project. I had that error *Exception in thread "main" java.io.IOException: Failed to open native connection to Cassandra at {127.0.1.1}:9042* I think I have to modify the submit line. What should I add or remove when I submit my project? Best, yasemin -- hiç e

Re: ALS Rating Object

2015-06-03 Thread Yasemin Kaya
1.4: spark.ml.recommendation.ALS (in the Pipeline API) exposes > ALS.train as a DeveloperApi to allow users to use Long instead of Int. > We're also thinking about better ways to permit Long IDs. > > Joseph > > On Wed, Jun 3, 2015 at 5:04 AM, Yasemin Kaya wrote: > >&g

ALS Rating Object

2015-06-03 Thread Yasemin Kaya
Hi, I want to use Spark's ALS in my project. I have the userid like 30011397223227125563254 and Rating Object which is the Object of ALS wants Integer as a userid so the id field does not fit into a 32 bit Integer. How can I solve that ? Thanks. Best, yasemin -- hiç ender hiç

Cassanda example

2015-06-01 Thread Yasemin Kaya
Hi, I want to write my RDD to Cassandra database and I took an example from this site . I add that to my project but I have errors. Here is my project in gist . errors :

Collabrative Filtering

2015-05-26 Thread Yasemin Kaya
Hi, In CF String path = "data/mllib/als/test.data"; JavaRDD data = sc.textFile(path); JavaRDD ratings = data.map(new Function() { public Rating call(String s) { String[] sarray = s.split(","); return new Rating(Integer.parseInt(sarray[0]), Integer .parseInt(sarray[1]), Double.parseDouble(sarray[

map reduce ?

2015-05-21 Thread Yasemin Kaya
Hi, I have JavaPairRDD> and as an example what I want to get. user_id cat1 cat2 cat3 cat4 522 0 1 2 0 62 1 0 3 0 661 1 2 0 1 query : the users who have a number (except 0) in cat1 and cat3 column answer: cat2 -> 522,611 & cat3->522,62 = user 522 How can I get this solution?

Re: swap tuple

2015-05-14 Thread Yasemin Kaya
*Sent:* Thursday, May 14, 2015 1:24 PM > *To:* 'Holden Karau'; 'Yasemin Kaya' > *Cc:* user@spark.apache.org > *Subject:* RE: swap tuple > > Where is the “Tuple” supposed to be in - you can > refer to a “Tuple” if it was e.g. > > > > > *From:* holden

reduceByKey

2015-05-14 Thread Yasemin Kaya
Hi, I have JavaPairRDD and I want to implement reduceByKey method. My pairRDD : *2553: 0,0,0,1,0,0,0,0* 46551: 0,1,0,0,0,0,0,0 266: 0,1,0,0,0,0,0,0 *2553: 0,0,0,0,0,1,0,0* *225546: 0,0,0,0,0,1,0,0* *225546: 0,0,0,0,0,1,0,0* I want to get : *2553: 0,0,0,1,0,1,0,0* 46551: 0,1,0,0,0,0,0,0 266: 0,1

swap tuple

2015-05-14 Thread Yasemin Kaya
Hi, I have *JavaPairRDD *and I want to *swap tuple._1() to tuple._2()*. I use *tuple.swap() *but it can't be changed JavaPairRDD in real. When I print JavaPairRDD, the values are same. Anyone can help me for that? Thank you. Have nice day. yasemin -- hiç ender hiç

Re: JavaPairRDD

2015-05-13 Thread Yasemin Kaya
aribooksonline.com/library/view/learning-spark/9781449359034/ch04.html > > Tristan > > > > > > On 13 May 2015 at 23:12, Yasemin Kaya wrote: > >> Hi, >> >> I want to get *JavaPairRDD *from the tuple part of >> *JavaPairRDD> Tuple2>

JavaPairRDD

2015-05-13 Thread Yasemin Kaya
Hi, I want to get *JavaPairRDD *from the tuple part of *JavaPairRDD> .* As an example: ( http://www.koctas.com.tr/reyon/el-aletleri/7,(0,1,0,0,0,0,0,0,46551)) in my *JavaPairRDD> *and I want to get *( (46551), (0,1,0,0,0,0,0,0) )* I try to split tuple._2() and create new JavaPairRDD but I can'

Content based filtering

2015-05-12 Thread Yasemin Kaya
Hi, is Content based filtering available for Spark in Mllib? If it isn't , what can I use as an alternative? Thank you. Have a nice day yasemin -- hiç ender hiç

Spark Mongodb connection

2015-05-04 Thread Yasemin Kaya
Hi! I am new at Spark and I want to begin Spark with simple wordCount example in Java. But I want to give my input from Mongodb database. I want to learn how can I connect Mongodb database to my project. Any one can help for this issue. Have a nice day yasemin -- hiç ender hiç