from:"Nitay Joffe"

Spark 1.3 SQL Type Parser Changes?

2015-03-10 Thread Nitay Joffe

In Spark 1.2 I used to be able to do this: scala> org.apache.spark.sql.hive.HiveMetastoreTypes.toDataType("struct") res30: org.apache.spark.sql.catalyst.types.DataType = StructType(List(StructField(int,LongType,true))) That is, the name of a column can be a keyword like "int". This is no longer t

Re: Spark S3 Performance

2014-11-24 Thread Nitay Joffe

his in your logs which indicates that it is a read that >> starts from an offset and reading one split size (64MB) worth of data: >> >> 14/11/20 15:39:45 [Executor task launch worker-1 ] INFO HadoopRDD: Input >> split: s3n://mybucket/myfile:335544320+67108864 >> On Nov 22, 2

Re: Spark S3 Performance

2014-11-22 Thread Nitay Joffe

Err I meant #1 :) - Nitay Founder & CTO On Sat, Nov 22, 2014 at 10:20 AM, Nitay Joffe wrote: > Anyone have any thoughts on this? Trying to understand especially #2 if > it's a legit bug or something I'm doing wrong. > > - Nitay > Founder & CTO > > &g

Re: Spark S3 Performance

2014-11-22 Thread Nitay Joffe

Anyone have any thoughts on this? Trying to understand especially #2 if it's a legit bug or something I'm doing wrong. - Nitay Founder & CTO On Thu, Nov 20, 2014 at 11:54 AM, Nitay Joffe wrote: > I have a simple S3 job to read a text file and do a line count. > Sp

Spark S3 Performance

2014-11-20 Thread Nitay Joffe

I have a simple S3 job to read a text file and do a line count. Specifically I'm doing *sc.textFile("s3n://mybucket/myfile").count*.The file is about 1.2GB. My setup is standalone spark cluster with 4 workers each with 2 cores / 16GB ram. I'm using branch-1.2 code built against hadoop 2.4 (though I

Spark 1.3 SQL Type Parser Changes?

Re: Spark S3 Performance

Re: Spark S3 Performance

Re: Spark S3 Performance

Spark S3 Performance

5 matches

Site Navigation

Mail list logo

Footer information