from:"gpatcham"

How to set Description in UI SQL tab

2020-06-04 Thread gpatcham

Is there a way can we set description to display in UI SQL TAB ?. Like we can set sc.setJobDescription for Jobs and stages Thanks Giri -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail

Re: Apache Spark orc read performance when reading large number of small files

2018-11-01 Thread gpatcham

When I run spark.read.orc("hdfs://test").filter("conv_date = 20181025").count with "spark.sql.orc.filterPushdown=true" I see below in executors logs. Predicate push down is happening 18/11/01 17:31:17 INFO OrcInputFormat: ORC pushdown predicate: leaf-0 = (IS_NULL conv_date) leaf-1 = (EQUALS conv_d

Re: Apache Spark orc read performance when reading large number of small files

2018-10-31 Thread gpatcham

spark version 2.2.0 Hive version 1.1.0 There are lot of small files Spark code : "spark.sql.orc.enabled": "true", "spark.sql.orc.filterPushdown": "true val logs =spark.read.schema(schema).orc("hdfs://test/date=201810").filter("date > 20181003") Hive: "spark.sql.orc.enabled": "true", "spark.s

Apache Spark orc read performance when reading large number of small files

2018-10-31 Thread gpatcham

When reading large number of orc files from HDFS under a directory spark doesn't launch any tasks until some amount of time and I don't see any tasks running during that time. I'm using below command to read orc and spark.sql configs. What spark is doing under hoods when spark.read.orc is issue

Re: Spark UI error spark 2.0.1 hadoop 2.6

2016-10-27 Thread gpatcham

I'm able to fix.. added servlet 3.0 to classpath -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-UI-error-spark-2-0-1-hadoop-2-6-tp27970p27971.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --

Spark UI error spark 2.0.1 hadoop 2.6

2016-10-27 Thread gpatcham

Hi, I'm running spark-shell in yarn client mode and sparkcontext started and able to run commands . But UI is not coming up and see below error's in spark shell 20:51:20 WARN servlet.ServletHandler: javax.servlet.ServletException: Could not determine the proxy server for redirection at

Not able pass 3rd party jars to mesos executors

2016-05-10 Thread gpatcham

Hi All, I'm using --jars option in spark-submit to send 3rd party jars . But I don't see they are actually passed to mesos slaves. Getting Noclass found exceptions. This is how I'm using --jars option --jars hdfs://namenode:8082/user/path/to/jar Am I missing something here or what's the correct

using spark context in map funciton TASk not serilizable error

2016-01-18 Thread gpatcham

Hi, I have a use case where I need to pass sparkcontext in map function reRDD.map(row =>method1(row,sc)).saveAsTextFile(outputDir) Method1 needs spark context to query cassandra. But I see below error java.io.NotSerializableException: org.apache.spark.SparkContext Is there a way we can fix th

Incorrect results with spark sql

2015-09-16 Thread gpatcham

Hi, I'm trying to query on hive view using spark and it is giving different rowcounts when compared to hive. here is the view definition in hive create view test_hive_view as select col1 , col2 from tab1 left join tab2 on tab1.col1 = tab2.col1 left join tab3 on tab1.col1 = tab3.col1 where col1 i

query avro hive table in spark sql

2015-08-26 Thread gpatcham

Hi, I'm trying to query hive table which is based on avro in spark SQL and seeing below errors. 15/08/26 17:51:12 WARN avro.AvroSerdeUtils: Encountered AvroSerdeException determining schema. Returning signal schema to indicate problem org.apache.hadoop.hive.serde2.avro.AvroSerdeException: Neither

Removing empty partitions before we write to HDFS

2015-08-06 Thread gpatcham

Is there a way to filter out empty partitions before I write to HDFS other than using reparition and colasce ? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Removing-empty-partitions-before-we-write-to-HDFS-tp24156.html Sent from the Apache Spark User Lis

resource allocation spark on yarn

2014-12-12 Thread gpatcham

Hi All, I have spark on yarn and there are multiple spark jobs on the cluster. Sometimes some jobs are not getting enough resources even when there are enough free resources available on cluster, even when I use below settings --num-workers 75 \ --worker-cores 16 Jobs stick with the resources w

saveasSequenceFile with codec and compression type

2014-10-20 Thread gpatcham

Hi All, I'm trying to save RDD as sequencefile and not able to use compresiontype (BLOCK or RECORD) Can any one let me know how we can use compressiontype here is the code I'm using RDD.saveAsSequenceFile(target,Some(classOf[org.apache.hadoop.io.compress.GzipCodec])) Thanks -- View this mes

Re: Issue using kryo serilization

2014-08-01 Thread gpatcham

any pointers to this issue. Thanks -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Issue-using-kryo-serilization-tp11129p11191.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Issue using kryo serilization

2014-07-31 Thread gpatcham

No,it doesn't implement serializable..It's third party class -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Issue-using-kryo-serilization-tp11129p11136.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Issue using kryo serilization

2014-07-31 Thread gpatcham

Yes,I did enable that conf.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer") conf.set("spark.kryo.registrator", "com.bigdata.MyRegistrator") -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Issue-using-kryo-serilization-tp

Issue using kryo serilization

2014-07-31 Thread gpatcham

I'm new to spark programming and here I'm trying to use third party class in map with kryo serializer val deviceApi = new DeviceApi() deviceApi.loadDataFromStream(this.getClass.getClassLoader.getResourceAsStream("20140730.json")) val properties = uaRDD1.map(line => deviceApi.getProperties(lin

How to set Description in UI SQL tab

Re: Apache Spark orc read performance when reading large number of small files

Re: Apache Spark orc read performance when reading large number of small files

Apache Spark orc read performance when reading large number of small files

Re: Spark UI error spark 2.0.1 hadoop 2.6

Spark UI error spark 2.0.1 hadoop 2.6

Not able pass 3rd party jars to mesos executors

using spark context in map funciton TASk not serilizable error

Incorrect results with spark sql

query avro hive table in spark sql

Removing empty partitions before we write to HDFS

resource allocation spark on yarn

saveasSequenceFile with codec and compression type

Re: Issue using kryo serilization

Re: Issue using kryo serilization

Re: Issue using kryo serilization

Issue using kryo serilization

17 matches

Site Navigation

Mail list logo

Footer information