Using Apache Kylin as data source for Spark

2018-05-17 Thread ShaoFeng Shi
Hello, Kylin and Spark users, A doc is newly added in Apache Kylin website on how to using Kylin as a data source in Spark; This can help the users who want to use Spark to analysis the aggregated Cube data. https://kylin.apache.org/docs23/tutorial/spark.html Thanks for your attention. -- Best

Re: Continuous Processing mode behaves differently from Batch mode

2018-05-17 Thread Yuta Morisawa
Thank you for reply. I checked WEB UI and found that the total number of tasks is 10. So, I changed the number of cores from 1 to 10, then it works well. But I haven't figure out what is happening. My assumption is that each Job consists of 10 tasks in default and each task occupies 1 core. S

Getting Data From Hbase using Spark is Extremely Slow

2018-05-17 Thread SparkUser6
I have written four lines of simple spark program to process data in Phoenix table: queryString = getQueryFullString( );// Get data from Phoenix table select col from table JavaPairRDD phRDD = jsc.newAPIHadoopRDD( configuration, Ph

[structured-streaming] foreachPartition alternative in structured streaming.

2018-05-17 Thread karthikjay
I am reading data from Kafka using structured streaming and I need to save the data to InfluxDB. In the regular Dstreams based approach I did this as follows: val messages:DStream[(String, String)] = kafkaStream.map(record => (record.topic, record.value)) messages.foreachRDD { rdd =>

Re: Snappy file compatible problem with spark

2018-05-17 Thread JF Chen
Yes. The JSON files compressed by Flume or Spark work well with Spark. But the json files compressed by myself cannot be read by spark due to codec problem. It seems sparking can read files compressed by hadoop snappy( https://code.google.com/archive/p/hadoop-snappy/) only Regard, Junfeng Chen O

Snappy file compatible problem with spark

2018-05-17 Thread JF Chen
I made some snappy compressed json file with normal snappy codec( https://github.com/xerial/snappy-java ) , which seems cannot be read by Spark correctly. So how to make existed snappy file recognized by spark? Any tools to convert them? Thanks@! Regard, Junfeng Chen

Spark Jobs ends when assignment not found for Kafka Partition

2018-05-17 Thread Biplob Biswas
Hi, I am having this peculiar problem with our spark jobs in our cluster, where the spark job ends with a message: No current assignment for partition iomkafkaconnector-deliverydata-dev-2 We have a setup where we have 4 kafka partitions and 4 spark executors, so each partition should be directl