Logstash to collect Spark logs

2016-05-20 Thread Ashish Kumar Singh
We are trying to collect Spark logs using logstash for parsing app logs and collecting useful info. We can read the Nodemanager logs but unable to read Spark application logs using Logstash . Current Setup for Spark logs and Logstash 1- Spark runs on Yarn . 2- Using log4j socketAppenders to wr

Spark log collection via Logstash

2016-05-19 Thread Ashish Kumar Singh
We are trying to collect Spark logs using logstash for parsing app logs and collecting useful info. We can read the Nodemanager logs but unable to read Spark application logs using Logstash . Current Setup for Spark logs and Logstash 1- Spark runs on Yarn . 2- Using log4j socketAppenders to wr

Reading Nested Fields in DataFrames

2015-05-11 Thread Ashish Kumar Singh
Hi , I am trying to read Nested Avro data in Spark 1.3 using DataFrames. I need help to retrieve the Inner element data in the Structure below. Below is the schema when I enter df.printSchema : |-- THROTTLING_PERCENTAGE: double (nullable = false) |-- IMPRESSION_TYPE: string (nullable = false)

ImportError: No module named pyspark, when running pi.py

2015-02-09 Thread Ashish Kumar
*Command:* sudo python ./examples/src/main/python/pi.py *Error:* Traceback (most recent call last): File "./examples/src/main/python/pi.py", line 22, in from pyspark import SparkContext ImportError: No module named pyspark

Error when running example (pi.py)

2015-02-08 Thread Ashish Kumar
Traceback (most recent call last): File "pi.py", line 29, in sc = SparkContext(appName="PythonPi") File "/home/ashish/Downloads/spark-1.1.0-bin-hadoop2.4/python/pyspark/context.py", line 104, in __init__ SparkContext._ensure_initialized(self, gateway=gateway) File "/home/ashish/Downl