LZO support in Spark 1.0.0 - nothing seems to work

2014-09-17 Thread rogthefrog
I have a HDFS cluster managed with CDH Manager. Version is CDH 5.1 with matching GPLEXTRAS parcel. LZO works with Hive and Pig, but I can't make it work with Spark 1.0.0. I've tried: * Setting this: HADOOP_OPTS="-Djava.net.preferIPv4Stack=true $HADOOP_CLIENT_OPTS -Djava.library.path=/opt/cloudera

Re: LZO support in Spark 1.0.0 - nothing seems to work

2014-09-17 Thread rogthefrog
That does appear to be the case. Thanks! For posterity, I ran my pyspark like this: $ sudo su yarn $ pyspark --driver-library-path /opt/cloudera/parcels/GPLEXTRAS/lib/hadoop/lib/native/ >>> p = sc.textFile("/some/file") >>> p.count() everything appears to be working now. -- View this message