Re: Use maxmind geoip lib to process ip on Spark/Spark Streaming/Spark SQL

2016-03-03 Thread Zhun Shen
om: Zhun ShenSent: Monday, February 29, 2016 11:17 PMTo: romain sageanCc: userSubject: Re: Use maxmind geoip lib to process ip on Spark/Spark Streaming  Hi,I check the dependencies and fix the bug. It work well on Spark but not on Spark Streaming. So I think I still need find another way to d

Get Offset when using Spark Streaming + Kafka

2016-03-06 Thread Zhun Shen
Hi, I use KafkaUtils.createDirectStream to consumer data from Kafka, but I found that Zookeeper-based Kafka monitoring tools could not show progress of the streaming application because createDirectStream save the offset in checkpoints(http://spark.apache.org/docs/latest/streaming-kafka-integra

Use maxmind geoip lib to process ip on Spark/Spark Streaming

2016-02-23 Thread Zhun Shen
Hi all, Currently, I sent nginx log to Kafka then I want to use Spark Streaming to parse the log and enrich the IP info with geoip libs from Maxmind. I found this one https://github.com/Sanoma-CDA/maxmind-geoip2-scala.git , but spark stre

Re: Use maxmind geoip lib to process ip on Spark/Spark Streaming

2016-02-25 Thread Zhun Shen
rElse(None).toString >> val longitude = >> (lookupResult._1).map(_.longitude).getOrElse(None).toString >> return List(countryName, city, latitude, longitude) >> } >> sc.addFile("/home/your_user/GeoLiteCity.dat") >> >> //l

Re: Use maxmind geoip lib to process ip on Spark/Spark Streaming

2016-02-29 Thread Zhun Shen
Hi, I check the dependencies and fix the bug. It work well on Spark but not on Spark Streaming. So I think I still need find another way to do it. > On Feb 26, 2016, at 2:47 PM, Zhun Shen wrote: > > Hi, > > thanks for you advice. I tried your method, I use Gradle to manage m

Got error “"java.lang.IllegalAccessError" when using HiveContext in Spark shell on AWS

2014-08-07 Thread Zhun Shen
Hi, When I try to use HiveContext in Spark shell on AWS, I got the error "java.lang.IllegalAccessError: tried to access method com.google.common.collect.MapMaker.makeComputingMap(Lcom/google/common/base/Function;)Ljava/util/concurrent/ConcurrentMap". I follow the steps below to compile and instal

Re: Got error “"java.lang.IllegalAccessError" when using HiveContext in Spark shell on AWS

2014-08-07 Thread Zhun Shen
. -- Zhun Shen Data Mining at LightnInTheBox.com Email: shenzhunal...@gmail.com | shenz...@yahoo.com Phone: 186 0627 7769 GitHub: https://github.com/shenzhun LinkedIn: http://www.linkedin.com/in/shenzhun On August 7, 2014 at 6:57:06 PM, Cheng Lian (lian.cs@gmail.com) wrote: Hey Zhun, Thanks

Move Spark configuration from SPARK_CLASSPATH to spark-default.conf , HiveContext went wrong with "Class com.hadoop.compression.lzo.LzoCodec not found"

2014-09-17 Thread Zhun Shen
66 more Caused by: java.lang.ClassNotFoundException: Class com.hadoop.compression.lzo.LzoCodec not found         at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1801)         at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:128)         ... 68 more -- Zhun Shen Data Mining at LightnInTheBox.com Email: shenzhunal...@gmail.com | shenz...@yahoo.com Phone: 186 0627 7769 GitHub: https://github.com/shenzhun LinkedIn: http://www.linkedin.com/in/shenzhun