Re: Use maxmind geoip lib to process ip on Spark/Spark Streaming/Spark SQL

2016-03-03 Thread Zhun Shen
om: Zhun ShenSent: Monday, February 29, 2016 11:17 PMTo: romain sageanCc: userSubject: Re: Use maxmind geoip lib to process ip on Spark/Spark Streaming  Hi,I check the dependencies and fix the bug. It work well on Spark but not on Spark Streaming. So I think I still need find another way to d

RE: Use maxmind geoip lib to process ip on Spark/Spark Streaming

2016-02-29 Thread Silvio Fiorito
PM To: romain sagean<mailto:romain.sag...@hupi.fr> Cc: user<mailto:user@spark.apache.org> Subject: Re: Use maxmind geoip lib to process ip on Spark/Spark Streaming Hi, I check the dependencies and fix the bug. It work well on Spark but not on Spark Streaming. So I think I still need find

Re: Use maxmind geoip lib to process ip on Spark/Spark Streaming

2016-02-29 Thread Zhun Shen
Hi, I check the dependencies and fix the bug. It work well on Spark but not on Spark Streaming. So I think I still need find another way to do it. > On Feb 26, 2016, at 2:47 PM, Zhun Shen wrote: > > Hi, > > thanks for you advice. I tried your method, I use Gradle to manage my scala > code.

Re: Use maxmind geoip lib to process ip on Spark/Spark Streaming

2016-02-26 Thread Romain Sagean
it seems like some library are missing. I'm not good at compiling and I don't know how to use gradle. But for sbt I use sbt-assembly plugin ( https://github.com/sbt/sbt-assembly) to include all dependency and make a fat jar. For gradle I have found this: https://github.com/musketyr/gradle-fatjar-pl

Re: Use maxmind geoip lib to process ip on Spark/Spark Streaming

2016-02-25 Thread Zhun Shen
Hi, thanks for you advice. I tried your method, I use Gradle to manage my scala code. 'com.snowplowanalytics:scala-maxmind-iplookups:0.2.0’ was imported in Gradle. spark version: 1.6.0 scala: 2.10.4 scala-maxmind-iplookups: 0.2.0 I run my test, got the error as below: java.lang.NoClassDefFound

Re: Use maxmind geoip lib to process ip on Spark/Spark Streaming

2016-02-23 Thread romain sagean
I realize I forgot the sbt part resolvers += "SnowPlow Repo" at "http://maven.snplow.com/releases/"; libraryDependencies ++= Seq( "org.apache.spark" %% "spark-core" % "1.3.0", "com.snowplowanalytics" %% "scala-maxmind-iplookups" % "0.2.0" ) otherwise, to process streaming log I use logsta

Re: Use maxmind geoip lib to process ip on Spark/Spark Streaming

2016-02-23 Thread Romain Sagean
Hi, I use maxmind geoip with spark (no streaming). To make it work you should use mapPartition. I don't know if something similar exist for spark streaming. my code for reference: def parseIP(ip:String, ipLookups: IpLookups): List[String] = { val lookupResult = ipLookups.performLookups(ip)

Use maxmind geoip lib to process ip on Spark/Spark Streaming

2016-02-23 Thread Zhun Shen
Hi all, Currently, I sent nginx log to Kafka then I want to use Spark Streaming to parse the log and enrich the IP info with geoip libs from Maxmind. I found this one https://github.com/Sanoma-CDA/maxmind-geoip2-scala.git , but spark stre