Hi Ted,

  Thanks for your prompt reply.

  I am afraid clearing the DNS cache did not help. I did the following

  sudo /etc/init.d/dnsmasq restart

 on the two nodes I am using, as I did not have nscd, but still getting the
same error. I am launching the master from 172.26.49.156, whose old name
was IMPETUS-1466, launching one worker from each of 172.26.49.156 and
172.26.49.55,
and launching the app through ./bin/pyspark from 172.26.49.55. I am sending
the detailed stack trace.

Exception in user code:
Traceback (most recent call last):
  File
"/home/impadmin/bibudh/healthcare/code/cloudera_challenge/analyze_anomaly_with_spark.py",
line 121, in anom_with_lr
    pat_proc = pycsv.csvToDataFrame(sqlContext, plaintext_rdd, sep = ",")
  File
"/tmp/spark-0fe22b7c-da8a-4971-8fcf-20b43829504b/userFiles-d9a3c3ae-20d4-4476-8026-a225dd746dc4/pyspark_csv.py",
line 53, in csvToDataFrame
    column_types = evaluateType(rdd_sql, parseDate)
  File
"/tmp/spark-0fe22b7c-da8a-4971-8fcf-20b43829504b/userFiles-d9a3c3ae-20d4-4476-8026-a225dd746dc4/pyspark_csv.py",
line 179, in evaluateType
    return rdd_sql.map(getRowType).reduce(reduceTypes)
  File
"/home/impadmin/spark-1.6.0-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/rdd.py",
line 797, in reduce
    vals = self.mapPartitions(func).collect()
  File
"/home/impadmin/spark-1.6.0-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/rdd.py",
line 771, in collect
    port = self.ctx._jvm.PythonRDD.collectAndServe(self._jrdd.rdd())
  File
"/home/impadmin/spark-1.6.0-bin-hadoop2.6/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py",
line 813, in __call__
    answer, self.gateway_client, self.target_id, self.name)
  File
"/home/impadmin/spark-1.6.0-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/sql/utils.py",
line 45, in deco
    return f(*a, **kw)
  File
"/home/impadmin/spark-1.6.0-bin-hadoop2.6/python/lib/py4j-0.9-src.zip/py4j/protocol.py",
line 308, in get_return_value
    format(target_id, ".", name), value)
Py4JJavaError: An error occurred while calling
z:org.apache.spark.api.python.PythonRDD.collectAndServe.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0
in stage 2.0 failed 4 times, most recent failure: Lost task 0.3 in stage
2.0 (TID 11, IMPETUS-1466): java.lang.IllegalArgumentException:
java.net.UnknownHostException: IMPETUS-1466
at
org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418)
at
org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:231)
at
org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:139)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:510)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:453)
at
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:136)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2433)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:166)
at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:653)
at
org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:389)
at
org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:362)
at
org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$33.apply(SparkContext.scala:1015)
at
org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$33.apply(SparkContext.scala:1015)
at
org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176)
at
org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176)
at scala.Option.map(Option.scala:145)
at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:176)
at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:212)
at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:208)
at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.UnknownHostException: IMPETUS-1466
... 38 more

On Tue, Apr 12, 2016 at 5:53 PM, Ted Yu <yuzhih...@gmail.com> wrote:

> FYI
>
>
> https://documentation.cpanel.net/display/CKB/How+To+Clear+Your+DNS+Cache#HowToClearYourDNSCache-MacOS
> ®10.10
> https://www.whatsmydns.net/flush-dns.html#linux
>
> On Tue, Apr 12, 2016 at 2:44 PM, Bibudh Lahiri <bibudhlah...@gmail.com>
> wrote:
>
>> Hi,
>>
>>     I am trying to run a piece of code with logistic regression on
>> PySpark. I’ve run it successfully on my laptop, and I have run it
>> previously on a standalone cluster mode, but the name of the server on
>> which I am running it was changed in between (the old name was
>> "IMPETUS-1466") by the admin. Now, when I am trying to run, it is
>> throwing the following error:
>>
>> File
>> "/home/impadmin/Nikunj/spark-1.6.0/python/lib/pyspark.zip/pyspark/sql/utils.py",
>> line 53, in deco
>>
>>     raise IllegalArgumentException(s.split(': ', 1)[1], stackTrace)
>>
>> pyspark.sql.utils.IllegalArgumentException:
>> u'java.net.UnknownHostException: IMPETUS-1466.
>>
>>    I have changed a few configuration files, and /etc/hosts, and
>> regenerated the SSH keys, updated the files .ssh/known_hosts and 
>> .ssh/authorized_keys,
>> but still this is not getting resolved. Can someone please point out where
>> this name is being picked up from?
>>
>> --
>> Bibudh Lahiri
>> Data Scientist, Impetus Technolgoies
>> 5300 Stevens Creek Blvd
>> San Jose, CA 95129
>> http://knowthynumbers.blogspot.com/
>>
>>
>
>


-- 
Bibudh Lahiri
Data Scientist, Impetus Technolgoies
5300 Stevens Creek Blvd
San Jose, CA 95129
http://knowthynumbers.blogspot.com/

Reply via email to