alanwake created ZEPPELIN-4946:
----------------------------------

             Summary: zeppelin server failed to connect spark interpreter on 
k8s 
                 Key: ZEPPELIN-4946
                 URL: https://issues.apache.org/jira/browse/ZEPPELIN-4946
             Project: Zeppelin
          Issue Type: Bug
          Components: zeppelin-server
    Affects Versions: 0.9.0
         Environment: zeppelin:0.9.0

k8s: 1.16.2

spark: spark-py:3.0-2.7

here are my development environment:

1.  standalone spark cluster on k8s , master service expose at 
spark://master-0.spark-master.spark.svc.cluster.local:7077

2.  zeppelin is deploy at zeppelin namespace.
{code:java}
// deployment.yml
apiVersion: v1
kind: ConfigMap
metadata:
  name: zeppelin-server-conf-map
  namespace: zeppelin
data:
  # 'serviceDomain' is a Domain name to use for accessing Zeppelin UI.
  # Should point IP address of 'zeppelin-server' service.
  #
  # Wildcard subdomain need to be point the same IP address to access service 
inside of Pod (such as SparkUI).
  # i.e. if service domain is 'local.zeppelin-project.org', DNS configuration 
should make 'local.zeppelin-project.org' and '*.local.zeppelin-project.org' 
point the same address.
  #
  # Default value is 'local.zeppelin-project.org' while it points 127.0.0.1 and 
`kubectl port-forward zeppelin-server` will give localhost to connects.
  # If you have your ingress controller configured to connect to 
`zeppelin-server` service and have a domain name for it (with wildcard 
subdomain point the same address), you can replace serviceDomain field with 
your own domain.
  #SERVICE_DOMAIN: zeppelin-server.zeppelin.svc.cluster.local:8080
  SERVICE_DOMAIN: local.zeppelin-project.org:8080
  ZEPPELIN_K8S_SPARK_CONTAINER_IMAGE: spark-py:3.0-2.7
  ZEPPELIN_K8S_CONTAINER_IMAGE: apache/zeppelin:0.9.0
  ZEPPELIN_HOME: /zeppelin
  ZEPPELIN_SERVER_RPC_PORTRANGE: 12320:12320
  # default value of 'master' property for spark interpreter.
  #SPARK_MASTER: k8s://https://kubernetes.default.svc
  SPARK_MASTER: spark://master-0.spark-master.spark.svc.cluster.local:7077
  # default value of 'SPARK_HOME' property for spark interpreter.
  SPARK_HOME: /spark---apiVersion: apps/v1
kind: Deployment
metadata:
  name: zeppelin
  namespace: zeppelin
  labels:
    app: zeppelin
spec:
  replicas: 1
  selector:
    matchLabels:
      app: zeppelin
  template:
    metadata:
      labels:
        app: zeppelin
    spec:
      nodeSelector:
        role: worker
      containers:
      - name: zeppelin
        image: apache/zeppelin:0.9.0
        securityContext:
          runAsUser: 0
        envFrom:
        - configMapRef:
            name: zeppelin-server-conf-map
        ports:
        - containerPort: 8080
          name: web
        - containerPort: 12320
          name: rpc
        resources:
          requests:
            cpu: 0.2
            memory: 200m
        volumeMounts:
          - name: podyaml
            mountPath: /zeppelin/k8s/interpreter
      volumes:
      - name: podyaml
        hostPath:
          path: /datadisk/nfs/zeppelin/k8s/interpreter/

{code}
{code:java}

//100-interpreter-spec.yaml
here may be a bug
-c {{zeppelin.k8s.server.rpc.service}} can not work, it's empty.
so i replace it with hard code -c zeppelin-server.zeppelin.svc.cluster.local

{code}
 
{code:java}
kind: Service
apiVersion: v1
metadata:
    name: zeppelin-server
  namespace: zeppelin
spec:
  type: NodePort
  ports:
    - port: 8080
      targetPort: 8080
      nodePort: 30080
      name: web
    - port: 12320
      name: rpc            # port name is referenced in the code. So it 
shouldn't be changed.
  selector:
    app: zeppelin

{code}
            Reporter: alanwake
         Attachments: 1.txt, 2.txt

HELP, Dears!

i am new to here and unfamiliar with java projects. the logs show nothing about 
remote address.

 

 
{code:java}
[root@master zeppelin]# kubectl get pods -n=zeppelin -o=wide
NAME                       READY   STATUS    RESTARTS   AGE     IP            
NODE                   NOMINATED NODE   READINESS GATES
spark-hpvbft               1/1     Running   0          13s     10.244.1.23   
node01.51vrk8s.local   <none>           <none>
zeppelin-df54795fb-wddqs   1/1     Running   0          5m50s   10.244.1.22   
node01.51vrk8s.local   <none>           <none>
{code}
 
{code:java}
[root@master ~]# kubectl logs spark-hpvbft -n=zeppelin
 INFO [2020-07-10 04:19:37,576] 
({FIFOScheduler-interpreter_1257482730-Worker-1} Logging.scala[logInfo]:57) - 
Initialized BlockManager: BlockManagerId(driver, spark-hpvbft, 36344, None)
 INFO [2020-07-10 04:19:37,681] 
({FIFOScheduler-interpreter_1257482730-Worker-1} 
ContextHandler.java[doStart]:855) - Started 
o.s.j.s.ServletContextHandler@69b63f8c{/metrics/json,null,AVAILABLE,@Spark}
 INFO [2020-07-10 04:19:37,754] 
({FIFOScheduler-interpreter_1257482730-Worker-1} 
BaseSparkScalaInterpreter.scala[spark2CreateContext]:293) - Created Spark 
session (without Hive support)
 INFO [2020-07-10 04:19:41,316] 
({FIFOScheduler-interpreter_1257482730-Worker-1} SparkShims.java[loadShims]:61) 
- Initializing shims for Spark 3.x
 INFO [2020-07-10 04:19:42,727] 
({FIFOScheduler-interpreter_1257482730-Worker-1} 
AbstractScheduler.java[runJob]:152) - Job 20150210-015259_1403135953 finished 
by scheduler interpreter_1257482730

{code}
see details file 1

 
{code:java}

[root@master ~]# kubectl logs zeppelin-df54795fb-wddqs -n=zeppelin
INFO [2020-07-10 04:19:34,427] ({SchedulerFactory2} 
RemoteInterpreter.java[call]:141) - Open RemoteInterpreter 
org.apache.zeppelin.spark.SparkInterpreter
 INFO [2020-07-10 04:19:34,427] ({SchedulerFactory2} 
RemoteInterpreter.java[pushAngularObjectRegistryToRemote]:431) - Push local 
angular object registry from ZeppelinServer to remote interpreter group 
spark-shared_process
 WARN [2020-07-10 04:19:42,736] ({SchedulerFactory2} 
NotebookServer.java[onStatusChange]:1901) - Job 20150210-015259_1403135953 is 
finished, status: ERROR, exception: null, result: %text warning: there was one 
deprecation warning (since 2.0.0); for details, enable `:setting -deprecation' 
or `:replay -deprecation'
java.net.ConnectException: Connection refused (Connection refused)
  at java.net.PlainSocketImpl.socketConnect(Native Method)
{code}
see details file 2



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to