Hi,

 I encounter the error:


"java.lang.UnsupportedOperationException: sun.misc.Unsafe or
java.nio.DirectByteBuffer.<init>(long, int) not available"


When reading from Google BigQuery (GBQ) table using Kubernetes cluster
built on debian buster


The current debian bustere from the docker image is:

root@ccf3ac45d0ed:/opt/spark/work-dir# cat /etc/*-release

PRETTY_NAME="Debian GNU/Linux 10 (buster)"


And the Java version is


echo $JAVA_HOME

/usr/local/openjdk-11


Now according to Spark 3.1.2 doc <https://spark.apache.org/docs/latest/>


"*For Java 11*, -Dio.netty.tryReflectionSetAccessible=true is required
additionally for Apache Arrow library. This prevents
*java.lang.UnsupportedOperationException:
sun.misc.Unsafe or java.nio.DirectByteBuffer.(long, int) not available* when
Apache Arrow uses Netty internally.


So I have used it as follows:

        spark-submit --verbose \

           --properties-file ${property_file} \

           --master k8s://https://$KUBERNETES_MASTER_IP:443 \

           --deploy-mode cluster \

           --name pytest \

           --conf
spark.yarn.appMasterEnv.PYSPARK_PYTHON=./pyspark_venv/bin/python \

           --py-files $CODE_DIRECTORY/DSBQ.zip \

           --conf spark.kubernetes.namespace=$NAMESPACE \

           --conf spark.executor.memory=5000m \

           --conf spark.network.timeout=300 \

           --conf spark.executor.instances=2 \

           --conf spark.kubernetes.driver.limit.cores=1 \

           --conf spark.driver.cores=1 \

           --conf spark.executor.cores=1 \

           --conf spark.executor.memory=2000m \

           --conf spark.kubernetes.driver.docker.image=${IMAGEGCP} \

           --conf spark.kubernetes.executor.docker.image=${IMAGEGCP} \

           --conf spark.kubernetes.container.image=${IMAGEGCP} \

           --conf
spark.kubernetes.authenticate.driver.serviceAccountName=spark-bq \

          --conf
spark.driver.extraJavaOptions="-Dio.netty.tryReflectionSetAccessible=true" \

           --conf
spark.executor.extraJavaOptions="-Dio.netty.tryReflectionSetAccessible=true"
\

           $CODE_DIRECTORY/${APPLICATION}

However, some comments mentioned
<https://stackoverflow.com/questions/62109276/errorjava-lang-unsupportedoperationexception-for-pyspark-pandas-udf-documenta>
that these parameters need to be supplied before spark-submit, so I added
them to $SPARK_HOME/conf/spark-defaults.conf


185@b272bbf663e6:/opt/spark/conf$ cat spark-defaults.conf

spark.driver.extraJavaOptions="-Dio.netty.tryReflectionSetAccessible=true"

spark.executor.extraJavaOptions="-Dio.netty.tryReflectionSetAccessible=true"

But I'm still getting the same error!


Any ideas will be appreciated.


Mich



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.

Reply via email to