Hi, agree with Holden, have faced quite a few issues with FUSE.
Also trying to understand "spark-submit from local" . Are you submitting your SPARK jobs from a local laptop or in local mode from a GCP dataproc / system? If you are submitting the job from your local laptop, there will be performance bottlenecks I guess based on the internet bandwidth and volume of data. Regards, Gourav On Sat, Feb 12, 2022 at 7:12 PM Holden Karau <hol...@pigscanfly.ca> wrote: > You can also put the GS access jar with your Spark jars — that’s what the > class not found exception is pointing you towards. > > On Fri, Feb 11, 2022 at 11:58 PM Mich Talebzadeh < > mich.talebza...@gmail.com> wrote: > >> BTW I also answered you in in stackoverflow : >> >> >> https://stackoverflow.com/questions/71088934/unable-to-access-google-buckets-using-spark-submit >> >> HTH >> >> >> view my Linkedin profile >> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >> >> >> https://en.everybodywiki.com/Mich_Talebzadeh >> >> >> >> *Disclaimer:* Use it at your own risk. Any and all responsibility for >> any loss, damage or destruction of data or any other property which may >> arise from relying on this email's technical content is explicitly >> disclaimed. The author will in no case be liable for any monetary damages >> arising from such loss, damage or destruction. >> >> >> >> >> On Sat, 12 Feb 2022 at 08:24, Mich Talebzadeh <mich.talebza...@gmail.com> >> wrote: >> >>> You are trying to access a Google storage bucket gs:// from your local >>> host. >>> >>> It does not see it because spark-submit assumes that it is a local file >>> system on the host which is not. >>> >>> You need to mount gs:// bucket as a local file system. >>> >>> You can use the tool called gcsfuse >>> https://cloud.google.com/storage/docs/gcs-fuse . Cloud Storage FUSE is >>> an open source FUSE <http://fuse.sourceforge.net/> adapter that allows >>> you to mount Cloud Storage buckets as file systems on Linux or macOS >>> systems. You can download gcsfuse from here >>> <https://github.com/GoogleCloudPlatform/gcsfuse> >>> >>> >>> Pretty simple. >>> >>> >>> It will be installed as /usr/bin/gcsfuse and you can mount it by >>> creating a local mount file like /mnt/gs as root and give permission to >>> others to use it. >>> >>> >>> As a normal user that needs to access gs:// bucket (not as root), use >>> gcsfuse to mount it. For example I am mounting a gcs bucket called >>> spark-jars-karan here >>> >>> >>> Just use the bucket name itself >>> >>> >>> gcsfuse spark-jars-karan /mnt/gs >>> >>> >>> Then you can refer to it as /mnt/gs in spark-submit from on-premise host >>> >>> spark-submit --packages org.apache.spark:spark-sql-kafka-0-10_2.12:3.2.0 >>> --jars /mnt/gs/spark-bigquery-with-dependencies_2.12-0.23.2.jar >>> >>> HTH >>> >>> view my Linkedin profile >>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>> >>> >>> >>> *Disclaimer:* Use it at your own risk. Any and all responsibility for >>> any loss, damage or destruction of data or any other property which may >>> arise from relying on this email's technical content is explicitly >>> disclaimed. The author will in no case be liable for any monetary damages >>> arising from such loss, damage or destruction. >>> >>> >>> >>> >>> On Sat, 12 Feb 2022 at 04:31, karan alang <karan.al...@gmail.com> wrote: >>> >>>> Hello All, >>>> >>>> I'm trying to access gcp buckets while running spark-submit from local, >>>> and running into issues. >>>> >>>> I'm getting error : >>>> ``` >>>> >>>> 22/02/11 20:06:59 WARN NativeCodeLoader: Unable to load native-hadoop >>>> library for your platform... using builtin-java classes where applicable >>>> Exception in thread "main" >>>> org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem for >>>> scheme "gs" >>>> >>>> ``` >>>> I tried adding the --conf >>>> spark.hadoop.fs.gs.impl=com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem >>>> >>>> to the spark-submit command, but getting ClassNotFoundException >>>> >>>> Details are in stackoverflow : >>>> >>>> https://stackoverflow.com/questions/71088934/unable-to-access-google-buckets-using-spark-submit >>>> >>>> Any ideas on how to fix this ? >>>> tia ! >>>> >>>> -- > Twitter: https://twitter.com/holdenkarau > Books (Learning Spark, High Performance Spark, etc.): > https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> > YouTube Live Streams: https://www.youtube.com/user/holdenkarau >