Re: Drop-In Virtual Office Half-Hour

2021-09-13 Thread Mich Talebzadeh
Would be interested in reference to K8 work. Can you please drop a brief sentence what will that entail (assuming Volcano etc). Thanks view my Linkedin profile *Disclaimer:* Use it at your own risk. Any and all responsibility for

Datasource v2 can not prune file source partitions when readDataSchema is empty

2021-09-13 Thread Heng Su
Hi, community: We use spark 3.1.2 In PruneFileSourcePartitions rule, the FileScan::withFilters is called to push partition prune filter(and this is the only place this function can be called), but it has a constraint that “scan.readDataSchema.nonEmpty” (https://github.com/apache/spark/blob/de3

Re: Drop-In Virtual Office Half-Hour

2021-09-13 Thread Holden Karau
Hmmm ok the Google calendar share link is still being difficult sorry y’all. It’s going to be Monday at 2:30pm pacific time: Holden - OSS Virtual Office Half Hour, Drop Ins Welcome :) Monday, Sep 20 · 2:30–3 PM Google Meet joining info Video call link: https://meet.google.com/ccd-mkbd-gfv On Mon,

Re: Drop-In Virtual Office Half-Hour

2021-09-13 Thread Holden Karau
Ah thanks for pointing that out. I changed the visibility on it to public so it should work now. On Mon, Sep 13, 2021 at 4:26 PM Gourav Sengupta wrote: > Hi Holden, > > This is such a wonderful opportunity. Sadly when I click on the link it > says event not found. > > Regards, > Gourav > > On Tu

Re: Drop-In Virtual Office Half-Hour

2021-09-13 Thread Gourav Sengupta
Hi Holden, This is such a wonderful opportunity. Sadly when I click on the link it says event not found. Regards, Gourav On Tue, Sep 14, 2021 at 12:13 AM Holden Karau wrote: > Hi Folks, > > I'm going to experiment with a drop-in virtual half-hour office hour type > thing next Monday, if you've

Drop-In Virtual Office Half-Hour

2021-09-13 Thread Holden Karau
Hi Folks, I'm going to experiment with a drop-in virtual half-hour office hour type thing next Monday, if you've got any burning Spark or general OSS questions you haven't had the time to ask anyone else I hope you'll swing by and join me. If no one comes with questions I'll tour some of the Spark

Spark-3.1.2 r-images can not be built

2021-09-13 Thread Yehor Kryvokon
Hi all, I can't build Spark-3.1.2 R image: bin/docker-image-tool.sh -r myrepo -R resource-managers/kubernetes/docker/src/main/dockerfiles/spark/bindings/R/Dockerfile -n build I've an error whet script tries to install R: Reading package lists... Building dependency tree... Reading state informat

[Spark Core] saveAsTextFile is unable to rename a directory using hadoop-azure NativeAzureFileSystem

2021-09-13 Thread Abhishek Jindal
Hello, I am trying to use the Spark rdd.saveAsTextFile function which calls the FileSystem.rename() under the hood. This errors out with “com.microsoft.azure.storage.StorageException: One of the request inputs is not valid” when using hadoop-azure NativeAzureFileSystem. I have written a small test

[Announcement] Zingg fuzzy matching for entity resolution, deduplication and data mastering

2021-09-13 Thread Sonal Goyal
Hi All, Super stoked to announce open sourcing Zingg, a Spark based tool to build unified customer and supplier profiles and remove duplicates. More details at https://github.com/zinggAI/zingg I do hope some of you will find it useful. Cheers, Sonal https://github.com/zinggAI/zingg