Re: Write DataFrame with Partition and choose Filename in PySpark

2023-05-05 Thread Marco Costantini
e > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > > On Thu, 4 May 2023 at 22:14, Marco Costantini < > marco.costant..

Re: Write DataFrame with Partition and choose Filename in PySpark

2023-05-04 Thread Marco Costantini
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary dama

Write DataFrame with Partition and choose Filename in PySpark

2023-05-04 Thread Marco Costantini
Hello, I am testing writing my DataFrame to S3 using the DataFrame `write` method. It mostly does a great job. However, it fails one of my requirements. Here are my requirements. - Write to S3 - use `partitionBy` to automatically make folders based on my chosen partition columns - control the res

Re: Write custom JSON from DataFrame in PySpark

2023-05-04 Thread Marco Costantini
> | 3| {a3}| > +---+-+ > > df2.write.json("data.json") > {"id":1,"stuff":{"datA":"a1"}} > {"id":2,"stuff":{"datA":"a2"}} > {"id":3,"stuff":{"datA":"

Write custom JSON from DataFrame in PySpark

2023-05-03 Thread Marco Costantini
Hello, Let's say I have a very simple DataFrame, as below. +---++ | id|datA| +---++ | 1| a1| | 2| a2| | 3| a3| +---++ Let's say I have a requirement to write this to a bizarre JSON structure. For example: { "id": 1, "stuff": { "datA": "a1" } } How can I achieve this

Re: What is the best way to organize a join within a foreach?

2023-04-26 Thread Marco Costantini
verybodywiki.com/Mich_Talebzadeh >>> >>> >>> >>> *Disclaimer:* Use it at your own risk. Any and all responsibility for >>> any loss, damage or destruction of data or any other property which may >>> arise from relying on this email's technic

Re: What is the best way to organize a join within a foreach?

2023-04-25 Thread Marco Costantini
7| Mich's 7th order|107.11| 107.11| > |Mich| 50008| Mich's 8th order|108.11| 108.11| > |Mich| 50009| Mich's 9th order|109.11| 109.11| > |Mich| 50010|Mich's 10th order|210.11| 210.11| > +++-+--+-+ > >

Re: What is the best way to organize a join within a foreach?

2023-04-25 Thread Marco Costantini
hnical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > > On Tue, 25 Apr 2023 at 14:07, Marco Costantini < > marco.costant...@rocketfncl.com> wrote: > >> Th

Re: What is the best way to organize a join within a foreach?

2023-04-25 Thread Marco Costantini
ng on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > > On Tue, 25 Apr 2023 at 00:15, Marco Costantini < > marco.costant...@rocketfncl.com>

What is the best way to organize a join within a foreach?

2023-04-24 Thread Marco Costantini
I have two tables: {users, orders}. In this example, let's say that for each 1 User in the users table, there are 10 Orders in the orders table. I have to use pyspark to generate a statement of Orders for each User. So, a single user will need his/her own list of Orders. Additionally, I need t

What is the best way to organize a join within a foreach?

2023-04-24 Thread Marco Costantini
Marco Costantini 5:55 PM (5 minutes ago) to user I have two tables: {users, orders}. In this example, let's say that for each 1 User in the users table, there are 10 Orders in the orders table. I have to use pyspark to generate a statement of Orders for each User. So, a single user will

Re: AWS Spark-ec2 script with different user

2014-04-09 Thread Marco Costantini
aram > > > On Wed, Apr 9, 2014 at 9:12 AM, Marco Costantini < > silvio.costant...@granatads.com> wrote: > >> Ah, tried that. I believe this is an HVM AMI? We are exploring >> paravirtual AMIs. >> >> >> On Wed, Apr 9, 2014 at 11:17 AM, Nichola

Re: AWS Spark-ec2 script with different user

2014-04-09 Thread Marco Costantini
ill default to it. > > > On Wed, Apr 9, 2014 at 11:08 AM, Nicholas Chammas < > nicholas.cham...@gmail.com> wrote: > >> Marco, >> >> If you call spark-ec2 launch without specifying an AMI, it will default >> to the Spark-provided AMI. >&g

Re: AWS Spark-ec2 script with different user

2014-04-09 Thread Marco Costantini
as far as I can tell. > > Shivaram > > > On Tue, Apr 8, 2014 at 8:50 AM, Marco Costantini < > silvio.costant...@granatads.com> wrote: > >> I was able to keep the "workaround" ...around... by overwriting the >> generated '/root/.ssh/authorized_keys&#x

Re: AWS Spark-ec2 script with different user

2014-04-08 Thread Marco Costantini
I was able to keep the "workaround" ...around... by overwriting the generated '/root/.ssh/authorized_keys' file with a known good one, in the '/etc/rc.local' file On Tue, Apr 8, 2014 at 10:12 AM, Marco Costantini < silvio.costant...@granatads.com> wrote: &

Re: AWS Spark-ec2 script with different user

2014-04-08 Thread Marco Costantini
3. I believe HVMs still work. But it would be valuable to the community to know that the root user work-around does/doesn't work any more for paravirtual instances. Thanks, Marco. On Tue, Apr 8, 2014 at 9:51 AM, Marco Costantini < silvio.costant...@granatads.com> wrote: > As re

Re: AWS Spark-ec2 script with different user

2014-04-08 Thread Marco Costantini
the spark-ec2 wrapper script > using the guidelines at > http://spark.apache.org/docs/latest/ec2-scripts.html > > Shivaram > > > On Mon, Apr 7, 2014 at 1:53 PM, Marco Costantini < > silvio.costant...@granatads.com> wrote: > >> Hi Shivaram, >> >> OK so le

Re: AWS Spark-ec2 script with different user

2014-04-07 Thread Marco Costantini
x27;s home directory hard coded as > /root. However all the Spark AMIs we build should have root ssh access -- > Do you find this not to be the case ? > > You can also enable root ssh access in a vanilla AMI by editing > /etc/ssh/sshd_config and setting "PermitRootLogin&

AWS Spark-ec2 script with different user

2014-04-07 Thread Marco Costantini
Hi all, On the old Amazon Linux EC2 images, the user 'root' was enabled for ssh. Also, it is the default user for the Spark-EC2 script. Currently, the Amazon Linux images have an 'ec2-user' set up for ssh instead of 'root'. I can see that the Spark-EC2 script allows you to specify which user to l