Re: Help wanted on securing spark with Apache Knox / JWT

2024-07-12 Thread Adam Binford
> I tried to play with the filters especially > org.apache.hadoop.security.authentication.server.AuthenticationFilter but > didn' t manage to get anything working, so I don' t even know if this is > the right way to do. > > Thanks for your answer > > -- Adam Binford

Re: Re-create SparkContext of SparkSession inside long-lived Spark app

2024-02-17 Thread Adam Binford
sercache/hadoop/appcache/application_1706835946137_0110/blockmgr-eda47882-56d6-4248-8e30-a959ddb912c5 > > [2] https://stackoverflow.com/a/38791921 > -- Adam Binford

Re: [SPARK STRUCTURED STREAMING] : Rocks DB uses off-heap usage

2022-11-30 Thread Adam Binford
GB Res Memory. > > Thanks, > Akshit > > > Thanks and regards > - Akshit Marwah > > > -- Adam Binford

Re: Unable to force small partitions in streaming job without repartitioning

2022-02-11 Thread Adam Binford
ons/19188315/behavior-of-the-parameter-mapred-min-split-size-in-hdfs>); >>> however, I don't quite understand the link between the splitting settings, >>> row group configuration, and resulting number of records when reading from >>> a delta table. >>> >&

Re: Unable to force small partitions in streaming job without repartitioning

2022-02-11 Thread Adam Binford
large number of empty partitions and a small > number containing the rest of the data (see median vs max number of input > records). > > [image: image.png] > > Any help would be much appreciated > > Chris > -- Adam Binford

Re: How to modify a field in a nested struct using pyspark

2021-01-29 Thread Adam Binford
know. Thank you so much Adam. Do you know when > the 3.1 release is scheduled? > > Regards, > Felix K Jose > > On Fri, Jan 29, 2021 at 12:35 PM Adam Binford wrote: > >> As of 3.0, the only way to do it is something that will recreate the >> whole struct

Re: How to modify a field in a nested struct using pyspark

2021-01-29 Thread Adam Binford
n( > "timingPeriod.start", > transform_date_col("timingPeriod.start").cast("timestamp")).withColumn( > "timingPeriod.end", > transform_date_col("timingPeriod.end").cast("timestamp")) > > the timingPeriod fields are not a struct anymore rather they become two > different fields with names "timingPeriod.start", "timingPeriod.end". > > How can I get them as a struct as before? > Is there a generic way I can modify a single/multiple properties of nested > structs? > > I have hundreds of entities where the long needs to convert to timestamp, > so a generic implementation will help my data ingestion pipeline a lot. > > Regards, > Felix K Jose > > -- Adam Binford