Hi All,
Any idea about this?
Thanks,
Rishi
On Tue, May 21, 2019 at 11:29 PM Rishi Shah
wrote:
> Hi All,
>
> What is the best way to determine partitions of a dataframe dynamically
> before writing to disk?
>
> 1) statically determine based on data and use coalesce or repartition
> while writin
Any suggestions?
On Wed, May 22, 2019 at 6:32 AM Rishi Shah wrote:
> Hi All,
>
> If dataframe is repartitioned in memory by (date, id) columns and then if
> I use multiple window functions which uses partition by clause with (date,
> id) columns --> we can avoid shuffle/sort again I believe.. Ca
I was having query regarding narrow transformation that is it ok to write
adhoc logic into single transformation or we should split it into multiple
tranfromations for ex: if i would like to explode one of the list and then
using value of list again explode it and then using its values again attach
Hello
In nifi in order to run the processor ExecuteSparkInteractive it needs a
LivySessionController. When we start a livy controller service we do not
see that we specify the yarn queue. Is there a way to do this?
Thanks
Hi ,
I am getting # java.lang.OutOfMemoryError: Java heap space . I have
increased my driver memory and executor memory still i am facing this issue.
I am using r4 for driver and core nodes(16). How can we see which step or
whether its related to any GC . Can we pin point to single point on code
I was able to resolve the error. Initially I was giving custom name for
subscribe but was giving topic name at topics options.Giving the same name
at both places worked.
I am confused now with the difference of giving value for topic and
subsribe here. do you have any suggestions?
reader.option
just wondering what is the advantage of doing this?
Regards
Gourav Sengupta
On Wed, May 22, 2019 at 3:01 AM Huizhe Wang wrote:
> Hi Hari,
> Thanks :) I tried to do it as u said. It works ;)
>
>
> Hariharan 于2019年5月20日 周一下午3:54写道:
>
>> Hi Huizhe,
>>
>> You can set the "fs.defaultFS" field in cor
Hi,
We have started using Spark on Kubernetes and in most cases, our all jobs
use AWS S3 to read/write data. We are setting up the aws key and secret
using these properties:
spark.kubernetes.driver.secretKeyRef.[EnvName]
spark.kubernetes.executor.secretKeyRef.[EnvName]
But, we already have *kube
Have you tried what the exception suggests?
If startingOffsets contains specific offsets, you must specify all
TopicPartitions.
BR,
G
On Tue, May 21, 2019 at 9:16 PM KhajaAsmath Mohammed <
mdkhajaasm...@gmail.com> wrote:
> Hi,
>
> I am getting below errror when running sample strreaming app. d
Hi All,
If dataframe is repartitioned in memory by (date, id) columns and then if I
use multiple window functions which uses partition by clause with (date,
id) columns --> we can avoid shuffle/sort again I believe.. Can someone
confirm this?
However what happens when dataframe repartition was do
Hi all,
I just need to know how spark decide how many partitions should be created
while reading a table from hive.
Thanks
--
Shivam Sharma
Indian Institute Of Information Technology, Design and Manufacturing
Jabalpur
Email:- 28shivamsha...@gmail.com
LinkedIn:-*https://www.linkedin.com/in/28shi
11 matches
Mail list logo