Hi,
Why are you doing the following two lines?
.select("id",lit(referenceFiltered))
.selectexpr(
"id"
)
What are you trying to achieve? What's lit and what's referenceFiltered?
What's the difference between select and selectexpr? Please start at
http://spark.apache.org/docs/latest/sql-programmin
Thank you both for your input!
To calculate moving average of active users, could you comment on whether
to go for RDD based implementation or dataframe? If dataframe, will window
function work here?
In general, how would spark behave when working with dataframe with date,
week, month, quarter, y
Hi Dave,
As part of driver pod bringup, a configmap is created using all the spark
configuration parameters (with name spark.properties) and mounted to
/opt/spark/conf. So all the other files present in /opt/spark/conf will be
overwritten.
Same is happening with the log4j.properties in this cas
Hi Tianhua,
I read similar question to your's from HBase mailing list. so I'd like to
let you know about efforts on supporting AArch64 from Apache Bigtop[1]
I don't believe that CI and distribution of Bigtop is not exactly what you
are looking for but, Folks from Linaro and Arm are contributing to
Hi all,
The CI testing for apache spark is supported by AMPLab Jenkins, and I find
there are some computers(most of them are Linux (amd64) arch) for the CI
development, but seems there is no Aarch64 computer for spark CI testing.
Recently, I build and run test for spark(master and branch-2.4) on my
I am using Spark on Kubernetes from Spark 2.4.3. I have created a
log4j.properties file in my local spark/conf directory and modified it so that
the console (or, in the case of Kubernetes, the log) only shows warnings and
higher (log4j.rootCategory=WARN, console). I then added the command
COPY c
Hi all
it took me some time to get the issues extracted into a piece of standalone
code. I created the following gist
https://gist.github.com/jammann/b58bfbe0f4374b89ecea63c1e32c8f17
I has messages for 4 topics A/B/C/D and a simple Python program which shows 6
use cases, with my expectations a
Hello Deng, Thank you for your email.
Issue was with Spark - Hadoop / HDFS configuration settings.
Thanks
On Mon, Jun 10, 2019 at 5:28 AM Deng Ching-Mallete
wrote:
> Hi Chetan,
>
> Best to check if the user account that you're using to run the job has
> permission to write to the path in HDFS.
Spark can use the HiveMetastore as a catalog, but it doesn't use the hive
parser or optimization engine. Instead it uses Catalyst, see
https://databricks.com/blog/2015/04/13/deep-dive-into-spark-sqls-catalyst-optimizer.html
On Mon, Jun 10, 2019 at 2:07 PM naresh Goud
wrote:
> Hi Team,
>
> Is Spa
Hi Team,
Is Spark Sql uses hive engine to run queries ?
My understanding that spark sql uses hive meta store to get metadata
information to run queries.
Thank you,
Naresh
--
Thanks,
Naresh
www.linkedin.com/in/naresh-dulam
http://hadoopandspark.blogspot.com/
Hi guys,
I was not able to find the foreseen release date for Spark 3.
Would one have any information on this please ?
Many thanks,
Alex
We have spark kafka sreaming job running on standalone spark cluster. We
have below kafka architecture
1. Two cluster running on two data centers.
2. There is LTM on top on each data center (load balance)
3. There is GSLB on top of LTM.
I observed when ever any of the node in kafka cluster is dow
https://stackoverflow.com/questions/56428367/any-clue-how-to-join-this-spark-structured-stream-joins
Thanks All.
I managed to get this working.
Marking this thread as closed.
On Mon, Jun 10, 2019 at 4:14 PM Deepak Sharma wrote:
> This is the project requirement , where paths are being streamed in kafka
> topic.
> Seems it's not possible using spark structured streaming.
>
>
> On Mon, Jun 10, 20
Hi,
Any suggestions regarding below issue?
https://stackoverflow.com/questions/56524921/how-spark-structured-streaming-consumers-initiated-and-invoked-while-reading-mul
Thanks,
Shyam
This is the project requirement , where paths are being streamed in kafka
topic.
Seems it's not possible using spark structured streaming.
On Mon, Jun 10, 2019 at 3:59 PM Shyam P wrote:
> Hi Deepak,
> Why are you getting paths from kafka topic? any specific reason to do so ?
>
> Regards,
> Shy
Hi Deepak,
Why are you getting paths from kafka topic? any specific reason to do so ?
Regards,
Shyam
On Mon, Jun 10, 2019 at 10:44 AM Deepak Sharma
wrote:
> The context is different here.
> The file path are coming as messages in kafka topic.
> Spark streaming (structured) consumes form this t
https://stackoverflow.com/questions/56524539/how-to-handle-small-file-problem-in-spark-structured-streaming
Regards,
Shyam
Hi Chetan,
Best to check if the user account that you're using to run the job has
permission to write to the path in HDFS. I would suggest to write the
parquet files to a different path, perhaps to a project space or user home,
rather than at the root directory.
HTH,
Deng
On Sat, Jun 8, 2019 at
19 matches
Mail list logo