Hi All,
I have recently been exploring MIDAS: an algorithm for Streaming Anomaly
Detection. A production level parallel and distributed implementation of
MIDAS should be quite useful to the industry. I feel that Spark is very
well-suited for the same. If anyone is interested to
contribute/collabor
Hi Shivin,
I'm interested in collaborating with you on this project.
I have been using pyspark for a while now and quite familiar with it.
Do you have any plan on how to proceed?
Thanks,
Aditya
On Sat, 27 Jun, 2020, 2:58 pm Shivin Srivastava,
wrote:
> Hi All,
>
> I have recently been explori
Hello spark-dev,
Looking at ColumnarBatch [1] it seems to indicate a single object is meant
to be used for the entire loading process.
Does this imply that Spark assumes the ColumnarBatch and any direct
references to ColumnarBatch (e.g. UTF8Strings) returned by
InputPartitionReader/PartitionReade
There’s been some comments & a few additions in the doc, but it seems like
the folks taking a look generally agree on the design. If there are no
other issues I will bring this to a vote late next week.
On Thu, Jun 25, 2020 at 7:43 PM Holden Karau wrote:
> Thanks for looping in more folks :)
>
>
HI Spark Developers,
Encountering this NullPointerException while reading parquet file in multi-node
cluster. However while running the spark-job locally on single-node
(development environment) not encountering this error. Appreciate your inputs.
Thanks in advance,
NKH
pqjah.dx.internal.cloud
StackTrace with WSCG disabled
scala> df29.groupBy("LastName").count().show()
20/06/28 06:20:55 WARN TaskSetManager: Lost task 1.0 in stage 2.0 (TID 8,
wn5-nkhwes.zhqzi2stszlevpekfsrlmpqjah.dx.internal.cloudapp.net, executor 4):
java.lang.NullPointerException
at
org.apache.spark.sql.cat