Hello,
I have a use case where I need to read events(non correlated) from a source
kafka topic, then correlate and push forward to another target topic.
I use spark structured streaming with FlatMapGroupsWithStateFunction along
with GroupStateTimeout.ProcessingTimeTimeout() . After each timeout
Hello Attila,
Thank you for verifying this for me. I was looking at
Step 1/18 : ARG java_image_tag=11-jre-slim
and presumed that the docker image is built using JRE 11.
I can confirm that,
(1) $ docker image history 3ef86250a35b
IMAGE CREATED CREATED BY
SIZE CO
Maybe it is my environment cause
jiahong li 于2021年3月11日周四 上午11:14写道:
> it not the cause,when i set -Phadoop-2.7 instead of
> -Dhadoop.version=2.6.0-cdh5.13.1, the same errors come out.
>
> Attila Zsolt Piros 于2021年3月10日周三 下午8:56写道:
>
>> I see, this must be because of hadoop version you are sele
it not the cause,when i set -Phadoop-2.7 instead of
-Dhadoop.version=2.6.0-cdh5.13.1, the same errors come out.
Attila Zsolt Piros 于2021年3月10日周三 下午8:56写道:
> I see, this must be because of hadoop version you are selecting by using
> "-Dhadoop.version=2.6.0-cdh5.13.1".
> Spark 3.1.1 only support h
Hi Muthu!
I tried and at my side it is working just fine:
$ ./bin/docker-image-tool.sh -r docker.io/sample-spark -b
java_image_tag=8-jre-slim -t 3.1.1 build
Sending build context to Docker daemon 228.3MB
Step 1/18 : ARG java_image_tag=11-jre-slim
Step 2/18 : FROM openjdk:${java_image_tag}
*8-jr
hi
Thank you. The suggestion is very good. There is no need to use
"repartitionByRange",
However, there is a little doubt that if the output file is required to be
globally ordered, "repartition" will disrupt the order of the data, and the
result of using "coalesce" is correct
Best Regards,
m li
Hi Pankaj,Have you tried spark.sql.parquet.respectSummaryFiles=true?
Bests,
Kent Yao @ Data Science Center, Hangzhou Research Institute, NetEase Corp
Hi Pankaj,
Can you show your detail code and Job/Stage Info? Which Stage is slow?
Pankaj Bhootra 于2021年3月10日周三 下午12:32写道:
> Hi,
>
> Could someone please revert on this?
>
>
> Thanks
> Pankaj Bhootra
>
>
> On Sun, 7 Mar 2021, 01:22 Pankaj Bhootra, wrote:
>
>> Hello Team
>>
>> I am new to Spark
I see, this must be because of hadoop version you are selecting by using
"-Dhadoop.version=2.6.0-cdh5.13.1".
Spark 3.1.1 only support hadoop-2.7 and hadoop-3.2, at least these two can
be given via profiles: -Phadoop-2.7 and -Phadoop-3.2 (the default).
On Wed, Mar 10, 2021 at 12:26 PM jiahong li
i use ./build/mvn to compile ,and after execute command
:./build/zinc-0.3.15/bin/zinc
-shutdown
and execute command like this: /dev/make-distribution.sh --name
custom-spark --pip --tgz -Phive -Phive-thriftserver -Pyarn
-Dhadoop.version=2.6.0-cdh5.13.1 -DskipTests
same error appear.
and execute com
hi!
Are you compiling Spark itself?
Do you use "./build/mvn" from the project root?
If you compiled an other version of Spark before and there the scala
version was different then zinc/nailgun could cached the old classes which
can cause similar troubles.
In that case this could help:
./build/zin
hi, everybody, when i compile spark 3.1.1 from tag v3.1.1 ,encounter error
like this:
INFO] --- scala-maven-plugin:4.3.0:compile (scala-compile-first) @
spark-core_2.12 ---
[INFO] Using incremental compilation using Mixed compile order
[INFO] Compiler bridge file:
.sbt/1.0/zinc/org.scala-sbt/org.s
12 matches
Mail list logo