https://issues.apache.org/jira/browse/SPARK-20894
On Thu, May 25, 2017 at 4:31 PM, Shixiong(Ryan) Zhu wrote:
> I don't know what happened in your case so cannot provide any work around.
> It would be great if you can provide logs output
> by HDFSBackedStateStoreProvider.
>
> On Thu, May 25, 2017
I don't know what happened in your case so cannot provide any work around.
It would be great if you can provide logs output by
HDFSBackedStateStoreProvider.
On Thu, May 25, 2017 at 4:05 PM, kant kodali wrote:
>
> On Thu, May 25, 2017 at 3:41 PM, Shixiong(Ryan) Zhu <
> shixi...@databricks.com> wr
On Thu, May 25, 2017 at 3:41 PM, Shixiong(Ryan) Zhu wrote:
> bin/hadoop fs -ls /usr/local/hadoop/checkpoint/state/0/*
>
Hi,
There are no files under bin/hadoop fs -ls
/usr/local/hadoop/checkpoint/state/0/*
but all the directories until /usr/local/hadoop/checkpoint/state/0 does
exist(which are c
Feel free to create a new ticket. Could you also provide the files in
"/usr/local/hadoop/checkpoint/state/0" (Just run "bin/hadoop fs -ls
/usr/local/hadoop/checkpoint/state/0/*") in the ticket and the Spark logs?
On Thu, May 25, 2017 at 2:53 PM, kant kodali wrote:
> Should I file a ticket or sho
Should I file a ticket or should I try another version like Spark 2.2 since
I am currently using 2.1.1?
On Thu, May 25, 2017 at 2:38 PM, kant kodali wrote:
> Hi Ryan,
>
> You are right I was setting checkpointLocation for readStream. Now I did
> set if for writeStream as well like below
>
> Str
Hi Ryan,
You are right I was setting checkpointLocation for readStream. Now I did
set if for writeStream as well like below
StreamingQuery query = df2.writeStream().foreach(new KafkaSink()).option(
"checkpointLocation","/usr/local/hadoop/checkpoint").outputMode("update"
).start();
query.awaitTe
Read your codes again and found one issue: you set "checkpointLocation" in
`readStream`. It should be set in `writeStream`. However, I still have no
idea why use a temp checkpoint location will fail.
On Thu, May 25, 2017 at 2:23 PM, kant kodali wrote:
> I did the following
>
> *bin/hadoop fs -mk
I did the following
*bin/hadoop fs -mkdir -p **/usr/local/hadoop/checkpoint* and did *bin/hadoop
fs -ls / *
and I can actually see */tmp* and */usr* and inside of */usr *there is
indeed *local/hadoop/checkpoint. *
So until here it looks fine.
I also cleared everything */tmp/** as @Michael sugge
Executing this bin/hadoop fs -ls /usr/local/hadoop/checkpoint says
ls: `/usr/local/hadoop/checkpoint': No such file or directory
This is what I expected as well since I don't see any checkpoint directory
under /usr/local/hadoop. Am I missing any configuration variable like
HADOOP_CONF_DIR ? I am
Hi Ryan,
I did add that print statement and here is what I got.
class org.apache.hadoop.hdfs.DistributedFileSystem
Thanks!
On Wed, May 24, 2017 at 11:39 PM, Shixiong(Ryan) Zhu <
shixi...@databricks.com> wrote:
> I meant using HDFS command to check the directory. Such as "bin/hadoop fs
> -ls /u
I meant using HDFS command to check the directory. Such as "bin/hadoop fs
-ls /usr/local/hadoop/checkpoint". My hunch is the default file system in
driver probably is the local file system. Could you add the following line
into your code to print the default file system?
println(org.apache.hadoop.
Hi All,
I specified hdfsCheckPointDir = /usr/local/hadoop/checkpoint as you can see
below however I dont see checkpoint directory under my hadoop_home=
/usr/local/hadoop in either datanodes or namenodes however in datanode
machine there seems to be some data under
/usr/local/hadoop/hdfs/namenode/
What's the value of "hdfsCheckPointDir"? Could you list this directory on
HDFS and report the files there?
On Wed, May 24, 2017 at 3:50 PM, Michael Armbrust
wrote:
> -dev
>
> Have you tried clearing out the checkpoint directory? Can you also give
> the full stack trace?
>
> On Wed, May 24, 2017
-dev
Have you tried clearing out the checkpoint directory? Can you also give
the full stack trace?
On Wed, May 24, 2017 at 3:45 PM, kant kodali wrote:
> Even if I do simple count aggregation like below I get the same error as
> https://issues.apache.org/jira/browse/SPARK-19268
>
> Dataset df2
Even if I do simple count aggregation like below I get the same error as
https://issues.apache.org/jira/browse/SPARK-19268
Dataset df2 = df1.groupBy(functions.window(df1.col("Timestamp5"),
"24 hours", "24 hours"), df1.col("AppName")).count();
On Wed, May 24, 2017 at 3:35 PM, kant kodali wrote:
15 matches
Mail list logo