Hi Dan,

did you try using the JobClient you can get from the TableResult to wait for job completion? You can get a CompletableFuture for the JobResult which should help you.

Best,
Aljoscha

On 08.10.20 23:55, Dan Hill wrote:
I figured out the issue.  The join caused part of the job's execution to be
delayed.  I added my own hacky wait condition into the test to make sure
the join job finishes first and it's fine.

What common test utilities exist for Flink?  I found
flink/flink-test-utils-parent.  I implemented a simple sleep loop to wait
for jobs to finish.  I'm guessing this can be done with one of the other
utilities.

Are there any open source test examples?

How are watermarks usually sent with Table API in tests?

After I collect some answers, I'm fine updating the Flink testing page.
https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/testing.html#testing-flink-jobs

On Thu, Oct 8, 2020 at 8:52 AM Austin Cawley-Edwards <
austin.caw...@gmail.com> wrote:

Can't comment on the SQL issues, but here's our exact setup for Bazel and
Junit5 w/ the resource files approach:
https://github.com/fintechstudios/vp-operator-demo-ff-virtual-2020/tree/master/tools/junit

Best,
Austin

On Thu, Oct 8, 2020 at 2:41 AM Dan Hill <quietgol...@gmail.com> wrote:

I was able to get finer grained logs showing.  I switched from
-Dlog4j.configuration to -Dlog4j.configurationFile and it worked.  With my
larger test case, I was hitting a silent log4j error.  When I created a
small test case to just test logging, I received a log4j error.

Here is a tar
<https://drive.google.com/file/d/1b6vJR_hfaRZwA28jKNlUBxDso7YiTIbk/view?usp=sharing>
with the info logs for:
- (test-nojoin.log) this one works as expected
- (test-join.log) this does not work as expected

I don't see an obvious issue just by scanning the logs.  I'll take a
deeper in 9 hours.




On Wed, Oct 7, 2020 at 8:28 PM Dan Hill <quietgol...@gmail.com> wrote:

Switching to junit4 did not help.

If I make a request to the url returned from
MiniClusterWithClientResource.flinkCluster.getClusterClient().getWebInterfaceURL(),
I get
{"errors":["Not found."]}.  I'm not sure if this is intentional.




On Tue, Oct 6, 2020 at 4:16 PM Dan Hill <quietgol...@gmail.com> wrote:

@Aljoscha - Thanks!  That setup lets fixing the hacky absolute path
reference.  However, the actual log calls are not printing to the console.
Only errors appear in my terminal window and the test logs.  Maybe console
logger does not work for this junit setup.  I'll see if the file version
works.

On Tue, Oct 6, 2020 at 4:08 PM Austin Cawley-Edwards <
austin.caw...@gmail.com> wrote:

What Aljoscha suggested is what works for us!

On Tue, Oct 6, 2020 at 6:58 PM Aljoscha Krettek <aljos...@apache.org>
wrote:

Hi Dan,

to make the log properties file work this should do it: assuming the
log4j.properties is in //src/main/resources. You will need a
BUILD.bazel
in that directory that has only the line
"exports_files(["log4j.properties"]). Then you can reference it in
your
test via "resources = ["//src/main/resources:log4j.properties"],". Of
course you also need to have the right log4j deps (or slf4j if you're
using that)

Hope that helps!

Aljoscha

On 07.10.20 00:41, Dan Hill wrote:
I'm trying to use Table API for my job.  I'll soon try to get a test
working for my stream job.
- I'll parameterize so I can have different sources and sink for
tests.
How should I mock out a Kafka source?  For my test, I was planning
on
changing the input to be from a temp file (instead of Kafka).
- What's a good way of forcing a watermark using the Table API?


On Tue, Oct 6, 2020 at 3:35 PM Dan Hill <quietgol...@gmail.com>
wrote:

Thanks!

Great to know.  I copied this junit5-jupiter-starter-bazel
<
https://github.com/junit-team/junit5-samples/tree/main/junit5-jupiter-starter-bazel>
rule
into my repository (I don't think junit5 is supported directly with
java_test yet).  I tried a few ways of bundling `log4j.properties`
into the
jar and didn't get them to work.  My current iteration hacks the
log4j.properties file as an absolute path.  My failed attempts
would spit
an error saying log4j.properties file was not found.  This route
finds it
but the log properties are not used for the java logger.

Are there a better set of rules to use for junit5?

# build rule
java_junit5_test(
      name = "tests",
      srcs = glob(["*.java"]),
      test_package = "ai.promoted.logprocessor.batch",
      deps = [...],
      jvm_flags =

["-Dlog4j.configuration=file:///Users/danhill/code/src/ai/promoted/logprocessor/batch/log4j.properties"],
)

# log4j.properties
status = error
name = Log4j2PropertiesConfig
appenders = console
appender.console.type = Console
appender.console.name = LogToConsole
appender.console.layout.type = PatternLayout
appender.console.layout.pattern = %d [%t] %-5p %c - %m%n
rootLogger.level = info
rootLogger.appenderRefs = stdout
rootLogger.appenderRef.stdout.ref = LogToConsole

On Tue, Oct 6, 2020 at 3:34 PM Austin Cawley-Edwards <
austin.caw...@gmail.com> wrote:

Oops, this is actually the JOIN issue thread [1]. Guess I should
revise
my previous "haven't had issues" statement hah. Sorry for the
spam!

[1]:

apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Streaming-SQL-Job-Switches-to-FINISHED-before-all-records-processed-td38382.html

On Tue, Oct 6, 2020 at 6:32 PM Austin Cawley-Edwards <
austin.caw...@gmail.com> wrote:

Unless it's related to this issue[1], which was w/ my JOIN and
time
characteristics, though not sure that applies for batch.

Best,
Austin

[1]:

apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-SQL-Streaming-Join-Creates-Duplicates-td37764.html


On Tue, Oct 6, 2020 at 6:20 PM Austin Cawley-Edwards <
austin.caw...@gmail.com> wrote:

Hey Dan,

We use Junit5 and Bazel to run Flink SQL tests on a mini
cluster and
haven’t had issues, though we’re only testing on streaming jobs.

Happy to help setting up logging with that if you’d like.

Best,
Austin

On Tue, Oct 6, 2020 at 6:02 PM Dan Hill <quietgol...@gmail.com>
wrote:

I don't think any of the gotchas apply to me (at the bottom of
this
link).


https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/testing.html#junit-rule-miniclusterwithclientresource

I'm assuming for a batch job that I don't have to do anything
for:
"You can implement a custom parallel source function for
emitting
watermarks if your job uses event time timers."

On Tue, Oct 6, 2020 at 2:42 PM Dan Hill <quietgol...@gmail.com>
wrote:

I've tried to enable additional logging for a few hours
today.  I
think something with junit5 is swallowing the logs.  I'm
using Bazel and
junit5.  I setup MiniClusterResourceConfiguration using a
custom
extension.  Are there any known issues with Flink and
junit5?  I can try
switching to junit4.

When I've binary searched this issue, this failure happens if
my
query in step 3 has a join it.  If I remove the join, I can
remove step 4
and the code still works.  I've renamed a bunch of my tables
too and the
problem still exists.





On Tue, Oct 6, 2020, 00:42 Aljoscha Krettek <
aljos...@apache.org>
wrote:

Hi Dan,

there were some bugs and quirks in the MiniCluster that we
recently
fixed:

    - https://issues.apache.org/jira/browse/FLINK-19123
    - https://issues.apache.org/jira/browse/FLINK-19264

But I think they are probably unrelated to your case. Could
you
enable
logging and see from the logs whether the 2) and 3) jobs
execute
correctly on the MiniCluster?

Best,
Aljoscha

On 06.10.20 08:08, Dan Hill wrote:
I'm writing a test for a batch job using
MiniClusterResourceConfiguration.

Here's a simple description of my working test case:
1) I use TableEnvironment.executeSql(...) to create a
source and
sink table
using tmp filesystem directory.
2) I use executeSql to insert some test data into the
source tabel.
3) I use executeSql to select from source and insert into
sink.
4) I use executeSql from the same source to a different
sink.

When I do these steps, it works.  If I remove step 4, no
data gets
written
to the sink.  My actual code is more complex than this (has
create
view,
join and more tables).  This is a simplified description but
highlights the
weird error.

Has anyone hit issues like this?  I'm assuming I have a
small code
bug in
my queries that's causing issues.  These queries appear to
work in
production so I'm confused.  Are there ways of viewing
failed jobs
or
queries with MiniClusterResourceConfiguration?

Thanks!
- Dan








Reply via email to