Re: Issue with Flink - unrelated executeSql causes other executeSqls to fail.

Dan Hill Tue, 06 Oct 2020 15:42:26 -0700

I'm trying to use Table API for my job.  I'll soon try to get a test
working for my stream job.
- I'll parameterize so I can have different sources and sink for tests.
How should I mock out a Kafka source?  For my test, I was planning on
changing the input to be from a temp file (instead of Kafka).
- What's a good way of forcing a watermark using the Table API?



On Tue, Oct 6, 2020 at 3:35 PM Dan Hill <quietgol...@gmail.com> wrote:

> Thanks!
>
> Great to know.  I copied this junit5-jupiter-starter-bazel
> <https://github.com/junit-team/junit5-samples/tree/main/junit5-jupiter-starter-bazel>
>  rule
> into my repository (I don't think junit5 is supported directly with
> java_test yet).  I tried a few ways of bundling `log4j.properties` into the
> jar and didn't get them to work.  My current iteration hacks the
> log4j.properties file as an absolute path.  My failed attempts would spit
> an error saying log4j.properties file was not found.  This route finds it
> but the log properties are not used for the java logger.
>
> Are there a better set of rules to use for junit5?
>
> # build rule
> java_junit5_test(
>     name = "tests",
>     srcs = glob(["*.java"]),
>     test_package = "ai.promoted.logprocessor.batch",
>     deps = [...],
>     jvm_flags =
> ["-Dlog4j.configuration=file:///Users/danhill/code/src/ai/promoted/logprocessor/batch/log4j.properties"],
> )
>
> # log4j.properties
> status = error
> name = Log4j2PropertiesConfig
> appenders = console
> appender.console.type = Console
> appender.console.name = LogToConsole
> appender.console.layout.type = PatternLayout
> appender.console.layout.pattern = %d [%t] %-5p %c - %m%n
> rootLogger.level = info
> rootLogger.appenderRefs = stdout
> rootLogger.appenderRef.stdout.ref = LogToConsole
>
> On Tue, Oct 6, 2020 at 3:34 PM Austin Cawley-Edwards <
> austin.caw...@gmail.com> wrote:
>
>> Oops, this is actually the JOIN issue thread [1]. Guess I should revise
>> my previous "haven't had issues" statement hah. Sorry for the spam!
>>
>> [1]:
>> apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Streaming-SQL-Job-Switches-to-FINISHED-before-all-records-processed-td38382.html
>>
>> On Tue, Oct 6, 2020 at 6:32 PM Austin Cawley-Edwards <
>> austin.caw...@gmail.com> wrote:
>>
>>> Unless it's related to this issue[1], which was w/ my JOIN and time
>>> characteristics, though not sure that applies for batch.
>>>
>>> Best,
>>> Austin
>>>
>>> [1]:
>>> apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-SQL-Streaming-Join-Creates-Duplicates-td37764.html
>>>
>>>
>>> On Tue, Oct 6, 2020 at 6:20 PM Austin Cawley-Edwards <
>>> austin.caw...@gmail.com> wrote:
>>>
>>>> Hey Dan,
>>>>
>>>> We use Junit5 and Bazel to run Flink SQL tests on a mini cluster and
>>>> haven’t had issues, though we’re only testing on streaming jobs.
>>>>
>>>> Happy to help setting up logging with that if you’d like.
>>>>
>>>> Best,
>>>> Austin
>>>>
>>>> On Tue, Oct 6, 2020 at 6:02 PM Dan Hill <quietgol...@gmail.com> wrote:
>>>>
>>>>> I don't think any of the gotchas apply to me (at the bottom of this
>>>>> link).
>>>>>
>>>>> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/testing.html#junit-rule-miniclusterwithclientresource
>>>>>
>>>>> I'm assuming for a batch job that I don't have to do anything for:
>>>>> "You can implement a custom parallel source function for emitting
>>>>> watermarks if your job uses event time timers."
>>>>>
>>>>> On Tue, Oct 6, 2020 at 2:42 PM Dan Hill <quietgol...@gmail.com> wrote:
>>>>>
>>>>>> I've tried to enable additional logging for a few hours today.  I
>>>>>> think something with junit5 is swallowing the logs.  I'm using Bazel and
>>>>>> junit5.  I setup MiniClusterResourceConfiguration using a custom
>>>>>> extension.  Are there any known issues with Flink and junit5?  I can try
>>>>>> switching to junit4.
>>>>>>
>>>>>> When I've binary searched this issue, this failure happens if my
>>>>>> query in step 3 has a join it.  If I remove the join, I can remove step 4
>>>>>> and the code still works.  I've renamed a bunch of my tables too and the
>>>>>> problem still exists.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Oct 6, 2020, 00:42 Aljoscha Krettek <aljos...@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Dan,
>>>>>>>
>>>>>>> there were some bugs and quirks in the MiniCluster that we recently
>>>>>>> fixed:
>>>>>>>
>>>>>>>   - https://issues.apache.org/jira/browse/FLINK-19123
>>>>>>>   - https://issues.apache.org/jira/browse/FLINK-19264
>>>>>>>
>>>>>>> But I think they are probably unrelated to your case. Could you
>>>>>>> enable
>>>>>>> logging and see from the logs whether the 2) and 3) jobs execute
>>>>>>> correctly on the MiniCluster?
>>>>>>>
>>>>>>> Best,
>>>>>>> Aljoscha
>>>>>>>
>>>>>>> On 06.10.20 08:08, Dan Hill wrote:
>>>>>>> > I'm writing a test for a batch job using
>>>>>>> MiniClusterResourceConfiguration.
>>>>>>> >
>>>>>>> > Here's a simple description of my working test case:
>>>>>>> > 1) I use TableEnvironment.executeSql(...) to create a source and
>>>>>>> sink table
>>>>>>> > using tmp filesystem directory.
>>>>>>> > 2) I use executeSql to insert some test data into the source tabel.
>>>>>>> > 3) I use executeSql to select from source and insert into sink.
>>>>>>> > 4) I use executeSql from the same source to a different sink.
>>>>>>> >
>>>>>>> > When I do these steps, it works.  If I remove step 4, no data gets
>>>>>>> written
>>>>>>> > to the sink.  My actual code is more complex than this (has create
>>>>>>> view,
>>>>>>> > join and more tables).  This is a simplified description but
>>>>>>> highlights the
>>>>>>> > weird error.
>>>>>>> >
>>>>>>> > Has anyone hit issues like this?  I'm assuming I have a small code
>>>>>>> bug in
>>>>>>> > my queries that's causing issues.  These queries appear to work in
>>>>>>> > production so I'm confused.  Are there ways of viewing failed jobs
>>>>>>> or
>>>>>>> > queries with MiniClusterResourceConfiguration?
>>>>>>> >
>>>>>>> > Thanks!
>>>>>>> > - Dan
>>>>>>> >
>>>>>>>
>>>>>>>

Re: Issue with Flink - unrelated executeSql causes other executeSqls to fail.

Reply via email to