I'm trying to use Table API for my job. I'll soon try to get a test working for my stream job. - I'll parameterize so I can have different sources and sink for tests. How should I mock out a Kafka source? For my test, I was planning on changing the input to be from a temp file (instead of Kafka). - What's a good way of forcing a watermark using the Table API?
On Tue, Oct 6, 2020 at 3:35 PM Dan Hill <quietgol...@gmail.com> wrote: > Thanks! > > Great to know. I copied this junit5-jupiter-starter-bazel > <https://github.com/junit-team/junit5-samples/tree/main/junit5-jupiter-starter-bazel> > rule > into my repository (I don't think junit5 is supported directly with > java_test yet). I tried a few ways of bundling `log4j.properties` into the > jar and didn't get them to work. My current iteration hacks the > log4j.properties file as an absolute path. My failed attempts would spit > an error saying log4j.properties file was not found. This route finds it > but the log properties are not used for the java logger. > > Are there a better set of rules to use for junit5? > > # build rule > java_junit5_test( > name = "tests", > srcs = glob(["*.java"]), > test_package = "ai.promoted.logprocessor.batch", > deps = [...], > jvm_flags = > ["-Dlog4j.configuration=file:///Users/danhill/code/src/ai/promoted/logprocessor/batch/log4j.properties"], > ) > > # log4j.properties > status = error > name = Log4j2PropertiesConfig > appenders = console > appender.console.type = Console > appender.console.name = LogToConsole > appender.console.layout.type = PatternLayout > appender.console.layout.pattern = %d [%t] %-5p %c - %m%n > rootLogger.level = info > rootLogger.appenderRefs = stdout > rootLogger.appenderRef.stdout.ref = LogToConsole > > On Tue, Oct 6, 2020 at 3:34 PM Austin Cawley-Edwards < > austin.caw...@gmail.com> wrote: > >> Oops, this is actually the JOIN issue thread [1]. Guess I should revise >> my previous "haven't had issues" statement hah. Sorry for the spam! >> >> [1]: >> apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Streaming-SQL-Job-Switches-to-FINISHED-before-all-records-processed-td38382.html >> >> On Tue, Oct 6, 2020 at 6:32 PM Austin Cawley-Edwards < >> austin.caw...@gmail.com> wrote: >> >>> Unless it's related to this issue[1], which was w/ my JOIN and time >>> characteristics, though not sure that applies for batch. >>> >>> Best, >>> Austin >>> >>> [1]: >>> apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-SQL-Streaming-Join-Creates-Duplicates-td37764.html >>> >>> >>> On Tue, Oct 6, 2020 at 6:20 PM Austin Cawley-Edwards < >>> austin.caw...@gmail.com> wrote: >>> >>>> Hey Dan, >>>> >>>> We use Junit5 and Bazel to run Flink SQL tests on a mini cluster and >>>> haven’t had issues, though we’re only testing on streaming jobs. >>>> >>>> Happy to help setting up logging with that if you’d like. >>>> >>>> Best, >>>> Austin >>>> >>>> On Tue, Oct 6, 2020 at 6:02 PM Dan Hill <quietgol...@gmail.com> wrote: >>>> >>>>> I don't think any of the gotchas apply to me (at the bottom of this >>>>> link). >>>>> >>>>> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/testing.html#junit-rule-miniclusterwithclientresource >>>>> >>>>> I'm assuming for a batch job that I don't have to do anything for: >>>>> "You can implement a custom parallel source function for emitting >>>>> watermarks if your job uses event time timers." >>>>> >>>>> On Tue, Oct 6, 2020 at 2:42 PM Dan Hill <quietgol...@gmail.com> wrote: >>>>> >>>>>> I've tried to enable additional logging for a few hours today. I >>>>>> think something with junit5 is swallowing the logs. I'm using Bazel and >>>>>> junit5. I setup MiniClusterResourceConfiguration using a custom >>>>>> extension. Are there any known issues with Flink and junit5? I can try >>>>>> switching to junit4. >>>>>> >>>>>> When I've binary searched this issue, this failure happens if my >>>>>> query in step 3 has a join it. If I remove the join, I can remove step 4 >>>>>> and the code still works. I've renamed a bunch of my tables too and the >>>>>> problem still exists. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Tue, Oct 6, 2020, 00:42 Aljoscha Krettek <aljos...@apache.org> >>>>>> wrote: >>>>>> >>>>>>> Hi Dan, >>>>>>> >>>>>>> there were some bugs and quirks in the MiniCluster that we recently >>>>>>> fixed: >>>>>>> >>>>>>> - https://issues.apache.org/jira/browse/FLINK-19123 >>>>>>> - https://issues.apache.org/jira/browse/FLINK-19264 >>>>>>> >>>>>>> But I think they are probably unrelated to your case. Could you >>>>>>> enable >>>>>>> logging and see from the logs whether the 2) and 3) jobs execute >>>>>>> correctly on the MiniCluster? >>>>>>> >>>>>>> Best, >>>>>>> Aljoscha >>>>>>> >>>>>>> On 06.10.20 08:08, Dan Hill wrote: >>>>>>> > I'm writing a test for a batch job using >>>>>>> MiniClusterResourceConfiguration. >>>>>>> > >>>>>>> > Here's a simple description of my working test case: >>>>>>> > 1) I use TableEnvironment.executeSql(...) to create a source and >>>>>>> sink table >>>>>>> > using tmp filesystem directory. >>>>>>> > 2) I use executeSql to insert some test data into the source tabel. >>>>>>> > 3) I use executeSql to select from source and insert into sink. >>>>>>> > 4) I use executeSql from the same source to a different sink. >>>>>>> > >>>>>>> > When I do these steps, it works. If I remove step 4, no data gets >>>>>>> written >>>>>>> > to the sink. My actual code is more complex than this (has create >>>>>>> view, >>>>>>> > join and more tables). This is a simplified description but >>>>>>> highlights the >>>>>>> > weird error. >>>>>>> > >>>>>>> > Has anyone hit issues like this? I'm assuming I have a small code >>>>>>> bug in >>>>>>> > my queries that's causing issues. These queries appear to work in >>>>>>> > production so I'm confused. Are there ways of viewing failed jobs >>>>>>> or >>>>>>> > queries with MiniClusterResourceConfiguration? >>>>>>> > >>>>>>> > Thanks! >>>>>>> > - Dan >>>>>>> > >>>>>>> >>>>>>>