GitHub user uce opened a pull request: https://github.com/apache/flink/pull/5102
[FLINK-7762, FLINK-8167] Clean up and harden WikipediaEditsSource ## What is the purpose of the change This pull requests addresses two related issues with the WikipediaEditsSource. It makes the WikipediaEditsSourceTest a proper test instead of unnecessarily starting a FlinkMiniCluster and addresses a potential test instability. In general, the WikipediaEditsSource is not in good shape and could benefit from further refactoring. One potential area for improvement is integration with the asynchronous channel listener that reports events like errors or being kicked out of a channel, etc. I did not do this due to time constraints and the fact that this is not a production source. In general, it is questionable whether we should keep the test as is or remove it since it depends on connectivity to an IRC channel. ## Brief change log - Harden WikipediaEditsSource with eager sanity checks - Make WikipediaEditsSourceTest proper test ## Verifying this change This change is a rework/code cleanup without any new test coverage. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): yes, but only to `flink-test-utils-junit` - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no - If yes, how is the feature documented? not applicable You can merge this pull request into a Git repository by running: $ git pull https://github.com/uce/flink 7762-8167-wikiedits Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/5102.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5102 ---- commit b2ab66f05ce545214a8132dc2d46b3143939b015 Author: Ufuk Celebi <u...@apache.org> Date: 2017-11-29T15:28:18Z [FLINK-8167] [connector-wikiedits] Harden WikipediaEditsSource - Minor eager sanity checks - Use UUID suffix for nickname. As reported in FLINK-8167, the current nickname suffix can result in nickname clashes which lead to test failures. commit 06ec1542963bbe2afaf1ad1fd55a54d13f855304 Author: Ufuk Celebi <u...@apache.org> Date: 2017-11-29T15:36:29Z [FLINK-7762] [connector-wikiedits] Make WikipediaEditsSourceTest proper test The WikipediaEditsSourceTest unnecessarily implements an integration test that starts a FlinkMiniCluster and executes a small Flink program. This simply creates a source and executes run in a separate thread until a single WikipediaEditEvent is received. ---- ---