szaszm commented on a change in pull request #1138: URL: https://github.com/apache/nifi-minifi-cpp/pull/1138#discussion_r679838081
########## File path: Extensions.md ########## @@ -16,108 +16,64 @@ To enable all extensions for your platform, you may use -DENABLE_ALL=TRUE OR select the option to "Enable all Extensions" in the bootstrap script. [ReadMe](https://github.com/apache/nifi-minifi-cpp/#bootstrapping) -# Extensions by example - -Extensions consist of modules that are conditionally built into your client. Reasons why you may wish to do this with your modules/processors - - - Do not with to make dependencies required or the lack thereof is a known/expected runtime condition. - - You wish to allow users to exclude dependencies for a variety of reasons. - -# Extensions by example -We've used HTTP-CURL as the first example. We've taken all libcURL runtime classes and placed them into an extensions folder - - /extensions/http-curl - -This folder contains a CMakeLists file so that we can conditionally build it. In the case with libcURL, if the user does not have curl installed OR they specify -DDISABLE_CURL=true in the cmake build, the extensions will not be built. In this case, when the extension is not built, C2 REST protocols, InvokeHTTP, and an HTTP Client implementation will not be included. - -Your CMAKE file should build a static library, that will be included into the run time. This must be added with your conditional to the libminifi CMAKE, along with a platform specific whole archive inclusion. Note that this will ensure that despite no direct linkage being found by the compiler, we will include the code so that we can dynamically find your code. - -# Including your extension in the build -There is a new function that can be used in the root cmake to build and included your extension. An example is based on the LibArchive extension. The createExtension function has 8 possible arguments. The first five arguments are required. -The first argument specifies the variable controlling the exclusion of this extension, followed by the variable that -is used when including it into conditional statements. The third argument is the pretty name followed by the description of the extension and the extension directory. The first optional argument is the test directory, which must also contain a CMakeLists.txt file. The seventh argument can be a conditional variable that tells us whether or not to add a third party subdirectory specified by the final extension. - -In the lib archive example, we provide all arguments, but your extension may only need the first five and the the test folder. The seventh and eighth arguments must be included in tandem. - -```cmake -if ( NOT LibArchive_FOUND OR BUILD_LIBARCHIVE ) - set(BUILD_TP "TRUE") -endif() -createExtension(DISABLE_LIBARCHIVE - ARCHIVE-EXTENSIONS - "ARCHIVE EXTENSIONS" - "This Enables libarchive functionality including MergeContent, CompressContent, and (Un)FocusArchiveEntry" - "extensions/libarchive" - "${TEST_DIR}/archive-tests" - BUILD_TP - "thirdparty/libarchive-3.3.2") +# Extension internals +Extensions are dynamic libraries loaded at runtime by the agent. An extension makes its +capabilities (classes) available to the system through registrars. + +``` C++ +// register user-facing classes as +REGISTER_RESOURCE(InvokeHTTP, "An HTTP client processor which can interact with a configurable HTTP Endpoint. " + "The destination URL and HTTP Method are configurable. FlowFile attributes are converted to HTTP headers and the " + "FlowFile contents are included as the body of the request (if the HTTP Method is PUT, POST or PATCH)."); + +// register internal resources as +REGISTER_INTERNAL_RESOURCE(HTTPClient); ``` -It is advised that you also add your extension to bootstrap.sh as that is the suggested method of configuring MiNiFi C++ - -# C bindings -To find your classes, you must adhere to a dlsym call back that adheres to the core::ObjectFactory class, like the one below. This object factory will return a list of classes, that we can instantiate through the class loader mechanism. Note that since we are including your code directly into our runtime, we will take care of dlopen and dlsym calls. A map from the class name to the object factory is kept in memory. +If you decide to put them in a header file, you better make sure that there are no dependencies between extensions, +as the inclusion of such header from extension "A", will force it to be defined in the including extension "B". This could result +in shadowing each other's resources. Best to put them in source files. Review comment: I would simplify this by only saying that them must go in source files. You could add it to the first paragraph as well. ########## File path: conf/minifi.properties ########## @@ -20,6 +20,9 @@ nifi.administrative.yield.duration=30 sec # If a component has no work to do (is "bored"), how long should we wait before checking again for work? nifi.bored.yield.duration=100 millis +# Comma separated path for the extension libraries. Relative path is relative to the minifi executable. +nifi.extension.path=../extensions/* Review comment: Can we make this relative to MINIFI_HOME instead? I think it would be easier to use. ########## File path: libminifi/test/script-tests/ExecutePythonProcessorTests.cpp ########## @@ -53,7 +53,6 @@ class ExecutePythonProcessorTestBase { testController_.reset(new TestController()); plan_ = testController_->createPlan(); logTestController_.setDebug<TestPlan>(); - logTestController_.setDebug<minifi::python::processors::ExecutePythonProcessor>(); Review comment: Is there a reason for this removal? ########## File path: Extensions.md ########## @@ -16,108 +16,64 @@ To enable all extensions for your platform, you may use -DENABLE_ALL=TRUE OR select the option to "Enable all Extensions" in the bootstrap script. [ReadMe](https://github.com/apache/nifi-minifi-cpp/#bootstrapping) -# Extensions by example - -Extensions consist of modules that are conditionally built into your client. Reasons why you may wish to do this with your modules/processors - - - Do not with to make dependencies required or the lack thereof is a known/expected runtime condition. - - You wish to allow users to exclude dependencies for a variety of reasons. - -# Extensions by example -We've used HTTP-CURL as the first example. We've taken all libcURL runtime classes and placed them into an extensions folder - - /extensions/http-curl - -This folder contains a CMakeLists file so that we can conditionally build it. In the case with libcURL, if the user does not have curl installed OR they specify -DDISABLE_CURL=true in the cmake build, the extensions will not be built. In this case, when the extension is not built, C2 REST protocols, InvokeHTTP, and an HTTP Client implementation will not be included. - -Your CMAKE file should build a static library, that will be included into the run time. This must be added with your conditional to the libminifi CMAKE, along with a platform specific whole archive inclusion. Note that this will ensure that despite no direct linkage being found by the compiler, we will include the code so that we can dynamically find your code. - -# Including your extension in the build -There is a new function that can be used in the root cmake to build and included your extension. An example is based on the LibArchive extension. The createExtension function has 8 possible arguments. The first five arguments are required. -The first argument specifies the variable controlling the exclusion of this extension, followed by the variable that -is used when including it into conditional statements. The third argument is the pretty name followed by the description of the extension and the extension directory. The first optional argument is the test directory, which must also contain a CMakeLists.txt file. The seventh argument can be a conditional variable that tells us whether or not to add a third party subdirectory specified by the final extension. - -In the lib archive example, we provide all arguments, but your extension may only need the first five and the the test folder. The seventh and eighth arguments must be included in tandem. - -```cmake -if ( NOT LibArchive_FOUND OR BUILD_LIBARCHIVE ) - set(BUILD_TP "TRUE") -endif() -createExtension(DISABLE_LIBARCHIVE - ARCHIVE-EXTENSIONS - "ARCHIVE EXTENSIONS" - "This Enables libarchive functionality including MergeContent, CompressContent, and (Un)FocusArchiveEntry" - "extensions/libarchive" - "${TEST_DIR}/archive-tests" - BUILD_TP - "thirdparty/libarchive-3.3.2") +# Extension internals +Extensions are dynamic libraries loaded at runtime by the agent. An extension makes its +capabilities (classes) available to the system through registrars. + +``` C++ +// register user-facing classes as +REGISTER_RESOURCE(InvokeHTTP, "An HTTP client processor which can interact with a configurable HTTP Endpoint. " + "The destination URL and HTTP Method are configurable. FlowFile attributes are converted to HTTP headers and the " + "FlowFile contents are included as the body of the request (if the HTTP Method is PUT, POST or PATCH)."); + +// register internal resources as +REGISTER_INTERNAL_RESOURCE(HTTPClient); ``` -It is advised that you also add your extension to bootstrap.sh as that is the suggested method of configuring MiNiFi C++ - -# C bindings -To find your classes, you must adhere to a dlsym call back that adheres to the core::ObjectFactory class, like the one below. This object factory will return a list of classes, that we can instantiate through the class loader mechanism. Note that since we are including your code directly into our runtime, we will take care of dlopen and dlsym calls. A map from the class name to the object factory is kept in memory. +If you decide to put them in a header file, you better make sure that there are no dependencies between extensions, +as the inclusion of such header from extension "A", will force it to be defined in the including extension "B". This could result +in shadowing each other's resources. Best to put them in source files. + +Some extensions (e.g. `http-curl`) require initialization before use. You need to subclass `Extension` and let the system know by using `REGISTER_EXTENSION`. ```C++ -class __attribute__((visibility("default"))) HttpCurlObjectFactory : public core::ObjectFactory { +class HttpCurlExtension : core::extension::Extension { public: Review comment: Could we do that without subclassing? For example by registering extension init and deinit functions as needed instead of a class. Or registering a class whose ctor and dtor are init and deinit, but without subclassing. Or using a specially named pair of functions (e.g. init_extension() and deinit_extension()) that are called after dlopen by minifi, if they are defined. I like to avoid inheritance unless it's by far the simplest solution, or I'm modeling is-a relationships. The style guide seems to be leaning in this direction as well: https://google.github.io/styleguide/cppguide.html#Inheritance There's also this funny talk title: [Inheritance Is The Base Class of Evil](https://youtu.be/bIhUE5uUFOA) (Sean Parent) ########## File path: libminifi/include/core/state/nodes/DeviceInformation.h ########## @@ -39,6 +37,10 @@ #else #pragma comment(lib, "iphlpapi.lib") +#ifndef WIN32_LEAN_AND_MEAN +#define WIN32_LEAN_AND_MEAN 1 +#endif Review comment: Maybe it would make sense to have this defined in cmake for our targets. Does this help with macros btw? ########## File path: libminifi/include/utils/Export.h ########## @@ -15,14 +15,21 @@ * See the License for the specific language governing permissions and * limitations under the License. */ -#include "TemplateLoader.h" -#include "core/FlowConfiguration.h" -bool TemplateFactory::added = core::FlowConfiguration::add_static_func("createTemplateFactory"); +#pragma once -extern "C" { - -void *createTemplateFactory(void) { - return new TemplateFactory(); -} -} +#ifdef WIN32 + #ifdef LIBMINIFI + #define MINIFIAPI __declspec(dllexport) + #else + #define MINIFIAPI __declspec(dllimport) + #endif + #ifdef MODULE_NAME + #define EXTENSIONAPI __declspec(dllexport) + #else + #define EXTENSIONAPI __declspec(dllimport) + #endif +#else + #define MINIFIAPI + #define EXTENSIONAPI +#endif Review comment: Instead of this approach, I think we should have an exports header for each of our targets, and use the target-specific macros. Extensions shouldn't share export macros IMO. See this for a better explanation: https://youtu.be/m0DwB4OvDXk?t=610 ########## File path: libminifi/src/utils/file/FileMatcher.cpp ########## @@ -0,0 +1,294 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include "utils/file/FileMatcher.h" +#include "utils/file/FileUtils.h" +#include "utils/StringUtils.h" + +namespace org { +namespace apache { +namespace nifi { +namespace minifi { +namespace utils { +namespace file { + +std::shared_ptr<core::logging::Logger> FileMatcher::FilePattern::logger_ = logging::LoggerFactory<FileMatcher::FilePattern>::getLogger(); +std::shared_ptr<core::logging::Logger> FileMatcher::logger_ = logging::LoggerFactory<FileMatcher>::getLogger(); + +static bool isGlobPattern(const std::string& pattern) { + return pattern.find_first_of("?*") != std::string::npos; +} + +static std::vector<std::string> split(const std::string& str, const std::vector<std::string>& delimiters) { + std::vector<std::string> result; + + size_t prev_delim_end = 0; + size_t next_delim_begin = std::string::npos; + do { + for (const auto& delim : delimiters) { + next_delim_begin = str.find(delim, prev_delim_end); + if (next_delim_begin != std::string::npos) { + result.push_back(str.substr(prev_delim_end, next_delim_begin - prev_delim_end)); + prev_delim_end = next_delim_begin + delim.length(); + break; + } + } + } while (next_delim_begin != std::string::npos); + result.push_back(str.substr(prev_delim_end)); + return result; +} + +#ifdef WIN32 +static const std::vector<std::string> path_separators{"/", "\\"}; +#else +static const std::vector<std::string> path_separators{"/"}; +#endif + +optional<FileMatcher::FilePattern> FileMatcher::FilePattern::fromPattern(std::string pattern) { + pattern = utils::StringUtils::trim(pattern); + bool excluding = false; + if (!pattern.empty() && pattern[0] == '!') { + excluding = true; + pattern = utils::StringUtils::trim(pattern.substr(1)); + } + if (pattern.empty()) { + logger_->log_error("Empty pattern"); + return nullopt; + } + std::string exe_dir = get_executable_dir(); + if (exe_dir.empty() && !isAbsolutePath(pattern.c_str())) { + logger_->log_error("Couldn't determine executable dir, relative pattern '%s' not supported", pattern); + return nullopt; + } + pattern = resolve(exe_dir, pattern); + auto segments = split(pattern, path_separators); + gsl_Expects(!segments.empty()); + auto file_pattern = segments.back(); + if (file_pattern == "**") { + file_pattern = "*"; + } else { + segments.pop_back(); + } + if (file_pattern == "." || file_pattern == "..") { + logger_->log_error("Invalid file pattern '%s'", file_pattern); + return nullopt; + } + bool after_wildcard = false; + for (const auto& segment : segments) { + if (after_wildcard && segment == "..") { + logger_->log_error("Parent accessor is not supported after wildcards"); + return nullopt; + } + if (isGlobPattern(segment)) { + after_wildcard = true; + } + } + return FilePattern(segments, file_pattern, excluding); +} + +std::string FileMatcher::FilePattern::getBaseDirectory() const { + std::string base_dir; + for (const auto& segment : directory_segments_) { + // ignore segments at or after wildcards + if (isGlobPattern(segment)) { + break; + } + base_dir += segment + get_separator(); + } + return base_dir; +} + +FileMatcher::FileMatcher(const std::string &patterns) { + for (auto&& pattern : split(patterns, {","})) { + if (auto&& p = FilePattern::fromPattern(pattern)) { + patterns_.push_back(std::move(p.value())); + } + } +} + +template<typename It> +static bool advance_if_not_equal(It& it, const It& end) { + if (it == end) { + return false; + } + ++it; + return true; +} + +static bool is_this_dir(const std::string& dir) { + return dir.empty() || dir == "."; +} + +template<typename It, typename Fn> +static void skip_if(It& it, const It& end, const Fn& fn) { + while (it != end && fn(*it)) { + ++it; + } +} + +static bool matchGlob(std::string::const_iterator pattern_it, std::string::const_iterator pattern_end, std::string::const_iterator value_it, std::string::const_iterator value_end) { Review comment: You might want to use `std::string_view` ########## File path: libminifi/test/azure-tests/CMakeLists.txt ########## @@ -30,9 +30,9 @@ FOREACH(testfile ${AZURE_INTEGRATION_TESTS}) target_compile_features(${testfilename} PRIVATE cxx_std_14) createTests("${testfilename}") target_link_libraries(${testfilename} ${CATCH_MAIN_LIB}) - target_wholearchive_library(${testfilename} minifi-azure) - target_wholearchive_library(${testfilename} minifi-standard-processors) - target_wholearchive_library(${testfilename} minifi-expression-language-extensions) + target_link_libraries(${testfilename} minifi-azure) Review comment: Incorrect indentation here. (not new) ########## File path: libminifi/test/unit/MemoryUsageTest.cpp ########## @@ -16,6 +16,9 @@ * limitations under the License. */ +// including the extensions messes up the memory usage +#define EXTENSION_LIST "" Review comment: What does this do? What was the problem? ########## File path: libminifi/test/aws-tests/CMakeLists.txt ########## @@ -30,9 +30,9 @@ FOREACH(testfile ${AWS_INTEGRATION_TESTS}) target_include_directories(${testfilename} PRIVATE BEFORE "${CMAKE_SOURCE_DIR}/extensions/expression-language") createTests("${testfilename}") target_link_libraries(${testfilename} ${CATCH_MAIN_LIB}) - target_wholearchive_library(${testfilename} minifi-aws) - target_wholearchive_library(${testfilename} minifi-standard-processors) - target_wholearchive_library(${testfilename} minifi-expression-language-extensions) + target_link_libraries(${testfilename} minifi-aws) Review comment: Incorrect indentation (not new) ########## File path: main/CMakeLists.txt ########## @@ -39,16 +39,8 @@ if(WIN32) endif() add_executable(minifiexe ${MINIFIEXE_SOURCES}) -if (NOT USE_SHARED_LIBS) - if (LIBC_STATIC) - set_target_properties(minifiexe PROPERTIES LINK_SEARCH_START_STATIC 1) - endif(LIBC_STATIC) -endif(NOT USE_SHARED_LIBS) Review comment: Why are we throwing out static standard library support? ########## File path: libminifi/test/LogUtils.h ########## @@ -0,0 +1,51 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#pragma once + +#include <sstream> +#include <memory> +#include <string> +#include <utility> + +#include "spdlog/sinks/sink.h" +#include "spdlog/sinks/ostream_sink.h" + +class StringStreamSink : public spdlog::sinks::sink { + public: + explicit StringStreamSink(std::shared_ptr<std::ostringstream> stream, bool force_flush = false) + : stream_(std::move(stream)), sink_(*stream_, force_flush) {} + + ~StringStreamSink() override = default; + + void log(const spdlog::details::log_msg &msg) override { + sink_.log(msg); + } + void flush() override { + sink_.flush(); + } + void set_pattern(const std::string &pattern) override { + sink_.set_pattern(pattern); + } + void set_formatter(std::unique_ptr<spdlog::formatter> sink_formatter) override { + sink_.set_formatter(std::move(sink_formatter)); + } + + private: + std::shared_ptr<std::ostringstream> stream_; Review comment: I'd prefer this to be an observer pointer, i.e. without ownership. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
