[GitHub] [flink] fapaul commented on a change in pull request #17363: [FLINK-24324][connectors/elasticsearch] Add Elasticsearch 7 sink based on FLIP-143

GitBox Fri, 01 Oct 2021 06:28:52 -0700


fapaul commented on a change in pull request #17363:
URL: https://github.com/apache/flink/pull/17363#discussion_r720245959




##########
File path: 
flink-connectors/flink-connector-elasticsearch7/src/main/java/org/apache/flink/connector/elasticsearch/sink/ElasticsearchProcessor.java
##########
@@ -0,0 +1,79 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.connector.elasticsearch.sink;
+
+import org.apache.flink.annotation.PublicEvolving;
+import org.apache.flink.api.common.functions.Function;
+import org.apache.flink.api.connector.sink.SinkWriter;
+
+import org.elasticsearch.action.ActionRequest;
+
+import java.io.Serializable;
+
+/**
+ * Creates multiple {@link ActionRequest ActionRequests} from the incoming 
elements.
+ *
+ * <p>This is used by sinks to prepare elements for sending them to 
Elasticsearch.
+ *
+ * <p>Example:
+ *
+ * <pre>{@code
+ * private static class TestElasticsearchProcessor implements 
ElasticsearchProcessor<Tuple2<Integer, String>> {
+ *
+ *     public IndexRequest createIndexRequest(Tuple2<Integer, String> element) 
{
+ *         Map<String, Object> json = new HashMap<>();
+ *                json.put("data", element.f1);
+ *
+ *            return Requests.indexRequest()
+ *                    .index("my-index")
+ *                        .type("my-type")
+ *                        .id(element.f0.toString())
+ *                        .source(json);
+ *     }
+ *
+ *        public void process(Tuple2<Integer, String> element, RequestIndexer 
indexer) {
+ *            indexer.add(createIndexRequest(element));
+ *     }
+ * }
+ *
+ * }</pre>
+ *
+ * @param <T> The type of the element handled by this {@link 
ElasticsearchProcessor}
+ */
+@PublicEvolving
+public interface ElasticsearchProcessor<T> extends Function, Serializable {
+
+    /**
+     * Initialization method for the function. It is called once before the 
actual working process
+     * methods.
+     */
+    default void open() throws Exception {}
+
+    /** Tear-down method for the function. It is called when the sink closes. 
*/
+    default void close() throws Exception {}
+
+    /**
+     * Process the incoming element to produce multiple {@link ActionRequest 
ActionRequests}. The
+     * produced requests should be added to the provided {@link 
RequestIndexer}.
+     *
+     * @param element incoming element to process
+     * @param context to access additional information about the record
+     * @param indexer request indexer that {@code ActionRequest} should be 
added to
+     */
+    void process(T element, SinkWriter.Context context, RequestIndexer 
indexer);

Review comment:
       If we return a list of requests we loose the type information and would 
need to do the respective casts later, isn't this performance critical?
   I understand what you are aiming for to make it more applicable to the other 
serialization schemas it would definitely help to have a return type.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] fapaul commented on a change in pull request #17363: [FLINK-24324][connectors/elasticsearch] Add Elasticsearch 7 sink based on FLIP-143

Reply via email to