[
https://issues.apache.org/jira/browse/METRON-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16490968#comment-16490968
]
ASF GitHub Bot commented on METRON-1569:
----------------------------------------
Github user mmiklavc commented on a diff in the pull request:
https://github.com/apache/metron/pull/1022#discussion_r190946427
--- Diff:
metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/CachedFieldNameConverterFactory.java
---
@@ -0,0 +1,155 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.elasticsearch.writer;
+
+import com.github.benmanes.caffeine.cache.Cache;
+import com.github.benmanes.caffeine.cache.Caffeine;
+import org.apache.commons.lang.ClassUtils;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.commons.lang3.exception.ExceptionUtils;
+import org.apache.metron.common.configuration.writer.WriterConfiguration;
+import org.apache.metron.common.field.DeDotFieldNameConverter;
+import org.apache.metron.common.field.FieldNameConverter;
+import org.apache.metron.common.field.FieldNameConverters;
+import org.apache.metron.common.field.NoopFieldNameConverter;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.lang.invoke.MethodHandles;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * A {@link FieldNameConverterFactory} that is backed by a cache.
+ *
+ * <p>Each sensor type can use a different {@link FieldNameConverter}
implementation.
+ *
+ * <p>The {@link WriterConfiguration} allows a user to define the {@link
FieldNameConverter}
+ * that should be used for a given sensor type.
+ *
+ * <p>The {@link FieldNameConverter}s are maintained in a cache for a
fixed period of time
+ * after they are created. Once they expire, the {@link
WriterConfiguration} is used to
+ * reload the {@link FieldNameConverter}.
+ *
+ * <p>The user can change the {@link FieldNameConverter} in use at
runtime. A change
--- End diff --
I actually liked what you had with your enum pattern. I was imaging a very
minor change to your code, like below. The other change would be having the
default converter value provided by the config class, or making it an
Optional<String> if you wanted the es writer to own setting the default:
```
public interface FieldNameConverter {
String convert(String originalField);
}
public enum FieldNameConverters implements FieldNameConverter {
NOOP(new NoopFieldNameConverter()),
DEDOT(new DeDotFieldNameConverter());
private FieldNameConverter converter;
FieldNameConverters(FieldNameConverter converter) {
this.converter = converter;
}
@Override
public String convert(String originalField) {
return converter.convert(originalField);
}
}
public class ElasticsearchWriter implements BulkMessageWriter<JSONObject>,
Serializable {
write( ... ) {
String converterType = config.getFieldNameConverter(sensorType);
// or this -> String converterType =
config.getFieldNameConverter(sensorType).orElse("DEDOT");
// fetch the field name converter for this sensor type
FieldNameConverter fieldNameConverter =
FieldNameConverters.valueOf(converterType);
final String indexPostfix = dateFormat.format(new Date());
BulkRequestBuilder bulkRequest = client.prepareBulk();
for(JSONObject message: messages) {
JSONObject esDoc = new JSONObject();
for(Object k : message.keySet()){
copyField(k.toString(), message, esDoc, fieldNameConverter);
}
...
}
...
}
private void copyField(... FieldNameConverter fieldNameConverter) {
String destinationFieldName =
fieldNameConverter.convert(sourceFieldName);
...
}
...
}
```
> Allow user to change field name conversion when indexing to Elasticsearch
> -------------------------------------------------------------------------
>
> Key: METRON-1569
> URL: https://issues.apache.org/jira/browse/METRON-1569
> Project: Metron
> Issue Type: Improvement
> Reporter: Nick Allen
> Assignee: Nick Allen
> Priority: Major
>
> The `ElasticsearchWriter` has a mechanism to transform the field names of a
> message before it is written to Elasticsearch. Right now this mechanism is
> hard-coded to replace all '.' dots with ':' colons.
> This mechanism was needed for Elasticsearch 2.x which did not allow dots in
> field names. Now that Metron supports Elasticsearch 5.x this is no longer a
> problem.
> A user should be able to configure the field name transformation when writing
> to Elasticsearch, as needed.
> While it might have been simpler to just remove the de-dotting mechanism,
> this would break backwards compatibility. Providing users with a means to
> configure this mechanism provides them with an upgrade path.
>
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)