somandal commented on code in PR #16094:
URL: https://github.com/apache/pinot/pull/16094#discussion_r2192993019
##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/inv/text/LuceneFSTIndexCreator.java:
##########
@@ -50,36 +56,56 @@ public class LuceneFSTIndexCreator implements
FSTIndexCreator {
*
* @param indexDir Index directory
* @param columnName Column name for which index is being created
+ * @param tableNameWithType table name with type
* @param sortedEntries Sorted entries of the unique values of the column.
* @throws IOException
*/
- public LuceneFSTIndexCreator(File indexDir, String columnName, String[]
sortedEntries)
+ public LuceneFSTIndexCreator(File indexDir, String columnName, String
tableNameWithType, String[] sortedEntries)
throws IOException {
+ this(indexDir, columnName, tableNameWithType, sortedEntries, new
FSTBuilder());
+ }
+
+ @VisibleForTesting
+ public LuceneFSTIndexCreator(File indexDir, String columnName, String
tableNameWithType, String[] sortedEntries,
+ FSTBuilder fstBuilder)
+ throws IOException {
+ _tableNameWithType = tableNameWithType;
_fstIndexFile = new File(indexDir, columnName +
V1Constants.Indexes.LUCENE_V912_FST_INDEX_FILE_EXTENSION);
- _fstBuilder = new FSTBuilder();
+ _fstBuilder = fstBuilder;
_dictId = 0;
if (sortedEntries != null) {
for (_dictId = 0; _dictId < sortedEntries.length; _dictId++) {
- _fstBuilder.addEntry(sortedEntries[_dictId], _dictId);
+ try {
+ _fstBuilder.addEntry(sortedEntries[_dictId], _dictId);
+ } catch (IOException ex) {
+ // Caught exception while trying to add, update metric and skip the
document
+ String metricKeyName =
+ _tableNameWithType + "-" +
FstIndexType.INDEX_DISPLAY_NAME.toUpperCase(Locale.US) + "-indexingError";
+ ServerMetrics.get().addMeteredTableValue(metricKeyName,
ServerMeter.INDEXING_FAILURES, 1);
Review Comment:
we haven't done that for the H3 changes either btw - @Jackie-Jiang @KKcorps
any thoughts on whether we should expose this option to users for non-JSON
indexes as well?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]