Fokko commented on code in PR #3502:
URL: https://github.com/apache/parquet-java/pull/3502#discussion_r3256327637
##########
parquet-column/src/main/java/org/apache/parquet/column/values/dictionary/DictionaryValuesWriter.java:
##########
@@ -121,8 +125,20 @@ protected DictionaryPage dictPage(ValuesWriter
dictPageWriter) {
@Override
public boolean shouldFallBack() {
- // if the dictionary reaches the max byte size or the values can not be
encoded on 4 bytes anymore.
- return dictionaryByteSize > maxDictionaryByteSize || getDictionarySize() >
MAX_DICTIONARY_ENTRIES;
+ return dictionarySizeExceeded;
+ }
+
+ /**
+ * Called by subclass write methods after adding a new dictionary entry to
check if the dictionary
+ * has exceeded its size limits. This avoids the per-value virtual dispatch
overhead of calling
+ * getDictionarySize() on every write -- the check only runs when a new
entry is actually added.
+ *
+ * @param newDictionarySize the current dictionary size after adding the new
entry
+ */
+ protected void checkDictionarySizeLimit(int newDictionarySize) {
Review Comment:
Since `dictionarySizeExceeded` is private, shouldn't this method be private
as well?
```suggestion
private void checkDictionarySizeLimit(int newDictionarySize) {
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]