nsivabalan commented on issue #4541:
URL: https://github.com/apache/hudi/issues/4541#issuecomment-1008230765


   let's try to remove some advanced configs, and test if we can make a simple 
job succeed and then we can add back more configs to deduce the issue.
   
   - I see you have added lot of custom configs for index. can we remove them 
for now. 
   ```
   'hoodie.bloom.index.bucketized.checking': True,
               'hoodie.bloom.index.keys.per.bucket': 50000000,
               'hoodie.index.bloom.num_entries': 1000000,
               'hoodie.bloom.index.use.caching': True,
               'hoodie.bloom.index.use.treebased.filter': True,
               'hoodie.bloom.index.filter.type': 'DYNAMIC_V0',
               'hoodie.bloom.index.filter.dynamic.max.entries': 1000000,
               'hoodie.bloom.index.prune.by.ranges': True,
   ```
   - 'write.parquet.block.size': 256 seems very low. Can we remove this for 
now. 
   - I see the exception arises from clustering code. lets try to remove them 
for now. 
   ```
   'hoodie.clustering.inline': True,
               'hoodie.clustering.inline.max.commits': '1',
               'hoodie.clustering.plan.strategy.small.file.limit': '1073741824',
               'hoodie.clustering.plan.strategy.target.file.max.bytes': 
'2147483648',
               'hoodie.clustering.execution.strategy.class':
                   'org.apache.hudi.client.clustering.run.strategy'
                   '.SparkSortAndSizeExecutionStrategy',
               'hoodie.clustering.plan.strategy.sort.columns': sort_cols,
   ```
   
   Lets try to see if the job succeeds after making above modifications. and we 
can go from there.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to