Arnaud-Nauwynck closed pull request #1244: PARQUET-2416: Use
'mapreduce.outputcommitter.factory.class' in ParquetOutpuFormat
URL: https://github.com/apache/parquet-java/pull/1244
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub an
Arnaud-Nauwynck opened a new issue, #3074:
URL: https://github.com/apache/parquet-java/issues/3074
### Describe the enhancement requested
This is a minor performance improvement, but worth when reading many files.
read footer using 1 call readFully(byte[8]) instead of 5 calls ( 4 x
Arnaud-Nauwynck opened a new pull request, #3075:
URL: https://github.com/apache/parquet-java/pull/3075
GH-3074: read footer using 1 call readFully(byte[8]) instead of 5 calls
### Rationale for this change
performance
### What changes are included in this PR?
only method F
Arnaud-Nauwynck commented on issue #3076:
URL: https://github.com/apache/parquet-java/issues/3076#issuecomment-2495541296
see also related issue:
[https://github.com/apache/parquet-java/issues/3077](https://github.com/apache/parquet-java/issues/3077)
: AzureBlobFileSystem.open() should ret
emkornfield commented on code in PR #464:
URL: https://github.com/apache/parquet-format/pull/464#discussion_r1855229728
##
VariantEncoding.md:
##
@@ -386,11 +386,15 @@ The Decimal type contains a scale, but no precision. The
implied precision of a
| Exact Numeric| deci
Arnaud-Nauwynck opened a new issue, #3076:
URL: https://github.com/apache/parquet-java/issues/3076
### Describe the enhancement requested
When reading some column chunks but not all, parquet is building a list of
"ConsecutivePartList", then trying to call the Hadoop api for vectorized
Arnaud-Nauwynck closed issue #3077: AzureBlobFileSystem.open() should return a
sub-class fsDataInputStream that override readVectored() much more efficiently
for small reads
URL: https://github.com/apache/parquet-java/issues/3077
--
This is an automated message from the Apache Git Service.
T
Arnaud-Nauwynck commented on issue #3077:
URL: https://github.com/apache/parquet-java/issues/3077#issuecomment-2495543888
Sorry, I misclick in github project parquet instead of hadoop.
I have recreated
[(https://issues.apache.org/jira/browse/HADOOP-19345](HADOOP-19345), as I did
not saw
emkornfield commented on code in PR #466:
URL: https://github.com/apache/parquet-format/pull/466#discussion_r1855230005
##
LogicalTypes.md:
##
@@ -670,6 +681,13 @@ optional group array_of_arrays (LIST) {
Backward-compatibility rules
+Modern writers should always produc
Arnaud-Nauwynck opened a new issue, #3077:
URL: https://github.com/apache/parquet-java/issues/3077
### Describe the enhancement requested
In hadoop-azure, there are huge performance problems when reading file in a
too fragmented way: by reading many small file fragments even with the
alamb commented on PR #63:
URL: https://github.com/apache/parquet-testing/pull/63#issuecomment-2495459421
> Probably...I've never touched that behavior because I don't know if it is
intentional or not.
I vaguely remember the original rationale being it was impossible to decode
ranges
11 matches
Mail list logo