This is an automated email from the ASF dual-hosted git repository.

vhs pushed a change to branch variant-intro-shredded-avro-support
in repository https://gitbox.apache.org/repos/asf/hudi.git


    omit 46779f9daff6 Remove looksLikeVariantStruct:
    omit af00daf8dcca feat(schema): Add read + write support for shredded for 
AVRO
    omit d1e8c0d46287 feat(schema): Config path implemented for spark record 
type
    omit d3bb0d3750f1 feat(schema): Add support to write shredded variants
    omit cf1d9a1e223b Support reading and writing of Variant Types - Add 
adapter pattern for Spark3 and 4 - Cleanup invariant issue in SparkSqlWriter - 
Add cross engine test - Add backward compatibility test for Spark3.x - Add 
cross engine read for Flink - Make VariantLogicalType compare against singleton
     add b7f72d33b560 feat: support predicate push down in Hudi flink source v2 
(#18212)
     add f7b805961e05 feat(flink): Off-heap lookup join cache backed by RocksDB 
(#18231)
     add fbeb93d353e4 fix: Remove trailing colon from incomplete error message 
in HoodieTableMetadataUtil (#18233)
     add 8c5f83237c11 fix: Fix typos across codebase (#18232)
     add f5d08ec1bbae fix: Fix SHOW PARTITIONS commands functionality for 
slash-separated date partitioning (#18195)
     add 18ee6cd66b4f fix: Fix string handling on bloom index (#18240)
     add 894b817f79af chore(ci): cleanup for print statements, showing 
tables/schemas (#17771)
     add fd0656c11304 fix: Use correct lastCompletedTransactionMetadata while 
acquiring lock for clustering (#18198)
     add 6156b74414d5 feat(spark): Add HoodieSparkSqlUtils APIs for tooling 
(#18202)
     add df5d7c8fd262 feat(spark-datasource): support spark.hoodie.* read 
config overrides (#18205)
     add 7f4dfdab3222 test: Add Scala test for record index rebootstrap on 
non-Hoodie partitions (#18208)
     add 5fb1b34a6c38 fix: Fail metadata bootstrap early in presence of 0 byte 
file (#18209)
     add fb7b1a5d9111 feat(metadata-table): Add count validation for record 
index bootstrap (#18029)
     add 967408cc9b76 refactor: move source assign package under split (#18253)
     add 181af01a1a25 perf: Adding support for LatestBaseFilesPathFilter to 
Spark File Index (#18136)
     add 59a5d889bbe9 fix: add all fields in HoodieSourceSplitSerializer 
(#18243)
     add 363f41acbbda fix: [HUDI-CLUSTERING] Optimize binary copy performance 
with lazy loading, bulk reads, and double buffering (#18241)
     add d4ff54b571c7 fix(flink): Use timestamp based partitioning in 
AutoRowDataKeyGen (#18090)
     add e1ae9c6ac2fe feat(flink): collect event time in 
HoodieRowDataCreateHandle for min/max event time metrics (#18250)
     add 73e710de1332 feat(table-services): Emit archival metrics for 
monitoring and debugging (#18133)
     add 7e9abc71ed03 feat(table-services): Add config to filter partitions 
during full clean (#17550)
     add e063493337e5 feat(metrics): emit metric for rollback failures (#18148)
     add 65c1b1217ece feat: Notebooks to support multiple hudi versions (#18255)
     add 69e24ea4690e perf: eliminate unnecessary timeline loading for Flink 
append only write path (#18264)
     add 1203b215b03c feat: Use PartitionValueExtractor interface in Spark 
reader path (#17850)
     add 1b2cee800994 feat(vector): add VECTOR type to HoodieSchema (#18146)
     add a31f15d22a21 fix: infer record merge mode for pre-v9 tables in 
generateRequiredSchema (#18106)
     add 43d8ed83f361 test(common): report jvm memory stats for unit tests 
(#18207)
     add abd8c22d7a80 fix(table-services): When applying rollback metadata to 
metadata table (v6) do not rollback a metadata table deltacommit if it has been 
already rolled back by post-commit rollback (#18160)
     add 365398affac8 refactor: Hudi Flink source v2 with better context 
management (#18269)
     add 3139a1935ee6 feat(table-services): Allow  users to not parallelize 
each partition with engine context during clustering planning (#18191)
     add 3dcf4c65b261 feat(client): Add pre-write validator framework (#18239)
     add 31b8706dbb2d feat(vector): Add further research for supporting VECTOR 
type to RFC-99 (#18184)
     add 9ba276029311 feat(table-services): Support clustering file groups with 
earlier instants times first (#18174)
     add 4499b0b2bd43 feat(spark): ZooKeeper node should hold spark app id (for 
helping debug when lock is held for long time) (#18123)
     add 6e0d786b52b8 fix(flink): Don't perform table service during mdt 
initialization if streaming write is enabled (#18283)
     add bf4425b9f3d9 fix: Remove noisy logging when table partition is empty 
(#18290)
     add 729b30c128d8 fix: Improve config docs of enabling column stats in 
metadata table (#18289)
     add 2c1cb392df14 feat(vector): add converters from spark to hoodieSchema 
for vectors (#18190)
     add 8296df0de3a3 fix(flink): enable integration test for Hudi Flink Source 
V2 (#18287)
     add 22aa1fad6a0b fix: Databricks Spark 3.4 Runtime compatibility for 
reading Hudi tables (#18292)
     add 93b8e9fc9804 feat(flink): Add Kafka offset tracking to Flink Hudi 
commits (#18127)
     add d13310c6c119 perf(table-services): Incremental clean planning (for 
COW) should ignore partitions from instants with only new file groups (#18016)
     add b5daa30aed8c feat(flink): Add helper functions to parse Kafka offset 
differences b… (#18125)
     add f867059b3c0a fix(spark): SparkSQL write queries should correctly infer 
HUDI configs from spark.hoodie.* configs in spark conf (#18297)
     add cc3a5293bf5c fix(table-services): When single clustering group config 
is disabled, clustering should not create clustering groups with same number of 
input/output files (#18172)
     add abb5fd2ad65d feat: add support for touch partitions in HiveSyncTool 
(#18064)
     add da244e12fc07 feat(flink): Support create table DDL without primary key 
(#18086)
     add b01ae22236d6 fix: sort partitions after filtering for clustering 
planning (#18092)
     add b31d5f7a4409 refactor: rewrite executors tests to avoid code 
duplication  (#18005)
     add b77c7e5eb4d9 fix(common): Handle zero byte properties file and ensure 
atomic writes during modification (#18058)
     add ddfcc92b131d [HUDI-7503] Compaction execution should fail if another 
active writer is already executing the same plan (#18012)
     add 39cb726bebe9 feat(common): Add Policy for cleanup/rollback before each 
write (#18197)
     add e6723a8b2af5 fix(metadata): Allow metadata table bootstrap when 
pending commits are being rolled back (#18033)
     add 39f1f395b172 fix(common): Filter stray files when loading partitions 
in AbstractTableFileSystemView (#18047)
     add f64c93ee899c fix(clustering): When inferring wether an instant is 
clustering, do not fail if replacecommit was rolled back already (by a 
concurrent writer) (#18288)
     add 74649c83045d docs: RFC-102 - Spark Vector Search in Apache Hudi 
(#14218)
     add f763da2bc197 feat(conflict-resolution): Allow 
PreferWriterConflictResolutionStrategy to abort clustering if there is an 
ongoing write that is in requested state. (#18280)
     add a16d43171da9 feat(hudi-sync): Publish HUDI version to Hive metastore 
(#18307)
     add a17955528595 chore(ci): Add test jobs and Codecov integration in 
GitHub Actions (#18225)
     add b7b0b83e0ebf chore(ci): Simplify test combinations on Spark in Github 
actions (#18336)
     add 14a549f45c2c chore(ci): Add codecov coverage from tests running on 
Spark 4.0 (#18335)
     add 3a1ea4bf8602 feat(metasync): Support HMS 4.x in JDBC sync mode via 
automatic Thrift fallback (#18227)
     add bbda2428bfd1 feat(flink): Support write buffer based on flink managed 
memory (#18319)
     add cad08b1f2cfd feat(lance): Support bloom filter in Lance writer and 
reader (#18304)
     add 19c4cc9c3166 fix: Use explicit Throwable type in AvroConversionUtils 
catch clause (#18342)
     add 4e21ff1a3d1a docs: Update the build instructions by mentioning 
profiles in README (#18310)
     add b0e40f62e8b3 feat(utilities): add DELETE operation support for 
HudiStreamer (#18088)
     add b634262f060a feat(metadata-table): add config to disable automatic 
deletion of MDT partitions (#18181)
     add 26b324f267e5 fix(concurrency): detect rollback conflicts with ongoing 
commit operations (#18089)
     add 967e456cc456 feat(common): add core pre-commit validation framework - 
Phase 1  (#18068)
     add 3aef2cacdb1c fix: Fix flaky test 
TestProtoConversionUtil#allFieldsSet_wellKnownTypesAndTimestampsAsRecords 
(#18352)
     add f74bf3a3e040 fix(flink): enable batch read it for flink source v2 
(#18325)
     add 941ae6200078 fix: modify the incorrect Hive configuration in hoodie 
hive catalog (#18365)
     add 81a8c26ee739 feat: support read commits limit in Hudi Flink Source V2 
(#18369)
     add 331b018d0cfc feat(hive-sync): add Spark-catalog based metastore client 
implementation to avoid Hive-on-Spark classloader issues (#18203)
     add 817b3ad7de92 fix(common): fix typos commited -> committed, commiting 
-> committing (#18363)
     add b60855defe0c feat: support read splits limit in Hudi Flink Source V2 
(#18370)
     add c2b401ed70ff feat(flink): Support bootstrap from RLI to local RocksDB 
for flink bucket assigner (#18254)
     add 41337396a9d6 perf: Skip unnecessary clean planning for MOR metadata 
table file-version cleaning (#17943)
     add 2f073643dfe7 feat: add graceful handling for post-commit failures with 
metrics (#18196)
     add 9859f9aa29df feat(flink): Support more efficient customized serializer 
for HoodieRecordGlobalLocation (#18326)
     add 69fa35b1015f feat(metadata): Defer RLI initialization for fresh tables 
to optimize file group allocation (#18353)
     add 56bc28398a47 feat(flink): add pre-commit validation framework for 
Flink - Phase 2 (#18362)
     add f15e1d060f96 feat: add Flink source reader function for cdc splits 
(#18361)
     add 3fc1deb68b0f feat(vector): Support writing VECTOR to parquet and avro 
formats using Spark (#18328)
     add bb5abb6b0483 fix: Optimizing internal schema lookup in 
TableSchemaResolver (#18387)
     add 1eb97b31826e [HUDI-7030] Commit-based Clustering Plan Strategy (#18251)
     add 02e5efb41c7b fix: Fixed the issue of incorrect opName values in Flink 
bulk insert writing (#18313)
     add d241b0901b27 fix(flink): Improve splits distribution strategy for mor 
table w/ bucket index (#18103)
     add e930b834e95f feat: Add Unshredded Variant read & write support (#17833)
     add e4bc9851bf58 Explicitly state the spark stage name (#18416)
     add 54276a957b82 refactor: modularize long test methods in 
TestHoodieClientOnCopyOnWriteStorage (#18377)
     add 78109aa88b4a feat(schema): Add support to write shredded variants
     add b702b7d75d83 feat(schema): Config path implemented for spark record 
type
     new afc6b4b5b06e feat(schema): Add read + write support for shredded for 
AVRO

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (46779f9daff6)
            \
             N -- N -- N   refs/heads/variant-intro-shredded-avro-support 
(afc6b4b5b06e)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .github/workflows/bot.yml                          |  589 ++++++-----
 README.md                                          |    9 +-
 azure-pipelines-20230430.yml                       |    2 +-
 .../hudi/cli/commands/TestLockAuditingCommand.java |   15 +-
 .../testutils/HoodieCLIIntegrationTestBase.java    |    1 -
 .../testutils/HoodieCLIIntegrationTestHarness.java |    6 -
 .../org/apache/hudi/client/BaseHoodieClient.java   |    4 +
 .../hudi/client/BaseHoodieTableServiceClient.java  |  129 ++-
 .../apache/hudi/client/BaseHoodieWriteClient.java  |  107 +-
 .../client/timeline/HoodieTimelineArchiver.java    |   10 +
 .../timeline/versioning/v1/TimelineArchiverV1.java |   26 +-
 .../timeline/versioning/v2/TimelineArchiverV2.java |   27 +-
 .../client/transaction/ConcurrentOperation.java    |   40 +-
 .../transaction/ConflictResolutionStrategy.java    |   13 +
 .../PreferWriterConflictResolutionStrategy.java    |  129 ++-
 .../SchemaConflictResolutionStrategy.java          |    2 +-
 ...urrentFileWritesConflictResolutionStrategy.java |   40 +-
 .../SimpleSchemaConflictResolutionStrategy.java    |    4 +-
 .../lock/BaseZookeeperBasedLockProvider.java       |    4 +-
 .../transaction/lock/HoodieInterProcessMutex.java  |   65 ++
 .../hudi/client/transaction/lock/LockManager.java  |    5 +-
 .../apache/hudi/client/utils/ArchivalMetrics.java  |   75 ++
 .../hudi/client/utils/PreWriteValidatorUtils.java  |  137 +++
 .../apache/hudi/client/utils/TransactionUtils.java |    2 +-
 .../hudi/client/validator/PreWriteValidator.java   |   75 ++
 .../client/validator/StreamingOffsetValidator.java |  213 ++++
 .../org/apache/hudi/config/HoodieCleanConfig.java  |   38 +
 .../apache/hudi/config/HoodieClusteringConfig.java |   46 +-
 .../config/HoodiePreCommitValidatorConfig.java     |   34 +
 .../hudi/config/HoodiePreWriteValidatorConfig.java |   83 ++
 .../org/apache/hudi/config/HoodieWriteConfig.java  |   83 ++
 .../java/org/apache/hudi/keygen/KeyGenUtils.java   |    6 +-
 .../metadata/HoodieBackedTableMetadataWriter.java  |  104 +-
 ...ieBackedTableMetadataWriterTableVersionSix.java |   38 +-
 .../hudi/metadata/HoodieMetadataWriteUtils.java    |    8 +-
 .../org/apache/hudi/metrics/HoodieMetrics.java     |   85 ++
 .../java/org/apache/hudi/table/HoodieTable.java    |   10 +-
 .../hudi/table/action/clean/CleanPlanner.java      |   72 +-
 .../cluster/ClusteringFileSliceComparator.java     |   69 ++
 .../cluster/ClusteringFileSliceSortByField.java    |   20 +-
 .../CommitBasedClusteringPlanStrategy.java         |  341 +++++++
 .../PartitionAwareClusteringPlanStrategy.java      |   53 +-
 .../TestConflictResolutionStrategyUtil.java        |   19 +
 ...TestPreferWriterConflictResolutionStrategy.java |  415 ++++++++
 .../lock/TestHoodieInterProcessMutex.java          |   91 ++
 .../client/utils/TestDeletePartitionUtils.java     |    2 -
 .../client/utils/TestPreWriteValidatorUtils.java   |  434 ++++++++
 .../validator/TestStreamingOffsetValidator.java    |  552 ++++++++++
 .../config/TestHoodiePreWriteValidatorConfig.java  |  202 ++++
 .../apache/hudi/config/TestHoodieWriteConfig.java  |    2 +-
 .../org/apache/hudi/io/TestHoodieCreateHandle.java |    6 +-
 ...ieBackedTableMetadataWriterTableVersionSix.java |  260 +++++
 .../org/apache/hudi/metrics/TestHoodieMetrics.java |   71 ++
 .../org/apache/hudi/table/TestHoodieTable.java     |   52 +
 .../TestCommitBasedClusteringPlanStrategy.java     |  670 ++++++++++++
 .../TestPartitionAwareClusteringPlanStrategy.java  |   43 +
 .../hudi/client/HoodieFlinkTableServiceClient.java |   62 +-
 .../apache/hudi/client/HoodieFlinkWriteClient.java |   30 +-
 .../io/storage/row/HoodieRowDataCreateHandle.java  |   61 +-
 .../org/apache/hudi/table/HoodieFlinkTable.java    |    2 +-
 .../commit/FlinkAutoCommitActionExecutor.java      |    2 +-
 .../apache/hudi/util/HoodieSchemaConverter.java    |    7 +-
 .../client/TestHoodieFlinkTableServiceClient.java  |  117 +++
 .../storage/row/TestHoodieRowDataCreateHandle.java |  439 ++++++++
 .../testutils/HoodieFlinkClientTestHarness.java    |    1 -
 .../hudi/util/TestHoodieSchemaConverter.java       |   40 +
 .../apache/hudi/client/HoodieJavaWriteClient.java  |   14 +-
 .../run/strategy/JavaExecutionStrategy.java        |    2 +-
 .../hudi/client/TestJavaHoodieBackedMetadata.java  |   56 +
 .../apache/hudi/client/SparkRDDWriteClient.java    |   18 +-
 .../MultipleSparkJobExecutionStrategy.java         |    2 +-
 ...SparkJobConsistentHashingExecutionStrategy.java |    2 +-
 .../client/common/HoodieSparkEngineContext.java    |    5 +
 .../io/storage/HoodieSparkFileWriterFactory.java   |    4 +-
 .../hudi/io/storage/HoodieSparkLanceReader.java    |   68 +-
 .../hudi/io/storage/HoodieSparkLanceWriter.java    |   31 +-
 .../hudi/io/storage/HoodieSparkParquetReader.java  |   26 +-
 .../hudi/io/storage/VectorConversionUtils.java     |  238 +++++
 .../row/HoodieBloomFilterRowWriteSupport.java      |   58 ++
 .../storage/row/HoodieRowParquetWriteSupport.java  |  221 ++--
 .../org/apache/hudi/table/HoodieSparkTable.java    |   25 +-
 .../action/commit/SparkAutoCommitExecutor.java     |    2 +-
 .../org/apache/hudi/AvroConversionUtils.scala      |    2 +-
 .../apache/hudi/HoodieSchemaConversionUtils.scala  |    1 +
 .../scala/org/apache/hudi/HoodieSparkUtils.scala   |   53 +-
 .../SparkFileFormatInternalRowReaderContext.scala  |   89 +-
 .../apache/spark/sql/HoodieInternalRowUtils.scala  |   24 -
 .../sql/avro/HoodieSparkSchemaConverters.scala     |   76 +-
 .../datasources/SparkSchemaTransformUtils.scala    |   61 +-
 .../parquet/HoodieParquetFileFormatHelper.scala    |   36 +-
 .../parquet/HoodieParquetReadSupport.scala         |   48 +-
 .../org/apache/spark/sql/hudi/SparkAdapter.scala   |   28 +-
 .../callback/TestHoodieClientInitCallback.java     |    1 +
 .../hudi/client/TestSparkRDDWriteClient.java       |  131 +++
 .../TestSparkSizeBasedClusteringPlanStrategy.java  |  229 ++++-
 .../utils/TestSparkPreWriteValidatorUtils.java     |  194 ++++
 .../hudi/execution/BaseExecutorTestHarness.java    |  324 ++++++
 .../TestBoundedInMemoryExecutorInSpark.java        |  136 +--
 .../hudi/execution/TestBoundedInMemoryQueue.java   |    2 +-
 .../execution/TestDisruptorExecutionInSpark.java   |  137 +--
 .../hudi/execution/TestSimpleExecutionInSpark.java |  196 +---
 .../apache/hudi/io/TestHoodieTimelineArchiver.java |  141 +++
 .../java/org/apache/hudi/table/TestCleaner.java    |  369 +++++++
 .../TestSparkClusteringPlanPartitionFilter.java    |   18 +-
 .../table/functional/TestCleanPlanExecutor.java    |   82 ++
 .../TestSparkSchemaTransformUtils.scala            |   66 +-
 .../parquet/TestHoodieParquetReadSupport.scala     |   34 +
 .../org/apache/hudi/BaseHoodieTableFileIndex.java  |  108 +-
 .../client/validator/BasePreCommitValidator.java   |   80 ++
 .../hudi/client/validator/ValidationContext.java   |  183 ++++
 .../hudi/common/config/HoodieMetadataConfig.java   |   60 +-
 .../hudi/common/config/LockConfiguration.java      |    2 +
 .../hudi/common/data/HoodieListPairData.java       |    8 +
 .../apache/hudi/common/data/HoodiePairData.java    |   11 +
 .../hudi/common/engine/HoodieEngineContext.java    |    8 +
 .../java/org/apache/hudi/common/fs/FSUtils.java    |   10 +-
 .../hudi/common/model/HoodieCommitMetadata.java    |   15 +
 .../common/model/HoodiePreWriteCleanerPolicy.java  |   74 ++
 .../apache/hudi/common/schema/HoodieSchema.java    |  624 +++++++++++-
 .../HoodieSchemaComparatorForSchemaEvolution.java  |   10 +
 .../schema/HoodieSchemaCompatibilityChecker.java   |   22 +
 .../hudi/common/schema/HoodieSchemaType.java       |    5 +
 .../hudi/common/table/HoodieTableConfig.java       |   29 +-
 .../hudi/common/table/HoodieTableMetaClient.java   |   12 +
 .../hudi/common/table/TableSchemaResolver.java     |    9 +-
 .../apache/hudi/common/table/log/InstantRange.java |   86 +-
 .../table/read/FileGroupReaderSchemaHandler.java   |   28 +-
 .../table/view/HoodieTableFileSystemView.java      |    1 -
 .../hudi/common/table/view/NoOpTableMetadata.java  |    6 +
 .../apache/hudi/common/util/CheckpointUtils.java   |  327 ++++++
 .../apache/hudi/common/util/ClusteringUtils.java   |   29 +-
 .../hudi/common/util/HoodieTableConfigUtils.java   |   40 +-
 .../hudi/common/util/collection/RocksDBDAO.java    |   28 +-
 .../exception/HoodieWriteConflictException.java    |   31 +
 .../java/org/apache/hudi/internal/schema/Type.java |    7 +-
 .../org/apache/hudi/internal/schema/Types.java     |   70 ++
 .../schema/convert/InternalSchemaConverter.java    |   64 +-
 .../apache/hudi/metadata/BaseTableMetadata.java    |    4 +-
 .../metadata/FileSystemBackedTableMetadata.java    |   26 +-
 .../hudi/metadata/HoodieBackedTableMetadata.java   |   67 ++
 .../hudi/metadata/HoodieMetadataPayload.java       |   10 +-
 .../apache/hudi/metadata/HoodieTableMetadata.java  |   23 +-
 .../hudi/metadata/HoodieTableMetadataUtil.java     |   15 +-
 .../hudi/metadata/MetadataPartitionType.java       |    4 +-
 .../sync/common/model/PartitionValueExtractor.java |    0
 .../hudi/util}/LazyConcatenatingIterator.java      |    4 +-
 .../apache/hudi/TestReportJvmConfiguration.java    |   67 ++
 .../common/data/TestHoodieListDataPairData.java    |   18 +
 .../common/model/TestHoodieCommitMetadata.java     |  104 ++
 .../hudi/common/schema/TestHoodieSchema.java       |  950 ++++++++++++++++-
 ...stHoodieSchemaComparatorForSchemaEvolution.java |   18 +
 .../schema/TestHoodieSchemaCompatibility.java      |  136 +++
 .../hudi/common/schema/TestHoodieSchemaType.java   |   96 ++
 .../read/TestFileGroupReaderSchemaHandler.java     |   78 +-
 .../hudi/common/util/TestCheckpointUtils.java      |  245 +++++
 .../TestClosableSortedDedupingIterator.java        |   16 +-
 .../common/util/collection/TestRocksDBDAO.java     |   39 +
 .../convert/TestInternalSchemaConverter.java       |  155 +++
 .../TestHoodieBackedTableMetadataDataCleanup.java  |    2 +-
 .../hudi/metadata/TestHoodieTableMetadataUtil.java |    9 +
 .../hudi/util}/TestLazyConcatenatingIterator.java  |    3 +-
 hudi-flink-datasource/hudi-flink/pom.xml           |   11 +
 .../apache/hudi/configuration/FlinkOptions.java    |  144 +++
 .../apache/hudi/configuration/OptionsResolver.java |   31 +
 .../apache/hudi/sink/FlinkCheckpointClient.java    |  323 ++++++
 .../org/apache/hudi/sink/StreamWriteFunction.java  |   15 +-
 .../hudi/sink/StreamWriteOperatorCoordinator.java  |   20 +
 .../AppendWriteFunctionWithBIMBufferSort.java      |   17 +-
 ...AppendWriteFunctionWithDisruptorBufferSort.java |    8 +-
 .../sink/bootstrap/AbstractBootstrapOperator.java  |   83 ++
 .../hudi/sink/bootstrap/BootstrapOperator.java     |   50 +-
 .../hudi/sink/bootstrap/RLIBootstrapOperator.java  |  232 +++++
 .../apache/hudi/sink/buffer/BufferMemoryType.java  |   42 +-
 .../hudi/sink/buffer/MemorySegmentPoolFactory.java |   65 +-
 .../apache/hudi/sink/bulk/AutoRowDataKeyGen.java   |   22 +-
 .../hudi/sink/bulk/BulkInsertWriteFunction.java    |    8 +
 .../sink/common/AbstractStreamWriteFunction.java   |    6 +
 .../hudi/sink/common/AbstractWriteFunction.java    |   12 +
 .../hudi/sink/common/AbstractWriteOperator.java    |   18 +
 .../org/apache/hudi/sink/event/Correspondent.java  |   38 +
 .../hudi/sink/muttley/AthenaIngestionGateway.java  |  346 +++++++
 .../hudi/sink/muttley/FlinkHudiMuttleyClient.java  |  246 +++++
 .../muttley/FlinkHudiMuttleyClientException.java   |   19 +-
 .../sink/muttley/FlinkHudiMuttleyException.java    |   38 +-
 .../muttley/FlinkHudiMuttleyServerException.java   |   19 +-
 .../sink/partitioner/BucketAssignFunction.java     |    1 +
 .../partitioner/index/IndexBackendFactory.java     |   30 +-
 .../sink/partitioner/index/IndexWriteFunction.java |   11 +-
 .../index/RecordGlobalLocationSerializer.java      |  116 +++
 ...eIndexBackend.java => RocksDBIndexBackend.java} |   31 +-
 .../org/apache/hudi/sink/utils/CommitGuard.java    |   25 +
 .../org/apache/hudi/sink/utils/EventBuffers.java   |   22 +-
 .../java/org/apache/hudi/sink/utils/Pipelines.java |   56 +-
 .../sink/validator/FlinkKafkaOffsetValidator.java  |   58 ++
 .../sink/validator/FlinkValidationContext.java     |  117 +++
 .../hudi/sink/validator/FlinkValidatorUtils.java   |  150 +++
 .../java/org/apache/hudi/source/HoodieSource.java  |    6 +-
 .../apache/hudi/source/IncrementalInputSplits.java |    9 +-
 .../enumerator/AbstractHoodieSplitEnumerator.java  |    3 +-
 .../HoodieContinuousSplitEnumerator.java           |   11 +-
 .../apache/hudi/source/reader/BatchRecords.java    |   27 +-
 .../source/reader/HoodieSourceSplitReader.java     |   41 +-
 .../function/HoodieCdcSplitReaderFunction.java     | 1065 ++++++++++++++++++++
 .../reader/function/HoodieSplitReaderFunction.java |   83 +-
 .../StreamReadBucketIndexPartitioner.java          |   27 +-
 .../selector/StreamReadBucketIndexKeySelector.java |    7 +-
 .../source/split/DefaultHoodieSplitDiscover.java   |   31 +-
 .../source/split/DefaultHoodieSplitProvider.java   |    2 +-
 .../split/HoodieCdcSourceSplit.java}               |   45 +-
 .../source/split/HoodieContinuousSplitBatch.java   |   63 +-
 .../hudi/source/split/HoodieSourceSplit.java       |   12 +-
 .../source/split/HoodieSourceSplitSerializer.java  |  134 ++-
 .../assign/DefaultHoodieSplitAssigner.java         |    2 +-
 .../{ => split}/assign/HoodieSplitAssigner.java    |    2 +-
 .../{ => split}/assign/HoodieSplitAssigners.java   |    2 +-
 .../assign/HoodieSplitBucketAssigner.java          |    2 +-
 .../assign/HoodieSplitNumberAssigner.java          |    2 +-
 .../org/apache/hudi/table/HoodieTableFactory.java  |    9 +
 .../org/apache/hudi/table/HoodieTableSource.java   |   45 +-
 .../hudi/table/catalog/HoodieHiveCatalog.java      |   13 +-
 .../org/apache/hudi/table/format/FormatUtils.java  |    1 +
 .../hudi/table/format/cdc/CdcInputFormat.java      |    9 +-
 .../hudi/table/format/cdc/CdcInputSplit.java       |    3 +-
 .../table/format/mor/MergeOnReadInputFormat.java   |    2 +-
 .../table/format/mor/MergeOnReadInputSplit.java    |    6 +-
 .../table/format/mor/MergeOnReadTableState.java    |    6 +-
 .../HeapLookupCache.java}                          |   40 +-
 .../hudi/table/lookup/HoodieLookupFunction.java    |   56 +-
 .../org/apache/hudi/table/lookup/LookupCache.java  |   60 ++
 .../hudi/table/lookup/RocksDBLookupCache.java      |  181 ++++
 .../java/org/apache/hudi/util/FileIndexReader.java |    9 +-
 .../org/apache/hudi/util/FlinkWriteClients.java    |    4 +
 .../java/org/apache/hudi/util/HoodiePipeline.java  |   20 +-
 .../apache/hudi/util/KafkaOffsetParseUtils.java    |  201 ++++
 .../java/org/apache/hudi/util/StreamerUtil.java    |  277 +++++
 .../apache/hudi/sink/ITTestDataStreamWrite.java    |   31 +-
 .../hudi/sink/TestFlinkCheckpointClient.java       |  451 +++++++++
 .../hudi/sink/TestFlinkCheckpointClientMock.java   |  318 ++++++
 .../sink/TestStreamWriteOperatorCoordinator.java   |  465 +++++++++
 .../sink/buffer/TestMemorySegmentPoolFactory.java  |  108 ++
 .../apache/hudi/sink/bulk/TestRowDataKeyGens.java  |   20 +
 .../index/TestRecordGlobalLocationSerializer.java  |  191 ++++
 .../partitioner/index/TestRocksDBIndexBackend.java |   54 +
 .../utils/BucketStreamWriteFunctionWrapper.java    |    4 +-
 .../sink/utils/StreamWriteFunctionWrapper.java     |    5 +-
 .../apache/hudi/sink/utils/TestCommitGuard.java    |   78 ++
 .../apache/hudi/sink/utils/TestEventBuffers.java   |  121 +++
 .../validator/TestFlinkKafkaCheckpointParsing.java |  171 ++++
 .../validator/TestFlinkKafkaOffsetValidator.java   |  343 +++++++
 .../sink/validator/TestFlinkValidationContext.java |  200 ++++
 .../sink/validator/TestFlinkValidatorUtils.java    |  290 ++++++
 .../org/apache/hudi/source/TestHoodieSource.java   |   14 +-
 .../apache/hudi/source/TestStreamReadOperator.java |    2 +-
 .../TestHoodieContinuousSplitEnumerator.java       |  202 +++-
 .../TestHoodieEnumeratorStateSerializer.java       |    5 +-
 .../TestHoodieStaticSplitEnumerator.java           |    5 +-
 .../hudi/source/reader/TestBatchRecords.java       |   31 -
 .../source/reader/TestHoodieRecordEmitter.java     |    3 +-
 .../source/reader/TestHoodieSourceSplitReader.java |   65 +-
 .../function/TestHoodieCdcSplitReaderFunction.java |  196 ++++
 .../function/TestHoodieSplitReaderFunction.java    |  226 ++++-
 .../split/TestDefaultHoodieSplitDiscover.java      |   77 +-
 .../split/TestDefaultHoodieSplitProvider.java      |   21 +-
 .../source/split/TestHoodieCdcSourceSplit.java     |  183 ++++
 .../split/TestHoodieContinuousSplitBatch.java      |  368 +++++++
 .../hudi/source/split/TestHoodieSourceSplit.java   |  182 +++-
 .../split/TestHoodieSourceSplitComparator.java     |    3 +-
 .../split/TestHoodieSourceSplitSerializer.java     |  903 ++++++++++++++++-
 .../assign/TestDefaultHoodieSplitAssigner.java     |    5 +-
 .../assign/TestHoodieSplitAssigners.java           |    2 +-
 .../assign/TestHoodieSplitBucketAssigner.java      |    5 +-
 .../assign/TestHoodieSplitNumberAssigner.java      |    5 +-
 .../apache/hudi/table/ITTestHoodieDataSource.java  |  231 ++++-
 .../ITTestVariantCrossEngineCompatibility.java     |   10 +-
 .../hudi/util/TestKafkaOffsetParseUtils.java       |  273 +++++
 .../apache/hudi/utils/TestFlinkWriteClients.java   |   17 +
 .../main/java/org/apache/hudi/adapter/Utils.java   |   25 +
 .../main/java/org/apache/hudi/adapter/Utils.java   |   25 +
 .../main/java/org/apache/hudi/adapter/Utils.java   |   26 +
 .../main/java/org/apache/hudi/adapter/Utils.java   |   26 +
 .../main/java/org/apache/hudi/adapter/Utils.java   |   26 +
 .../main/java/org/apache/hudi/adapter/Utils.java   |   26 +
 .../apache/hudi/avro/HoodieAvroWriteSupport.java   |   51 +-
 .../hudi/io/lance/HoodieBaseLanceWriter.java       |   20 +-
 .../parquet/io/ByteArraySeekableInputStream.java   |  125 +++
 .../parquet/io/HoodieParquetBinaryCopyBase.java    |  388 +++----
 .../parquet/io/HoodieParquetFileBinaryCopier.java  |  316 +++++-
 .../avro/AvroSchemaConverterWithTimestampNTZ.java  |   12 +
 ...TestHoodieAvroWriteSupportVariantShredding.java |  508 ----------
 .../index/TestBaseHoodieTableFileIndex.java        |    2 +-
 .../hudi/common/table/TestHoodieTableConfig.java   |    4 +-
 .../hudi/common/table/TestTableSchemaResolver.java |  125 +++
 .../hudi/common/util/TestClusteringUtils.java      |  143 +++
 .../TestHoodieNativeAvroHFileReaderCaching.java    |   39 +-
 ...oodieAvroFileWriterFactoryVariantShredding.java |  275 -----
 .../TestFileSystemBackedTableMetadata.java         |   48 +
 .../io/TestByteArraySeekableInputStream.java       |  195 ++++
 ...HoodieParquetBinaryCopyBaseSchemaEvolution.java |  132 ++-
 .../io/TestHoodieParquetFileBinaryCopier.java      |  101 +-
 .../TestHoodieParquetFileBinaryCopierPrefetch.java |  142 +++
 .../io/TestOutputStreamBackedOutputFile.java       |   83 ++
 .../parquet/avro/TestAvroSchemaConverter.java      |   23 +-
 .../hudi/hadoop/HiveHoodieTableFileIndex.java      |    1 +
 .../hadoop/HoodieLatestBaseFilesPathFilter.java    |   29 +-
 .../hudi/hadoop/HoodieROTablePathFilter.java       |   15 +-
 .../hudi/hadoop/TestHoodieROTablePathFilter.java   |   16 +-
 .../hudi/hadoop/utils/TestHiveAvroSerializer.java  |    1 -
 hudi-notebooks/Dockerfile.spark                    |   49 +-
 hudi-notebooks/build.sh                            |   22 +-
 hudi-notebooks/conf/spark/spark-defaults.conf      |    5 -
 hudi-notebooks/docker-compose.yml                  |    4 +-
 hudi-notebooks/notebooks/01-crud-operations.ipynb  |    2 +-
 hudi-notebooks/notebooks/02-query-types.ipynb      |    2 +-
 .../notebooks/03-scd-type2_and_type4.ipynb         |    2 +-
 hudi-notebooks/notebooks/04-schema-evolution.ipynb |    2 +-
 .../notebooks/05-mastering-sql-procedures.ipynb    |    2 +-
 .../notebooks/06_hudi_trino_example.ipynb          |  325 ++++++
 .../notebooks/07_hudi_presto_example.ipynb         |  325 ++++++
 hudi-notebooks/notebooks/utils.py                  |  191 ++--
 hudi-notebooks/requirements.txt                    |    6 +
 hudi-notebooks/run_spark_hudi.sh                   |    2 +
 .../scala/org/apache/hudi/DataSourceOptions.scala  |   52 +-
 .../org/apache/hudi/DatabricksRuntimeHelper.scala  |   75 ++
 .../main/scala/org/apache/hudi/DefaultSource.scala |   15 +-
 .../scala/org/apache/hudi/HoodieBaseRelation.scala |   10 +-
 .../scala/org/apache/hudi/HoodieFileIndex.scala    |   24 +
 .../hudi/HoodieHadoopFsRelationFactory.scala       |    3 +-
 .../org/apache/hudi/HoodieSparkSqlWriter.scala     |    8 +-
 .../scala/org/apache/hudi/HoodieWriterUtils.scala  |   10 +
 .../apache/hudi/SparkHoodieTableFileIndex.scala    |   31 +-
 .../sql/catalyst/catalog/HoodieCatalogTable.scala  |   22 +-
 .../HoodieFileGroupReaderBasedFileFormat.scala     |  134 ++-
 .../sql/hive/SparkCatalogMetaStoreClient.scala     |  380 +++++++
 .../spark/sql/hudi/HoodieSqlCommonUtils.scala      |    4 +
 .../spark/sql/hudi/ProvidesHoodieConfig.scala      |   34 +-
 .../hudi/command/CreateHoodieTableCommand.scala    |    1 +
 .../command/ShowHoodieTablePartitionsCommand.scala |   29 +-
 .../org/apache/hudi/TestDataSourceOptions.scala    |   54 +-
 .../java/org/apache/hudi/HoodieSparkSQLUtils.java  |  118 +++
 .../apache/hudi/TestDecimalTypeDataWorkflow.scala  |    6 +-
 .../hudi/client/TestHoodieClientMultiWriter.java   |  381 +++++++
 ...DataValidationCheckForLogCompactionActions.java |   16 -
 .../TestHoodieClientOnCopyOnWriteStorage.java      |  538 +++++-----
 .../TestRemoteFileSystemViewWithMetadataTable.java |    1 -
 .../hudi/functional/TestHoodieBackedMetadata.java  |  279 +++++
 ...SparkBinaryCopyClusteringAndValidationMeta.java |    2 +-
 .../io/storage/TestHoodieSparkLanceReader.java     |   36 +-
 .../io/storage/TestHoodieSparkLanceWriter.java     |   22 +-
 .../TestCopyOnWriteRollbackActionExecutor.java     |    3 +-
 ...dieSparkMergeOnReadTableInsertUpdateDelete.java |    2 +
 .../TestIncrementalQueryWithArchivedInstants.scala |    2 +-
 .../hudi/TestAvroSchemaResolutionSupport.scala     |  204 ++--
 .../hudi/TestHoodieSchemaConversionUtils.scala     |  284 ++++++
 .../org/apache/hudi/TestHoodieSparkSqlWriter.scala |    6 +-
 .../org/apache/hudi/TestInsertDedupPolicy.scala    |    2 -
 .../functional/PartitionStatsIndexTestBase.scala   |    1 -
 .../TestAutoGenerationOfRecordKeys.scala           |    8 -
 .../hudi/functional/TestBasicSchemaEvolution.scala |    8 -
 .../apache/hudi/functional/TestCOWDataSource.scala |    5 +-
 .../apache/hudi/functional/TestMORDataSource.scala |   12 +-
 .../functional/TestPartialUpdateAvroPayload.scala  |    8 -
 .../hudi/functional/TestRecordLevelIndex.scala     |  201 +++-
 .../TestSparkDataSourceDAGExecution.scala          |    9 -
 .../hudi/functional/TestVectorDataSource.scala     | 1000 ++++++++++++++++++
 .../functional/cdc/TestCDCDataFrameSuite.scala     |    6 +-
 .../functional/cdc/TestCDCStreamingSuite.scala     |    4 +-
 .../hudi/utils/TestHoodieSparkSQLUtils.scala       |  109 ++
 .../TestBaseSpark3AdapterVariantMethods.scala      |   77 ++
 .../TestBaseSpark4AdapterVariantMethods.scala      |  262 +++++
 .../org/apache/spark/sql/avro/TestAvroSerDe.scala  |  136 ++-
 .../spark/sql/avro/TestSchemaConverters.scala      |   72 +-
 .../sql/hive/TestSparkCatalogMetaStoreClient.scala |  246 +++++
 .../sql/hudi/common/MockSlashKeyGenerator.scala    |  134 +++
 .../common/MockSlashPartitionValueExtractor.scala  |   40 +
 .../common/TestCustomParitionValueExtractor.scala  |  386 +++++++
 .../sql/hudi/common/TestROPathFilterOnRead.scala   |  352 +++++++
 .../apache/spark/sql/hudi/common/TestSqlConf.scala |   35 +-
 .../spark/sql/hudi/ddl/TestShowPartitions.scala    |   47 +
 .../apache/spark/sql/hudi/ddl/TestSpark3DDL.scala  |   13 +-
 .../spark/sql/hudi/ddl/TestSparkCatalogSync.scala  |  162 +++
 .../sql/hudi/dml/schema/TestVariantDataType.scala  |  318 +-----
 .../sql/hudi/feature/TestCDCForSparkSQL.scala      |   28 +-
 .../spark/sql/adapter/BaseSpark3Adapter.scala      |   10 +-
 .../apache/spark/sql/avro/AvroDeserializer.scala   |   35 +
 .../org/apache/spark/sql/avro/AvroSerializer.scala |   38 +
 .../apache/spark/sql/avro/AvroDeserializer.scala   |   35 +
 .../org/apache/spark/sql/avro/AvroSerializer.scala |   38 +
 .../HoodieSpark34PartitionedFileUtils.scala        |    3 +-
 .../apache/spark/sql/avro/AvroDeserializer.scala   |   35 +
 .../org/apache/spark/sql/avro/AvroSerializer.scala |   38 +
 .../spark/sql/adapter/BaseSpark4Adapter.scala      |   45 +-
 .../TestSpark4VariantShreddingProvider.java        |  279 -----
 .../apache/spark/sql/avro/AvroDeserializer.scala   |   65 +-
 .../org/apache/spark/sql/avro/AvroSerializer.scala |   40 +-
 .../TestHoodieRowParquetWriteSupportVariant.java   |  444 ++++++++
 .../java/org/apache/hudi/hive/HiveSyncConfig.java  |    4 +
 .../org/apache/hudi/hive/HiveSyncConfigHolder.java |    6 +
 .../java/org/apache/hudi/hive/HiveSyncTool.java    |   11 +-
 .../org/apache/hudi/hive/HoodieHiveSyncClient.java |  321 +++++-
 .../java/org/apache/hudi/hive/ddl/DDLExecutor.java |   10 +
 .../org/apache/hudi/hive/ddl/HMSDDLExecutor.java   |   46 +-
 .../hudi/hive/ddl/JDBCBasedMetadataOperator.java   |  300 ++++++
 .../org/apache/hudi/hive/ddl/JDBCExecutor.java     |    9 +
 .../hudi/hive/ddl/QueryBasedDDLExecutor.java       |   64 +-
 .../org/apache/hudi/hive/TestHiveSyncTool.java     |   95 +-
 .../hive/ddl/TestJDBCBasedMetadataOperator.java    |  190 ++++
 .../hudi/sync/common/HoodieMetaSyncOperations.java |   19 +
 .../apache/hudi/sync/common/HoodieSyncClient.java  |    4 +
 .../apache/hudi/sync/common/HoodieSyncConfig.java  |   45 +-
 .../hudi/sync/common/model/PartitionEvent.java     |    6 +-
 .../hudi/sync/common/TestHoodieSyncConfig.java     |   16 +
 .../resources/log4j2-surefire-quiet.properties     |    2 +-
 .../src/main/resources/log4j2-surefire.properties  |    2 +-
 .../sources/helpers/ProtoConversionUtil.java       |    2 +-
 .../utilities/sources/helpers/QueryRunner.java     |    2 +-
 .../utilities/streamer/BaseErrorTableWriter.java   |    6 +-
 .../apache/hudi/utilities/streamer/StreamSync.java |   12 +-
 .../deltastreamer/TestHoodieDeltaStreamer.java     |   18 +-
 ...TestHoodieDeltaStreamerSchemaEvolutionBase.java |    6 +-
 .../utilities/sources/TestJsonKafkaSource.java     |    6 +-
 .../sources/helpers/TestProtoConversionUtil.java   |    2 +-
 rfc/rfc-102/cat_emebdding.png                      |  Bin 0 -> 6075633 bytes
 rfc/rfc-102/comparison_embedding.png               |  Bin 0 -> 6920384 bytes
 rfc/rfc-102/embedding_table.png                    |  Bin 0 -> 6452822 bytes
 rfc/rfc-102/rfc-102.md                             |  227 +++++
 rfc/rfc-99/appendix.md                             |  246 +++++
 rfc/rfc-99/rfc-99.md                               |   30 +-
 style/checkstyle-suppressions.xml                  |    1 +
 428 files changed, 33163 insertions(+), 4561 deletions(-)
 create mode 100644 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/lock/HoodieInterProcessMutex.java
 create mode 100644 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/utils/ArchivalMetrics.java
 create mode 100644 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/utils/PreWriteValidatorUtils.java
 create mode 100644 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/validator/PreWriteValidator.java
 create mode 100644 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/validator/StreamingOffsetValidator.java
 create mode 100644 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodiePreWriteValidatorConfig.java
 create mode 100644 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/cluster/ClusteringFileSliceComparator.java
 copy 
hudi-common/src/main/java/org/apache/hudi/common/bloom/BloomFilterTypeCode.java 
=> 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/cluster/ClusteringFileSliceSortByField.java
 (56%)
 create mode 100644 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/cluster/strategy/CommitBasedClusteringPlanStrategy.java
 create mode 100644 
hudi-client/hudi-client-common/src/test/java/org/apache/hudi/client/transaction/lock/TestHoodieInterProcessMutex.java
 create mode 100644 
hudi-client/hudi-client-common/src/test/java/org/apache/hudi/client/utils/TestPreWriteValidatorUtils.java
 create mode 100644 
hudi-client/hudi-client-common/src/test/java/org/apache/hudi/client/validator/TestStreamingOffsetValidator.java
 create mode 100644 
hudi-client/hudi-client-common/src/test/java/org/apache/hudi/config/TestHoodiePreWriteValidatorConfig.java
 create mode 100644 
hudi-client/hudi-client-common/src/test/java/org/apache/hudi/metadata/TestHoodieBackedTableMetadataWriterTableVersionSix.java
 create mode 100644 
hudi-client/hudi-client-common/src/test/java/org/apache/hudi/table/action/cluster/strategy/TestCommitBasedClusteringPlanStrategy.java
 create mode 100644 
hudi-client/hudi-flink-client/src/test/java/org/apache/hudi/client/TestHoodieFlinkTableServiceClient.java
 create mode 100644 
hudi-client/hudi-flink-client/src/test/java/org/apache/hudi/io/storage/row/TestHoodieRowDataCreateHandle.java
 create mode 100644 
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/io/storage/VectorConversionUtils.java
 create mode 100644 
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/io/storage/row/HoodieBloomFilterRowWriteSupport.java
 create mode 100644 
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/client/utils/TestSparkPreWriteValidatorUtils.java
 create mode 100644 
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/execution/BaseExecutorTestHarness.java
 create mode 100644 
hudi-common/src/main/java/org/apache/hudi/client/validator/BasePreCommitValidator.java
 create mode 100644 
hudi-common/src/main/java/org/apache/hudi/client/validator/ValidationContext.java
 create mode 100644 
hudi-common/src/main/java/org/apache/hudi/common/model/HoodiePreWriteCleanerPolicy.java
 create mode 100644 
hudi-common/src/main/java/org/apache/hudi/common/util/CheckpointUtils.java
 rename {hudi-sync/hudi-sync-common => 
hudi-common}/src/main/java/org/apache/hudi/sync/common/model/PartitionValueExtractor.java
 (100%)
 rename 
{hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/utils => 
hudi-common/src/main/java/org/apache/hudi/util}/LazyConcatenatingIterator.java 
(96%)
 create mode 100644 
hudi-common/src/test/java/org/apache/hudi/TestReportJvmConfiguration.java
 create mode 100644 
hudi-common/src/test/java/org/apache/hudi/common/util/TestCheckpointUtils.java
 rename {hudi-client/hudi-client-common/src/test/java/org/apache/hudi/utils => 
hudi-common/src/test/java/org/apache/hudi/util}/TestLazyConcatenatingIterator.java
 (97%)
 create mode 100644 
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/FlinkCheckpointClient.java
 create mode 100644 
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/bootstrap/AbstractBootstrapOperator.java
 create mode 100644 
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/bootstrap/RLIBootstrapOperator.java
 copy 
hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/writer/DeltaInputWriter.java
 => 
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/buffer/BufferMemoryType.java
 (54%)
 create mode 100644 
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/muttley/AthenaIngestionGateway.java
 create mode 100644 
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/muttley/FlinkHudiMuttleyClient.java
 copy 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/bootstrap/translator/IdentityBootstrapPartitionPathTranslator.java
 => 
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/muttley/FlinkHudiMuttleyClientException.java
 (65%)
 copy hudi-io/src/main/java/org/apache/hudi/exception/HoodieException.java => 
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/muttley/FlinkHudiMuttleyException.java
 (57%)
 copy 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/bootstrap/translator/IdentityBootstrapPartitionPathTranslator.java
 => 
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/muttley/FlinkHudiMuttleyServerException.java
 (65%)
 create mode 100644 
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/partitioner/index/RecordGlobalLocationSerializer.java
 copy 
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/partitioner/index/{FlinkStateIndexBackend.java
 => RocksDBIndexBackend.java} (50%)
 create mode 100644 
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/validator/FlinkKafkaOffsetValidator.java
 create mode 100644 
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/validator/FlinkValidationContext.java
 create mode 100644 
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/validator/FlinkValidatorUtils.java
 create mode 100644 
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/source/reader/function/HoodieCdcSplitReaderFunction.java
 copy 
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/{table/format/cdc/CdcInputSplit.java
 => source/split/HoodieCdcSourceSplit.java} (51%)
 rename hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/source/{ 
=> split}/assign/DefaultHoodieSplitAssigner.java (97%)
 rename hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/source/{ 
=> split}/assign/HoodieSplitAssigner.java (96%)
 rename hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/source/{ 
=> split}/assign/HoodieSplitAssigners.java (96%)
 rename hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/source/{ 
=> split}/assign/HoodieSplitBucketAssigner.java (97%)
 rename hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/source/{ 
=> split}/assign/HoodieSplitNumberAssigner.java (97%)
 copy 
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/{format/SchemaEvolvedRecordIterator.java
 => lookup/HeapLookupCache.java} (54%)
 create mode 100644 
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/lookup/LookupCache.java
 create mode 100644 
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/lookup/RocksDBLookupCache.java
 create mode 100644 
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/util/KafkaOffsetParseUtils.java
 create mode 100644 
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/TestFlinkCheckpointClient.java
 create mode 100644 
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/TestFlinkCheckpointClientMock.java
 create mode 100644 
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/buffer/TestMemorySegmentPoolFactory.java
 create mode 100644 
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/partitioner/index/TestRecordGlobalLocationSerializer.java
 create mode 100644 
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/partitioner/index/TestRocksDBIndexBackend.java
 create mode 100644 
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/utils/TestCommitGuard.java
 create mode 100644 
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/utils/TestEventBuffers.java
 create mode 100644 
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/validator/TestFlinkKafkaCheckpointParsing.java
 create mode 100644 
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/validator/TestFlinkKafkaOffsetValidator.java
 create mode 100644 
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/validator/TestFlinkValidationContext.java
 create mode 100644 
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/validator/TestFlinkValidatorUtils.java
 create mode 100644 
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/source/reader/function/TestHoodieCdcSplitReaderFunction.java
 create mode 100644 
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/source/split/TestHoodieCdcSourceSplit.java
 create mode 100644 
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/source/split/TestHoodieContinuousSplitBatch.java
 rename hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/source/{ 
=> split}/assign/TestDefaultHoodieSplitAssigner.java (99%)
 rename hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/source/{ 
=> split}/assign/TestHoodieSplitAssigners.java (99%)
 rename hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/source/{ 
=> split}/assign/TestHoodieSplitBucketAssigner.java (99%)
 rename hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/source/{ 
=> split}/assign/TestHoodieSplitNumberAssigner.java (98%)
 create mode 100644 
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/util/TestKafkaOffsetParseUtils.java
 create mode 100644 
hudi-hadoop-common/src/main/java/org/apache/hudi/parquet/io/ByteArraySeekableInputStream.java
 delete mode 100644 
hudi-hadoop-common/src/test/java/org/apache/hudi/avro/TestHoodieAvroWriteSupportVariantShredding.java
 delete mode 100644 
hudi-hadoop-common/src/test/java/org/apache/hudi/io/storage/hadoop/TestHoodieAvroFileWriterFactoryVariantShredding.java
 create mode 100644 
hudi-hadoop-common/src/test/java/org/apache/hudi/parquet/io/TestByteArraySeekableInputStream.java
 create mode 100644 
hudi-hadoop-common/src/test/java/org/apache/hudi/parquet/io/TestHoodieParquetFileBinaryCopierPrefetch.java
 create mode 100644 
hudi-hadoop-common/src/test/java/org/apache/hudi/parquet/io/TestOutputStreamBackedOutputFile.java
 copy 
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/versioning/v2/TimelinePathProviderV2.java
 => 
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieLatestBaseFilesPathFilter.java
 (52%)
 create mode 100644 hudi-notebooks/notebooks/06_hudi_trino_example.ipynb
 create mode 100644 hudi-notebooks/notebooks/07_hudi_presto_example.ipynb
 create mode 100644 hudi-notebooks/requirements.txt
 create mode 100644 
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/DatabricksRuntimeHelper.scala
 create mode 100644 
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hive/SparkCatalogMetaStoreClient.scala
 create mode 100644 
hudi-spark-datasource/hudi-spark/src/main/java/org/apache/hudi/HoodieSparkSQLUtils.java
 create mode 100644 
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestVectorDataSource.scala
 create mode 100644 
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/utils/TestHoodieSparkSQLUtils.scala
 create mode 100644 
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/adapter/TestBaseSpark3AdapterVariantMethods.scala
 create mode 100644 
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/adapter/TestBaseSpark4AdapterVariantMethods.scala
 create mode 100644 
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hive/TestSparkCatalogMetaStoreClient.scala
 create mode 100644 
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/common/MockSlashKeyGenerator.scala
 create mode 100644 
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/common/MockSlashPartitionValueExtractor.scala
 create mode 100644 
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/common/TestCustomParitionValueExtractor.scala
 create mode 100644 
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/common/TestROPathFilterOnRead.scala
 create mode 100644 
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/ddl/TestSparkCatalogSync.scala
 delete mode 100644 
hudi-spark-datasource/hudi-spark4-common/src/test/java/org/apache/hudi/variant/TestSpark4VariantShreddingProvider.java
 create mode 100644 
hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/ddl/JDBCBasedMetadataOperator.java
 create mode 100644 
hudi-sync/hudi-hive-sync/src/test/java/org/apache/hudi/hive/ddl/TestJDBCBasedMetadataOperator.java
 create mode 100644 rfc/rfc-102/cat_emebdding.png
 create mode 100644 rfc/rfc-102/comparison_embedding.png
 create mode 100644 rfc/rfc-102/embedding_table.png
 create mode 100644 rfc/rfc-102/rfc-102.md
 create mode 100644 rfc/rfc-99/appendix.md

Reply via email to