[DISCUSS] SPIP: Write schema narrowing for column-level UPDATE in DSv2

Anurag Mantripragada Thu, 23 Apr 2026 11:39:59 -0700

Hi everyone,

I would like to start a discussion regarding an enhancement to the DSv2
API. This proposal allows connectors to declare which columns they need to
receive during an update, significantly improving performance and reducing
write amplification. This is particularly beneficial for connectors like
Iceberg on wide tables, which are increasingly common in AI/ML use cases.


I have included a PR with this SPIP that demonstrates the changes. It has
been tested on the Iceberg connector and is working well end-to-end.

Huaxian Gao has agreed to serve as the shepherd for this SPIP.

SPARK-56599 <https://issues.apache.org/jira/browse/SPARK-56599>
SPIP Doc
<https://docs.google.com/document/d/1-Wiw9U54ESpbLakb9Cn_mO4AviM4nrk4TF7rNhI3JZg/edit?tab=t.0#heading=h.yoitjxhaitk8>
PR <https://github.com/apache/spark/pull/55518>

Please take a look and provide feedback!

Thanks,
Anurag Mantripragada

[DISCUSS] SPIP: Write schema narrowing for column-level UPDATE in DSv2

Reply via email to