[ https://issues.apache.org/jira/browse/ARROW-5340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17662361#comment-17662361 ]
Rok Mihevc commented on ARROW-5340: ----------------------------------- This issue has been migrated to [issue #21799|https://github.com/apache/arrow/issues/21799] on GitHub. Please see the [migration documentation|https://github.com/apache/arrow/issues/14542] for further details. > [C++] See if possible to deduplicate dictionaries in IPC streams in some way > ---------------------------------------------------------------------------- > > Key: ARROW-5340 > URL: https://issues.apache.org/jira/browse/ARROW-5340 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ > Reporter: Wes McKinney > Priority: Major > > As follow-on work to ARROW-3144, there are cases where a dictionary may be > shared by multiple fields in a RecordBatch. > The presumption of {{arrow::ipc::DictionaryMemo}} is that there is a 1-to-1 > mapping between fields and dictionaries, and dictionary id assignment occurs > prior to observing the dictionaries (to know whether or not they are used > multiple times), so it may not be feasible, or at least not easy. -- This message was sent by Atlassian Jira (v8.20.10#820010)