zhuqi-lucas commented on issue #16374:
URL: https://github.com/apache/datafusion/issues/16374#issuecomment-3041327170
Thank you @alamb , a minor topic is i may pick up this:
http://github.com/apache/datafusion/pull/13933
To use this user-defined index or parquet SortColumn metad
alamb commented on issue #16374:
URL: https://github.com/apache/datafusion/issues/16374#issuecomment-3041322628
> User-Defined Index.
I think this is a really good term -- I will update the blog post in
https://github.com/apache/datafusion-site/pull/79 to use that
--
This is an aut
alamb commented on issue #16374:
URL: https://github.com/apache/datafusion/issues/16374#issuecomment-3041320800
> Thank you [@alamb](https://github.com/alamb)
[@JigaoLuo](https://github.com/JigaoLuo)
[@adriangb](https://github.com/adriangb) , i agree current example is the
start, we can fu
zhuqi-lucas commented on issue #16374:
URL: https://github.com/apache/datafusion/issues/16374#issuecomment-3040702910
Thank you @alamb @JigaoLuo @adriangb , i agree current example is the start,
we can further add more advanced examples!
--
This is an automated message from the Ap
JigaoLuo commented on issue #16374:
URL: https://github.com/apache/datafusion/issues/16374#issuecomment-3039853245
> > Hi [@zhuqi-lucas](https://github.com/zhuqi-lucas),
> > While proofreading the blog, I had one major general question: **What
are the limitations of such an embedded index
adriangb commented on issue #16374:
URL: https://github.com/apache/datafusion/issues/16374#issuecomment-3039806254
Index suggestion: a tablesample index.
And a general thought: exploring these sorts of indexes could do very cool
stuff for DataFusion in general in terms of pushing us t
alamb commented on issue #16374:
URL: https://github.com/apache/datafusion/issues/16374#issuecomment-3039796047
> Hi [@zhuqi-lucas](https://github.com/zhuqi-lucas),
>
> While proofreading the blog, I had one major general question: **What are
the limitations of such an embedded index?
JigaoLuo commented on issue #16374:
URL: https://github.com/apache/datafusion/issues/16374#issuecomment-3039204719
Hi @zhuqi-lucas,
While proofreading the blog, I had one major general question: **What are
the limitations of such an embedded index?**
- Is it limited to just one emb
alamb closed issue #16374: Add an example of embedding indexes *inside* a
parquet file
URL: https://github.com/apache/datafusion/issues/16374
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the spec
JigaoLuo commented on issue #16374:
URL: https://github.com/apache/datafusion/issues/16374#issuecomment-2993567391
@alamb @zhuqi-lucas Thank you for this issue and the PR. This could
significantly aid query processing on Parquet.
I was previously **never** aware of `key_value_metadat
zhuqi-lucas commented on issue #16374:
URL: https://github.com/apache/datafusion/issues/16374#issuecomment-2969754548
I am also preparing to cook a advanced_embedding_indexes later after the
simple one merged.
--
This is an automated message from the Apache Git Service.
To respond to the
zhuqi-lucas commented on issue #16374:
URL: https://github.com/apache/datafusion/issues/16374#issuecomment-2969605248
Thank you @alamb @adriangb , submit a simple example PR for review, i can
add more examples follow-up:
https://github.com/apache/datafusion/pull/16395
--
This is an
adriangb commented on issue #16374:
URL: https://github.com/apache/datafusion/issues/16374#issuecomment-2967987004
Very excited about this!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the spe
alamb commented on issue #16374:
URL: https://github.com/apache/datafusion/issues/16374#issuecomment-2966448392
Nice @zhuqi-lucas -- BTW I am not sure how easy it will be to use the
parquet APIs to do this (specifically write arbitrary bytes to the inner
writer) so it may take some fiddlin
zhuqi-lucas commented on issue #16374:
URL: https://github.com/apache/datafusion/issues/16374#issuecomment-2966416266
take
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
T
zhuqi-lucas commented on issue #16374:
URL: https://github.com/apache/datafusion/issues/16374#issuecomment-2966419013
I am interested in this, and i want to be familiar with embedding indexes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please lo
zhuqi-lucas commented on issue #16374:
URL: https://github.com/apache/datafusion/issues/16374#issuecomment-2966460212
Thank you @alamb, i will investigate and explore the APIs and see what’s
possible.
--
This is an automated message from the Apache Git Service.
To respond to the message,
alamb opened a new issue, #16374:
URL: https://github.com/apache/datafusion/issues/16374
### Is your feature request related to a problem or challenge?
One of the common criticisms of parquet based query systems is that they
don't have some particular type of index (e.g. HyperLogLog a
18 matches
Mail list logo