Announcing Hyperspace v0.4.0 - an indexing subsystem for Apache Spark™

Terry Kim Mon, 08 Feb 2021 12:50:56 -0800

Hi,

We are happy to announce that Hyperspace v0.4.0 - an indexing subsystem for
Apache Spark™ - has been released
<https://github.com/microsoft/hyperspace/releases/tag/v0.4.0>!


Here are the some of the highlights:

   - Delta Lake support: Hyperspace v0.4.0 supports creating indexes on
   Delta Lake tables. Please refer to the user guide
   
<https://microsoft.github.io/hyperspace/docs/ug-supported-data-formats/#delta-lake>
for
   more info.
   - Support for Databricks: A known issue when Hyperspace was run on
   Databricks has been addressed. Hyperspace v0.4.0 can now run on Databricks
   Runtime 5.5 LTS & 6.4!
   - Globbing patterns for indexes: Globbing patterns can be used to
   specify a subset of source data to create/maintain index on. Please refer
   to the user guide
   
<https://microsoft.github.io/hyperspace/docs/ug-quick-start-guide/#supporting-globbing-patterns-on-hyperspace-since-040>
on
   the usage.
   - Hybrid Scan improvements: Hyperspace 0.4.0 brings in several
   improvements on Hybrid Scan such as a better mechanism
   
<https://microsoft.github.io/hyperspace/docs/ug-mutable-dataset/#how-to-enable>
to
   enable/disable the feature, rank algorithm improvements, quick index
   refresh, etc.
   - Pluggable source provider: This release introduces a (evolving)
   pluggable source provider API set so that different source formats can be
   plugged in. This enabled Delta Lake source to be plugged in, and there is
   on-going PR to support Iceberg tables.

We would like to thank the community for the great feedback and all those
who contributed to this release.

Thanks,
Terry Kim on behalf of the Hyperspace team

Announcing Hyperspace v0.4.0 - an indexing subsystem for Apache Spark™

Reply via email to