Hi, We are happy to announce that Hyperspace v0.4.0 - an indexing subsystem for Apache Spark™ - has been released <https://github.com/microsoft/hyperspace/releases/tag/v0.4.0>!
Here are the some of the highlights: - Delta Lake support: Hyperspace v0.4.0 supports creating indexes on Delta Lake tables. Please refer to the user guide <https://microsoft.github.io/hyperspace/docs/ug-supported-data-formats/#delta-lake> for more info. - Support for Databricks: A known issue when Hyperspace was run on Databricks has been addressed. Hyperspace v0.4.0 can now run on Databricks Runtime 5.5 LTS & 6.4! - Globbing patterns for indexes: Globbing patterns can be used to specify a subset of source data to create/maintain index on. Please refer to the user guide <https://microsoft.github.io/hyperspace/docs/ug-quick-start-guide/#supporting-globbing-patterns-on-hyperspace-since-040> on the usage. - Hybrid Scan improvements: Hyperspace 0.4.0 brings in several improvements on Hybrid Scan such as a better mechanism <https://microsoft.github.io/hyperspace/docs/ug-mutable-dataset/#how-to-enable> to enable/disable the feature, rank algorithm improvements, quick index refresh, etc. - Pluggable source provider: This release introduces a (evolving) pluggable source provider API set so that different source formats can be plugged in. This enabled Delta Lake source to be plugged in, and there is on-going PR to support Iceberg tables. We would like to thank the community for the great feedback and all those who contributed to this release. Thanks, Terry Kim on behalf of the Hyperspace team