On 12/12/2017 04:33 PM, Oleg Bartunov wrote:
On Mon, Dec 11, 2017 at 11:11 PM, Nikolay Samokhvalov
<samokhva...@gmail.com> wrote:
Very interesting read: https://arxiv.org/abs/1712.01208

HN discussion: https://news.ycombinator.com/item?id=15894896

Some of the comments (from Twitter
https://twitter.com/schrockn/status/940037656494317568): "Jeff Dean and co
at GOOG just released a paper showing how machine-learned indexes can
replace B-Trees, Hash Indexes, and Bloom Filters. Execute 3x faster than
B-Trees, 10-100x less space. Executes on GPU, which are getting faster
unlike CPU. Amazing."

Can those ideas be applied to Postgres in its current state? Or it's not
really down-to-earth?
Oleg made some analysis of the paper.
If the keys are comparable and the data distribution is simple enough (which seems to happen rather often) and the data does not change, we can learn the distribution, and so perform the search faster, and save the memory also. That is what authors wrote and that is what must work in my opinion.

The limitations of the approach, which authors mention in their work:
+ The data does not change. The proposed method can be extended more or less straightforward to nearly append-only workloads and to workloads in which the data distribution does not change or changes very slowly (the data itself or its amount may change, nevertheless). Otherwise it is challenging, because the key positions cannot be considered as independent values anymore, but it looks still possible. The other proposed by the authors option is to store new objects in buffer and retrain model in the background, but that does not look as a nice solution. + GPU are fast, but the latency of doing operations on the GPU almost certainly avoids all speed benefits for such small models. The only case in which it is reasonable to use GPU is training models (i. e. building such indexes).

The following left unclear for me from the paper:
+ How effectively the approach can be implemented taking into account the limitations above. + For hash functions authors propose to replace hash function with the learned CDF, which can decrease the amount of collisions, which decreaes the memory consumption. That is reasonable, but this hash function implicitly uses the information about the order on the keys. I suppose that using the ordering on the data and with access to the data histogram one could achieve similar memory consumption decrease. + The proposed hierarchical learning looks natural for the problem, but it is not clear that it is the best option. For example, one might use the end-to-end learning procedure with weights sparsity regularizers as well. I wonder whether the last approach can obtain the competitive results with the first one, and if not, then why. Anyway, I have a feeling that the proposed method has a room for improvement.

In general, the approach still needs some additional research (mostly about the changing data), and has some points to think about how to implement them properly (paging, for example), but it looks promising enough nevertheless.

Reply via email to