Michael, just checking in to see what might be a good time to chat. We're excited to connect!
Aditya On Fri, Dec 13, 2019 at 2:22 PM Aditya Parameswaran <adity...@berkeley.edu> wrote: > Michael, > > We'd love to meet and discuss! Unfortunately, a lot of us are off for > break starting next week so it might be best to sync up early next year. > Would week of the 6th work for you? 8am PT/10am CT/4pm GMT any day should > work! > > > We started by having the relational database be a simple persistent >> > storage layer, when coupled with an index to retrieve data by position, >> > can allow us to scroll through large datasets of billions of rows at >> > ease. We developed a new positional index to handle insertions and >> > deletions in O(log(n)) -- https://arxiv.org/pdf/1708.06712.pdf. I agree >> > that pushing the computation to the relational database does have >> > overheads; but at the same time, it allows for scaling to arbitrarily >> > large datasets. >> >> Ooh - nice paper. Your crawled data-set looks quite interesting >> too, we >> run wide-scale crash-testing on the LibreOffice code-base across ~100k >> files and enlarging our corpus there: or better, getting some >> statistical view of which OOXML attributes (and thus features) are most >> used out there would be extremely useful to us as we develop the core. >> >> I like the data on spreadsheet and formula shape - that is very >> useful. >> Do you have data on the geometry of formulae - as in rows vs. columns ? >> [ we switched to columnar storage based mostly on experience rather than >> hard data ;-]. >> >> It is also interesting to have access to very large (1.3m row) >> data-sets that can have useful analysis done on them - would love to see >> the source data there. >> > > Again, this is something that we'd be happy to share; this might just take > a bit more work since it's an older codebase. > I believe we did use the geometry of the formulae to determine the best > storage representation, so it's there somewhere :-) > > Sounds good, cf. above - if we can't make that - early in the new >> year >> would be great. >> >> I look forward to talking, >> > > Likewise! > > Aditya >
_______________________________________________ LibreOffice mailing list LibreOffice@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/libreoffice