Hello, thanks for the valuable feedback and time you've taken. I will try to go from top to bottom.
> Don't use labdas in unit tests at all unless it is strongly necessary. Understood. > I changed [...] to re-open connection and tests started to fail. Correct me if I'm wrong, but for in-memory databases failures are expected, because they are dropped when the last connection is closed (unless prevented via flags in the connection string). That's one of the things my lambdas tried to address. However, I've replaced the lambda based tests with simpler, more straight forward ones which allow faster reasoning about the test behavior. > This code throws OOME unless restore point creation is commented out Thanks for pointing that out and good catch. For the time being I will not address it this, but I would address it once you (as a group) are convinced the feature goes in. > This code throws General Error Same as above, but an even better catch. In my opinion database compaction should not erase restore points. That would defeat the purpose of restore points (the way I intended them) entirely. > Because SCRIPT command can't export these restore points, it must be documented. Good point. > You also need to document this functionality as experimental only. Of course. If it goes in at all though. > I think content of INFORMATION_SCHEMA.RESTORE_POINTS shouldn't be visible for regular users. I disagree. The information schema contains all available information of a database and restore points are a part of that information, much like tables and columns and constraints. If nothing else, it simply enables users to realize that there are still restore points, i.e. more is kept on disk than might be necessary. Or maybe I misunderstood: I'm thinking of users who have access to the information schema and you'd like to restrict access to the table `restore_points` to admins? > Don't use MVCC in its documentation, this term only confuses people. I disagree again. MVCC is a widely used concept and restore points simply expose an implementation detail in H2: that it uses MVCC. I think this is comparable to the Oracle documentation, which is pretty thorough. Even more so than PostgreSQL (which is mind blowing to me). But I also get your point that it would be great if we were able to hide that information from users in case H2 ever switches to another storage which is not MVCC. But we could always adjust the docs if that happens. > What is a purpose of OLDEST_DATABASE_VERSION_TO_KEEP and why this field isn't changed in newer restore points if older restore points were already dropped? The field tracks the `database_version` of the very first restore point across all restore points. The reason (for better or worse) is that if one creates two restore points A and B, the latter one includes the creation of A, because they share the same storage as the rest of the database. Ergo, if one drops restore point A but returns to point B (at which the drop of A hasn't happened yet) and then returns to A, we expect the database to be in the state of A. And that's why we need to track the version of A (the oldest restore point in a chain) in every subsequent restore point. I'd be lieing if I said that I like this detail. > [...] a [...] question, but why this feature is so badly needed? It is 100% *not* badly needed. It would merely be a nice to have and, to be completely honest, I don't think it would ever go beyond that. However, if somebody wants to be able to restore an H2 database to an earlier state without the hassle of requiring persistent storage and copying files around, that's where it would shine. > What's wrong with populating database to a desired state, shutting down, copying the file, and starting H2 with a file copy. Nothing. It's just a wee bit more work. > Why do you need this functionality for in-memory (non-persistent) database? If performance is the reason, you can use RAM-based file system instead. Performance is not the reason. At least not when it comes to H2 directly. Database population tools like Liquibase on the other hand can be a bottleneck. And I actually tried a RAM-based filesystem. Jimfs. But I didn't get H2 to work on it, because it doesn't use `Paths.get(URI)` to look up files, but `Paths.get(String, String...)`. As for "real" RAM-based filesystems: That's simply too much work in my opinion. But to answer your question why I would like this functionality for in-memory databases: Convenience. There's simply no other and no good or better reason. But this kinda brings me back to the point I mentioned already: I think H2 shouldn't know if it's running in-memory or not and it should work with Java based in-memory filesystems. And maybe it even does and I was not able to figure out how, but I think it doesn't. > I am struggling to understand what is going to happen with all other concurrent connections and their transactions at the moment when RESTORE TO POINT is executed. Gone. Simple as that. Concurrent connections are dropped. The connection that executes `restore to point` (which requires admin permissions) goes into exclusive mode and drops all other connections. It then rolls the storage back to the desired version and re-initializes the database (reloads META, rolls back transactions that were ongoing concurrently at the time of restore point creation it is restoring to). It is as if nothing ever happened after the restore point, that's being returned to. That's the idea at least. Not gonna lie, it is complicated. > I also agree with Evgenij, that implementation based on MVStore will prevent a lot of intermediate versions from being dropped and heedlessly bloat database file. Which shouldn't be a problem once all restore points have been dropped, because then the MVStore GC can work its magic, or if a database is being restored to a point, because then at least some GC can happen. > So if I was interested in this feature, I would implement SAVEPOINT as "export to script" and RESTORE as "drop all schemas and execute script". Only questions I have: how to lock the DB while this happens and queue/stall any pending sessions -- in case of multi-user access. I like that idea *a lot*. It is much, much simpler than my approach. Although it's more than drop all schemas, because it would require a wiped database. Nothing bad though. Let's maybe go even further and simply create a script export and then create a new database with `INIT=RUNSCRIPT FROM '~/create.sql'`. That should do the trick, no? Why didn't I come up with it? :D Yes, all connections would have to be dropped to the existing database, but it would be wiped clean anyway, so that's not an issue. Guys, thanks a lot for the thoughts exchange. I'm still willing to spend time on "my approach" if you'd accept it into H2, but I'd also be fine with dropping it. As I said, what I've written so far was a good exercise and I've learned more about H2 in the past few days than I have over the last 5 or more years. Regards, Enno On Saturday, November 9, 2024 at 3:53:30 AM UTC+1 Andreas Reichel wrote: Dear All. On Fri, 2024-11-08 at 18:47 -0800, Andrei Tokar wrote: What's wrong with populating database to a desired state, shutting down, copying the file, and starting H2 with a file copy. Am I right to assume, that shutdown is not even needed: you can run export to script/backup on an open instance without interruptions, right? (Tiny detail, but it could matter.) So if I was interested in this feature, I would implement SAVEPOINT as "export to script" and RESTORE as "drop all schemas and execute script". Only questions I have: how to lock the DB while this happens and queue/stall any pending sessions -- in case of multi-user access. Cheers Andreas -- You received this message because you are subscribed to the Google Groups "H2 Database" group. To unsubscribe from this group and stop receiving emails from it, send an email to h2-database+unsubscr...@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/h2-database/fd7ce920-1596-4767-aab4-c1495287b1bfn%40googlegroups.com.