Hello,

On 9/16/19 12:01 PM, kvaps wrote:
> Hi guys!
>
> Thanks for your beautiful work to implement etcd support for Linstor
> server.
> I'm really glad that Linstor keeps up with the with the other
> cloud-native projects and provides an opportunity to use common
> interfaces like etcd for storing configuration.
>
> I have few questions about etcd and future of linstor with this:
>
> 1. Does etcd have any limitations in comparison with standard sql
> backends? (in case of using --max-txn-ops 1024)

From a technical perspective, it is virtually nothing else than a one
huge pile of limitations compared to a modern SQL database:

- As you mentioned, it cannot only do a fixed number of operations per
transaction

- It also cannot touch the same item twice in one transaction.

- It is not type safe, everything is a string, so everything that is
written must be serialized to strings, and everything that is read must
be parsed from strings, which is more complex and also much slower than
simple data serialization. It also requires more exception handling code
and creates a greater potential for software bugs.

- Etcd is only a key/value store, so it does not have a structure
(tables, rows, columns). Therefore, multiple columns belonging to one
key must be either serialized into multiple key/value paris (increasing
the number of operations per transaction) or the columns must be
serialized into one string (increasing complexity due to the additional
parsing)

- It does not support constraints, such as foreign key constraints,
checks of the values that go into it, etc., you could e.g. store a
TCP/IP-Port number of -70,000 just fine, or a duplicate one, while a
DBMS would have prevented such incorrect entries even in the presence of
a bug in LINSTOR code. That makes the data less robust than it would be
when stored in a DBMS.

- It cannot combine entries or their fields, e.g. like an SQL JOIN can,
so that must be coded into our software if required

- We cannot automatically transform it as we can with a DBMS. It cannot
be instructed to "put all values from this table into this other table"
or "change all the values where this condition matches". Transformations
typically require parsing and loading all the affected entries, then
writing our own logic to make any changes, and then serializing and
storing all the entries again.

- It is far less maintainable. Finding, changing or deleting one or
multiple entries, or just fields of some entries, is quite simple if you
can just type in some SQL manually. There is nothing like that in etcd.


It's just marginally better than writing to files (at least it offers
some kind of transactions). But apart from that, put bluntly, it's a bit
like going back to the 1960s.


But to be fair, even supporting multiple SQL databases is not as
carefree as it might seem. I like to jokingly call any database a
NoSQL-database, because none of the SQL databases actually implements
SQL, they all implement a chaotic mix of subsets, supersets and
variations of SQL. I'm tempted to say that SQL doesn't even exist in the
real world, except as an idea in a book that noone ever read after it
was written.
That being said, due to some kind of a miracle, we're still able to run
four different databases with the same database driver in LINSTOR, with
only few conditional changes here and there.

>
> 2. What about future of sql backends? Are you going to focus on etcd
> as main backend, or continue using sql, and leave etcd as an option?
>

It was meant to be an option, not an replacement. However, nothing is
cast in stone in the real world, even changes to technically worse
solutions are very common in the IT world (unfortunately), due to
various reasons.
From today's perspective, I would expect that we will continue using SQL
databases as the main backend and leave etcd as an option.

> 3. According previous questions, what's preferred for large
> deployments? etcd or sql?

The most powerful, robust and maintainable option would be a centralized
database cluster. For most customers, I would recommend the PostgreSQL
database for such installations.


I'll add some background to shed some light on the development effort
behind LINSTOR:

I have to admit that LINSTOR is a bit of an alien in its environment.
And that was actually done on purpose. When I created the initial design
for LINSTOR in 2016, our background at LINBIT was the reliability,
robustness, scalability and maintainability nightmare that we had gone
through with drbdmanage, which was LINSTOR's predecessor (when LINSTOR
development started, the project was actually still called "drbdmanage
next generation" internally). Drbdmanage was built around its typical
environment, some Linux server with DRBD installed, with D-Bus as an IPC
protocol, a filesystem with config files, a Python interpreter, simple
JSON documents for persistence. It turned out to be way too limited to
continue developing it.

Most of the ideas behind LINSTOR were the result of ignoring all the
conventions, traditions and half-baked solutions that existed already in
those typical environments, and instead asking the question: What would
be the theoretical ideal solution in a perfect world, and then, how
close can one get to something like that within real-world limits -
limited developer time, money, limited hard- and software environments, etc.

That is why it was built around a full-blown SQL database, why it
originally used its own communication protocol for IPC, why all the
object names are different from drbdmanage's, why it doesn't have
drbdmanage's "--force" flags, why it writes its own error report files
in addition to using syslog, and that is also why it is so different
from its environment. I did not originally design LINSTOR to work or to
look like a typical Unix/Linux application, or to use whatever the most
widespread or most convenient protocol or data format is, or to e.g.
have a single simple numeric return code as most usual applications do.
Instead, my intention was to make the design more robust, more
consistent, more maintainable and also more scalable by avoiding many of
the weaknesses found in more conventional technology.

The introduction of etcd as an option, the replacement of the binary API
with a REST webserver, the use of DRBD quorum instead of fencing in
cloud environments, the presence of configuration files instead of
configuration utilities, all those things were adjustments made to fit
certain limitations, not because those solutions are technically better.
They aren't, they are just what either the rest of the technology around
LINSTOR or the users can deal with more easily.

In the real world, it's always a compromise.

As you can see, I could probably write an entire book about it all, but
I'll stop here for now.
Anyhow, I hope I could provide some insight into what the challenges and
ideas behind the development of LINSTOR are.

br,
Robert

_______________________________________________
Star us on GITHUB: https://github.com/LINBIT
drbd-user mailing list
[email protected]
https://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to