Hi, On Wed, Mar 5, 2025, 8:44 PM me nefcanto <sn.1...@gmail.com> wrote:
> I once worked with a monolithic SQL Server database with more than 10 > billion records and about 8 Terabytes of data. A single backup took us more > than 21 days. It was a nightmare. Almost everybody knows that scaling up > has a ceiling, but scaling out has no boundaries. > But then you did the backup incrementally correct? That should not take the same amount of time... > Therefore I will never choose a monolithic database design unless it's a > small project. But my examples are just examples. We predict 100 million > records per year. So we have to design accordingly. And it's not just sales > records. Many applications have requirements that are cheap data but vast > in multitude. Consider a language-learning app that wants to store the > known words of any learner. 10 thousand learners each knowing 2 thousand > words means 20 million records. Convert that to 100 thousand learners each > knowing 7 thousand words and now you almost have a billion records. Cheap, > but necessary. Let's not dive into telemetry or time-series data. > Can you try and see if 1 server with 3 different databases will do? Having 1 table per database per server is too ugly. Also please understand - every databae is different. And so it works and operates differently. What work good in one may not work good in another... Thank you. > We initially chose to break the database into smaller databases, because > it seemed natural for our modularized monolith architecture. And it worked > great for SQL Server. If you're small, we host them all on one server. If > you get bigger, we can put heavy databases on separate machines. > > However, I don't have experience working with other types of database > scaling. I have used table partitioning, but I have never used sharding. > > Anyway, that's why I asked you guys. However, encouraging me to go back to > monolith without giving solutions on how to scale, is not helping. To be > honest, I'm somehow disappointed by how the most advanced open source > database does not support cross-database querying just like how SQL Server > does. But if it doesn't, it doesn't. Our team should either drop it as a > choice or find a way (by asking the experts who built it or use it) how to > design based on its features. That's why I'm asking. > > One thing that comes to my mind, is to use custom types. Instead of > storing data in ItemCategories and ItemAttributes, store them as arrays in > the relevant tables in the same database. But then it seems to me that in > this case, Mongo would become a better choice because I lose the relational > nature and normalization somehow. What drawbacks have you experienced in > that sense? > > Regards > Saeed > > On Wed, Mar 5, 2025 at 7:38 PM Adrian Klaver <adrian.kla...@aklaver.com> > wrote: > >> On 3/5/25 04:15, me nefcanto wrote: >> > Dear Laurenz, the point is that I think if we put all databases into >> one >> > database, then we have blocked our growth in the future. >> >> How? >> >> > A monolith database can be scaled only vertically. We have had huge >> > headaches in the past with SQL Server on Windows and a single database. >> > But when you divide bounded contexts into different databases, then you >> > have the chance to deploy each database on a separate physical machine. >> > That means a lot in terms of performance. Please correct me if I am >> wrong. >> >> And you add the complexity of talking across machines, as well as >> maintaining separate machines. >> >> > >> > Let's put this physical restriction on ourselves that we have different >> > databases. What options do we have? One option that comes to my mind, >> is >> > to store the ID of the categories in the Products table. This means >> that >> > I don't need FDW anymore. And databases can be on separate machines. I >> > first query the categories database first, get the category IDs, and >> > then add a where clause to limit the product search. That could be an >> > option. Array data type in Postgres is something that I think other >> > RDBMSs do not have. Will that work? And how about attributes? Because >> > attributes are more than a single ID. I should store the attribute key, >> > alongside its value. It's a key-value pair. What can I do for that? >> >> You seem to be going out of the way to make your life more complicated. >> >> The only way you are going to find an answer is set up test cases and >> experiment. My bet is a single server with a single database and >> multiple schemas is where you end up, after all that is where you are >> starting from. >> >> >> > >> > Thank you for sharing your time. I really appreciate it. >> > Saeed >> > >> > >> > >> > >> > >> > On Wed, Mar 5, 2025 at 3:18 PM Laurenz Albe <laurenz.a...@cybertec.at >> > <mailto:laurenz.a...@cybertec.at>> wrote: >> > >> > On Wed, 2025-03-05 at 14:18 +0330, me nefcanto wrote: >> > > That means a solid monolith database. We lose many goodies with >> that. >> > > As a real-world example, right now we can import a single >> database >> > > from the production to the development to test and troubleshoot >> data. >> > >> > Well, can't you import a single schema then? >> > >> > > What if we host all databases on the same server and use FDW. >> What >> > > happens in that case? Does it return 100 thousand records and >> join >> > > in the memory? >> > >> > It will do just the same thing. The performance could be better >> > because of the reduced latency. >> > >> > > Because in SQL Server, when you perform a cross-database query >> > > (not cross-server) the performance is extremely good, proving >> that >> > > it does not return 100 thousand ItemId from >> Taxonomy.ItemCategories >> > > to join with ProductId. >> > > >> > > Is that the same case with Postgres too, If databases are located >> > > on one server? >> > >> > No, you cannot perform cross-database queries without a foreign >> > data wrapper. I don't see a reason why the statement shouldn't >> > perform as well as in SQL Server if you use schemas instead of >> > databases. >> > >> > Yours, >> > Laurenz Albe >> > >> >> -- >> Adrian Klaver >> adrian.kla...@aklaver.com >> >>