Thanks Stephen. > > In the largest portion of the system, when it comes to tables, indexes, > and such, if there's a file 'X.1', you can be pretty darn sure that 'X' > is 1GB in size. If there's a file 'X.245' then you can know that > there's 245 files (X, X.1, X.2, X.3 ... X.244) that are 1GB in size.
Good idea. We could just add your logic in db_dir_size(). to teach both pg_database_size_name() and pg_database_size_oid(). 'X' maybe larger than 1GB with custom configuration. So the size of the table could be calculated with two stats: size(X.245) + 245*size(X) On Sun, Jan 6, 2019 at 3:52 AM Stephen Frost <sfr...@snowman.net> wrote: > Greetings, > > * Hubert Zhang (hzh...@pivotal.io) wrote: > > For very large databases, the dbsize function `pg_database_size_name()` > > etc. could be quite slow, since it needs to call `stat()` system call on > > every file in the target database. > > I agree, it'd be nice to improve this. > > > We proposed a new solution to accelerate these dbsize functions which > check > > the disk usage of database/schema/table. The new functions avoid to call > > `stat()` system call, and instead store the disk usage of database > objects > > in user tables(called diskquota.xxx_size, xxx could be db/schema/table). > > Checking the size of database 'postgres' could be converted to the SQL > > query `select size from diskquota.db_size where name = `postgres``. > > This seems like an awful lot of work though. > > I'd ask a different question- why do we need to stat() every file? > > In the largest portion of the system, when it comes to tables, indexes, > and such, if there's a file 'X.1', you can be pretty darn sure that 'X' > is 1GB in size. If there's a file 'X.245' then you can know that > there's 245 files (X, X.1, X.2, X.3 ... X.244) that are 1GB in size. > > Maybe we should teach pg_database_size_name() about that? > > Thanks! > > Stephen > -- Thanks Hubert Zhang