Thank you Adrian. I'm not sure if I can provide as much as you'd need for a 
definite answer but I'll give you what I have.

The original scheduled downtime for one installation was 24 hours. By 21 hours 
it had not completed the pg_dump schema-only so it was returned to operation.
The amount of data per table is widely varied. Some daily tables are 100-200GB 
and thousands of reports tables with stats are much smaller. I'm not connected 
to check now but I'd guess 1GB max. We chose to use the --link option partly 
because some servers do not have the disk space to copy. The time necessary to 
copy 1-2TB was also going to be an issue.
The vast majority of activity is on current day inserts and stats reports of 
that data. All previous days and existing reports are read only.
As is all too common, the DB usage grew with no redesign so it is a single 
database on a single machine with a single schema.
I get the impression there may be an option of getting the schema dump while in 
service but possibly not in this scenario. Plan B is to drop a lot of tables 
and deal with imports later.

I appreciate the help. 

________________________________________
From: Adrian Klaver <adrian.kla...@aklaver.com>
Sent: Sunday, April 7, 2019 8:19 AM
To: senor; pgsql-general@lists.postgresql.org
Subject: Re: pg_upgrade --jobs

On 4/6/19 5:47 PM, senor wrote:
> Thanks Tom for the explanation. I assumed it was my ignorance of how the 
> schema was handled that was making this look like a problem that had already 
> been solved and I was missing something.
>
> I fully expected the "You're Doing It Wrong" part. That is out of my control 
> but not beyond my influence.
>
> I suspect I know the answer to this but have to ask. Using a simplified 
> example where there are 100K sets of 4 tables, each representing the output 
> of a single job, are there any shortcuts to upgrading that would circumvent 
> exporting the entire schema? I'm sure a different DB design would be better 
> but that's not what I'm working with.

An answer is going to depend on more information:

1) What is the time frame for moving from one version to another?
Both the setup and the actual downtime.

2) There are 500,000+ tables, but what is the amount of data involved?

3) Are all the tables active?

4) How are the tables distributed across databases in the cluster and
schemas in each database?


>
> Thanks
>
> ________________________________________
> From: Ron <ronljohnso...@gmail.com>
> Sent: Saturday, April 6, 2019 4:57 PM
> To: pgsql-general@lists.postgresql.org
> Subject: Re: pg_upgrade --jobs
>
> On 4/6/19 6:50 PM, Tom Lane wrote:
>
> senor <frio_cerv...@hotmail.com><mailto:frio_cerv...@hotmail.com> writes:
>
>
> [snip]
>
> The --link option to pg_upgrade would be so much more useful if it
> weren't still bound to serially dumping the schemas of half a million
> tables.
>
>
>
> To be perfectly blunt, if you've got a database with half a million
> tables, You're Doing It Wrong.
>
> Heavy (really heavy) partitioning?
>
> --
> Angular momentum makes the world go 'round.
>
>
>


--
Adrian Klaver
adrian.kla...@aklaver.com


Reply via email to