[GENERAL] vacuum of empty table slows down as database table count grows

2017-01-04 Thread marcin kowalski
I am experiencing an odd issue, i've noticed it on 9.3 , but i can
reproduce it on 9.6.

Basically, i have a database with a lot of schemas, but not that much data.
Each schema is maybe 2-4 GB in size, and often much less than that.

The database has ~300-500 schemas, each with ~100-300 tables. Generally a
few hundred thousand tables total. Entire cluster has 2 or 3 such databases.

As the amount of tables grows, the time it takes to vacuum an _empty_ table
grows as well. The table is in public schema, and it is the only table
there.

I made a simple testing script to make sure that these things are related.
I set up a blank database, create a table with one column in public and
restore one schema. Then i vacuum that table three times, measure the
execution times and repeat the process, adding another schema to db.

At ~200 tables it takes ~100ms for psql to issue a vacuum verbose and exit.
At 83K tables the time is already at ~1.5second. The progress appars to be
directly proportional to table amount, and grows linearly, eventually
crossing past 3seconds - for blank table with no data.

I think this may severely impact the entire vacuumdb run, but i have not
verified that yet.

This is irrelevant of amount of data restored, i am seeing the same
behavior with just schema restore, as well as with schema+data restores.

If anyone is interested i may upload the schema data + my benchmarking
script with collected whisper data from my test run (i've been plotting it
in grafana via carbon)

Is this a known issue? Can i do anything to improve performance here?


Re: [GENERAL] vacuum of empty table slows down as database table count grows

2017-01-04 Thread marcin kowalski
Thanks, i'll redo the benchmarks and report back how things look now.

2017-01-04 20:33 GMT+01:00 Pavel Stehule :

>
>>> >
>>> > This is irrelevant of amount of data restored, i am seeing the same
>>> behavior with just schema restore, as well as with schema+data restores.
>>> >
>>> > If anyone is interested i may upload the schema data + my benchmarking
>>> script with collected whisper data from my test run (i've been plotting it
>>> in grafana via carbon)
>>> >
>>> > Is this a known issue? Can i do anything to improve performance here?
>>>
>>
>> we had 10K and more tables in one database - and we had lot of issues.
>>
>> I know so Tomas fixed some issues, but we need the stat files in tmpfs
>>
>> please, read this article  https://blog.pgaddict.com/pos
>> ts/the-two-kinds-of-stats-in-postgresql
>>
>
> http://hacksoclock.blogspot.cz/2014/04/putting-
> statstempdirectory-on-ramdisk.html
>
>
>>
>> Regards
>>
>> Pavel
>>
>> >
>>>
>>> --
>>> Jerry Sievers
>>> Postgres DBA/Development Consulting
>>> e: postgres.consult...@comcast.net
>>> p: 312.241.7800
>>>
>>>
>>> --
>>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>>> To make changes to your subscription:
>>> http://www.postgresql.org/mailpref/pgsql-general
>>>
>>
>>
>


Re: [GENERAL] vacuum of empty table slows down as database table count grows

2017-01-05 Thread marcin kowalski
Well, unfortunately i am not seeing much difference. I shaved off maybe a
second of worst case run.

I guess i should just split the db into smaller ones, since tmpstats are
now per-db. Are there any other things i could try?

2017-01-05 8:18 GMT+01:00 marcin kowalski :

> Thanks, i'll redo the benchmarks and report back how things look now.
>
> 2017-01-04 20:33 GMT+01:00 Pavel Stehule :
>
>>
>>>> >
>>>> > This is irrelevant of amount of data restored, i am seeing the same
>>>> behavior with just schema restore, as well as with schema+data restores.
>>>> >
>>>> > If anyone is interested i may upload the schema data + my
>>>> benchmarking script with collected whisper data from my test run (i've been
>>>> plotting it in grafana via carbon)
>>>> >
>>>> > Is this a known issue? Can i do anything to improve performance here?
>>>>
>>>
>>> we had 10K and more tables in one database - and we had lot of issues.
>>>
>>> I know so Tomas fixed some issues, but we need the stat files in tmpfs
>>>
>>> please, read this article  https://blog.pgaddict.com/pos
>>> ts/the-two-kinds-of-stats-in-postgresql
>>>
>>
>> http://hacksoclock.blogspot.cz/2014/04/putting-statstempdire
>> ctory-on-ramdisk.html
>>
>>
>>>
>>> Regards
>>>
>>> Pavel
>>>
>>> >
>>>>
>>>> --
>>>> Jerry Sievers
>>>> Postgres DBA/Development Consulting
>>>> e: postgres.consult...@comcast.net
>>>> p: 312.241.7800
>>>>
>>>>
>>>> --
>>>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>>>> To make changes to your subscription:
>>>> http://www.postgresql.org/mailpref/pgsql-general
>>>>
>>>
>>>
>>
>


[GENERAL] [general] rsync'd database requires reindex - why ?

2011-11-21 Thread marcin kowalski
Hi list.

I'm migrating a bunch of old suse 9.3 systems with postgresql 8.2 databases
to opensuse 11.4 systems with 8.2 databases (the exact same version -
8.2.14). From there, the databases will be migrated to postgresql 9.x with
custom process.

Let's assume that 9.3 machine is machine A, and new one is machine B.

For now, i'm simply stopping postgresql on both machines, and rsync-ing the
entire data directory from A to B. And then starting postgresql on the new
one.

I do it with rsync -azvr --delete parameters.

New machine has only bare minimum databases running prior to migration -
template0, template1, postgres. Which means that any extra user database
has its directory copied over from scratch. Postgresql starts up without
any issues after transfer is finished.

However, when running the new instance, it seems indexes are quite botched,
e.g.

"select * from table where primary_key_column='somevalue';" returns
nothing.

when doing plain "select * from table;" i can see the row that should have
been returned by the previous query.

Reindexing all databases seems to resolve the problem.

This is very reproductible, as i have lots of machines to migrate and i've
hit this problem at least three times now. For now, i'm just reindexing
databases after i've done rsyncing them, but is that behavior expected ?

Both machines have postgresql-8.2.14 built by hand, with the same
configuration and with contrib installed.