so 22. 9. 2018 v 8:48 odesílatel Hubert Zhang <hzh...@pivotal.io> napsal:
> But it looks like redundant to current GUC configuration and limits > > what do you mean by current GUC configuration? Is that the general block > number limit in your patch? If yes, the difference between GUC and > pg_diskquota catalog is that pg_diskquota will store different quota limit > for the different role, schema or table instead of a single GUC value. > storage is not relevant in this moment. I don't see to consistent to sets some limits via SET command, or ALTER X SET, and some other with CREATE QUOTA ON. The quotas or object limits, resource limits are pretty useful and necessary, but I don't see these like new type of objects, it is much more some property of current objects. Because we have one syntax for this purpose I prefer it. Because is not good to have two syntaxes for similar purpose. So instead CREATE DISC QUATA ON SCHEMA xxx some value I prefer ALTER SCHEMA xxx SET disc_quota = xxx; The functionality is +/- same. But ALTER XX SET was introduce first, and I don't feel comfortable to have any new syntax for similar purpose Regards Pavel > > On Sat, Sep 22, 2018 at 11:17 AM Pavel Stehule <pavel.steh...@gmail.com> > wrote: > >> >> >> pá 21. 9. 2018 v 16:21 odesílatel Hubert Zhang <hzh...@pivotal.io> >> napsal: >> >>> just fast reaction - why QUOTA object? >>>> Isn't ALTER SET enough? >>>> Some like >>>> ALTER TABLE a1 SET quote = 1MB; >>>> ALTER USER ... >>>> ALTER SCHEMA .. >>>> New DDL commans looks like too hard hammer . >>> >>> >>> It's an option. Prefer to consider quota setting store together: >>> CREATE DISK QUOTA way is more nature to store quota setting in a >>> separate pg_diskquota catalog >>> While ALTER SET way is more close to store quota setting in pg_class, >>> pg_role, pg_namespace. etc in an integrated way. >>> (Note that here I mean nature/close is not must, ALTER SET could also >>> store in pg_diskquota and vice versa.) >>> >> >> I have not a problem with new special table for storing this information. >> But it looks like redundant to current GUC configuration and limits. Can be >> messy do some work with ALTER ROLE, and some work via CREATE QUOTE. >> >> Regards >> >> Pavel >> >> >>> Here are some differences I can think of: >>> 1 pg_role is a global catalog, not per database level. It's harder to >>> tracker the user's disk usage in the whole clusters(considering 1000+ >>> databases). So the semantic of CREATE DISK QUOTA ON USER is limited: it >>> only tracks the user's disk usage inside the current database. >>> 2 using separate pg_diskquota could add more field except for quota >>> limit without adding too many fields in pg_class, e.g. red zone to give the >>> user a warning or the current disk usage of the db objects. >>> >>> On Fri, Sep 21, 2018 at 8:01 PM Pavel Stehule <pavel.steh...@gmail.com> >>> wrote: >>> >>>> >>>> >>>> pá 21. 9. 2018 v 13:32 odesílatel Hubert Zhang <hzh...@pivotal.io> >>>> napsal: >>>> >>>>> >>>>> >>>>> >>>>> >>>>> *Hi all,We redesign disk quota feature based on the comments from >>>>> Pavel Stehule and Chapman Flack. Here are the new >>>>> design.OverviewBasically, >>>>> disk quota feature is used to support multi-tenancy environment, >>>>> different >>>>> level of database objects could be set a quota limit to avoid over use of >>>>> disk space. A common case could be as follows: DBA could enable disk quota >>>>> on a specified database list. DBA could set disk quota limit for >>>>> tables/schemas/roles in these databases. Separate disk quota worker >>>>> process >>>>> will monitor the disk usage for these objects and detect the objects which >>>>> exceed their quota limit. Queries loading data into these “out of disk >>>>> quota” tables/schemas/roles will be cancelled.We are currently working at >>>>> init implementation stage. We would like to propose our idea firstly and >>>>> get feedbacks from community to do quick iteration.SQL Syntax (How to use >>>>> disk quota)1 Specify the databases with disk quota enabled in GUC >>>>> “diskquota_databases” in postgresql.conf and restart the database.2 DBA >>>>> could set disk quota limit for table/schema/role.CREATE DISK QUOTA tablea1 >>>>> ON TABLE a1 with (quota = ‘1MB’);CREATE DISK QUOTA roleu1 ON USER u1 with >>>>> (quota = ‘1GB’);CREATE DISK QUOTA schemas1 ON SCHEMA s1 with (quota = >>>>> ‘3MB’);* >>>>> >>>> >>>> just fast reaction - why QUOTA object? >>>> >>>> Isn't ALTER SET enough? >>>> >>>> Some like >>>> >>>> ALTER TABLE a1 SET quote = 1MB; >>>> ALTER USER ... >>>> ALTER SCHEMA .. >>>> >>>> New DDL commans looks like too hard hammer . >>>> >>>> >>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> *3 Simulate a schema out of quota limit case: suppose table a1 and >>>>> table a2 are both under schema s1.INSERT INTO a1 SELECT >>>>> generate_series(1,1000);INSERT INTO a2 SELECT >>>>> generate_series(1,3000000);SELECT pg_sleep(5)INSERT INTO a1 SELECT >>>>> generate_series(1,1000);ERROR: schema's disk space quota exceededDROP >>>>> TABLE a2;SELECT pg_sleep(5)INSERT INTO a1 SELECT >>>>> generate_series(1,1000);INSERT 0 1000ArchitectureDisk quota has the >>>>> following components.1. Quota Setting Store is where the disk quota >>>>> setting >>>>> to be stored and accessed. We plan to use catalog table pg_diskquota to >>>>> store these information. pg_diskquota is >>>>> like:CATALOG(pg_diskquota,6122,DiskQuotaRelationId){ NameData quotaname; >>>>> /* >>>>> diskquota name */ int16 quotatype; /* diskquota type name */ Oid >>>>> quotatargetoid; /* diskquota target db object oid*/ int32 quotalimit; /* >>>>> diskquota size limit in MB*/ int32 quotaredzone; /* diskquota redzone in >>>>> MB*/} FormData_pg_diskquota;2. Quota Change Detector is the monitor of >>>>> size >>>>> change of database objects. We plan to use stat collector to detect the >>>>> ‘active’ table list at initial stage. But stat collector has some >>>>> limitation on finding the active table which is in a running transaction. >>>>> Details see TODO section.3. Quota Size Checker is where to calculate the >>>>> size and compare with quota limit for database objects. According to >>>>> Pavel’s comment, autovacuum launcher and worker process could be a good >>>>> reference to disk quota. So we plan to use a disk quota launcher daemon >>>>> process and several disk quota worker process to finish this work. >>>>> Launcher >>>>> process is responsible for starting worker process based on a user defined >>>>> database list from GUC. Worker process will connect to its target database >>>>> and monitor the disk usage for objects in this database. In init stage of >>>>> worker process, it will call calculate_total_relation_size() to calculate >>>>> the size for each user table. After init stage, worker process will >>>>> refresh >>>>> the disk model every N seconds. Refreshing will only recalculate the size >>>>> of tables in ‘active’ table list, which is generated by Quata Change >>>>> Detector to minimize the cost.4. Quota Enforcement Operator is where to >>>>> check for the quota limitation at postgres backend side. We will firstly >>>>> implement it in ExecCheckRTPerms() as pre-running enforcement. It will >>>>> check the disk quota of tables being inserted or updated, and report error >>>>> if table’s or table’s schema’s or table’s owner’s quota limit is exceeded. >>>>> As a native feature, we plan to add more checkpoint to do running query >>>>> enforcement. For example, if a disk quota lefts 10MB quota, a query could >>>>> insert 1GB data. This query could be allowed in pre-running enforcement >>>>> check, but will be cancelled in running query enforcement check. >>>>> Therefore, >>>>> it can improve the accurate of disk quota usage. To achieve this, we plan >>>>> to add a checkpoint in lower API such as smgr_extened. Hence, the Quota >>>>> Enforcement Operator will check the disk quota usage when smgr_extened is >>>>> called. If the quota is over limited, current query will be cancelled. >>>>> Highlight1. Native feature.Support native Create/Drop Disk Quota SQL >>>>> statement.New catalog table pg_diskquota to store disk quota setting.2. >>>>> Auto DML/DDL detection. Table >>>>> create/update/insert/delete/vacuum/truncate/drop/schema_change/owner_change, >>>>> Schema create/drop and Role create/drop will be detected by disk quota >>>>> automatically. 3. Low cost disk quota checker.Worker process of disk quota >>>>> need to refresh the disk usage model every N seconds. Since recalculate >>>>> the >>>>> file size using stat() system call is expensive for a large number of >>>>> files, we use an ‘active’ table list to reduce the real work at each >>>>> iteration. A basic experiment on our init stage implementation on database >>>>> with 20K tables shows that the refresh cost is 1% cpu usage and will be >>>>> finished within 50ms. Todo/LimitationBefore we propose our patch, we plan >>>>> to enhance it with the following ideas:1. Setting database list with disk >>>>> quota enabled dynamically without restart database. Since we have the disk >>>>> quota launcher process, it could detect the new ‘diskquota_databases’ list >>>>> and start/stop the corresponding disk quota worker process.2. Enforcement >>>>> when query is running. Considering the case when there is 10MB quota left, >>>>> but next query will insert 10GB data. Current enforcement design will >>>>> allow >>>>> this query to be executed. This is limited by the ‘active’ table detection >>>>> is generated by stat collector. Postgres backend will only send table stat >>>>> information to collector only when the transaction ends. We need a new way >>>>> to detect the ‘active’ table even when this table is being modified inside >>>>> a running transaction.3. Monitor unlimited number of databases. Current we >>>>> set the max number of disk quota worker process to be 10 to reduce the >>>>> affection normal workload. But how about if we want to monitor the disk >>>>> quota of more than 10 databases? Our solution is to let disk quota >>>>> launcher >>>>> to manage a queue of database need to be monitored. And disk quota worker >>>>> process consuming the queue and refresh the disk usage/quota for this >>>>> database. After some periods, worker will return the database to the >>>>> queue, >>>>> and fetch the top database from queue to process. The period determine the >>>>> delay of detecting disk quota change. To implement this feature, we need >>>>> to >>>>> support a subprocess of postmaster to rebind to another database instead >>>>> of >>>>> the database binded in InitPostgres().4. Support active table detection on >>>>> vacuum full and vacuum analyze. Currently vacuum full and vacuum analyze >>>>> are not tracked by stat collector.Thanks to Heikki, Pavel Stehule,Chapman >>>>> Flack for the former comments on disk quota feature. Any comments on how >>>>> to >>>>> improve disk quota feature are appreciated.* >>>>> >>>>> >>>>> On Mon, Sep 3, 2018 at 12:05 PM, Pavel Stehule < >>>>> pavel.steh...@gmail.com> wrote: >>>>> >>>>>> >>>>>> >>>>>> 2018-09-03 3:49 GMT+02:00 Hubert Zhang <hzh...@pivotal.io>: >>>>>> >>>>>>> Thanks Pavel. >>>>>>> Your patch did enforcement on storage level(md.c or we could also >>>>>>> use smgr_extend). It's straight forward. >>>>>>> But I prefer to implement disk_quota as a feature with following >>>>>>> objectives: >>>>>>> 1 set/alter disk quota setting on different database objects, e.g. >>>>>>> user, database, schema etc. not only a general GUC, but we could set >>>>>>> separate quota limit for a specific objects. >>>>>>> 2 enforcement operator should work at two positions: before query is >>>>>>> running and when query is running. The latter one's implementation maybe >>>>>>> similar to your patch. >>>>>>> >>>>>> >>>>>> The patch was just example. The resource quotes should be more >>>>>> complex - per partition, table, schema, database, user - so GUC are >>>>>> possible, but not very user friendly. >>>>>> >>>>>> Our case is specific, but not too much. The servers are used for >>>>>> multidimensional analyses - and some tables can grow too fast (COPY, >>>>>> INSERT >>>>>> SELECT). We need to solve limits immediately. The implementation is >>>>>> simple, >>>>>> so I did it. Same implementation on database level, or schema level needs >>>>>> some more locks, so it will not be too effective. The resource management >>>>>> can be complex very complex, and I expect so it will be hard work. >>>>>> >>>>>> Regards >>>>>> >>>>>> Pavel >>>>>> >>>>>> >>>>>>> On Sun, Sep 2, 2018 at 8:44 PM, Pavel Stehule < >>>>>>> pavel.steh...@gmail.com> wrote: >>>>>>> >>>>>>>> Hi >>>>>>>> >>>>>>>> 2018-09-02 14:18 GMT+02:00 Hubert Zhang <hzh...@pivotal.io>: >>>>>>>> >>>>>>>>> Thanks Chapman. >>>>>>>>> @Pavel, could you please explain more about your second >>>>>>>>> suggestion "implement some quotas on storage level?" >>>>>>>>> >>>>>>>> >>>>>>>> See attached patch - it is very simple - and good enough for our >>>>>>>> purposes. >>>>>>>> >>>>>>>> Regards >>>>>>>> >>>>>>>> Pavel >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> We will not keep the long-lived processes attach to all >>>>>>>>> databases(just like you mentioned servers with thousands of >>>>>>>>> databases) >>>>>>>>> And you are right, we could share ideas with autovacuum process, >>>>>>>>> fork worker processes in need. >>>>>>>>> "autovacuum checks for tables that have had a large number of >>>>>>>>> inserted, updated or deleted tuples. These checks use the statistics >>>>>>>>> collection facility" >>>>>>>>> diskquota process is similar to autovacuum at caring about insert, >>>>>>>>> but the difference is that it also care about vucuum full, truncate >>>>>>>>> and >>>>>>>>> drop. While update and delete may not be interested since no file >>>>>>>>> change >>>>>>>>> happens. So a separate diskquota process is preferred. >>>>>>>>> >>>>>>>>> So if we implemented disk quota as a full native feature, and in >>>>>>>>> the first initial version I prefer to implement the following >>>>>>>>> features: >>>>>>>>> 1 Fork diskquota launcher process under Postmaster serverloop, >>>>>>>>> which is long-lived. >>>>>>>>> 2 Diskquota launcher process is responsible for >>>>>>>>> creating diskquota worker process for every database. >>>>>>>>> 3 DIskquota setting is stored in a separate catalog table for each >>>>>>>>> database. >>>>>>>>> 4 Initialization stage, Diskquota launcher process creates >>>>>>>>> diskquota worker process for all the databases(traverse like >>>>>>>>> autovacuum). Worker process calculates disk usage of db objects and >>>>>>>>> their diskquota setting. If any db object exceeds its quota limit, >>>>>>>>> put them >>>>>>>>> into the blacklist in the shared memory, which will later be used by >>>>>>>>> enforcement operator. Worker process exits when works are done. >>>>>>>>> 5 Running stage, Diskquota launcher process creates diskquota worker >>>>>>>>> process for the database with a large number of insert, copy, >>>>>>>>> truncate, >>>>>>>>> drop etc. or create disk quota statement. Worker process updates the >>>>>>>>> file >>>>>>>>> size for db objects containing the result relation, and compare with >>>>>>>>> the >>>>>>>>> diskquota setting. Again, if exceeds quota limit, put them into >>>>>>>>> blacklist, >>>>>>>>> remove from blacklist vice versa. Worker process exits when works >>>>>>>>> are done and a GUC could control the frequency of worker process >>>>>>>>> restart to >>>>>>>>> a specific database. As you know, this GUC also controls the delay >>>>>>>>> when we >>>>>>>>> do enforcement. >>>>>>>>> 6 Enforcement. When postgres backend executes queries, check the >>>>>>>>> blacklist in shared memory to determine whether the query is >>>>>>>>> allowed(before >>>>>>>>> execute) or need rollback(is executing)? >>>>>>>>> >>>>>>>>> If we implemented disk quota as an extension, we could just use >>>>>>>>> background worker to start diskquota launcher process and use >>>>>>>>> RegisterDynamicBackgroundWorker() to fork child diskquota worker >>>>>>>>> processes by the launcher process as suggested by @Chapman. >>>>>>>>> Diskquota setting could be stored in user table in a separate schema >>>>>>>>> for >>>>>>>>> each database(Schema and table created by create extension statement) >>>>>>>>> just >>>>>>>>> like what Heikki has done in pg_quota project. But in this case, we >>>>>>>>> need to >>>>>>>>> create extension for each database before diskquota worker process >>>>>>>>> can be >>>>>>>>> set up for that database. >>>>>>>>> >>>>>>>>> Any comments on the above design and which is preferred, native >>>>>>>>> feature or extension as the POC? >>>>>>>>> >>>>>>>>> >>>>>>>>> -- Hubert >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, Aug 31, 2018 at 3:32 AM, Pavel Stehule < >>>>>>>>> pavel.steh...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 2018-08-30 16:22 GMT+02:00 Chapman Flack <c...@anastigmatix.net>: >>>>>>>>>> >>>>>>>>>>> On 08/30/2018 09:57 AM, Hubert Zhang wrote: >>>>>>>>>>> >>>>>>>>>>> > 2 Keep one worker process for each database. But using a >>>>>>>>>>> parent/global >>>>>>>>>>> > quota worker process to manage the lifecycle of database level >>>>>>>>>>> worker >>>>>>>>>>> > processes. It could handle the newly created database(avoid >>>>>>>>>>> restart >>>>>>>>>>> > database) and save resource when a database is not used. But >>>>>>>>>>> this needs to >>>>>>>>>>> > change worker process to be hierarchical. Postmaster becomes >>>>>>>>>>> the grandfather >>>>>>>>>>> > of database level worker processes in this case. >>>>>>>>>>> >>>>>>>>>>> I am using background workers this way in 9.5 at $work. >>>>>>>>>>> >>>>>>>>>>> In my case, one worker lives forever, wakes up on a set period, >>>>>>>>>>> and >>>>>>>>>>> starts a short-lived worker for every database, waiting for each >>>>>>>>>>> one before starting the next. >>>>>>>>>>> >>>>>>>>>>> It was straightforward to implement. Looking back over the code, >>>>>>>>>>> I see the global worker assigns its own PID to >>>>>>>>>>> worker.bgw_notify_pid >>>>>>>>>>> of each of its children, and also obtains a handle for each child >>>>>>>>>>> from RegisterDynamicBackgroundWorker(). >>>>>>>>>>> >>>>>>>>>>> I imagine the global quota worker would prefer to start workers >>>>>>>>>>> for every database and then just wait for notifications from any >>>>>>>>>>> of them, but that seems equally straightforward at first glance. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> There are servers with thousands databases. Worker per database >>>>>>>>>> is not good idea. >>>>>>>>>> >>>>>>>>>> It should to share ideas, code with autovacuum process. >>>>>>>>>> >>>>>>>>>> Not sure, how to effective implementation based on bg workers can >>>>>>>>>> be. On servers with large set of databases, large set of tables it >>>>>>>>>> can >>>>>>>>>> identify too big table too late. >>>>>>>>>> >>>>>>>>>> Isn't better to implement some quotas on storage level? >>>>>>>>>> >>>>>>>>>> Regards >>>>>>>>>> >>>>>>>>>> Pavel >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> -Chap >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Thanks >>>>>>>>> >>>>>>>>> Hubert Zhang >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Thanks >>>>>>> >>>>>>> Hubert Zhang >>>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Thanks >>>>> >>>>> Hubert Zhang >>>>> >>>> >>> >>> -- >>> Thanks >>> >>> Hubert Zhang >>> >> > > -- > Thanks > > Hubert Zhang >