Re: Hot to model data in DB (PostgreSQL) for SNMP-like multiple configurations

Alban Hertroys Fri, 08 Mar 2019 07:16:33 -0800

Is there a reason not to use a relational model instead of json(b) here? I
think that is in fact considered best practice.


On Fri, 8 Mar 2019 at 15:40, Alexandru Lazarev <alexandru.laza...@gmail.com>
wrote:

> I am working on product managing and monitoring Network (NMS-like
> products).
>
> Product manages configuration of network devices, for now each device has
> stored its configuration in simple table - this was the original design.
>
> CREATE TABLE public.configuration(
>   id integer NOT NULL,
>   config json NOT NULL,
>   CONSTRAINT configuration_pkey PRIMARY KEY (id),)
>
> A config looks like:
>
> {
>     "_id": 20818132,
>     "type": "Modem",
>     "data": [{
>             "oid": "1.3.6.1.4.1.9999.3.10.2.2.25.4.1.40",
>             "instance": "24",
>             "value": "null"
>         },
>         {
>             "oid": "1.3.6.1.4.1.9999.3.5.10.1.86",
>             "instance": "0",
>             "value": "562"
>         },
>         {
>             "oid": "1.3.6.1.4.1.9999.3.5.10.3.92.4.1",
>             "instance": "0",
>             "value": "0"
>         },
>         {
>             "oid": "1.3.6.1.4.1.9999.3.10.2.2.25.4.1.43",
>             "instance": "24",
>             "value": "vlan24"
>         },
>         {
>             "oid": "1.3.6.1.4.1.9999.3.10.2.2.25.4.1.43",
>             "instance": "25",
>             "value": "vlan25"
>         }
>     ]}
>
> And there are many plv8 (java script procedural language extension for
> PostgreSQL) stored procedures working on bulks of such config, reading some
> OIDs, changing them conditionally, removing some of them and adding others,
> especially in use cases like: There are some upper-level META-configuration
> of different level, which during change have to update all their updated
> parameters to all affected leaves configs. An simple test-example (but
> without touching 'data' node)
>
> CREATE OR REPLACE FUNCTION public.process_jsonb_plv8()
>   RETURNS void AS$BODY$
> var CFG_TABLE_NAME = "configurations";
> var selPlan = plv8.prepare( "select c.config from " + CFG_TABLE_NAME + " c 
> where c.id = $1", ['int'] );
> var updPlan = plv8.prepare( 'update ' + CFG_TABLE_NAME + ' set config = $1 
> where id = $2', ['jsonb','int'] );
>
> try {
>
>     var ids = plv8.execute('select id from devices');
>
>     for (var i = 0; i < ids.length; i++) {
>         var db_cfg = selPlan.execute(ids[i].id); //Get current json config 
> from DB
>         var cfg = db_cfg[0].config;
>         cfg["key0"] = 'plv8_json'; //-add some dummy key
>         updPlan.execute(cfg, ids[i].id); //put uopdated JSON config in DB
>         plv8.elog(NOTICE, "UPDATED = " + ids[i].id);
>
>
>     }} finally {
>     selPlan.free();
>     updPlan.free();}
> return;$BODY$
>   LANGUAGE plv8 VOLATILE
>   COST 100;
>
> For real use-cases plv8 SPs are more complicated, doing FOR-LOOP through
> ALL OIDs object of 'data' array, checking if it is looking for and update
> value an/or remove it and/or add newer if necessary.
>
> Since number of devices in DB increased from several hundreds to 40K or
> even 70K, and number of OID+Instance combinations also increased from
> several hundred to ~1K and sometimes up to 10K within a config, we start
> facing slowness in bulk (especially global -> update to ALL Devices)
> updates/searches.
>
> In order to get rid off FOR LOOP step for each configuration I've
> converted data-node from array to object (key-value model), something like
> :
>
> {
>     "_id": 20818132,
>     "type": "Modem",
>     "data": {
>         "1.3.6.1.4.1.9999.3.10.2.2.25.4.1.40": {
>             "24": "null"
>         },
>         "1.3.6.1.4.1.9999.3.5.10.1.86": {
>             "0": "562"
>         },
>         "1.3.6.1.4.1.9999.3.5.10.3.92.4.1": {
>             "0": "0"
>         },
>         "1.3.6.1.4.1.9999.3.10.2.2.25.4.1.43": {
>             "24": "vlan24",
>             "25": "vlan25"
>         }
>     }}
>
> Now in order to get a concrete OID (e.g.
> "1.3.6.1.4.1.9999.3.10.2.2.25.4.1.43") and/or its instance I do 1-2 *O(1)*
> operations instead *O(n)*. And it become a bit faster. After I've changed
> column type from json to jsonb - I've got a lot of memory issues with
> plv8 stored procedures, so now ideas is:
>
> *What are the best practices to store such data and use cases in DB?*
> taking in considerations following: - Bulk and global updates are often
> enough (user-done operation) - several times per week and it takes long
> time - several minutes, annoying user experience. - Consulting some OIDs
> only from concrete config is medium frequency use case - Consulting ALL
> devices have some specific OID (SNMP Parameter) settled to a specific value
> - medium frequency cases. - Consult (read) a configuration for a specific
> device as a whole document - often use case (it is send to device as json
> or as converted CSV, it is send in modified json format to other utilities,
> etc)
>
> One of suggestion from other oppinions is to move ALL configurations to
> simple plain relational table
>
> CREATE TABLE public.configuration_plain(
>   device_id integer,
>   oid text,
>   instance text,
>   value text)
>
> Looking like
>
> *id*
>
> *oid*
>
> *instance*
>
> *value*
>
> 20818132
>
> 1.3.6.1.4.1.9999.2.13
>
> 0
>
> VSAT
>
> 20818132
>
> 1.3.6.1.4.1.9999.3.10.2.2.10.15
>
> 0
>
> 0
>
> 20818132
>
> 1.3.6.1.4.1.9999.3.10.2.2.10.17
>
> 0
>
> 0
>
> 20818132
>
> 1.3.6.1.4.1.9999.3.10.2.2.10.18
>
> 0
>
> 1
>
> 20818132
>
> 1.3.6.1.4.1.9999.3.10.2.2.10.19
>
> 0
>
> 2
>
> 20818132
>
> 1.3.6.1.4.1.9999.3.10.2.2.10.8.1.1
>
> 24
>
> 24
>
> 20818132
>
> 1.3.6.1.4.1.9999.3.10.2.2.10.8.1.1
>
> 25
>
> 25
>
> 20818132
>
> 1.3.6.1.4.1.9999.3.10.2.2.10.8.1.2
>
> 24
>
> vlan24
>
> 20818132
>
> 1.3.6.1.4.1.9999.3.10.2.2.10.8.1.2
>
> 25
>
> VLAN_25
>
> And now I end with a table of ~33 M rows for 40K devices * (700-900
> OID+Instance combinations). Some simple selects and updates (especially if
> I add simple indexes on id, oid columns) works faster than JSON (less than
> 1 sec updating one OID for ALL devices), but on some stored procedures
> where I need to do some checks and business logic before manipulating
> concrete parameters in configuration - performance decrease again from 10
> to 25 seconds in below example with each nee added operation:
>
> CREATE OR REPLACE FUNCTION public.test_update_bulk_configuration_plain_plpg(
>     sql_condition text, -- something like 'select id from devices'
>     new_elements text, --collection of OIDs to be Added or Update, could be 
> JSON Array or comma separated list, containing 1 or more (100) OIDs
>     oids_to_delete text --collection of OIDs to Delete
>     )
>   RETURNS void AS$BODY$DECLARE
>     r integer;
>     cnt integer;
>     ids int[];
>     lid int;BEGIN
>     RAISE NOTICE 'start';
>     EXECUTE 'SELECT ARRAY(' || sql_condition || ')' into ids;
>     FOREACH lid IN ARRAY ids
>     LOOP
>         -- DELETE
>         -- Some business logic
>         -- FOR .. IF .. BEGIN
>             delete from configuration_plain c where c.oid = 
> '1.3.6.1.4.1.9999.3.5.10.3.201.1.1' and instance = '10' and c.device_id = lid;
>             delete from configuration_plain c where c.oid = 'Other OID' and 
> instance = 'Index' and c.device_id = lid;
>             -- other eventual deletes
>         --END
>
>         -- UPDATE
>         -- Some business logic
>         -- FOR .. IF .. BEGIN
>             update configuration_plain c set value = '2' where c.oid = 
> '1.3.6.1.4.1.9999.3.5.10.3.87' and c.device_id = lid;
>             update configuration_plain c set value = '2' where c.oid = 
> '1.3.6.1.4.1.9999.3.5.10.3.201.1.1' and instance = '1' and c.device_id = lid;
>             -- other eventual updates
>         -- END
>
>         --INSERT
>         insert into configuration_plain (id, oid, instance, value) values 
> (lid,'1.3.6.1.4.1.9999.3.5.10.3.201.1.1', '11', '11');
>         -- OTHER eventually....
>         insert into configuration_plain (id, oid, instance, value) values 
> (lid,'OTHER_OID', 'Idx', 'Value of OID');
>     END LOOP;
>     RAISE NOTICE 'end';
>     RETURN;END$BODY$
>   LANGUAGE plpgsql VOLATILE
>   COST 100;
>
> So any best practices and advice on such data and use cases modeling in DB?
>
> Regards,
>
> AlexL
>


-- 
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.

Re: Hot to model data in DB (PostgreSQL) for SNMP-like multiple configurations

Reply via email to