Re: Ideas about presenting data coming from sensors

2025-02-14 Thread Allan Kamau
On Thu, Feb 13, 2025 at 9:44 PM Thiemo Kellner 
wrote:

> 13.02.2025 10:54:05 Achilleas Mantzios - cloud <
> a.mantz...@cloud.gatewaynet.com>:
>
> > If we followed a strict normalized approach then we would create
> additionally 11 tables each tag of type c) . And we are not guaranteed that
> the same tags would have the same structure over the whole
> fleet/manufacturers. So we are thinking of putting all semi-structured data
> of tags of type c) into one table with a single col of type jsonb . From
> what I read timescaledb plays nice with jsonb (or at least not bad).
> >
> > Do you ppl see any gotcha with this approach ?
> >
> > For starters we will not convert yet to timescaledb, but store them and
> handle them like normal tables. At least until we grasp the ins and outs of
> this.
>
> I have not come to see the real advantage of putting data into something
> like Jason or XML columns. Sue, you make life easier initially saving the
> time to put them into columns one by one, on the other end you have the
> hassle of dissecting the JSON, XML you name it when you retrieve/select the
> data, every query. While the query might stay stable the  computational
> effort dissection will have to be done with every query execution.
>
> For c) you could go to 6th normal form, or what number that is, by a table
> like TAG(id, tag_name, tag_value). You would have to convert the values to
> string to enter them. It is a pain in the butt, imho, retrieving them
> again. You can alleviate by having a specific column/table per data type,
> e.g. TAG(id, tag_name,tag_value_varray,tag_value_date,...).
>
> What is the downside of having each a table for the special tags? More
> effort in setting up.
> The upside is less effort in retrieval, and a much more understandable
> model.
>
> If your tag structure is volatile, you might have generic column names on
> a table mapping them in a view to speaking names. Taking this further does
> anchor modelling https://www.anchormodeling.com/
> .
>
>
>
Though you haven't asked for comments on database design, I have a
suggestion, have you considered using table partitioning based on a well
defined date interval, maybe monthly?
Here your DB data population application will determine the name of the
table based on the current date then check for the existence of the table
and then construct the inherited table if one does not exist and proceed to
populate the child table accordingly.
This may prove useful in future.

-Allan.


Re: Best Approach for Swapping a Table with its Copy

2025-02-14 Thread Greg Sabino Mullane
On Fri, Feb 14, 2025 at 1:02 AM Michał Kłeczek  wrote:

> Create index concurrently and then fiddle with the catalog tables to
> define the constraint using this index?
>

You mean an ALTER TABLE ... ADD CONSTRAINT ... EXCLUDE without actually
doing an ALTER TABLE. Nope, that's far worse than the pg_repack
shenanigans, as we would be creating new catalog objects.

Ideally would be that someone adds support for USING INDEX for an exclusion
constraint.

Cheers,
Greg

--
Crunchy Data - https://www.crunchydata.com
Enterprise Postgres Software Products & Tech Support


Re: Ideas about presenting data coming from sensors

2025-02-14 Thread Achilleas Mantzios - cloud


On 2/14/25 11:34, Allan Kamau wrote:



On Thu, Feb 13, 2025 at 9:44 PM Thiemo Kellner 
 wrote:


13.02.2025 10:54:05 Achilleas Mantzios - cloud
:

> If we followed a strict normalized approach then we would create
additionally 11 tables each tag of type c) . And we are not
guaranteed that the same tags would have the same structure over
the whole fleet/manufacturers. So we are thinking of putting all
semi-structured data of tags of type c) into one table with a
single col of type jsonb . From what I read timescaledb plays nice
with jsonb (or at least not bad).
>
> Do you ppl see any gotcha with this approach ?
>
> For starters we will not convert yet to timescaledb, but store
them and handle them like normal tables. At least until we grasp
the ins and outs of this.

I have not come to see the real advantage of putting data into
something like Jason or XML columns. Sue, you make life easier
initially saving the time to put them into columns one by one, on
the other end you have the hassle of dissecting the JSON, XML you
name it when you retrieve/select the data, every query. While the
query might stay stable the  computational
effort dissection will have to be done with every query execution.

For c) you could go to 6th normal form, or what number that is, by
a table like TAG(id, tag_name, tag_value). You would have to
convert the values to string to enter them. It is a pain in the
butt, imho, retrieving them again. You can alleviate by having a
specific column/table per data type, e.g. TAG(id,
tag_name,tag_value_varray,tag_value_date,...).

What is the downside of having each a table for the special tags?
More effort in setting up.
The upside is less effort in retrieval, and a much more
understandable model.

If your tag structure is volatile, you might have generic column
names on a table mapping them in a view to speaking names. Taking
this further does anchor modelling https://www.anchormodeling.com/
.



Though you haven't asked for comments on database design, I have a 
suggestion, have you considered using table partitioning based on a 
well defined date interval, maybe monthly?
Here your DB data population application will determine the name of 
the table based on the current date then check for the existence of 
the table and then construct the inherited table if one does not exist 
and proceed to populate the child table accordingly.


Absolutely, this will be a necessity sooner or later, as I can see the 
future coming it will be a new timescaledb functioning at the start as a 
logical replica of our maindb, and timescaledb will handle all the 
partitioning (i hope). At some point switch the app (both the data 
loading (from the vessels) and the queries ) to the timescaledb, and 
free the space in the maindb or smth along those line.



This may prove useful in future.

-Allan.


Re: Ideas about presenting data coming from sensors

2025-02-14 Thread Achilleas Mantzios - cloud



On 2/13/25 20:44, Thiemo Kellner wrote:

13.02.2025 10:54:05 Achilleas Mantzios - cloud 
:


If we followed a strict normalized approach then we would create additionally 
11 tables each tag of type c) . And we are not guaranteed that the same tags 
would have the same structure over the whole fleet/manufacturers. So we are 
thinking of putting all semi-structured data of tags of type c) into one table 
with a single col of type jsonb . From what I read timescaledb plays nice with 
jsonb (or at least not bad).

Do you ppl see any gotcha with this approach ?

For starters we will not convert yet to timescaledb, but store them and handle 
them like normal tables. At least until we grasp the ins and outs of this.

I have not come to see the real advantage of putting data into something like 
Jason or XML columns. Sue, you make life easier initially saving the time to 
put them into columns one by one, on the other end you have the hassle of 
dissecting the JSON, XML you name it when you retrieve/select the data, every 
query. While the query might stay stable the  computational
effort dissection will have to be done with every query execution.
Yes, jsonb is for cases that we simply do not know or care to know or 
have the time to break down something that will change from ship to 
ship, and already the burden of fighting with the electricians to define 
correctly the tags of a) and b) is too much. So since, on top of that, 
we hear variable "opinions" on those of type c) (one of them being ... 
nah .. nobody needs those), so we are saving our mental health going 
with jason, and of course this is under review, it could change with no 
big trouble at this stage.


For c) you could go to 6th normal form, or what number that is, by a table like 
TAG(id, tag_name, tag_value). You would have to convert the values to string to 
enter them. It is a pain in the butt, imho, retrieving them again. You can 
alleviate by having a specific column/table per data type, e.g. TAG(id, 
tag_name,tag_value_varray,tag_value_date,...).

What is the downside of having each a table for the special tags? More effort 
in setting up.
The upside is less effort in retrieval, and a much more understandable model.
The downside is for every vessel I might end up with a new table for 
every new setup which sucks, the idea by Adrian is fine, but we don't 
have users to support the variable dynamic definitions case by case, we 
would have to do that, so I better offload it all to the devs for lack 
of a better option. Plus this is the beginning we are starting lean, not 
over-engineering without a reason, and then augment as we go.


If your tag structure is volatile, you might have generic column names on a 
table mapping them in a view to speaking names. Taking this further does anchor 
modelling https://www.anchormodeling.com/
.







Re: Ideas about presenting data coming from sensors

2025-02-14 Thread Achilleas Mantzios - cloud



On 2/13/25 19:12, Adrian Klaver wrote:

On 2/13/25 01:53, Achilleas Mantzios - cloud wrote:

Now my problem is on the design . We have :

a) tags that have primitive values, float4 lets say - this is the 
majority, e.g. 60% of all tags


b) tags that contain alarms data also with defined structure, which 
have additional data such as time of the initial alarm set, 
acknowledgement of this alarm , validity of this alarm. Those 
represent smth like 35% fo all tags


c) tags that are basically polymorphic (about 11 of them all in all), 
each one has different structure, and their fields/cols range a few 
(1) up to many (25)


We have a table for a) and a table for b).

If we followed a strict normalized approach then we would create 
additionally 11 tables each tag of type c) . And we are not 
guaranteed that the same tags would have the same structure over the 
whole fleet/manufacturers. So we are thinking of putting all 
semi-structured data of tags of type c) into one table with a single 
col of type jsonb .
From what I read timescaledb plays nice with jsonb (or at least not 
bad).


Do you ppl see any gotcha with this approach ?


The only thing I can see at this time is: 'And we are not guaranteed 
that the same tags would have the same structure over the whole 
fleet/manufacturers.'


That would seem to me to point to a need for a table that maps a 
structure template to a fleet or manufacturer and a corresponding 
field in table c) that holds the fleet/manufacturer information.
Nice idea or just stuff everything (self-contained) inside json and let 
the software do the relevant logic, a somewhat more liberal approach.




For starters we will not convert yet to timescaledb, but store them 
and handle them like normal tables. At least until we grasp the ins 
and outs of this.






Thank you!













Re: Best Approach for Swapping a Table with its Copy

2025-02-14 Thread Greg Sabino Mullane
On Thu, Feb 13, 2025 at 6:06 PM Marcelo Fernandes 
wrote:

> > It's technically possible to do something similar for your use case, but
> it's not trivial. All the cab to trailer wires must be precisely changed.
> Everything directly related to the data must be swapped: heap, indexes,
> toast.
>
> I'd really appreciate to know more about how I can do this, as I think
> this is
> the crux of what I am trying to solve.
>

The pg_repack link posted earlier has the details on how it is done. But
messing with system catalogs like this is highly discouraged, for good
reasons. Still, if you need to go that route, test heavily and post the
solutions here for feedback.

Cheers,
Greg

--
Crunchy Data - https://www.crunchydata.com
Enterprise Postgres Software Products & Tech Support


Re: Best Approach for Swapping a Table with its Copy

2025-02-14 Thread Greg Sabino Mullane
On Fri, Feb 14, 2025 at 12:41 AM Laurenz Albe 
wrote:

> Moreover, you have to make sure to send out invalidation messages so that
> every session that caches statistics or
> execution plans for the tables discards them.


Hmm...is that really necessary? Because if so, there is no direct SQL-level
way to do that I am aware of, so we are back to a C extension. Or just
restarting the cluster :)

Cheers,
Greg

--
Crunchy Data - https://www.crunchydata.com
Enterprise Postgres Software Products & Tech Support