Re: [GENERAL] dynamic schema modeling and performance

Vincent Elschot Fri, 14 Apr 2017 11:24:07 -0700


Op 14/04/2017 om 19:03 schreef Rj Ewing:

We do know where we want to end up. We've had the application runningfor a while using a triple store db. We're looking to move away fromthe triple store due to performance issues. Our core concept is thateach project can define a set of entities and their relations. Eachentity has a set of attributes. We need to be able to efficiently runad-hoc queries across multiple entities in a project, filtering viathe attribute values

What kind of queries are you talking about?

I think our business needs probably eliminate the possibility of dataintegrity at the db level. We currently do application level datavalidation.

Application level checks can only be done if you exclusively lock thedatabase from before you start the check until the moment you nolongerneed the certainty.That's usually a no-go because it makes your data single-user for theduration of the check.

Regarding EAV, is there a good way to do this? Most everything I readonline says that EAV is a terrible idea and performance is lackluster. However there are 6NF advocators who say that done right, itis highly performant. Are there any articles on the correct way toimplement EAV?

Performance always depends entirely on what you are doing with it, andon what scale. This is something that you should experiment with.


would jsonb or eav likely provide better query performance?

Can't tell without more information about what you actually do with thedata.

But dynamic schemas can be just fine, as long as your application is100% in control over what can and cannot be done,and as long as you use separate databases per customer/project/whatever.You will probably want to scale up at some pointand move customers to different servers, so you might aswell take thatinto account before you start.

On Wed, Apr 12, 2017 at 7:43 AM, Merlin Moncure <mmonc...@gmail.com<mailto:mmonc...@gmail.com>> wrote:


    On Tue, Apr 11, 2017 at 12:46 PM, Rj Ewing <ewing...@gmail.com
    <mailto:ewing...@gmail.com>> wrote:
    > I'm looking for thoughts on the best way to handle dynamic schemas.
    >
    > The application I am developing revolves around user defined
    entities. Each
    > entity is a tabular dataset with user defined columns and data
    types.
    > Entities can also be related to each other through Parent-Child
    > relationships. Some entities will be 100% user driven, while
    others (such as
    > an entity representing a photo) will be partially user driven
    (all photo
    > entities will have common fields + custom user additions).
    >
    > I was hoping to get opinions on whether postgresql would be a
    suitable
    > backend. A couple of options I have thought of are:
    >
    > 1. Each entity is represented as a table in psql. The schema
    would be
    > dynamically updated (with limits) when an entity mapping is
    updated. I
    > believe that this would provide the best data constraints and
    allow the best
    > data normalization. A concern I have is that there could be an
    enormous
    > amount of tables generated and the performance impacts this
    might have in
    > the future. I could then run elasticsearch as a denormalized
    cache for
    > efficient querying and full-text-search.
    >
    > 2. Use a nosql database. This provides the "dynamic" schema
    aspect. A
    > concern here is the lack of relation support, thus leading to a more
    > denormalized data structure and the potential for the data to become
    > corrupted.
    >
    > Any opinions on the use of psql for this case, or other options
    would be
    > greatly appreciated!

    Postgres can function as a nosql database -- you can use jsonb for
    example to archive data in such a way that the data model can be
    changed without making schema adjustments.   Another way to do it is
    EAV pattern as noted.  These might be good strategies if you're not
    sure where you want to end up.

    It really comes down to this: how formal do you want your data model
    to be?   Adding formality leads to performance optimizations, exposes
    your data to the fantastic SQL language, and allows rigorous
    assumptions to made made from external dependencies and trusted.
    Formality also brings a degree of inflexibility since your data has to
    be forced into predetermined structures.

    merlin

Re: [GENERAL] dynamic schema modeling and performance

Reply via email to