Re: [Tagging] Is tagging of fuel: assumed to be exhaustive?

Matija Nalis Wed, 19 Apr 2023 18:35:37 -0700

On Thu, 20 Apr 2023 00:47:21 +0200, Marc_marc <marc_m...@mailo.com> wrote:
> Le 19.04.23 à 14:19, Matija Nalis a écrit :
>> I think that my point remains that:
>> - one method is clear and unambiguous ("fuel:lpg=no")
>> - one method is not clear / is ambiguous ("fuel=octane_98;diesel").
>> 
>> So the first one should be preferred. Does that make sense?
>
> - one is a nightmare for datause
> - one isn't a nightmare for datause
> So the 2nd one should be preferred. Does that make sense?


Hmm, no, not at all (if ordering of your sentences is same as mine at the top 
quote)? 

I'll assume that by "datause" you mean something like computer storing
data in some kind of database for purpose of retrieving / searching /
updating those values and operating on them (i.e. using them)?
(Or did you have some fundamentally different definition from that one in mind?)

Databases which I am familiar with (mostly relational SQL-based and other
databases which should be most relevant here for OSM uses, but also others
like non-relational key-value datastores) are MUCH happier with first
("fuel:lpg=no") than with second ("fuel=octane_98;diesel") case.


- in "fuel:lpg=no" case it is VERY efficient and fast (using const lookup
  on composite index [key,value]) to look for e.g.

  SELECT * from t where key = "fuel:lpg" and value = "yes";

- however, trying to do that lookup for second example
  ("fuel=octane_98;diesel"), you would have to use:

  SELECT * from t where key = "fuel" and value like "%lpg%";
  
  which is much more inefficient (would need to do fulltext column scan
  instead of using index, because of leading wildcard in that "%lpg%".
  
  Also, it would need either post-processing filter outside of database
  server (or alternative using even more expensive multi-condition or even
  regex match!) to get rid of false positives matches like (hypothetical)
  "fuel=diesel;motolpgane" e.g. you'd actually need extra 2 conditions:

  SELECT * from t where key = "fuel" and (value like "lpg;%" or value like 
"%;lpg;%" or value like "%;lpg")
  
  Not to mention that if you wanted to look for more than one fuel type,
  you'd have to have even more kludges (like postprocessing outside of
  database server), instead of simple `where key like "fuel:%"` in first
  case (which would also still be able to use your composite index!)

  Relational database design theory explain why storing multiple values in
  one column (like in "fuel=octane_98;diesel" case) is a very bad idea
  (e.g. that way of tagging breaks 1NF, see for example
  https://en.wikipedia.org/wiki/Database_normalization#Satisfying_1NF)

So, I'm totally confused why would you think that "fuel:lpg=no" +
"fuel:diesel=yes" method would be a "nightmare" for datause, or that
"fuel=octane_98;diesel" would be a good idea from datause perspective?


-- 
Opinions above are GNU-copylefted.


_______________________________________________
Tagging mailing list
Tagging@openstreetmap.org
https://lists.openstreetmap.org/listinfo/tagging

Re: [Tagging] Is tagging of fuel: assumed to be exhaustive?

Reply via email to