Hi,

A short summary of my thoughts and a reply.

source:XXX=* describes an object attribute, not a change attribute, and should hence be an object tag.
To benefit from overpass queries based on source:XXX=*, it must be an object tag.
Forcing mappers to split changes so that each part has appropriate source tags and to repeat the same source tags for different versions is a real hassle and is no improvement at all over object tags.

A bulk import totally complete in one operation may put a confidential source in changeset, but...

In the example case of a set of objects being checked and copied by mappers from a BusCo-yyyy-mm.osm file to update OSM, it is necessary that the copied objects contain a copied source tag containing the date yyyy-mm of the new version. Let's call this witness the update marker. By querying the update marker with that date, overpass can build the list of objects that are already mapper-processed and that can be expunged from the BusCo-yyyy-mm.osm remaining-to-do-file.

Replies inline...

On 2014-06-30 18:40, Jo wrote :
Just like everybody else I started several years ago by adding source tags on the objects themselves.

The whole reason why the imports list says source tags belong on the changeset has something to do with an import of millions of buildings in France, each and every one with:

source=cadastre-dgi-fr source : Direction Générale des Impôts - Cadastre. Mise à jour : 2013

To avoid this kind of clutter in the future, I read all the time on that list source tags should go on the changesets and I agree, even if it complicates things a bit.
The number of bus stops in Wallonia is much much less than the houses and other features of France.
They're even a tiny fraction of the other source tags that the mappers must use in the same region.
Once again, if the French Cadastre don't mind a confidential source mention, the answer is in my previous message: a tagset source is all-right for a bulk, robotized, one shot import import, thing done, OSM update, but not for BusCo piecewise, mapper-made, update needing a marker.

Only by curiosity:  1000 × that source=... line compression ratio:  zip: 262, bz2: 1600, xz: 2718 !!!
I've seen ADN like Y1 Y2 Y3 Y2 Y3 Y1 on the French border with ways more recent than their nodes ;-)
Having the source on the objects doesn't work either, anyway.
It does work and it's the good way to achieve what's wanted.
If you insist on putting them on the objects, take the prepared osm file. Select all objects, and add the source tag you like. For a detailed description on how to do so, see my previous message. Apparently things seem too complicated or unwieldy when described in too much detail.
No need to explain me how to do that, I made a complete osm file myself.
Don't expect other people to do so as well.
That is indeed the worst way.  Or the best way to have missing or unrecognizable source tags.
That is indeed why that tag must be copied from the *.osm source file like said in my previous message.
OFF topic, sorry...

Since not all the stops will have source tags, another system will be needed to know where there is still work to be done.
???
That is indeed why that tag must be copied from the *.osm source file like said in my previous message.
Should some mapper somehow fail to copy the update marker, the bus stop will remain in to-do state in the osm file and the problem will be spotted that way.
This is not very complicated. Every bus/tram stop has a ref. Compare the refs, compare the other tags. If not present, the object is still new. If the other tags differ, the object needs updating, either upstream or downstream.
Once again, what I said is not very complicated indeed.
If some OSM tag and the corresponding *.osm tag differ, we cannot determine if it is because the tag was not copied or because the *.osm data is incorrect and the mapper made a correction of it.
Hence, the update marker is necessary in the object tags to witness that the update was processed.
That doesn't prevent checking mandatory equality of, for example, the "ref", but data like the name of the stop is subject to user correction.

If all the new incoming tags match the OSM tags, and moreover if they match the previous update, one can sensibly say that the incoming update may be expunged and only the marker automatically updated.
I do have a system in place which does this. The whole import process is still in a setup phase, which is why you didn't find information about it on that French page yet. Anyway, once the upstream data and the data from an Overpass Query is in a PostGIS DB, it's not very hard to analyse it to one's heart content and create reports for the wiki with clickable links so they can be opened with JOSM or Merkaartor by remote control.
If I continue to like to make quality contribution, I'll be eager to see that. I congratulate you in advance.
But remember the source=BusCo yyyy-mm tag, else your system will hit you in the back.

Best regards,

André.


Jo


2014-06-30 15:44 GMT+02:00 André Pirard <a.pirard.pa...@gmail.com>:
Hi,

Let us notice that, unlike my message, your replies do not comment the reasons for source tags location.
I say:
  • the wiki instructions say to put sources in the objects
  • it needs fumbling to see changeset sources (is it fair to the author?)
  • only source in objects can be used in overpass queries
  • it's a user hassle having to split his changes so that the objects correspond to source
  • seen otherwise, it's attaching the source to the object or to a mixed bag container
Starting from the wiki sentence I commented, the reason for changeset sources is that other persons do it.
This is, of course not a valid reason unless saying why they do it and if it is appropriate.
I seem to read that those persons live in the imports list.

It is conceivable that a bulk import use changeset sources if their visibility and queriability is unimportant.
Bulk imports are one shot operations that do not involve general mappers and are a thing done afterwards.

Isolated changes and operations like BusCo do involve the general mappers.
The BusCo operation is not a one shot import but is providing a BusCo.osm file that contains the data for general mappers to copy, correct and update OSM with manually. Work already done is, now and in future updates, expunged from BusCo.osm and the obvious way to do that is to put a source=BusCo yyyy-mm in its objects. overpass will select objects having found their way to OSM and they will be deleted by ID from BusCo.osm.
Regarding a DB size argument, the overhead of multiple split changes makes the changeset sources more space consuming.

Other comments inline...

On 2014-06-26 18:17, Jo wrote :
I've been reading import proposals on the imports list for a while now and the recommendation I keep seeing there is to add source tags on the changesets, which is what I started since several months now. So now that I'm preparing the osm file for BusCo, I'd prefer to simply add the instruction to the importers to add source on the changeset upon uploading, instead of adding it to each and every of 30000 objects and that's only for half of a small country. On the northern side there are another 400000.

Of course, if the person performing the import wants to add source to all the objects they add they can simply do  Ctrl-a to select all objects, add source=whatever and save the file after downloading it.
That is suggesting that every user did it a different way and making sure that an overpass query will not be able to use that data.  Inserting source=BusCo 2014-4 in the BusCo.osm is exactly the opposite.

What I'd suggest to do to make it clear to which object a source tag from the changeset belongs is, transfer the  stops from the calculated layer to the working layer, give them all a nudge to where they belong and change the surrounding objects according to the aerial imagery. Then when done, do:

Ctrl-f
modified highway=bus_stop

then Upload selection, with source=TEC, Bing2011

Then perform a general upload for all the rest with source=Bing2011
Those who have understood, please raise a hand.
Alternatively:
- put source=BusCo 2014-4 in the object the users copy from the BusCo.osm file (or its tags if the object already exists)
- overpass query "source=BusCo 2014-4 ...", get their ref ID and used to expunge done work from BusCo.osm
My preference is to not add a date to TEC, it will always be the latest version that was available when the upload was performed anyway.
Your preferences are not the only ones.

There are other ways to check whether a stop needs to be updated (comparison with current data downloaded with Overpass API) This procedure is already in place, with output going to a wiki page, with links that can be clicked in a convenient way to edit with JOSM remote control.
If the BusCo data needs corrections, the OSM data could be different and right and the update could be undetected.
By definition, the source=BusCo 2014-4 method is reliable.
This cooperator is perfectly astounded to read the last phrase here for the first time and to find no mention of it here.




Jo


2014-06-26 14:21 GMT+02:00 Dan S <danstowell+...@gmail.com>:
2014-06-26 12:44 GMT+01:00 André Pirard <a.pirard.pa...@gmail.com>:

Hi,  I wonder if this phrase without an explanation link contains appropriate instructions (or just press news):
Since the introduction of changesets these tags are often added as changeset tags rather than in the features themselves.
It sounds like ("rather than") source tags in objects must now be replaced by source tags in changesets.

Hi Andre,

The sentence says changeset tags are "often" used in preference, and in your restatement you have converted "often" to "must now be replaced by". That is a massive difference, and I feel you've misread. I think the sentence in the wiki strikes the correct balance.
I have no understanding problem.  But that phrase has a meaning problem.
Normally, these pages contain instructions how to tag and not a chronicle of the taggers' doings.
So, according to the above, it might say that, if of very limited importance, source tags can be put on changeset if it's a bulk and one shot import. Not let believe that it's a general case.  A link to "more information" is always welcome as you can see.
Please correct it.
BTW, I don't feel JOSM>File>Upload appropriate: asking without any explanation for changes that are not bulk an apparently single source when there may be a dozen objects with possibly several sources each. 

While doing so may be appropriate to for huge bulk imports, I don't think it's always, even generally, the case.

I agree.
Try to convince Jo.


 
Suppose an osm file built from version 2014_04 of BusCo bus stops data.
The OSM contributors are invited to copy each object to OSM and to check the data, esp. coordinates.
Should:
  • this file's objects contain source=BusCo 2014-04 (ISO date)
  • or the contributor be requested to add that tag to the changesets for each and every update

In the first case, the tagging will be done without mistakes and the source will be very apparent on the main OSM Web map not only for the reader to see but also for overpass to filter which data belongs to BusCo and even which is not yet at the latest update.

In the mistake prone, second case, the mapper will be asked to force himself in different updates for BusCo and for other necessary updates that he will inevitably meet in the process, and the net result of that hassle will be a misplaced source tag with regard to visibility and overpass.

Which is the best method? Or is there another one?

I personally would say that your changeset source tags should only list the sources that have been used to make the changes you have made. In other words, your option 2 shouldn't be recommended. In the case you give, I would recommend to leave object source tags as they are, and add changeset tags listing any extra sources that the contributor used for their changes. I know this feels odd because the "total" source of the OSM data ends up split between object and changeset, but I think it's acceptable way to progress, and it definitely remains possible for a machine ot calculate the "total sources list".

I think that changeset source tagging is only appropriate to mechanical imports and that the above phrase should say so or link to some reading that does.

I disagree. When I do edits using a single source, it makes a lot of sense to put the source tag on the changeset. When I do edits using multiple sources, it makes a lot of sense to put the source tags on the objects.

 
It seems strange to have to split updates one per object so that the correct source tags are present on each when they could equivalently and more appropriately be on the object itself.
Typical, compared to the variety of object source tags format, is this scarce instruction in changeset:
  • source=* – specify the source for a group of edits
Typically, "source for" does not say "source of" what.  Of the objects or of the edits as a whole import?

Good spot. So the text needs improving. I've edited the sentence to try and improve it. Obviously I've edited it using my own understanding of the consensus idea of the tag, so if I'm wrong let's just keep improving it :)

Dan


André.


_______________________________________________
Tagging mailing list
Tagging@openstreetmap.org
https://lists.openstreetmap.org/listinfo/tagging

Reply via email to