Re: distributed/centralized META.yml data

Eric Wilhelm Mon, 29 Oct 2007 17:09:52 -0800

# from A. Pagaltzis
# on Monday 29 October 2007 15:11:

>Clearly this info should live somewhere and search.cpan should
>use it, but META.yml is the wrong place. It belongs somewhere
>unversioned.


+1, and see-also: "some kind of common API between all of the meta-ish 
foo.perl.org sites".

  http://www.nntp.perl.org/group/perl.module.build/2007/07/msg778.html

I think META.yml can play a part in that, particularly in fostering 
distributed pioneering.  The trouble with ad-hoc is just that it tends 
to *never* get formalized (i.e. never gets centrally documented, 
becomes discoverable, appears in books, etc.)

Of course, the trouble with centralization is that it can resist, 
discourage, or stifle change.  Plus, it is typically subject to 
the "wisdom" and latency of committees.

>The concern is “distance of metadata” I guess – it shouldn’t be
>too onerous for automatic tools working against the FTP, such as
>CPAN.pm, to get at this data, even though it lives outside the
>distribution.

It seems like something more along the lines of "web services plus sync" 
would be better suited to distributed implementation.  For example, 
meta.perl.org could be queried anonymously and edited by the author, 
but auto-filled (or even maybe over-written) by META.yml.

The nice thing about "just stick META.yml in your distro" is that it can 
be supported by shipped tools (e.g. Module::Build, Module::Starter, 
etc.)  This gives a nice low barrier to entry, and doesn't require as 
much opt-in or active engagement as e.g. editing something in a web 
form.  Also, it comes with the tarball and is therefore not subject to 
network failure, it mirrors well, etc -- all of those nifty qualities 
have to be traded-off to get external updateable-ness, especially if 
your solution is not built-in to the centralized mirroring scheme (i.e. 
PAUSE.)

Unfortunately, supporting multiple info sources (META.yml, plus a 
web-editable database somewhere and/or additional inputs such as 
cpanforum, etc) probably means attaching a version to the data and 
deciding which overrides which.  Typically, the data source which 
doesn't require the author to know about external interfaces is the 
easiest one to get rolling -- i.e. META.yml.

What if the tarball is newer than the last modification to the online 
data?  Do fields from META.yml still get overridden by the online data?  
Should the "meta.perl.org" service try to extract/update data from 
META.yml?  (Maybe just upon sign-up/request from the author?)

Perhaps META.yml explicitly delegates a URL as a definitive metadata 
source?  (Meaning (probably) that values for any META.yml fields are 
superseded if they appear in the online query result.)  Provided a 
machine-discoverable web API, multiple implementations could co-exist.

And there's also the consideration that some data should/could be 
per-author rather than per-dist:

  http://www.nntp.perl.org/group/perl.qa/2007/03/msg8050.html

--Eric
-- 
Don't worry about what anybody else is going to do. The best way to
predict the future is to invent it.
--Alan Kay
---------------------------------------------------
    http://scratchcomputing.com
---------------------------------------------------

Re: distributed/centralized META.yml data

Reply via email to