On 03/23/2014 17:51, Michał Górny wrote:
> Dnia 2014-03-23, o godz. 17:40:20
> Joshua Kinard <ku...@gentoo.org> napisał(a):
> 
>> On 03/23/2014 17:05, Michał Górny wrote:
>>> Dnia 2014-03-23, o godz. 16:27:43
>>> Joshua Kinard <ku...@gentoo.org> napisał(a):
>>>
>>>> On 03/23/2014 15:44, Michał Górny wrote:
>>>>> Tags, on the other hand, are more 'live'. They place the package
>>>>> somewhere in the 'global' tag hierarchy that can change over time.
>>>>> I expect that people other than maintainers will be adding tags to
>>>>> packages (and changing them), and that people will invent new tags
>>>>> and apply them to more packages.
>>>>>
>>>>> So, first of all, your solution would mean that every commit adding
>>>>> a new tag or changing one of the tags would modify the package
>>>>> metadata.xml. This means a Manifest update and a ChangeLog entry (please
>>>>> don't get into more rules for ChangeLogs now), and this means it will be
>>>>> harder to find actually useful entries there.
>>>>>
>>>>> So we make tag updates harder, and increase time and size of rsync.
>>>>
>>>> Instead of individual <tag> lines in metadata.xml for each tag, why not a
>>>> single <tags> line that contains a comma-delimited list of up to five tags,
>>>> whitespace optional?  That should help reduce the "fluff" of the tree by
>>>> adding this feature.
>>>>
>>>> E.g.,
>>>>
>>>> <tags>one,two,three,four,five</tags>
>>>
>>> Either use XML, or don't use XML. Don't make this some kind of ugly
>>> mixture of XML with non-XML.
>>>
>>> So:
>>>
>>>   <tags>
>>>     <tag>one</tag>
>>>     <tag>two</tag>
>>>   </tags>
>>>
>>> if we're really going for this. But I guess our DTD doesn't allow easy
>>> definition of single <tags/> with no forced position.
>>
>> TBH, I don't like the use of XML at all.  Never have and never will.  I am a
>> big fan of INI-style definitions (i.e., like Samba's config).  XML just
>> leads to a lot of unneeded fluff in what should be a really small file,
>> which is why I was proposing a single <tags> element instead of multiple
>> <tag> elements.
> 
> metadata.xml is XML at the moment, so you are supposed to obey its
> rules, whether you like them or not. if you want to replace it with
> something else, feel free to try. But don't make a shitsoup mixin out
> of it.

I'm not proposing to change it now...bit too late for that.  But if I ever
come across a TARDIS on eBay, well...

That said, Is XML that specific that every single atom has to be wrapped by
an individual tag?  A comma-separated list of values in its own XML tag is
prohibited by the spec?  I don't use XML often (if at all), so I am not
familiar with its intrinsics.


>>>>> Secondly, since tags for every package will be held in different files,
>>>>> people will need dedicated tools to collect tags from all those files
>>>>> and add matching tags to their own packages. Long story short, we're
>>>>> going to have many 'duplicate' tags that will require even more commits
>>>>> with ChangeLog entries and Manifest updates.
>>>>
>>>> If we automate the generation of a master tag index file, like
>>>> use.desc.local, this can be avoided.  emerge can simply go rummage through
>>>> the master index for matching tag entries instead of going through the
>>>> entire tree.  Because if we wanted to sift through the entire tree, grep
>>>> would be a far better method (compiled C and probably better text-matching
>>>> algorithms than emerge).
>>>
>>> And this goes pretty much backwards to what we were aiming at. We
>>> should finally kill use.desc.local, not get inspired by the redundancy.
>>
>> And what replaces it?  What differentiates a global USE flag that has
>> purpose across multiple packages (like 'ipv6') against a flag that only
>> exists for a single package?
> 
> Applications are supposed to read metadata.xml for local flags. That's
> all about it. Having an extra index file doesn't really make sense
> there.

But they don't currently, do they?  As far as I know, most everything parses
the use.local.desc file.  Wouldn't having portage apps read/parse every
package's metadata.xml file introduce a lot of disk I/O to seek out those
files across the entire tree?  That would seem like a bigger step backwards
if so.


>>>>> Worse than that, your GLEP doesn't even have any basic rules for naming
>>>>> tags -- like what language form to use and, say, which character to use
>>>>> instead of space. This sounds like the sort of things that's going to
>>>>> make it even harder to get some consistency, especially if some
>>>>> developers are going to follow someone else committing earlier and some
>>>>> will follow their own rules.
>>>>
>>>> Easy: ASCII, alphanumeric only, must start with a letter, lowercase, no
>>>> spaces.  A lot of problems are avoided if we keep tags to one-word
>>>> descriptors only.  E.g., for mail clients, they would carry both 'mail' and
>>>> 'client' as two of their five tags.  For kmail, a third tag would be 'kde'
>>>> and Evolution would have 'gnome' instead.
>>>
>>> I'm pretty sure you will finally hit something that goes with two
>>> words. Protocol name or something.
>>
>> Perhaps, but we can fight that battle when we get there.  starting off with
>> one-word tags keeps things simple for now and that'll make it easier to
>> determine whether this experiment actually pans out or not.
> 
> If you introduce arbitrary limitations, people will either find a way
> around them (which means getting even worse mess) or omit some tags.
> Either way, tags become less helpful.

Everything trends towards greater entropy, whether we like it or not.
Portage started with the basic idea of Ports, but it's grown way beyond that
over the years.  USE flags were supposed to be simple switches for
controlling compile-time functionality, emerge used to be the only package
manager, and Gentoo used to only support the Linux kernel and sysvinit scripts.

Whatever implementation of tags is adopted, if any, will eventually grow
beyond its original design parameters.  If tags are not adopted, something
else will probably get proposed and adopted down the road that will outgrow
its design parameters.  The question is, are tags the best we can do *now*,
or do we wait for some better idea to appear down the road and then go with
that instead?


>>>> I'd also suggest that 'all' be considered a default, global tag for all
>>>> packages, it be a reserved tag internal to emerge and other package
>>>> managers, and not count against the number of allowed tags (meaning that
>>>> technically, a package is allow five tags + 'all').
>>>>
>>>> As for default tags when a package does not define any, the package 
>>>> category
>>>> gets split at the hyphen and becomes two independent tags.  This is
>>>> overridden when at least one tag is defined in metadata.xml.
>>>
>>> Will this have a real benefit? Sounds like unnecessary confusion for
>>> a minor gain to me.
>>
>> Which?  The internal 'all' tag or the use of existing category names as a
>> default set of tags for packages that don't have any tags defined?
> 
> The 'all' tag sounds like something that would have no value.

Okay, let's ignore that then.  I'm just brainstorming -- not every idea has
worth or merit.


> The automagic tags sound like a way to confuse people -- yesterday it
> had this tag, now I wanted to add a new one and the old tag
> disappeared! Not to mention sometimes the categories don't give really
> useful tags. Tags are not replacing categories, so no point in trying
> to bind the two together.

I am not suggesting that tags replace categories.  Categories were the
original way to group packages (again, deriving from how Ports does it), so
when no tags are defined for a package, they offer a somewhat-suitable
fill-in.  That's not binding the two in any direct way, it's just offering a
default/fallback set of tags until a package maintainer updates metadata.xml
to add actual tag definitions.

Sample python pseudocode:

if not package.tags:
    package.tags = package.category.split('-')

If you have a better idea, I am definitely all ears.

-- 
Joshua Kinard
Gentoo/MIPS
ku...@gentoo.org
4096R/D25D95E3 2011-03-28

"The past tempts us, the present confuses us, the future frightens us.  And
our lives slip away, moment by moment, lost in that vast, terrible in-between."

--Emperor Turhan, Centauri Republic

Reply via email to