Pardon the delay, been putting this one off since it's going to be a fun one to address, and will be a bit long :)
On Thu, Aug 25, 2005 at 12:34:00PM +0200, Paul de Vrieze wrote: > What I mean is compatibility with current portage versions. Current > versions do not understand EAPI. There would be a good chance that they > could choke on packages with all kinds of new features, even in the sync > phase. A different extension would ensure that those portage versions > would still work (crippled) on a new tree. Of course such an extension > change should only be done once. Once the API versions are available this > is not an issue. General portage stance towards EAPI is unset EAPI == 0 (current stable ebuild format); if EAPI > then portage internal EAPI, unable to merge, which should be able to be detected during buildplan. Current portage doesn't know about EAPI; boned in that respect I'll admit, but it's the case for all new features rolled out- three options for dealing with this 1) Usual method, deploy support, N months later use support. 2) tweak stable so it's aware and can complain. Still requires people to upgrade, just makes it so that they're not forced into upgrading to 3.x; this is mainly a benefit for those who may don't care to try the first few releases of 3.x when it hits (akin to people dodging the first release or two of a gcc release). Worth noting that one rather visibile aspect of EAPI=1 is that (assuming the council votes on it, and yay's it) glep33 *will* result in current eclasses being effectively abandoned w/in the N months after an EAPI capable portage is released. Sound kind of bad, but people will have to upgrade for the capabilities. If EAPI was pegged into portage/ebuilds already it wouldn't be an issue, issues could be detected prior. Unfortunately it's not, and introduction of it (and use of it) is going to involve a few road bumps. Plus side, once it's in, portage *will* know if the ebuild is incompatible with the pythonic/bash ebuild code, and portage/the UI can act accordingly. Meanwhile, the changes that are being pushed into EAPI are addition of configure phase (broken out from compile), elib addition, and eclass2 support (same beast, different rules due to env save/restoration). The potential for horkage on sync'ing isn't there due to the fact that's purely python side; ebuild*sh doesn't play into it. Re: regen, issue isn't really there either; if you try and merge an eapi=0 on a non eapi aware portage, it works, same as it did before. If you try to merge an eapi=1 ebuild you hit either an issue with inherit, or a bail immediately in src_compile, due to the fact eapi=1 ebuilds will seperate configure out from compile (eapi=0 portage won't know to call it; no configure == failed compile). That said, there also quite likely is a change coming down the pipe to the tree's cache; the change will shift the rsync'd metadata cache over to a key/val based cache. Why oh why, yet another cache change? Simple. The change moves away from list based format to key:value pairs; in short it's a change that once made, means keys can be added to the cache from that point on without causing cache complaints on sync'ing. Last cache breakage, I swear :P EAPI addition being the next key tagged in; stable (not surprising) needs to be released with a version capable of reading both old and new format; once that's done, time for the usual "yo, upgrade people, something's coming down the line". Same version that supports old and new cache format can also include rudimentary eapi awareness. At least that's what I'm thinking. It's roughly inline with the previous forced cache breakages, just in this case slipping in some extra support in the process. Notices obviously would go out prior to moving on this also, along with a good chunk of waiting. > > > ps. I would also suggest requiring that EAPI can be retrieved by a > > > simple line by line parsing without using bash. (This allows for > > > changing the parsing system) > > > > No, that yanks EAPI setting away from eclasses. > > If the eclasses follow similar rules that would be easilly parseable. > (taking inherit ...) into account is easy as long as the inherit line is > on one line of it's own. (unconditionally) These rules that would > allready be followed out of style reasons would make various kinds of > parsers able to parse them. while it's insane, people *can* use indirection (eg inherit $var) for inherit's as long as it's deterministic, always the same inherit call for that ebuild's data. Don't see a good reason to ixnay that, which means we'd have to parse the whole enchilada, eclasses and the ebuild. Effectively, raiding a single var out wouldn't fly; eclasses could override an ebuild's eapi setting for example, just like any other metadata key (imo). A *true* format change, moving away from bash for example or moving to an executing design of ebuilds would require an extension change; such a change must imo anyways, since it's not a change of the ebuild env's template/hooks; either it's a fundamentally different model for ebuilds- either via no longer being bash based, or moving away from our declarative design of ebuilds. > > Only time this would be required is if we move away from bash; if that > > occurs, then I'd think a new extension would be required. <inserting a comment> contradicting myself via above, above is correct </comment> > > It would allow to for example restrict the ebuild format such that initial > parsing is not done by bash (but the files are still parseable by bash). > If we perform changes I think it should be done right in the first place. Elaborate please > > As is, shifting the 'template' loaded for an ebuild can be done in > > ebd's init_environ easy enough, so no reason to add the extra > > restrictions/changes. > > One of the issues of ebuilds is the cache/metadata stuff. Parsing an > ebuild for basic information takes a lot of time. This can be done lots > faster with a less featured parser (I've written one some day) that > accepts 98% of all current ebuilds, just doesn't like dynamic features in > the toplevel. Such a parser could be a python plugin and as such easy to > use from python. However to ensure compatibility with a faster parser the > EAPI variable should be there in a way that is a little more strict than > the other variables. And such a restriction is in practice not a > restriction. Any parser that doesn't support full bash syntax isn't acceptable from where I sit; re: slow down, 2.1 is around 33% faster sourcing the whole tree (some cases 60% faster, some 5%, etc). The speed up's are also what allow template's to be swapped, the eapi concept. I'd note limiting the bash capabilities is a restriction that transcends anything EAPI should supply; changes to what's possible in the language (a subset of bash syntax as you're suggesting) are a seperate format from where I draw the line in the sand. Mainly, limiting the syntax has the undesired affect of deviating from what users/devs know already; mistakes *will* occur. QA tools can be written, but people are fallable; both in writing a QA tool, and abiding by the syntax subset allowed. > The restriction I propose would be: > - If EAPI is defined in the ebuild it should be unconditional, on it's own > line in the toplevel of the ebuild before any functions are defined. > (preferably the first element after the comments and whitespace) > > - If EAPI is not defined in the ebuild, but in an eclass, the inherit > chain should be unconditional and direct. Further more in the eclass the > above rules should be followed. > > Please note that many of the conditions are allready true for current > ebuilds, just portage can "handle" more. inherit chain must be unconditional anyways. re: eapi placement, I would view that as somewhat arbitrary; the question is what gain it would give. I'd wonder about the parsing speed of your parser; the difference between parsing ebuilds and running from cache metadata is several orders of magnitude differant- the current cache backend flat_list and portage design properly corrected ought to widen the gap too. General cache lookup is slow due to- A) bad call patterns, allowed by the api; N calls to get different bits of metadata from a cpv, resulting in potentially N to disk set of ops. B) default cache requires opening/closing a file per cpv lookup; syscall's are killer here. C) every metadata lookup incurs 2 stats, ebuild and cache file. Getting to the point; cache is 100x to 400x faster then sourcing for <=2.0.51. Haven't tested it under 2.1, should be different due to cache and regen fixups/rewrites. Back to the point, essentially, EAPI matters in two places; 1) metadata transfer from the ebuild env into python side during depends phase; has to know what to transfer key wise. 2) actual ebuild build phase executions; if it isn't the depends phase, eapi being required so that the parser can swap drop in the appropriate ebuild env template. The restrictions suggested for EAPI would only make sense if eyeing #1, an alternative parser; no reason to drop the cache unless the parser is capable of hitting the same runtime performance the cache can hit (frankly, it's not possible from where I'm sitting although the gap can be narrowed). So... the EAPI limitations, not much for due to the conclusion above. Interested in the parser however, since ebd is effectively a pipe hack so that pythonic portage can control ebuild.sh. I (and others) have been after a bashlib for a while, just no one has crunched down and done it (easier said then done I suspect). My 2 cents at least. ~harring
pgps0akY27cTi.pgp
Description: PGP signature