[RFC] CPAN6 requirements analysis

2009-05-28 Thread Alex Elsayed
While lurking in IRC, I've seen several discussions of what CPAN 6 should 
look like. Honestly, wayland76++'s idea for packaging seems the best to me. 
Most of the suggestions so far, especially those based on alien, apt, yum, 
or other existing package managers have a few major problems:

* Alien only converts between a few package formats
* All of these suggestions are _heavily_ biased towards binary distributions
* These suggestions make automatic packaging for new distros extremely 
difficult, because they require major changes to multiple projects

For example, let's take Gentoo or Exherbo. Gentoo has two package managers, 
one which is used by most Gentoo users (Portage) and one which is more 
featureful and advances faster (for various reasons, including more 
developers) (Paludis). Portage supports binary packages through .tbz2 files 
with trailing metadata. Paludis does not support these, as they have been 
deemed ill-conceived and poorly implemented by the lead developer, and is 
working on its own binary package system. Exherbo uses Paludis exclusively. 
Both distros are source-based.

Paludis has a tool which can import arbitrary trees to merge into the 
filesystem, but it's generally preferred that it be used only by users, not 
system packagers.

Portage has no such tool.

One existing problem for Gentoo is supporting installation of Perl 5 
modules. Because all metadata is spread out (and more importantly, /inside/ 
the tarballs), it is impossible to, say, add a new kind of repository for 
Perl modules - you'd have to have the full source for every module on your 
machine. Similarly, the automatic ebuild creator for P5 modules, g-cpan, is 
forced to recursively download packages, unpack them, and read their 
metadata before fetching their dependencies. This poses a significant 
problem for distros which mandate up-front configuration without an 
interactivity requirement (i.e., Exherbo and to a lesser degree Gentoo).

I think that wayland76++'s idea of having a common metadata system with 
filters to convert that metadata into a distribution package is the best 
currently-proposed solution, but I also feel that it needs some additional 
extension.

I personally believe that there are a few requirements for a package format 
that is sufficient for Perl 6:

* It must enable packaging for both binary- and source-based distros
* It must enable automatic generation of packages for supported systems 
(although it may well not be capable of it out of the box)
* It must permit (or preferably help with) attempts to support new systems
* It must be simple to submit packages in the correct format
* It must enable the design and building of an automatic testing system

My extensions to wayland76++'s proposal are as follows:

* Include in the metadata enough information to:
* Build a binary package ( deb, rpm, etc)
* Create a source build script ( ebuild, exheres, PKGBUILD, etc)
* Fetch the source package
* Contact maintainer/author/etc with issues
* Query stability information
* Prod the testing framework (not the actual tests, though)
* Separate the metadata from the package
* If the metadata is in the source distribution, have CPAN 6 extract it, 
and put it in a separate tree of just metadata
* This enables simple fetching of the entire /metadata/ tree without the 
entire /source/ tree
* Also opens the door to package managers natively supporting this 
format (Paludis has done this with CRAN and is working on Ruby Gems support)
* Have well-documented example filters, and a skeleton-filter-builder script
* Enables packaging developers to rapidly roll out support for new 
systems
* Encourage use of pure-P6 filters, so as to enable cross-distro 
mantainership
* Like building Fedora packages on the developer's Debian workstation
* Multiple submission avenues
* Simple login-controlled FTP
* Web submission
* PAUSE-like client
* Hosted SCM source w/tagging


Comments?



Re: [RFC] CPAN6 requirements analysis

2009-05-28 Thread Alex Elsayed
On Thursday 28 May 2009 4:04:28 pm Daniel Carrera wrote:
> * We were mainly looking at Alien as a source of Perl code we could borrow.
Ah, I was lumping it in with the previous proposals to actually use .deb as 
the official P6 package format. My mistake.

> * The point of wayland76's proposal was to use the local package
> manager. Whether the local package manager is geared toward binary
> distributions is a separate issue.
Again, my point on the issue of supporting binary/source distros was that 
wayland76's proposal /would/ be able to handle it properly; the '.deb as 
official' method would not.

> At first I liked wayland76's proposal, but now I have a new concern:
> Most package managers are not designed to hold multiple versions of the
> same package. As indicated in S11, it is important that a computer can
> hold multiple versions of the same package. I fear that using the native
> package manager will make this difficult.
This could probably be resolved, but it would probably require distro-specific 
code. On Gentoo, the way to do it would be with 'slots', which are 
specifically designed for that use; on Debian, it would seem that the names 
would have to be distinguished by an identifier, allowing multiple versions.



Re: [RFC] CPAN6 requirements analysis

2009-05-28 Thread Alex Elsayed
On Thursday 28 May 2009 4:22:00 pm Larry Wall wrote:
> I support the notion of distributing binaries because nobody's gonna
> want to chew up their phone's battery doing unnecessary compiles.  The
> ecology of computing devices is different from ten years ago.
I agree. My ideal situation would be that CPAN6 offers multiple 'heads' - one 
might be a raw metadata store; another might be a Debian/Ubuntu style PPA 
provider, possibly building (.deb|.rpm)s when uploaded, or in a distributed 
CPANTS-like fashion.

> Most of these package managers have ways of running an installation
> script at the end, so we could perhaps think of this as downloading
> an installer rather than the actual software, and the new version
> of the installer contains or has access to all the versions it knows
> should be installed, and interacts with the official Perl library
> installer to install them.
This has the potential for greatly angering users and distro packagers: a 
major reason people use package managers is that they maintain a database of 
what files were installed, when, and by what package. If you use post-install 
hooks, many package managers won't have that information, making 
uninstallation a nightmare.

> By the same token, it's smart to keep the metadata close to the thing
> it's describing, so if it's easy to extract up front reliably, that's
> probably sufficient.
Again, agreed. That would fall under the "easy to submit well-formed packages" 
requirement in my mail. In fact, it could even still be a Meta.yml file in the 
source distribution - it could be up to CPAN6 to extract it for external 
access when the distribution is uploaded.


Re: [RFC] CPAN6 requirements analysis

2009-05-28 Thread Alex Elsayed
On Thursday 28 May 2009 4:54:50 pm Daniel Carrera wrote:
> On the other hand, distributing Parrot bytecode (or PIR, or PASM) seems
> fine. But I don't know what to suggest for modules that require a C
> compiler.
The problem with that is that Rakudo isn't the "Official" impelentation, and 
never will be. Distributing modules as Parrot bytecode would lock out other 
implementations, something that is very strongly discouraged. Also, I think 
Larry may have meant the compiled form of whatever the P6 equivalent of XS is, 
since Perl is largely designed as a scripting (or at least scripting-ish) 
language and therefore might lose more than it gains by precompiling the 
/Perl/ code rather than just the C/C++/etc.

> I suggested something similar to wayland76 a couple of days ago, to
> solve a different problem. If I remember correctly, he was concerned
> about the local package manager not knowing which files were installed.
>
> Perhaps we should revisit this idea. This is what I proposed: We have
> our own package manager (e.g. /usr/bin/cpan6 ) that takes a .tgz file
> with an appropriate format:
>
> /usr/bin/cpan6 install Foo-Bar.tgz
>
> The RPM Foo-Bar.rpm would contain Foo-Bar.tgz and the rpm install script
> would simply run "cpan6 install Foo-Bar.tgz". When the Fedora user
> uninstall the module, rpm runs a script which just calls "cpan6
> uninstall Foo::Bar".
 The issue with that goes somewhat farther than just "the package manager 
doesn't know what's there." Package managers provide several important 
features that would be rendered near-useless by that model:

* Collision detection - It becomes impossible to prevent another package from 
overwriting a file installed this way
* Linkage checking - If a library interface breaks compatibility, then 
binaries installed this way will not be detected when sweeping installed 
packages to fix linkage
* Programmatic generation of binary packages for hybrid distributions [1]
* Recording the attributes (permissions, etc) of installed packages, so they 
can be fixed if changed inappropriately
* Source-based distributions which give the user the option of 
enabling/disabling tests selectively lose that ability (Gentoo, Exherbo)

[1]: This is a footnote because it'll take a paragraph.
In Gentoo (and when their binary package format is stabilized, Exherbo) binary 
packages are generated by invoking the package manager with specific options. 
With Portage (Gentoo only), the --buildpkg[[only]?] flag determines this, 
while with Paludis (Gentoo and Exherbo) it is accomplished by 'installing' to 
a binary repository. The system you describe makes this model impossible.


[RFC] CPAN6 requirements analysis

2009-05-29 Thread Alex Elsayed
While lurking in IRC, I've seen several discussions of what CPAN 6 should 
look like. Honestly, wayland76++'s idea for packaging seems the best to me. 
Most of the suggestions so far, especially those based on alien, apt, yum, 
or other existing package managers have a few major problems:

* Alien only converts between a few package formats
* All of these suggestions are _heavily_ biased towards binary distributions
* These suggestions make automatic packaging for new distros extremely 
difficult, because they require major changes to multiple projects

For example, let's take Gentoo or Exherbo. Gentoo has two package managers, 
one which is used by most Gentoo users (Portage) and one which is more 
featureful and advances faster (for various reasons, including more 
developers) (Paludis). Portage supports binary packages through .tbz2 files 
with trailing metadata. Paludis does not support these, as they have been 
deemed ill-conceived and poorly implemented by the lead developer, and is 
working on its own binary package system. Exherbo uses Paludis exclusively. 
Both distros are source-based.

Paludis has a tool which can import arbitrary trees to merge into the 
filesystem, but it's generally preferred that it be used only by users, not 
system packagers.

Portage has no such tool.

One existing problem for Gentoo is supporting installation of Perl 5 
modules. Because all metadata is spread out (and more importantly, /inside/ 
the tarballs), it is impossible to, say, add a new kind of repository for 
Perl modules - you'd have to have the full source for every module on your 
machine. Similarly, the automatic ebuild creator for P5 modules, g-cpan, is 
forced to recursively download packages, unpack them, and read their 
metadata before fetching their dependencies. This poses a significant 
problem for distros which mandate up-front configuration without an 
interactivity requirement (i.e., Exherbo and to a lesser degree Gentoo).

I think that wayland76++'s idea of having a common metadata system with 
filters to convert that metadata into a distribution package is the best 
currently-proposed solution, but I also feel that it needs some additional 
extension.

I personally believe that there are a few requirements for a package format 
that is sufficient for Perl 6:

* It must enable packaging for both binary- and source-based distros
* It must enable automatic generation of packages for supported systems 
(although it may well not be capable of it out of the box)
* It must permit (or preferably help with) attempts to support new systems
* It must be simple to submit packages in the correct format
* It must enable the design and building of an automatic testing system

My extensions to wayland76++'s proposal are as follows:

* Include in the metadata enough information to:
* Build a binary package ( deb, rpm, etc)
* Create a source build script ( ebuild, exheres, PKGBUILD, etc)
* Fetch the source package
* Contact maintainer/author/etc with issues
* Query stability information
* Prod the testing framework (not the actual tests, though)
* Separate the metadata from the package
* If the metadata is in the source distribution, have CPAN 6 extract it, 
and put it in a separate tree of just metadata
* This enables simple fetching of the entire /metadata/ tree without the 
entire /source/ tree
* Also opens the door to package managers natively supporting this 
format (Paludis has done this with CRAN and is working on Ruby Gems support)
* Have well-documented example filters, and a skeleton-filter-builder script
* Enables packaging developers to rapidly roll out support for new 
systems
* Encourage use of pure-P6 filters, so as to enable cross-distro 
mantainership
* Like building Fedora packages on the developer's Debian workstation
* Multiple submission avenues
* Simple login-controlled FTP
* Web submission
* PAUSE-like client
* Hosted SCM source w/tagging


Comments?


Re: [RFC] CPAN6 requirements analysis

2009-05-29 Thread Alex Elsayed
On Friday 29 May 2009 1:51:40 am Mark Overmeer wrote:
> I would really like to see a split in terminology being used for the
> various seperate problems.  The traditional confusion about what CPAN is:
> an archive or an install tool.  Package manager discussions are in the
> process AFTER the install tool: to distribute OS changes to be made.
> In the messages on the list, I see people merge requirements of these
> very independent tasks.
Okay, I will readily admit that there is a real need to have there be a more 
clear separation between these two separate things which are both called CPAN. 

> I think that package managers are not a "CPAN" related problem at all.
> The Perl install tool decides which files it wants to have within some
> file-system tree and versioned environment, and then package managers
> distribute those files and meta-data.
Here I'll have to disagree, for several reasons.
* The structure of the archival system will heavily impact the structure of 
the install tool
* If the archive does not provide sufficient information in a useful format, 
then it may be difficult or impossible to delegate installation to the local 
package manager. With the current CPAN architecture for Perl 5, it is 
impossible for an external package manager to calculate dependencies without 
first downloading the entire /source/ of all modules in the dependency list 
*due to a limitation of the archive architecture*

> Also, there are various different package managers around for Linux
> distributions, and they tend to be replaced every few years.  If you
> want people to use Perl modules on their Linux systems in a convenient
> way, you have to distribute each perl module in all of the existing
> formats. Of course, a tool like "alien" can be used to simplify the
> task of creating all these flavors.
This feels like a straw man to me.
* Package managers are actually relatively stable. The .deb, .rpm, and .ebuild 
formats (just to name three _big_ ones) have been around for at least 10 years 
each.
* We don't need to provide everything in all formats - If the archive 
structure makes automated conversion easy, then _each_ _distro_ can just run a 
cron job to
* Sync with the CPAN archive
* Automagically build packages for their own distro
* Publish those packages (PPA for Debian and Ubuntu, overlay for Gentoo)
* Alien is really not a very good solution, as it only works for binary 
distros, not source or hybrid ones (Gentoo, Exherbo, Arch) and even then only 
a few binary distros are supported

> IMO, that discussion should go in the direction of additional services:
> the CPAN archive distributes what authors publish.  The install tools
> (CPAN.pm/CPANPLUS/successors) make that code fit in specific operating
> systems. As a service, other people can publish the results of their
> specific module installation via package-managers to the world, such
> that those people can use they platform native software management
> tools.  Just like search.cpan.org is an independent additional service
> on the CPAN archive.
And once again we run into an issue: This workflow would require that each end 
user, rather than using /only/ CPAN(PLUS)? or /only/ their package manager to 
install, they are forced to use one to generate packages, and then the other 
to install them! By providing a very open metadata  spec, we can make it so 
that neither the Perl community nor the end-users need to take on that burden. 
Instead, it would go to the distributions, who are already well-prepared to 
handle packaging. We'd just be providing the tools and material they need to 
do so.

> > I personally believe that there are a few requirements for a package
> > format that is sufficient for Perl 6:
> > * It must enable packaging for both binary- and source-based distros
> > * It must enable automatic generation of packages for supported systems
> > (although it may well not be capable of it out of the box)
> > * It must permit (or preferably help with) attempts to support new
> > systems * It must be simple to submit packages in the correct format
> > * It must enable the design and building of an automatic testing system
>
> The worst flaws in software design are based on the idea that you can
> organize the outside world.  The Perl community will never be able to
> push its packaging mechanism into Linux distributions.  We may be able
> to select the ideal packaging mechanism, and then they will wrap that
> in their own packaging mechanism.
As I said above, it doesn't need to. What my mail was suggesting was mainly a 
very information-rich metadata spec, which would contain enough information 
that the distros could package the modules without needing much more than 
that.
* For my first point, it merely says that the metadata must contain enough 
information to run an installation process, which is usually the information 
needed to build a distro package
* The second bulled adds the constraint that this should be doable in an

Re: [RFC] CPAN6 requirements analysis

2009-05-29 Thread Alex Elsayed
> I believe he is arguing that whatever we end up doing needs to make it
> easy for an external package-manager to find out what files CPAN6.pm
> is going to install, and where, and what the dependencies were (both
> Perl and system libraries).  So that the various distributions can
> make native packages from perl packages automatically (or at least as
> close to automatically as possible, it gets a little uglier when
> external libraries are involved).
I am completely in agreement here.

> So if there is program "foo" that is written in Perl that Fedora wants to
> distribute, they want to be able to list the dependencies when they
> make the package for "foo".  If foo requires a few packages, then they
> just want to say that and then when foo is installed, pull in those
> packages and any that those depend on,
Agreed.

> including any non-perl ones
> that are needed by the perl packages.
Here is where it gets sticky. Let's take a perl module which requires the 
Berkely DB - in fact, requires a specific API version. In Gentoo, you'd 
specify that with a slot dependency, while on Fedora you'd specify the 
release. The two distros may well use different names for the package and/or 
the slot/release specifier.

This problem strikes me as intractable - I think the only thing we can do is 
provide a dependency specifier, clearly tagged as being external to the CPAN 6 
archive, with a sensible name that allows a human to intervene and find the 
correct package for their distro.



Re: [RFC] CPAN6 requirements analysis

2009-05-29 Thread Alex Elsayed
> I know that Rakudo is not the official implementation. The problem is
> that you misunderstood my post. I did not say to distribute PIR to the
> exclusion of Perl source. You know that I was replying to Larry's
> comment that he supported the notion of distributing binaries. Surely
> you didn't think that Larry meant "distribute binaries to the exclusion
> of Perl source", did you? Therefore, my comment is a reply to the binary
> aspect and the central part of my comment is the problem with modules
> that require a C compiler.
Sorry about that then.

> > * Collision detection - It becomes impossible to prevent another package
> > from overwriting a file installed this way
>
> Nothing is impossible. The first method that crossed your mind may not
> do it, but that doesn't mean that it can't be done. In any case, the
> drawbacks are no worse than what the current CPAN shell does today. And
> the current CPAN shell obviously work fairly well. The current system
> has room for improvement, but any argument that says "it can't work" is
> flawed because it is working right this minute and it has been working
> for years.

It may not be impossible, but the only way to achieve that without making 
ebuilds and letting the package manager handle the package directly (on Gentoo 
at least) seem to be limited to "Support the (non-standardized) VDB database 
for installed packages internally". Which is a pretty shitty road to go down. 
Also, the Gentoo developers have flat-out declared that they will not 
officially support modules installed with the CPAN shell because of these 
issues, so the CPAN shell actually _doesn't_ work "fairly well" on Gentoo. It 
gives you an environment where your install is considered to be out of the 
control of the package manager and, therefore, a spaghetti mess where you have 
multiple package managers which overlap and interfere with each other, yet 
don't have any mechanism for cooperation, synchronization, or communication.


Fwd: Ideas for a "Object-Belongs-to-Thread" threading model

2010-05-12 Thread Alex Elsayed
Forgot to send this to the list.

-- Forwarded message --
From: Alex Elsayed 
Date: Wed, May 12, 2010 at 8:55 PM
Subject: Re: Ideas for a "Object-Belongs-to-Thread" threading model
To: Daniel Ruoso 


You may find interesting a paper that was (at one point) listed
in the /topic of #perl6. The paper is:

"Combining Events And Threads For Scalable Network Services"
http://www.cis.upenn.edu/~stevez/papers/LZ07.ps

Steve Zdancewic and Peng Li, who wrote it, implemented their
proof of concept in Haskell, and I think it would mesh rather
well with the 'hybrid threads' GSoC project that Parrot is
undertaking. What's more, the proof-of-concept demonstrated
that it performed very well, well enough that the
threading/event abstractions were never a bottle neck even
up to 10M threads (for memory usage, this came out to 48bytes
per thread of overhead), and with >100 threads it outperformed
NPTL(pthreads)+AIO on IO.

It's also CPS based, which fits pretty well.


Re: Re: Parallelism and Concurrency was Re: Ideas for a"Object-Belongs-to-Thread" (nntp: message 4 of 20) threading model (nntp: message 20 of 20 -lastone!-) (nntp: message 13 of 20)

2010-05-18 Thread Alex Elsayed
On Tue, May 18, 2010 at 3:19 AM,   wrote:
> The guts of the discussion has been kernel threading (and mutable shared
> state) is necessary. The perception being that by using user-threading (on
> a single core at a time), you avoid the need for and complexities of
> locking and synchronisation. And one of the (I believe spurious) arguments
> for the use of user-space (MIMD) threading, is that they are lightweight
> which allows you to runs thousands of concurrent threads.
>
> And it does. I've done it with Erlang right here on my dirt-cheap Intel
> Core2 Quad Q6600 processor. But, no matter how hard you try, you can never
> push the CPU utilisation above 25%, because those 100,000 user-threads all
> run in a single kernel-thread. And that means I waste 75% of my processing
> power. And next year (or maybe the spring after), when the lowest spec
> Magny-Cours 12-core processor systems have fallen to my 'dirt-cheap' price
> point, I'd be wasting 92% of my processing power.
>
> And for those geneticists and engineers trying to use Perl 6 on their
> relatively cheap 48-core boxes to chug through their inherently MIMD
> algorithms, they'd be wasting 98% of their CPU power if Perl 6 does not
> provide for a threading model that scales across multiple cores.
>
> I hope that gives some context to a) my misunderstanding of your post; b)
> my continued advocacy that kernel threading has to underpin Perl 6's
> threading model.
>
> Java has has user-space threading (green threads) for years; and it was an
> ongoing nightmare until they adopted kernels threads in Java 1.5.
> Erlang has had user-space threading (coroutines) for years; but they've
> had to add kernel threading in the last couple of versions in order to
> scale.
> IO has had coroutines; but has now added kernel threading in order to
> scale.

You are imposing a false dichotomy here. Neither 'green' threads nor kernel
threads preclude each other. In fact, it can  be convincingly argued that they
work _best_ when combined. Please look at the GSoC proposal for hybrid
threading on the Parrot list.


Re: Help mechanism in REPL?

2016-09-09 Thread Alex Elsayed
On Wednesday, 7 September 2016 17:57:32 PDT Parrot Raiser wrote:
> This isn't a request for a feature, merely a thought experiment. We're
> still in the phase where it's more important to ensure that existing
> features work properly than add new ones.
> 
> How difficult would it be to include a mechanism within the REPL to
> select either documentation or an example, (possibly from the test
> suite), for a particular command? Selection might be by some control
> key combination,  cursor positioning, or an alternative to "enter" at
> the end of the line. The purpose would be to speed development, by
> enabling an inexperienced developer to look up details while testing.
> 
> Syntax errors generate messages which attempt to provide help; could
> this provide the basis for a "help" mechanism? Would this be useful?
> 
> Opinions?

Well, this sounds like a job for the meta-object protocol (specifically, 
`.WHY`):

https://docs.perl6.org/language/mop#WHY

The simplest option for handling this in the REPL is probably to have some 
sort of automatic handling of Pod sent to sink context, rendering it and 
sending it to a pager. Then, the user could simply do

>> Hash.WHY
(LET THERE BE DOCS!)

And there would be docs.