Hi Kyle,
On 3/31/23 11:15, Kyle wrote:
I would expect most software versions to not be in Guix. Simon had mentioned
that this is mostly what the guix-past repository is for. However, some
packages might be buried on some branch or some commit in some Guix related git
repository. It may be helpful to facilitate their discovery and extraction for
conda import.
Git has a newish binary file format for caching searches across commits. Maybe
it would be helpful to figure out how to parse this format (its documented) and
index the data further using Xapian or a graph data structure (or tree sitter?)
with the relevant metadata needed to find and efficiently extract scheme code
and its dependencies?
If the format is documented then this is possible, although I'm not
super familiar with these kinds of data structures.
You make an interesting point about compilation errors. It may more productive to help
researchers test for working satisfiable configurations as a more relaxed approach to
having to specify the exact software version. Maybe some "nearby" or newer
version is packaged and that is enough to successfully run a test suite? I'm imagining
something between git bisect and Guix's own package solver.
Yes, we could have a variant of the solver that's more relaxed. It could
output multiple solutions so the user can inspect them and pick the best
one.
It might also be productive to add infrastructure to help scientists more
conveniently track and study their recent packaging experiments. Guix will only
become more useful the more packages which are already available. Work which
makes packaging more approachable by more people benefits everyone. Perhaps you
can think of other ideas in this direction?
I'm not sure how "packaging experiments" are different from packaging
software the usual way. I think making the importers easier to use and
debug would help, although that sounds outside the scope of the projects.
Finally, would these projects be considered large or medium for the
purposes of GSOC?
Thanks,
Skylar
On March 30, 2023 7:22:14 PM EDT, Spencer Skylar Chan
<scha...@terpmail.umd.edu> wrote:
Hi Kyle,
On 3/24/23 14:59, Kyle wrote:
I am a bit worried about your proposed project is too focused on replacing
python with guile. I think the project would benefit more from making python
users more comfortable productively using Guix tools in concert with the tools
they are already comfortable with.
Yes, I agree with you. Replacing Python with Guile is a much more ambitious
task and is not the highest priority here.
I'm wondering if you might consider modifying your project goals toward
exploring how GWL might be enhanced so that it could better complement more
expressive language specific workflow tools like snakemake. I am also
personally interested in exploring such a facilities from the targets workflow
system in R as well. Alternatively, perhaps you could focus kn extending the
GWL with more features?
I would also be interested in extending GWL with more features, I will follow
up with this on the GWL mailing list.
I agree that establishing an achievable scope within a short timeline is
crucial. The conda env importer idea would be quite an ambitious undertaking by
itself and would lead you towards thinking about some pretty interesting and
impactful problems.
While it's a challenging project, it could be broken into smaller steps:
1. import packages by exact matching names only, without versioning.
2. extend `guix import` to have `guix import conda` to help with package names
that do not match exactly, and to accelerate adoption of Conda packages not in
Guix
3. match software version numbers when translating Conda packages to Guix
What's currently undefined is the error handling:
- if a Conda package does not exist in Guix
- if the dependency graph is not solvable
- if compiling the environment fails (due to mismatching dependency versions)
I believe there are many satisfactory stopping points for successful completion
within the timeline of the summer, which I hope to present with my proposal
soon.
Thanks,
Skylar
On March 22, 2023 5:44:52 PM EDT, Spencer Skylar Chan
<scha...@terpmail.umd.edu> wrote:
Hi Ricardo,
On 3/22/23 14:19, Ricardo Wurmus wrote:
- Translating Snakemake to Guix Workflow Language (GWL)
Ricardo, maybe you would have some suggestions. :-)
Oh, this looks interesting. Could you please elaborate on the idea?
My idea is to take as input a Snakemake workflow file and eventually
output an equivalent GWL workflow file.
Currently, Snakemake workflows can be exported to CWL (Common Workflow
Language):
https://snakemake.readthedocs.io/en/stable/executing/interoperability.html
<https://snakemake.readthedocs.io/en/stable/executing/interoperability.html>
One approach could be to add CWL import/export capabilities to GWL. Then
Snakemake/GWL conversion would be a 2 step process, using CWL as an
intermediate step:
1. Snakemake -> CWL
2. CWL -> GWL
However, CWL is not as expressive as Snakemake. There may be some details
that are lost from Snakemake workflows.
So a 1-step Snakemake/GWL transpiler could be interesting, as both Snakemake/GWL use
a domain-specific language inside a general purpose language (Python/Guile respectively).
There may be a possibility to achieve more "accurate" translations between
workflows.
Is this topic something that could fit into a summer project?