Re: “Building a Secure Software Supply Chain with GNU Guix”
This is why things like SELinux exist, combine with separate binaries for the functionality that impacts things outside of the store to quickly minimize possible damage. If the binary can only create links the possible damage is quite limited. But the much more dangerous modification is much more subtle and can go months to years without being noticed. To which there is no defense. As there is no way to know when a person you trust will go crazy, turn evil or flip the switch they planned many years ago. Heck, the possible exploits that could be in the bootstrap seeds could so subtle you wouldn't notice or even hidden in the kernel itself: https://gitlab.com/bauen1/stage0-backdoor.git Making reviews by third parties cheap, make forking cheap and never assuming that anyone should be completely trusted is usually a secure place to start. And why the bootstrap seeds README starts with: NEVER TRUST ANYTHING IN HERE I could be evil after all -Jeremiah
Re: [bootstrappable] wip-full-source-bootstrap: from a 357-byte `hex0' to 'hello'
Amazing work as always janneke. We will just have to do some kaem work to make it work all on POSIX systems. -Jeremiah
Re: [bootstrappable] Re: wip-full-source-bootstrap: from a 357-byte `hex0' to 'hello'
>> I think that's what mes-m2 rewrite [1] (not to be confused with mes wip-m2 >> branch) >> is trying to achieve. > Oh I see. It’s still kinda confusing to have two Mes. Wouldn’t it be > nice to have just one? yes it would. Which is why a third scheme is being written in the Haskell subset we are bootstrapping. Ironically time from M2-Planet to Haskell was just a couple of weeks of work. but M2-Planet to scheme is a bit of a pain point, as janneke and I seem to have very different styles for scheme in C. But that might simply because I spent the last 4 years dealing with hex and assembly and my scheme code is now crap. > I understand the goals are not exactly the same, > but is there some way to converge? (This is a naive question, you’ve > probably already thought about it, but anyways. :-)) There are 3 easy ways to converge the two. 1) transplant janneke's wip-m2 eval into mes-m2 to solve the mes-m2 macro problem enough to run MesCC (but might make guile compatibility harder). 2) transplant my mes-m2's garbage collector into wip-m2 to solve the wip-m2 pointer arithmetic problem (but is even farther from guile compatibilty). 3) not actually converge the code and simply throw one or both of them away. Say write the whole thing in a better language than C (haskell perhaps but ultimately requires abandoning previous work). I wish I had better answers but we still have guile's psyntax.pp bootstrapping problem and figuring out how to do syntax-case in C is a b of a problem; not even having to deal with the do this with a minimal C compiler restrictions involved. If we pulled the scheme macro requirement out, then the number of minimal schemes which could run MesCC would explode. But it seems unlikely such a change would occur as macro-less scheme is no more productive than standard C coding. - Jeremiah
Cleaning up make clean's behavior
As running make clean breaks the bootstrap script. I propose we leverage git's shallow clones (git clone --depth 1 $URL) and include the .git directory with the repo such that we could simply have make clean check for git and if it exists run git clean -xdf and then only if git fails to exist, fallback to the existing broken form; which needs to be corrected. per discussion with g_bor[m] about the default automake clean rules being used currently; and per their suggestion bringing this question to this distribution list for further discussion. Additional wouldn't one want to pack the .git in the tarball to enable a simplified update method. jlicht pointed out that this would not be a problem yet for guix, but it does seem unconventional. It would not make sense for some bigger-repo projects (e.g. emacs) for sure though Given that discussion background does anyone have any problems, concerns or issues with the change proposed?
M2-Planet latest release
Today I proud to announce M2-Planet version 0.2.0 https://github.com/oriansj/M2-Planet The world's simplest C compiler with support for: structs with sizeof support anonymous unions (inside of structs) arrays Inline assembly Gotos for, while and do loops with optional breaks bitshifting bitwise operations escaped strings Passable function pointers Written and self-hosting in a lovely C99 subset optional dwarf footers (thanks to mescc-tools blood-elf) allowing for objdump and gdb to play nicely and 100% deterministic output Able to be bootstrapped from a trivial Macro-assembler and hex2-linker (which when hand made are under 3KB total) which can be found here: https://github.com/oriansj/mescc-tools or via any C compiler that supports only 60% of the features of M2-Planet
A major milestone in bootstrapping
Today I am proud to announce a combo of releases with major milestones. First stage0, reached Release version 0.2.0; which includes the following: A 250byte hex0 bootstrap binary that is self-hosting and builds hex1, which builds hex2, which bootstraps M0 macro assembly which has been used to make: The world's FIRST C compiler written in M0 macro assembly, supporting: structs unions inline assembly function pointers http://git.savannah.nongnu.org/cgit/stage0.git/tree/stage2/cc_x86.s which is capable of compiling reproducibly: The newly released M2-Planet V1.0 https://github.com/oriansj/M2-Planet Which is a self-hosting C compiler which is 100% deterministic by design with support for all the features needed to build the pieces of: mescc-tools https://savannah.nongnu.org/projects/mescc-tools which is capable of building M2-Planet from it's M1 macro seed. Soon we hope to finish the MesCC bootstrap from M2-Planet and then we will have a complete bootstrap path from 250byte hex0 all the way up to gcc ^_^ -Jeremiah Orians
Re: Stop it. Formerly - Re: Promoting the GNU Kind Communication Guidelines?
Perhaps, it is human nature to think in terms of conflict; right and wrong. Absolutes are naturally attractive, especially to those of us who program. It just feels so natural because what we work with the most is in many ways exactly like that. But one needs not get stuck on such a perspective. The Code of Conduct is an entirely rational and correct solution to a population of only cognitively normative individuals. But that is not the argument being made by both sides. But rather we as a community have those who fall outside of bounds of what is considered Cognitively Normal in our set of productive members and for them the Code of Conduct is a point of contention. It is entirely counter productive for that population and it isn't what historically been effective at growing productive software development communities in the past. But we need not think in such limited terms as have or not have in regards to the Code of Conduct but rather; can we carve out a zone of exclusion where those who are productive members of the community can act and interact without fear of the Code of Conduct or other normative pressure placed upon them? I propose we institute a Tony Stark <-> Pepper Pots mechanism. We create channels for people who can't or will not conform to the Code of Conduct are free to collaborate and contribute to the project through a few designated individuals who have thick skin and are willing to put up with Flaming assholes in private for the good of the project. There are multiple details we will need to hammer out over time but the general idea is we stop trying to force people who are different from contributing in a positive manner. -Jeremiah
Re: Packaging ufw
> I like this firewall, has anybody started packaging it? No, possibly because it doesn't add much when one has iptables and a guix configuration script for it. > We have no other firewall packages judging from my emacs-guix regex > search. We have iptables and ebtables and I suggest you consider the following iptables/ip6tables rules: -P INPUT DROP -P FORWARD DROP -P OUTPUT ACCEPT -A INPUT -i lo -j ACCEPT -A INPUT -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT You could easily lock it down further but one piece of software needed on servers missing on guix is port knocking software. -Jeremiah
Re: Preparing the reduced bootstrap tarballs
> Indeed. "mes-seed" and "tinycc-seed" are remnants of the past; the only > things we need are > >> What we would need here is something to build the things listed in >> ‘%bootstrap-inputs’, namely: >> ‘linux-libre-headers-stripped-4.14.26-i686-linux.tar.xz’ (easy :-)), >> ‘mescc-tools-seed-XYZ.tar.gz’, and >> ‘mes-stripped-0.18-0.08f04f5-i686-linux.tar.xz’ That is correct (With mescc-tools-seed adding steps and removing binaries over time) > So if you like, please make that change. There is only one little > thing: I have no (scripted) recipe to create mescc-tools-seed-XYZ. But > wait: I have a great excuse for that...I was too lazy or too sloppy. I do, in mescc-tools-seed; the script bootstrap.sh when run with the option "sin" will build the mescc-tools-seed binaries using mescc-tools. The .M1 files are always generated by cc_x86.s using the C source files. > The thing is, I used to build mescc-tools-seed, mes-seed and tinycc-seed > manually from the mes+mescc+tinycc source trees. I've been building them from the stage0 bootstrap tree (which as you can see is rather trivial: https://github.com/oriansj/talk-notes/blob/master/Current%20bootstrap%20map.dot ) > Jeremiah Orians is > working to remove any need for mescc-tools-seed (esp. the forward > dependency on Mes) but I don't think we're there yet. We have eliminated the forward dependency on Mes for the creation of the mescc-tools-seed already > Anyway, I think we/I will have to put some work into scripting > mescc-tools-seed or otherwise changing the mescc-tools-boot build. Already done in bash and kaem but not in guix yet (Should be trivial) > WDYT? I think we will end up having several versions of mescc-tools-seed; as each architecture guix supports will end up needing a variant if we plan on keeping them small. (I also have no idea how to make a multi-arch fat elf binary) I am also curious if there is any demand for the stripped versions of mescc-tools-seed as those binaries are nearly half the size. >> (do we really need an x86_64 version of this Mes?). > No, I don't think so. I added it esp. to get a preview and enable > future development of pure x86_64 bootstrap; but dependency-wise we > should be able to drop it! also, AMD64 does support i386 binaries without issue - Jeremiah
Re: Preparing the reduced bootstrap tarballs
> I think it's important that the new bootstrap-tarballs be > bit-reproducible, such that they can be independently verified by anyone > who wishes to do so. Every Piece below M2-Planet has always been bit-reproducible. In fact, each piece is designed in a way that you could by hand predict what the resulting binary must be after any change. and once I finally complete stage0; you would also have the blueprints for making the virtual machine in hardware, hand toggle in the bits for the hex0-monitor and have absolute proof that no trusting trust or Nexus Intruder Class attacks have occurred in the creation of the binaries. Every issue anyone is willing to bring, I will publicly address until all bootstrap roots (even on arbitrary hardware) lead to the proof that these binaries are perfectly reproducible and that they only behave in the manner explicitly specified by the standards to which they conform. > In particular, *I* would like to independently verify them, on my own > laptops where I have avoided using binary substitutes for a long time, > and which I keep with me at all times. Already done; here are the steps currently for bootstrapping the mescc-tools-seed and M2-Planet seed.M1: git clone 'https://git.savannah.nongnu.org/git/stage0.git' cd stage0 make test cd .. git clone 'https://git.savannah.nongnu.org/git/mescc-tools.git' cd mescc-tools make test cd .. git clone 'https://github.com/oriansj/mescc-tools-seed.git' cd mescc-tools-seed ./bootstrap.sh sin To generate the M2-Planet seed.M1 you need to either export mescc-tools-seed's blood-elf, M1 and hex2 or mescc-tools (via copying into your path or doing make install) then the steps to generate are as follows: git clone 'https://github.com/oriansj/M2-Planet.git' cd M2-Planet ./bootstrap.sh refresh Now you are done > My hope until now is that when we generated our existing bootstrap > binaries in 2013, Guix was too marginal a project to attract the > attention of hackers who might wish to compromise our bootstrap. In > 2018, as Guix has become more popular, we might well be considered a > worthy target of such efforts. I like to go with the assumption that every binary is already compromised; but by going back to the basics we can find and rip out every single hook until we are finally secure. I don't trust any hardware I can't or didn't make myself. And the only root of trust we have is the ability to work as a community, giving every member the ability to independently check our assumptions and point out our mistakes. We will have false starts and failures of imagination but we by working together will make us all a dream that is too hard to achieve alone but easy now that we have each other helping us all strive to a brighter future. -Jeremiah
Re: Preparing the reduced bootstrap tarballs
> However, my impression (correct me if I'm wrong) is that we are not yet > able to bootstrap Guix exclusively from M2-Planet. That is correct as the step of bootstrapping MesCC from M2-Planet is not yet complete. However once that is done, we can leverage Mes.c and gash to complete the bootstrap of guix from that trusted reproducible source in a reproducible fashion. > For example, unless > I'm mistaken, we still need Guile in our bootstrap, and I'm guessing > that we are not yet able to build Guile exclusively from M2-Planet. > Is that right? We don't need it, so much as it is people wishing to avoid tedious work. We already can bootstrap kaem without any shells or interpreters and it can be used to run shell scripts that can perform the rest of the bootstrap of a lisp or a proper shell. I think because that work is less of a technical challenge that it has been skipped. > My only point is that if we cannot yet avoid blindly trusting > precompiled binaries, Depends on how restricted of an environment you ware willing to work in > I have higher confidence in our 2013 binaries than > in binaries we would produce today, because (1) we are more likely to be > a target today because Guix has become far more popular, (2) I expect > that intelligence agencies have far more advanced tools today than they > did in 2013, and (3) I expect that governmental policies have become far > more favorable to permitting such attacks against projects such as ours. 1) Granted 2) Not exactly; simply because the most advanced attack tool ever invented was the Nexus Intruder Program in 1958. (Hardware that subverts software that later subverts hardware designs and more software [firmware, microcode, etc]). The tools might get more expensive but the actual quality of attack tools depends on the teams and the market's demand for pumping out vulnerable products and bugs. (Like the recent Hard drive firmware attack which leveraged the vendor's cost cutting process to hijack the drives and then lock out future attempts at recovery. 3) Actually Government agencies are depending more and more on "Open source tools" (Their words not mine) as software budgets have gotten tighter and third party vendors integrate them more and more into their commercial offerings purchased by Goverment agencies. Putting a backdoor in the software most Government agencies depend upon, invites vulnerabilities in our own Intelligence Agencies infrastructure and increase the probablity that Spies will be identified before their flight to their target country leaves the ground. To do such would not only be suicidal for those Intelligence Agencies but also ensure Cyberwarfare against the Countries they work for that much more effective. Now that isn't to say they consider that an extranality and doom us all but nothing stays hidden when we can read the source and can DDC our entire bootstrap across arbitrary hardware/operating system combinations. -Jeremiah
Re: Preparing the reduced bootstrap tarballs
> so, if I don't get it wrong, every skilled engineer will be able to > build an "almost analogic" (zero bit of software preloaded) computing > machine ad use stage0/mes [1] as the "metre" [2] to calibrate all other > computing machines (thanks to reproducible builds)? well, I haven't thought of it in those terms but yes I guess that is one of the properties of the plan. > the first bit of code have to be "manually" introduced in the machine, > right? Correct, otherwise you'll have to deal with firmware/bios as a trust vector to be concerned about. > for the lazyer like me, what about a punched card? :-) If someone is willing to figure out how to read a deck of punched cards without software, I'd be interested in learning more. > I didn't know about Nexus Intruder attacks: could you please give me > some links to the relevant bibliography? I'll see if I can dig those up for you. > so, having the scientific proof that binary conforms to source, there > will be noo need to trust (the untrastable) Well, that is what someone else could do with it but not a direct goal of the work. -Jeremiah
Re: Preparing the reduced bootstrap tarballs
share some idle thoughts. I actively encourage alternative perspectives and I love being told how I can do this better. I look forward to more ideas and suggestions from you on this subject later ^_^ Just an open reminder our #bootstrappable channel is always looking for people intersted in these sorts of topics and we love hearing about what you have created in this regard. -Jeremiah
RE: Trustworthiness of build farms (was Re: CDN performance)
understand the seed pieces, that unfortunately has reduced and that issue needs to be addressed. > we should be clear that the current efforts fall far short of a proof, Absolutely agree at this port, we only have the specification for which the proof must be written and someone will need to write a formally verified version to verify that our roots of truth which are approximations of the specification limited by the constraints of time and human effort have not missed a subtle error. > and there still remain several valid reasons not to place one's trust in > substitute servers. Personally I believe in terms of security and trust in absolute senses of the words, you are beyound a shadow of a doubt correct but the substitutes have never been about trust or security but convience which many guix users want. But that doesn't mean that there are not things we should start doing to make the task of compromising that trust harder. For example: Cryptographic port knocking for administration of servers requiring access to a physically seperate hard token. Cryptographically signed software white-listing on servers with the signing keys being on a seperate system that is offline and the binaries being signed being built from source on machines that have no internal state and running only software explicitly specified and required for the build. Random system spot checks and wipes. In short anything that makes single point compromises worthless needs to be actively considered. > Does that make sense? Yes and hopefully my perspective makes sense as well. -Jeremiah
Re: Trustworthiness of build farms (was Re: CDN performance)
> If you could add an "In-Reply-To:" header to your responses, that would > be very helpful. It's easy to add it manually if needed: just copy the > "Message-ID:" header from the original message and replace "Message-ID:" > with "In-Reply-To:". As is, it's very difficult for me to keep track of > any conversation with you. Sorry abou that, was not something I generally even consider as emacs handles that bit of tracking for me but I will try to add that in the future. > I'm not worried about those languages. Very little code is written in > them anyway, only a small part of the early bootstrap. Yet also the point where the trusting trust attacks can most effectively be inserted. Hence more concern and less trust needs to be placed there. > My concern is the correspondence between the source code and machine code for > the > majority of the operating system and applications. That seems more like a feature request for GCC and guile than the bootstrap or the trustworthiness of the build farms. > It's important to note that even a relatively obscure bug in the > compiler is enough to create an exploitable bug in the machine code of > compiled programs that's not present in the source code. Absolutely and we already have dozens of living examples due to performance optimizations done by GCC and Clang exploiting undefined behavior in the C language and in some cases violating the spec for sake of compatibility. > Such compiler bugs and the resulting exploits could be systematically > searched for by > well-resourced attackers. And actively exploiting as they are already doing. > So, if we want to *truly* solve the problem of exploitable bugs existing > only in the machine code and not in the corresponding source, it is not > enough to eliminate the possibility of code deliberately inserted in our > toolchain that inserts trojan horses in other software. Very true, we need formally defined languages which do not have undefined behavior and rigourous tests that prevent optimizations from violating the spec. In essense a series of acid tests that ensure the resulting mental model always exactly matches the computing model that the compilers will be based upon. > To truly solve that problem, we need bug-free compilers. Impossible for all but the simplest of languages as the complexity of implementing a compiler/assembler/interpreter is ln(c)+a but the complexity of implementing a bug-free compiler/assembler/interpreter is (e^(c))! - a. Where a is the complexity cost of supporting it's host architecture. > In practice, this requires provably correct compilers. Which in practice turn out *NOT* to be bug free nor complete in regards to the standard specification. Now, don't get me wrong; provably correct compilers are a correct step in the right direction but the real solution is to first generate simplified languages that don't have undefined behavior and human model first behavior. > Does that make sense? Absolutely, certainly something possible to do; but extremely human effort intensive and I don't see anyone willing to throw 2+ years of human effort at the problem outside of non-free Businesses like CompCert. I'd love to see someone do it, I'd even throw in a few dollars into a pot to fund it but it is so low down on my list of priorities, I'm not going to be touching it in the next 2 decades... -Jeremiah
Re: Trustworthiness of build farms (was Re: CDN performance)
> Where are you getting those complexity expressions from? Approximation of developer effort spent on single pass workflows and bugfree libraries in the State of Michigan Welfware Eligibility System extracted from it's ClearCase commit history. (Thank god, I finally got them to convert to git after 3 years of wailing and gnashing of teeth) > Can you cite references to back them up? I can site the numbers I used to generate those approximate complexity equations > If not, can you explain how you arrived at them? Simple graphing and curve matching of collected data. > What is 'c'? Number of 'Features' a program had. In short I found it is far easier for a developer to add features to a program but it is really really hard to make sure the existing functionality is bug free. > If you're referring to the bugs found in CompCert, the ones I know about > were actually bugs in the unverified parts of the toolchain. I was referring to the bugs in the verified parts in regards to C's undefined behavior. > In the past, its frontend was unverified, and several bugs were found there. > Even today, it produces assembly code, and depends on an unverified > assembler and linker. Which depending on if the compiler has been formally proven to only output valid assembly (no 33bit offset loads) then that is less of a concern. (Let us just hope the assembler doesn't arbitrarily choose 8bit immediates for performance reasons when given a 32bit int) > Bugs can also exist in the specifications themselves, of course. The most important class of bugs indeed > I'm not sure what "and human model first behavior" means, but if you > mean that the semantics of languages should strive to match what a human > would naturally expect, avoiding surprising or unintuitive behavior, I > certainly agree. That is exactly what I mean. The problem is ultimately do be useful things one has to support things that violate that in very terriable ways (like support IEEE floating point, disable garbage collection, etc). > I consider Standard ML, and some subsets of Scheme and > Lisp, to be such languages They certainly do have wonderful properties but I wouldn't say they qualify as matching the human model first behavior requirement (easy to verify by doing the 50 non-programmers first hand test) Ask a room full of non-programmers to use scheme to write a standard web CRUD app using an SQLite3 database backend or do XML parsing or generate a pdf/excel file. It doesn't look pretty and most just give up on the offer of free money after a bit. > If I understand correctly, what you don't expect to happen has already > been done. CakeML is free software, and formally proven correct all the > way down to the machine code. Moreover, it implements a language with > an exceptionally clear semantics and no undefined behavior. I don't deny that the CakeML team did an excellent job on their formally verified backend and their type inferencer. Most humans are not programmers and of the programmers, most of them programming by proof isn't in the realm of intuitive. > Anyway, you've made it fairly clear that you're not interested in this > line of work, and that's fine. It isn't so much as not interested but rather it is lower on my priorities > I appreciate the work you're doing > nonetheless. As I appreciate the work you do as well. -Jeremiah
Re: Trustworthiness of build farms (was Re: CDN performance)
#x27;ll quickly discover there is no need to keep very much in one's head at all. It's real weakness is it's type system and lack of runtime. One must remember a language's greatest strength is also it's greatest weakness as well. Every desired feature must be paid for, usually in regards to other desired features. Performance, correctness and ease of development. Pick one (the Industry picked Performance) Things that should have been done differently if correctness was the goal: Hardware support for typed memory Hardware support for garbage collection Hardware support protecting against memory corruption (ECC, rowhammer resistence circuits and SRAM instead of DRAM) Barrel processor architectures (No risk from Spectre, Meltdown or Raiden) Hardware Capability tagging of processes Flatten virtualization down to user processes. Stack space and Heap Space different from each other and Program space. Atleast a single attempt in the last 60 years from any military on the planet to deal with the risks written in the Nexus Intruder Report published in 1958. I could spend literal weeks ranting and raving about modern hardware makes correctness impossible for all but the trivial or the so isolated from the hardware that performance makes it a non-starter for anything but journal articles which are never read and forgotten within a generation. -Jeremiah
Re: Trustworthiness of build farms (was Re: CDN performance)
> > Do you know where one can obtain a copy of this report? I did an > > Internet search but couldn't find anything. > me too > Jeremiah: sorry if I insist (last time, promised!) but could you give us > some more info about that report? I am sorry for the delay, the Government shutdown really disabled access for me in regards to the archives in which it was found. As I am currently unable to link that resource, I'll do my best to provide the key points: It was a top secret report for the Department of Defense written in 1958 and declassified by the Clinton Administration. 1) Computers are being used to replace human thinking and as computers are growing faster and faster in complexity; there is going to be a point in the future where computers will be required to design computers. References back to a 1952 paper about lithography (that I couldn't find) and that it is likely that chips will replace single piece logic and thus provide the ultimate place for hiding of malicous functionality. 2) It is possible to infect the software used in the designing of Computers on elements common to all computers, which will alter the circuits to provide weaknesses we can exploit and/or functionality to leverage that the computer designer, builder and owner do not know about. 3) If done on a large enough machine, there is room to include infectors for tools such as assmblers, linkers, loaders and compilers on functionality that can not be removed. 4) It then details how they could backdoor the Strela computer and how it could be leveraged to compromise future Soviet computers to ensure a permanent weapon against the Soviet Union. 5) Then it has a huge section of blacked out text 6) Then a section of possible future hooks depending on how software evolves in the Soviet Union, thus allowing more pervasive hardware compromises and eliminating the possibility of trustworthy computing ever becoming possible on Soviet Computers. 7) Another big blacked out section. 8) Then the final section detailed a list of steps required for a lithography plant to be assembled by an Intelligence Agency to prevent their own infrastructure from being compromised by a similiar Soviet attack; with an estimated spinup time of almost a Decade. Examples included running traces close to the transistors to create a radio induced functionality such as The intensional leaking of crypto secrets upon recieving a very specific frequency. Allowing magic numbers in a set of memory addresses or registers to cause functionality to be engaged; such as disabling protections or giving a process priviledges that would normally be restricted for security reasons. I'm sorry as I am likely missing alot of the details and attacks. Once the Shutdown is done, I'll try again to find that paper for you. -Jeremiah
RE: Creating a reliable bootstrap for building from source
> Thinking about it, what I want to achieve is that we can take the > latest git tree and bootstrap by building guix and packages. This > should be easy, since I have guix running, but it is not. And the main > trouble is that the underlying build packages can differ over time. I > am looking at gcc versions and guile versions. I.e., we are building > on shifting sands. How unguixy! It is worse than that, Janneke and I are still trying to build out a full source bootstrap. Now mind you we have gotten quite a bit down that rabbit hole. ( I've build from a hex monitor to a Lexically scoped garbage compacting/collecting lisp and Janneke built his rather impressive MES which already supports large parts of the C language and enough to bootstrap some rather important pieces) But there still are many gaps left to close (how to bootstrap a 280 byte hex monitor without a hex monitor or hex assembler, stage0-vm downstrapping, MES tinycc bootstrapping, MES lisp bootstrapping, etc) but ultimately shifting sands are the only grounds we can be certain will be there. So we better get comfortable minimizing our assumptions. -Jeremiah
Re: Creating a reliable bootstrap for building from source
> What you are trying to do is even more heroic - bootstrapping all of > guix :) The real heros are those that check our work, report bugs and give us insight into solving our most difficult problems. And we LOVE patches and pull requests ;-D > What I want is achievable, simply have the build system as a tested > binary available. A tested builder package that is know to work > against the current tree. We pretty much have it, it is only not > formalized and tested on the build farm If what you want is achievable and going to help, then please do it. No need to ask, we always love people who take time to make things better. -Jeremiah
Missed testing
I know this probably is not a popular premise but we really need to take the time to actually test our example configurations prior to including them in our releases. For example if one were to go to the guixsd website and download the current release, verify that it was correct, burn onto a DVD and attempt to install with guix system init /etc/configuration/desktop.scm /mnt --fallback results in the following error: /gnu/store/729zbb84cah3wf2fcsy4h17lqxxib5q-configuration-templates/desktop.scm:23:9: error: you may need these modules in the initrd for /dev/sda1: mptspi hint: Try adding them to the 'initrd-modules' field of your 'operating-system' declaration, along these lines: (operating-system ;; ... (initrd-modules (append (list "mptspi") %base-initrd-modules))) If you think this diagnostic is inaccurate, use the '--skip-checks' option of 'guix system' So if I copy the code into the file, it stops recognizing users but if I --skip-checks the system installs but boots to a guile repl It takes a bit to find ,help works and then ,bournish only to discover no readline (so have to type everything by hand everytime) and no tab completion Which would have been fine if less/more was available or that pipes (|) worked or if cryptsetup was in the path so I wouldn't have to type the following line: /gnu/store/slpv4rzcmf6lfzzjlhm4d3r1pkb2cx00-cryptsetup-static-1.7.5/sbin/cryptsetup Then I discover /dev/sda1 isn't even exist!!! There is no documentation on how to mount and boot, let alone how to get shepherd to prompt for credentials for the luks volume... If nothing else we either need to include in the documentation how to mount a luks volume and resume boot or ensure it works everytime. -Jeremiah
Re: Missed testing
> This depends on your hardware and the modules that the kernel loaded in > response upon booting. There is no way to have a static resource as the > example configuration reflect the modules that can be automatically > loaded by the kernel on all hardware configurations out there. Ok, that is fine. Now why isn't there commented out code in the example with comments saying that? Still not addressed is why users section stops being defined when one copy and pastes that example text onto the configuration. Nor the fact that luks boot with that example configuration never prompts for the luks password and just goes to a very unhappy place and drops the user in a guile shell to sort things out and we lack documentation with how to deal with that case. Users are going to hit edge cases, when we write them; we really don't want the users to have to read 100,000+ lines of code to try to figureout how to deal with them. -Jeremiah
Re: Missed testing
> Also, that doesn't help on initial installation which should be made > much more user-friendly. Fault tolerant is far more important than user-friendly because a reliable system is far easier to make user-friendly than it is to make a user-friendly system fault tolerant. > That sounds very strange and would be a very bad bug. It is a very easy to reproduce bug, simply copy the text and paste it into the example config above the user field. > I'm using luks home with current guix master and it prompts for my > password. Here is the complete procedure I followed to hit the bug: # Steps for creating a guix vm image using qemu and guix bootstrap Image GUIX_VERSION=0.16.0 # Step 0 get, verify and unpack guix bootstrap image wget "https://alpha.gnu.org/gnu/guix/guixsd-install-$GUIX_VERSION.x86_64-linux.iso.xz"; wget "https://alpha.gnu.org/gnu/guix/guixsd-install-$GUIX_VERSION.x86_64-linux.iso.xz.sig"; gpg --verify "guixsd-install-$GUIX_VERSION.x86_64-linux.iso.xz.sig" unxz -k "guixsd-usb-install-$GUIX_VERSION.x86_64-linux.xz" # Step 1 create and starta vm disk image of appropriate format and size qemu-img create prototype.qcow2 20G -f qcow2 # start qemu qemu-system-x86_64 -m 1024 -smp 1 -boot menu=on -enable-kvm -drive file=prototype.qcow2 -drive file=guixsd-usb-install-$GUIX_VERSION.x86_64-linux # Step 2 setup disk partitions # Format virtual drive to have 1 large primary partition and mark it as # bootable echo -e "o\nn\np\n1\n\n\na\nw" | fdisk /dev/sda # Setup encrypted volume cryptsetup -v --cipher aes-xts-plain64 --key-size 512 --hash sha512 --iter-time 5 --use-random --verify-passphrase luksFormat /dev/sda1 # or if that takes too long to type: cryptsetup -v -c aes-xts-plain64 -s 512 -h sha512 -i 5 --use-random -y luksFormat /dev/sda1 cryptsetup open /dev/sda1 root # Format drive to allow its use mkfs.ext4 /dev/mapper/root # Label the volume for guix e2label /dev/mapper/root root # Mount the drive mount /dev/mapper/root /mnt # Step 3 setup network for download of packages and source code # turn on networking # vmware:: eno1636 ifconfig ens3 up dhclient ens3 # Step 4 add tools required to make setup easier # Set the default storage space for the setup on the drive itself herd start cow-store /mnt/ # Step 5 replace the uuid with "/dev/sda1" and set bootloader to grub-bootloader zile /etc/configuration/desktop.scm # Step 6 Apply the configuration to the disk guix system init /etc/configuration/desktop.scm /mnt --fallback Please note the important difference that the entire drive is fully encrypted (even grub will prompt for password to decrypt /boot) > The installer can and should be made to automatically amend the system > config by mptspi etc. To the examples, that would be fine but I have concerns about guix silently fixing configuration files. -Jeremiah
M2-Planet 1.2.0 and mescc-tools 0.6.0 releases
With such wonderful enhancements for mescc-tools such as: ** Added Added template ELF headers for ARM Added initial support for ARM Added official hex0 seed for AMD64 Added official hex1 seed for AMD64 Added support for Added catm NASM prototype to simplify build Added catm M1 prototype to reduce bootstrap dependency Added catm hex0 prototype to eliminate bootstrap dependencies down to hex0 Added M0 NASM prototype to simplify build Added M0 M1 prototype to reduce bootstrap dependency Added M0 hex2 prototype to eliminate bootstrap dependencies down to hex2 Verified ARM port to support M2-Planet ** Changed Updated build.sh and kaem.run to the current mescc-tools syntax Reduced get_machine's build dependencies Cleaned up x86 elf headers Removed kaem's dependence on getopt Replaced --Architecture with --architecture changed get_machine's default output to filter machine names into known families Reduced M1 null padding of strings to a single null for all architectures except Knight Updated AMD64 bootstrap kaem.run to include steps from hex0 to M0 ** Fixed Fixed broken test9 thanks to janneke Fixed wrong displacement calculations for ARM immediates Fixed typo in license header Fixed kaem.run to actually function and produce identical results Fixed regression caused by linux 4.17 Removed false newline added in numerate_number for zero case Fixed broken bootstrap script ** Removed Removed final dependency on getopt Removed need to know architecture numbers as that was a bad idea and with the complete port of M2-Planet to ARM: ** Added Added 24/24 working tests for armv7l Port to ARMv7l and ARMv6l both work ** Changed ELF-code segment now writable for ARMv7l without debug Updated from mescc-tools from 0.5.2 to 0.6 (with changes in checksums due to alternate null padding) ** Fixed Fixed unsigned division in ARMv7l port Fixed non-uniform behavior across locales and *BSDs Fixed broken stack in ARMv7l thanks to dd -Jeremiah
M2-Planet v1.6.0 and mescc-tools v1.0.0 released
https://github.com/oriansj/M2-Planet https://savannah.nongnu.org/projects/mescc-tools A K+R C equivalent C compiler, assembler, linker, dwarf stub generator and shell able to produce fully standards compliant ELF binaries for Knight, x86, AMD64, armv7l and aarch64 and be bootstrapped from a sub 250byte hex0 hex assembler and a 737byte shell -Jeremiah
RE: [bootstrappable] Re: wip-full-source-bootstrap: from a 357-byte `hex0' to 'hello'
> I think that's what mes-m2 rewrite [1] (not to be confused with mes wip-m2 > branch) is trying to achieve. My fault for that confusion. Wish I was faster at implementing syntax-case in C -_- > Outside of Guix we are working on bootstrap that does not depend on guile > driver and is driven only by hex-0 seed (357 bytes) kaem-optional-seed (737 > bytes) and any POSIX kernel. We love it ^_^ > At the moment it goes all the way up to Mes (tcc is now in progress). Eternal progress Oh and we are currently joking about replacing mes.c with a scheme written in Haskell because we bootstrapped a minimal Haskell too. https://github.com/oriansj/blynn-compiler/ Then the loop would be: a scheme interpreter written in Haskell running a C compiler written in scheme that can build the Haskell compiler able to build the original scheme interpreter. If we get it to enough guile compatibility; then it becomes: once you have Gnu Mes, you are already bootstrapped. ^_^ - Jeremiah
RE: [bootstrappable] Re: [Tinycc-devel] Re: wip-full-source-bootstrap: from a 357-byte `hex0' to 'hello'
> If so, is libc malloc supposed to ensure alignment of allocated memory? > According to https://man7.org/linux/man-pages/man3/malloc.3.html yes. > @Janneke: So our mes libc malloc should be aligning the stuff--but it's not > doing it. So it's a bug in our libc. Looks like you'll have to waste 3.7bytes on average per malloc to always pad to the 8byte boundary. -Jeremiah
RE: [bootstrappable] Re: wip-full-source-bootstrap: from a 357-byte `hex0' to 'hello'
> I see a fourth option, which is to keep both. :-) Might want to fix up the confusing naming in that case > In effect, it seems there are now two diverging projects. I think that’s > fine: more bootstrapping work and more diversity is better! Converging actually as they share the exact same goal of bootstrap from nothing and run MesCC > For Guix, the Scheme-based approach Janneke et al. have been pursing remains > the most attractive. And would be faster if MesCC running on guile was used as the lone bootstrap seed. > At any rate, work on Haskell will probably benefit Guix (and other distros I > guess!) to have a fully built-from-source Haskell platform. Indeed -Jeremiah
RE: [bootstrappable] ARM Unified Assembly Language - GNU as does some weird stuff
> (1) b #60 > It seems that GNU as ignores the immediate entirely and just always encodes > #0 (to test, do ".syntax unified" and then "b #60" in GNU as). WTF? > Likewise with bl, blx. All assemblers (except M1) do that because the linker is to populate that value at link time using the symbol table to allow code segment relocation. > (2) push #4 > It works in GNU as--but is it specified by ARM to push the register r2 ? > I think exposing ISA implementation details like that is a leaky > abstraction--and no good can come from it. > Likewise with pop, stm*, ldm*. Well arm doesn't actually have a push/pop instructions, only load and store instructions > GNU as fails to assemble these. And no one cared enough to fix it for 30 years? Is it possible something is missed? > (4) lsl r1, #4, #2 > GNU as encodes exactly the same as "lsl r1, #4"--drops the "#2" silently. 4 << 2 is 16. Log2(16) == 4; sounds about right -Jeremiah
RE: [bootstrappable] Re: wip-full-source-bootstrap: from a 357-byte `hex0' to 'hello'
>>> In effect, it seems there are now two diverging projects. I think that’s >>> fine: more bootstrapping work and more diversity is better! >> Converging actually as they share the exact same goal of bootstrap >> from nothing and run MesCC > My understanding is that they are nevertheless two different development > branches; is that right? There was a lot of cross-pollination between them during the original M2-Planet build attempt on Mes.c >>> For Guix, the Scheme-based approach Janneke et al. have been pursing >>> remains the most attractive. >> And would be faster if MesCC running on guile was used as the lone bootstrap >> seed. > What do you mean? Faster in what sense? Guile is a faster scheme than mes.c -Jeremiah
RE: [bootstrappable] Re: wip-full-source-bootstrap: from a 357-byte `hex0' to 'hello'
>>> Using this post as inspiration I replaced diffutils-mesboot with >>> gash-utils-boot. diffutils-mesboot provided cmp and diff, both of >>> which are available in gash-utils. > It would be better if gash-utils worked with mes though. Now they all run in > guile. > In live-bootstrap project (bootstrapping from hex0 with just kaem driver) we > had to skip gash and all gash-utils. Fortunately we now mostly got relevant > GNU utils. Hence the overly ambitious goal of mes-m2 (Which might end up being a dead end) and a proper scheme written in the Haskell Subset supported by blynn-compiler which has been bootstrapped. We have to think long term here, because we are going to have to support the bootstrap forever. And porting to new architectures and Operating Systems is going to be something we will have to deal with. - Jeremiah
RE: Mes 0.14 released
> I am pleased to announce the release of Mes 0.14, representing 98 commits > over 4 weeks. Mes+MesCC now compiles a self-hosting TinyCC that has only > been slightly patched. > This means that we can now build a tcc that depends only on a 1MB ASCII M1 > seed. GuixSD currently uses a ~250MB binary seed to build gcc. > Next targets are: build gcc using this almost full-source bootstrapped tcc, > and reduce the 1MB ASCII M1 seed to ~100KB of M2 source, which is a > restricted subset of C. > Packages are available from Guix's wip-bootstrap branch. Amazing work as always Janneke - Jeremiah Orians
RE: [bootstrappable] Mes 0.15 released
> I am pleased to announce the release of Mes 0.15, representing 45 commits > over 3 weeks. The GNU toolchain is getting bootstrapped! Great work as always Janneke -Jeremiah
M2-Planet latest release
Today I proud to announce M2-Planet version 0.2.0 https://github.com/oriansj/M2-Planet The world's simplest C compiler with support for: structs with sizeof support anonymous unions (inside of structs) arrays Inline assembly Gotos for, while and do loops with optional breaks bitshifting bitwise operations escaped strings Passable function pointers Written and self-hosting in a lovely C99 subset optional dwarf footers (thanks to mescc-tools blood-elf) allowing for objdump and gdb to play nicely and 100% deterministic output Able to be bootstrapped from a trivial Macro-assembler and hex2-linker (which when hand made are under 3KB total) which can be found here: https://github.com/oriansj/mescc-tools or via any C compiler that supports only 60% of the features of M2-Planet
RE: [bootstrappable] Re: M2-Planet latest release
> Looks nice! How do it compare to tcc? (http://www.tinycc.org/) Well at only 1,607 lines of code: an order of magnitude smaller less dependencies simpler build less complete support for c99 No optimization phase No preprocessor (nor need for one) Doesn't support // line comments Generally what you'd expect for a compiler optimized for bootstrapping bigger compilers -Jeremiah Cell phone: (517) 896-2948 On Signal and Riot
Re: bootstrap integration strategies
> I think that's the main difficulty. I think we'd rather not have > separate bootstrap paths for Intel GNU/Linux on one hand, and everything > else on the other hand. Well, due to the design of mescc-tools; the bootstrap paths only have to be divergent up to the M1-macro level. After that, we could simply use flags make the source work on different platforms > Yet, we know that porting what you already did on x86-linux-gnu to > GNU/Hurd and ARMv7 and AArch64 etc. is going to be a lot of non-trivial > work (especially since historical versions of the GNU toolchain did not > support AArch64, for instance.) Nor RISC-V but that is likely to be a much bigger issue in terms of bootstrapping > Waiting for this to be "solved" (and we don't even know how) would > equate to a status quo. But obviously, it'd be sad to have all this > work already done on Intel and not be able to benefit from it. Actually the work for the stage0 bootstrap steps have already been done on non-x86 hardware (Knight platform to be precise) And the engineering decisions involved where explicitly selected to minimize porting and cross-platform bootstrapping effort. M1-macro and hex2-linker only need flags to be set to build for all of the different supported platforms > So perhaps we'll have to get over it and have a different bootstrap path > on x86-linux-gnu. A multiway bootstrap path that exceeds the requirements of DDC actually > (BTW, I suspect we can get away with using 32-bit bootstrap binaries on > both i686/x86_64 and armv7/aarch64, no?) For AMD64, absolutely, ARM however I am not familiar enough to say > Gash seems to be a low-hanging fruit and a relatively easy thing, > because it's architecture-independent. How > far is it from being able to run typical 'configure' scripts? Well we would have to replace the parser at a bare minimum > I think the day it's able to run 'configure' scripts, we can switch to > it right away without further ado, and then incrementally improve it as > we stumble upon limitations and bugs. Well we only need Gash to get to the build make and bash level, after that its scope can be limited. In theory, someone could hand replace the make build script with a custom version that gash can use right now instead of us enhancing gash > There's also another option you didn't mention: ditching the 2.0 > bootstrap Guile in favor of Mes. That can be done in several steps: > 1. Replace the guile-2.0.*.xz binary tarballs with Mes, and add a step > that builds Guile 2.x using our big bootstrap GCC binary. Slow but possible > 2. Same, but build Guile 2.x, libgc, etc. using MesCC. MesCC can't directly build Guile yet but I do enjoy that ambition ;-) > This could allow us to remove quite a lot of MiBs from our binary seeds. FTFY At this point, we effectively have a rope bridge to full bootstrappability But we still have a lot of details to hammer out, like getting basic ARM support and having the ARM and x86 binaries verify each other's bootstrap; Finding 6502, z80, 8051, 68K, VAX, pdp11, Alpha, MIPS, SPARC and PowerPC/Power Developer(s) to do stage0 work for their platforms and perform the cross verify steps. Hammer out cross-platform build details for MesCC and M2-Planet Jeremiah Orians Cell phone: (517) 896-2948
RE: bootstrap integration strategies
> Sounds nice. I wonder if Jan was referring to something else then? Probably alternate operating systems like Hurd is my guess but I'm probably wrong. > There’s still the question of GNU/Hurd, though, which requires a vastly > different libc. Fortunately Janneke has done a good job making that selectable > So far the initial ports of Guix to non-x86 were done through > cross-compilation (info "(guix) Porting"). So in a way, the binary > seeds for these platforms were built from source; we just “cut” the > source-to-binary connection by making those binaries the root of the > dependency graph on these platforms. Thank you for clarifying > Maybe that’s something we’ll have to live with on new architectures. Unless you want to make qemu a root dependency > So, problem solved? Or am I missing something? :-) Mescc-tools can build valid binaries for all instruction sets with sane immediate representations (RISC-V is the only exception here; hopefully they fix that) But Definition files need to be written and tests generated > I think the ‘wip-bootstrap’ branch does not use M1 at this point, does it? M1 and Hex2 are core pieces of mescc-tools and are required for MesCC to produce binaries. As MesCC outputs M1-macro files (the .S files) > I wonder what it would take to fix that. After all, compiling libguile > must not be much harder than compiling tcc, no? Janneke know far better than me on this one > One thing at a time. :-) But this is lots of fun :D > IMO what matters most at this point is to come up with a plan that allow > us to incrementally reduce the size of our binary seeds. A port of > M1/stage0 to Z80 can wait. ;-) So we really need a list of actionable > items in the short term to start taking advantage of all the work that’s Ok, how does this sound: We walk the bootstrap binaries towards MesCC (This will add more bootstrap binaries in the short term) Then we eliminate them one at a time until guile and mescc-tools are the only binaries that remain Mescc is a scheme program which is interpreted by either guile or mes.c and thus could leverage guile until mes.c is in a state that it can replace guile for : guix, mescc and GASH Thus the making of the bootstrap binaries can follow 2 possible paths: 1) Fast via standard C compilers: just build mes.c and mescc-tools and be done 2) Platform specific Stage0, which starts with hex0 (`200B) -> hex1 (~500B) -> hex2 (`1KB) -> M0 (~2KB) -> [Everything here on is platform neutral] ->M2-Planet (~16KB) -> mes.c + mescc-tools (M1 and Hex2) The best part is all of the binary seeds of all of the platforms will be able to build the binary seeds for all of the other platforms with bit for bit identical results (Which eliminates hardware based Trusting Trust attacks avoiding detection) Jeremiah Orians Cell phone: (517) 896-2948
RE: bootstrap integration strategies
> I agree. We need to make sure, though, that the Guix build infrastructure > doesn’t add more complicated packages to the environment that are not needed. Especially since those are the varieties that no one wants to be responsible for maintaining. > Right. We would need to cut out Guile on the build side. Which would significantly increase system resource requirements, unless we make this a 1 time only cost sort of thing. > I would love to take a closer look again before merging it. > Unfortunately, these days I’m a bit short on time as I’m on “vacation” > with other plans imposed on my schedule. Ricardo, we love you dearly but please for the love of all that is holy; Get back to that vacation! *cracks whip* Burnout is a real thing and believe me when I say bootstrapping is a marathon Jeremiah Orians Cell phone: (517) 896-2948
RE: [rb-general] A major milestone in bootstrapping
> Interesting... I'm looking at > https://github.com/oriansj/M2-Planet/blob/master/seed.M1 > How was it written? It seems like a monumental task to write all that and > keep enough context in one's head! > Then again, I have never written > assembly before... If you'll notice https://github.com/oriansj/M2-Planet/blob/master/seed.M1#L2 It explicitly says it was generated from stage0 https://savannah.nongnu.org/projects/stage0/ Specifically cc_x86 http://git.savannah.nongnu.org/cgit/stage0.git/tree/stage2/cc_x86.s Which was built by M0: http://git.savannah.nongnu.org/cgit/stage0.git/tree/stage1/M0-macro.hex2 Which was built by hex2: http://git.savannah.nongnu.org/cgit/stage0.git/tree/stage1/stage1_assembler-2.hex1 Which was built by hex1: http://git.savannah.nongnu.org/cgit/stage0.git/tree/stage1/stage1_assembler-1.hex0 Which was built by hex0: http://git.savannah.nongnu.org/cgit/stage0.git/tree/stage1/stage1_assembler-0.hex0 Which was the 250byte seed used Well the work started back in 2016 with http://git.savannah.nongnu.org/cgit/stage0.git/tree/Linux%20Bootstrap/hex0.s and http://git.savannah.nongnu.org/cgit/stage0.git/tree/Linux%20Bootstrap/hex0.hex It was written one function at a time, with the arguments passed in registers and careful preservation of everything passed. - Jeremiah Orians
RE: [bootstrappable] Re: GNU Mes 0.18 released
> What is M2 Planet? My project https://github.com/oriansj/M2-Planet That is both written in Assembly: http://git.savannah.nongnu.org/cgit/stage0.git/tree/stage2/cc_x86.s (That can be bootstrapped from a hex0 monitor) And in C (That it can self-host from) That way you can build it with any C compiler you like or bootstrap it from stage0 https://savannah.nongnu.org/projects/stage0/ -Jeremiah
RE: [bootstrappable] prototyping the full source bootstrap path
> Now that MesCC starts to build TinyCC that starts to pass a large set of the > mescc C tests, it's time to get walking the bootstrap path. > Attached*) is my initial attempt for the full source bootstrap path in > GuixSD; to try it, do Very nicely done Janneke >The starting point is Jeremiah Orian's stage0 self hosting hex assembler. The >binary seed of our bootstrap is made explicit in an additional source > download: stage0-seed (of ~400 bytes). This binary that is identical with > it's ASCII source can be considered "source". :D > There are still many gaps in our full source bootstrap path to-be, I "filled" > these by adding additional binary seeds: mescc-tools-seed and > mes-seed. We are working to remove these, that will take some time. I'll try to schedule some time to hammer on these
RE: [bootstrappable] Re: prototyping the full source bootstrap path
> Yeah, the mean reason to do it in Guix packages is that it becomes impossible > to cheat. However, coding the bootstrap path in Guix > means that we depend on some form of Guile...hmm. Easy to break, simply allow each piece to be able to be built using only a trivial shell script
RE: [bootstrappable] Re: prototyping the full source bootstrap path
> It wouldn’t really help in that mescc+/guilecc is just as capable as the > earlier mescc, no? There is however a real difference in terms of performance, guile is simply faster > Indeed, Guile needs a C compiler. Technically, it could be built from a lisp compiler > In general, we need a C compiler early on… unless we have replacements for > Bash Rain1 is working on that > Coreutils, etc. written in Guile or Mes, which would allow us to strip bits > of the tip of the DAG. Actually we only need a core subset of coreutils that mescc can compile, which ironically is a small job should one use openbsd's core as a base Essentially build from the stage0-seed to mescc, build the essentials only, then simply add pieces as we need them. No need to make this any more complex than it already is.
RE: [bootstrappable] Re: prototyping the full source bootstrap path
> Plus there is another angle on this. MesCC, the bootstrap C compiler in > Scheme, is not a intended to be used beyond bootstrapping. And probably will lose features over time not directly related to the act of bootstrapping itself > A C compiler on top of Guile however, could be a very interesting project and > could easily target gcc; possibly attempt C++. Well mescc is well on its way to that > I may just be dreaming... I hope not > Hmm, it's my understanding that Guile is pretty heavily tied to libguile/*.c. > What makes you think that it's possible for Guile to run > without libguile/*.c? https://wingolog.org/archives/2016/01/11/the-half-strap-self-hosting-and-guile Specifically "The bootstrap C interpreter in libguile loads the Scheme compiler and builds eval.go from eval.scm" Thus by simply having a scheme compiler able to compile eval.scm, we can skip the libguile/*.c Assuming I interpreted that situation correctly
RE: [bootstrappable] Re: prototyping the full source bootstrap path
> Jan is correct that Guile is still heavily tied to its C code. It's true > that Guile's compiler is written in Scheme and that > the C evaluator is used only during bootstrapping, but the C bootstrap > evaluator is only a small piece of libguile. > The majority of libguile is still needed. Notably, the entire runtime, the > VM, and implementations of many data > structures and other libraries are written in C. Thank you for correcting my mistaken assumption.
RE: GNU Mes 0.24 released
>> The common objection is: "you're building from source but you're not >> gonna audit all that source code anyway, so why bother?" I think it's >> akin to security by obscurity. That we collectively can and do fiddle >> with all this code makes a practical difference; that this is all >> transparent means that backdoors become harder to hide. Well from root binaries to Gnu Mes (along with the extras such as sha256sum, ungz and untar) if printed on single sided paper at size 12 font would be only 171 pages. So not that hard after all after that you can leverage sha256sums and chains of trust to do the rest > I saw a project a while ago with an interesting approach that looks very > interesting for tackling this problem: crowd-sourced, social code > review: > https://github.com/crev-dev/crev Looks interesting -Jeremiah
Announcing mescc-tools-seed v1.0
I am pleased to announce version 1.0 of mescc-tools-seed https://github.com/oriansj/mescc-tools-seed For those not familiar it is the full bootstrap of a cross-platform C compiler (written in C) from hex https://github.com/oriansj/M2-Planet https://savannah.nongnu.org/projects/mescc-tools and once this piece is done we will have a full bootstrap from hex to GCC https://github.com/oriansj/mes-m2 -Jeremiah
RE: [bootstrappable] GNU Mes 0.22 released
> We are pleased to announce the release of GNU Mes 0.22, representing > 57 commits over 8 weeks. Great job as always Janneke, stage0's 0.3.0 release last week was far less impressive. -Jeremiah
mescc-tools v0.7.0 released
Today I am proud to announce the release of mescc-tools v0.7.0 https://savannah.nongnu.org/projects/mescc-tools https://github.com/oriansj/mescc-tools Featuring AArch64 support, fixes for outstanding ARMv7l bugs and the elimination of all segfaults found via fuzzing and static code analysis Not to mention: major enhancements to kaem thanks to fosslinux Reproducible friendly tarball generation thanks to Janneke Andrius Štikonas fixing a lot of my typos -Jeremiah
Stage0 Release 0.4.0, M2-Planet Release 1.5.0 and mescc-tools-seed Release 1.2
Today I am pleased to announce the following releases: https://savannah.nongnu.org/projects/stage0/ https://github.com/oriansj/stage0 In stage0 we have gained hand written (in assembly) C compilers for: AMD64 :: cc_amd64.s Knight :: cc_knight-native.s and cc_knight-posix.s ARMv7l :: cc_armv7l.s AArch64 :: cc_aarch64.s Along with C High level prototypes for all of them (All of the above was completed in a 4 hour speed run) https://github.com/oriansj/M2-Planet In M2-Planet we gained more posix primitives and thanks to recent fuzzing have dropped a considerable number of possible segfaults. Knight-native gained support for large binaries (2GB) Oh and to bury the lead a bit. Deesix single handedly ported AArch64 support into M2-Planet! (up next RISC-V [32 and 64bit]) https://github.com/oriansj/mescc-tools-seed In mescc-tools-seed we eliminated all binaries (not that there was much anyway) So you'll have to clone https://github.com/oriansj/bootstrap-seeds if you want a generated 357byte hex0 binary - Jeremiah
RE: Stage0 Release 0.4.0, M2-Planet Release 1.5.0 and mescc-tools-seed Release 1.2
> WOW! That is an impressive list. And in 4 hours. Really ends the debate about how to bootstrap C compilers in my book > Is there something I can play with on my 32-bit ppc machine to add to the > list? Well if you are willing to do some testing it is really trivial to get a new architecture into mescc-tools and M2-Planet (I can handle the conversion to cc_* rather quickly) First you'll want to figure out how the architecture expresses immediates and offsets; hex2 is the place where those architecture specific details are leveraged (https://github.com/oriansj/mescc-tools/blob/master/hex2_linker.c) (rasm2 and gdb are extremely handy here) This will give you the answers to hex2 --architecture powerpc32 (or what ever name you want really) --{Big/Little}Endian -f elf-header.hex2 -f test.hex2 -- exec_enable -o binary Which will generate working binaries for that architecture from hex2 sources Next you'll want to figure out how the bits of the instructions are actually laid out and M1 definitions are a quick way to get to that quickly (https://github.com/oriansj/mescc-tools/blob/master/M1-macro.c) This will give you the answers to M1 --architecture powerpc32 (or what ever name you want really) --{Big/Little}Endian -f defs.M1 -f test.M1 -o test.hex2 Which can be passed to hex2 (with the previously created elf-header) and generate a working binary for that architecture. After that one needs to figure out what M1 defs would be required (This should give you a simple example of what would be required https://github.com/oriansj/stage0/blob/master/stage2/High_level_prototypes/cc_x86/cc_core.c ) Which will convert C code into M1 output which you at this point know what to do with it to get a working binary. After all of that, one simply takes the M1 strings that are in M2-Planet which we know generate working C code and taking a copy of cc_x86.s, replace the strings, fix the local and argument offsets (change a few type sizes for 64bit targets) and you are done. With a working C compiler for powerpc32 written in assembly. One trick is after you get the architecture into M2-Planet Is have it build itself for your target instruction set and put that into a file (I used foo1) Then cat all of the inputs used into another file (I used foo.c) Here is the instruction I used with knight-native: M1 --architecture knight-native --BigEndian -f High_level_prototypes/defs -f stage2/cc_knight-native.s -o scratch/cc_knight-native.hex2 && hex2 --architecture knight-native --BigEndian -f scratch/cc_knight-native.hex2 -o scratch/cc_knight-native && ./bin/vm --rom scratch/cc_knight-native --tape_01 foo.c --memory 10M && meld foo1 tape_02 If you need further clarification, I am more than happy to help Plus there are some wonderful people on #bootstrappable who are able to help you work through ugly details -Jeremiah
Re: [Proposal] The Formal Methods in GNU Guix Working Group
> I'm interested on this topic and I will try to help as much as I can. Good > The original idea of Brett is very interesting. In my case I would do the > base compiler implemented in C and using yacc (for example) to implement the > grammar. > But it won't make sense in a community like Guix where most people know > Scheme rather than C/C++. > So it may make sense to write a small C compiler for Scheme and then write > the ML bootstrap compiler in Scheme, similar to what Guix does to bootstrap > itself with nyacc. Well we already have a C compiler written in scheme, it is called Gnu Mes (MesCC to be precise) We also have a scheme bootstrappable from nothing written in C https://github.com/oriansj/mes-m2 https://github.com/oriansj/mescc-tools-seed > This will solve more problems than Guix itself, because it seems this > bootstrapping problem comes historically from the very first implementations > of ML. Let us hope > As we talked yesterday with Brett via chat, PolyML is the only one that has > been packaged in Guix but it is very tricky, because they have on the repo > the binaries to boostrap itself. That needs to be fixed > Writting a Scheme compiler should be easy, if we don't care about > optimization techniques. It doesn't need that requirements. > But if you need any help in the low level area, I can help you guys. Well I need more help in the high level areas -Jeremiah
RE: [Proposal] The Formal Methods in GNU Guix Working Group
> The term "nothing" is mitigated; i.e. "nothing" means: a booted system > running a (linux) kernel. Right? No, I mean bootstrapped from bare metal. No Kernel No firmware No microcode No Bios Just individual TTL logic circuits https://github.com/oriansj/stage0 > Thank you for all your contributions! > e.g., hex0 is amazing! :-) Hex0 is only about 3 hours of work This however is taking me months https://github.com/oriansj/mes-m2 and when done will result in solving multiple bootstrapping problems -Jeremiah
RE: [Proposal] The Formal Methods in GNU Guix Working Group
> ISTR someone made an initrd with guile in it, and "booted to guile." Yep amazing work that it is; someone also made emacs an initrd too > If that is so, does that not suggest that "from nothing" could be independent > of a running kernel? Actually no, bare metal is a bit different than running on top of a posix But is even more fun is we made a 737byte initrd/shell https://github.com/oriansj/bootstrap-seeds/blob/master/kaem-optional-seed.hex0 Which means one can bootstrap on POSIX from hex0 and the above all the way to mes-m2 > Wouldn't that be cool? !! ;-) Indeed, hence why the solving of the Linux/POSIX bootstrap problem can be solved with M2-Planet and stage0 Need more help though to get it all done faster -Jeremiah
RE: [Proposal] The Formal Methods in GNU Guix Working Group
>> > The term "nothing" is mitigated; i.e. "nothing" means: a booted system >> > running a (linux) kernel. Right? >> No, I mean bootstrapped from bare metal. >> No Kernel >> No firmware >> No microcode >> No Bios >> Just individual TTL logic circuits > Is it ready yet? Well the parts all the way to M2-Planet are done and verified on the virtual machine https://github.com/oriansj/stage0 We however need to write a portable POSIX to remove Linux from guix's bootstrap (Ideally buildable via M2-Planet and discussion on what route is ongoing) Then implement the design on FPGA via http://www.clifford.at/icestorm/ Finally I'll have to convince my wife to let me spend $10K to implement the design in hardware using: https://libresilicon.com/ (But at least I'll have plenty of free chips to share with the world) >This should be great. :-) That is the plan but we really could use more scheme programmers. (Ambitious goals and all that) -Jeremiah
RE: [bootstrappable] GNU Mes 0.26.1 released
> We are happy to announce the release of GNU Mes 0.26.1. Great work everyone ^_^ -Jeremiah