Need help embedding Guile
Hi all, I'm in the process of embedding Guile in an application and although I seem to have the essentials working, I'd appreciate some confirmation of the validity of my approach and also some tips on a couple of loose ends. I won't bore you with the specifics of my application; for the purposes of the discussion, its most important characteristic, is that it uses Guile as a frontend of sorts. By this, I mean that a Scheme program is executed, which creates objects in the C (well actually, and out of necessity, C++) domain. These objects represent geometric operations, in the form of a graph and are evaluated once the Scheme program has terminated. Since the evaluation can take a long time and the Scheme code itself, in simply creating nodes in the graph, is expected to run to completion quite quickly, even though it can be conceptually complex, the emphasis on the Scheme side is on debugability instead of efficiency. My aim, is to be able to load a Scheme program from a file, run it to have the graph created and then clean up. On error, I'd like to print out diagnostic information in the form of an error message with as accurate as possible source location and a stack trace. (I'd also like to print the latter with my own formatting to match rest of the output of the application.) Although perhaps other approaches are possible, I have, for now, chosen to leave memory management to the C++ side, so that my foreign objects need custom finalization. The basic layout of my current implementation, with uninteresting portions left out, is the following (where `run_scheme' is called by the main program to run a Scheme script): struct context { char *input, **first, **last; }; int run_scheme(const char *input, char **first, char **last) { struct context context = {const_cast(input), first, last}; scm_with_guile(&run_scheme_from_guile, &context); return 0; } static void *run_scheme_from_guile(void *data) { struct context *context = static_cast(data); scm_set_automatic_finalization_enabled(0); // Define some foreign objects types and subroutines. // [...] scm_set_program_arguments( context->last - context->first, context->first, context->input); scm_c_catch(SCM_BOOL_T, run_body, reinterpret_cast(context), post_handler, nullptr, pre_handler, nullptr); scm_gc(); scm_run_finalizers(); return nullptr; } static SCM run_body(void *data) { struct context *context = static_cast(data); scm_primitive_eval( scm_list_2( scm_from_latin1_symbol("load"), scm_from_latin1_string(context->input))); return SCM_UNSPECIFIED; } static SCM pre_handler(void *data, SCM key, SCM args) { SCM s = scm_make_stack(SCM_BOOL_T, SCM_EOL); SCM p = scm_current_error_port(); scm_print_exception(p, SCM_BOOL_F, key, args); scm_display_backtrace(s, p, SCM_BOOL_F, SCM_BOOL_F); return SCM_BOOL_T; } static SCM post_handler(void *data, SCM key, SCM args) { return SCM_BOOL_T; } Actually, my code in `pre_handler' is not quite what is shown above, as I print the stack with my own formatting, but let's leave that for later. As I said, this seems to be working, but certain points are unclear to me after reading all the documentation I could find and snooping around in Guile's source code: 1. The manual is not very specific about how and when finalizers are run. The approach above seems to correctly finalize all objects created as the Scheme code executes, but if references are kept, say via (define), they are not finalized and I get memory leaks. Is there some way to arrange for the complete deinitialization of Guile after I've finished evaluating Scheme code and making sure that all finalizers are run? 2. If, in `run_body', I simply do scm_c_primitive_load(context->input); then the code is evaluated, but on error I get no locations in the stack trace. The error is said to have occurred "in an unknown file" with no line numbers. Evaluating `load' as shown above, seems to produce proper source locations in the stack trace. Is there something else I should be preferably doing? 3. More generally, is there a preferable way to go about embedding Guile for my use case? Thanks in advance for any pointers, Dimitris
Re: Need help embedding Guile
Dimitris Papavasiliou schreef op di 21-12-2021 om 11:12 [+]: > [1...] > . The manual is not very specific about how and when finalizers are > run. The > approach above seems to correctly finalize all objects created as > the Scheme > code executes, but if references are kept, say via (define), they > are not > finalized and I get memory leaks. Is there some way to arrange > for the > complete deinitialization of Guile after I've finished evaluating > Scheme code > and making sure that all finalizers are run? The manual is not very specific on when finalizers are run, because there aren't many formal guarantees (e.g., BDW-GC is a conservative GC, so it might think an object is not finalizable even though it is). About deinitialising guile: I don't know. About finalizers: No. From the BDW-GC faq: I want to ensure that all my objects are finalized and reclaimed before process exit. How can I do that? You can't, and you don't really want that. This would require finalizing reachable objects. Finalizers run later would have to be able to handle this, and would have to be able to run with randomly broken libraries, because the objects they rely on where previously finalized. In most environments, you would also be replacing the operating systems mechanism for very efficiently reclaiming process memory at process exit with a significantly slower mechanism. You do sometimes want to ensure that certain particular resources are explicitly reclaimed before process exit, whether or not they become unreachable. Programming techniques for ensuring this are discussed in ``Destructors, Finalizers, and Synchronization'', Proceeedings of the 2003 ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Jan. 2003, pp. 262-272. Official version. Technical report version. HTML slides. PDF slides. > 2. If, in `run_body', I simply do > > scm_c_primitive_load(context->input); > > then the code is evaluated, but on error I get no locations in the > stack > trace. The error is said to have occurred "in an unknown file" > with no line > numbers. Evaluating `load' as shown above, seems to produce > proper source > locations in the stack trace. Is there something else I should be > preferably > doing? Due to bootstrapping reasons, there are multiple readers and evaluators in Guile, of varying debugability. I'm not 100% sure, but I think the 'primitive-load' reader+evaluator has low debugability and the 'load' procedure has higher debugability? > 3. More generally, is there a preferable way to go about embedding > Guile for my > use case? Instead of reinitialising and deinitialising guile repeatedly (seems inefficient!), I would suggest initialising Guile once at program start and do Guile stuff whenever needed. However, that might be incompatible with your memory management approach ... Greetings, Maxime.
Re: Need help embedding Guile
Hi, Maxime Devos schreef op di 21-12-2021 om 11:37 [+]: > > approach above seems to correctly finalize all objects created > > as > > the Scheme > > code executes, but if references are kept, say via (define), > > they > > are not > > finalized and I get memory leaks. You can (set! the-global-variable #f) to clear the reference, though still there are no formal guarantees BDW-GC will collect things. Greetings, Maxime
Re: Need help embedding Guile
Maxime Devos schreef op di 21-12-2021 om 11:37 [+]: > About finalizers: No. From the BDW-GC faq: [...] I misread your question; this answer doesn't apply exactly to your question. However, there still are no formal guaranteed BDW-GC will collect everything. Greetings, Maxime
Re: Need help embedding Guile
Hi Maxime, Many thanks for your response; it was very helpful. Unfortunately I'm now not so sure that I have the basics of embedding Guile more or less working and, even worse, I'm not really sure Guile is meant to work in the way I'm trying to use it. The idea is that the C++ program, after some initialization, loads and evaluates one or more Scheme files (provided by the user as command line arguments). During the course of their evaluation, these create objects on the C++ side (representing the work that is to be done) and, once they're evaluated the work of Guile is done. At that point, ideally, I'd like to deinitialize/terminate Guile, both to reclaim resources which are no longer necessary and to ensure that it plays no further role in the execution of the rest of the program. As far as I can see, this is not possible. Furthermore, as you have pointed out, I cannot ensure that all created foreign objects are finalized. The idea here seems to be that some objects might have still been reachable at the very end, as far as the GC can tell and, at any rate, will be reclaimed by the operating system when the process exits. But in my case, where the role of the embedded language, is restricted to the initial phase of the embedding program's execution, this not only needlessly removes control of these resources from the embedding program, they also show up as memory leaks in tools like MemorySanitizer and Valgrind, which is a big problem in itself. Given the inability to tear down/kill Guile explicitly, I can't see a way around this. That is not entirely true. I could perhaps keep track of objects that need to be finalized myself, and finalize them manually after Scheme code evaluation is done. This also seems to be what's recommended in one of the sources you quoted: On Tuesday, December 21st, 2021 at 1:37 PM, Maxime Devos wrote: > You do sometimes want to ensure that certain particular resources are > explicitly reclaimed before process exit, whether or not they become > unreachable. Programming techniques for ensuring this are discussed in > ``Destructors, Finalizers, and Synchronization'', Proceeedings of the > 2003 ACM SIGPLAN-SIGACT Symposium on Principles of Programming > Languages, Jan. 2003, pp. 262-272. Official version. Technical report > version. HTML slides. PDF slides. I'm still not sure I find this to be a satisfactory approach though. Not only is it non-trivial in terms of implementation, it also feels like I'm going against the current, trying to bend Guile into a role it's not meant to serve. > You can (set! the-global-variable #f) to clear the reference, > though still there are no formal guarantees BDW-GC will collect things. True. Furthermore, the Scheme code, being user-provided, could establish any number of references in various ways. As far as I can see, there is no way to completely "clear" an environment, i.e. remove all bindings contained in it, or to somehow delete it altogether or communicate to the GC that is no longer reachable, so again I see no way around this. If anyone has any comments or ideas, they would be most welcome. Dimitris PS: > Due to bootstrapping reasons, there are multiple readers and evaluators > in Guile, of varying debugability. I'm not 100% sure, but I think > the 'primitive-load' reader+evaluator has low debugability and the > 'load' procedure has higher debugability? I'll leave diagnostics for later, perhaps for a different thread, as the matters discussed above seem to be more serious (and potentially fatal).
Re: Need help embedding Guile
On Tue, 21 Dec 2021, Dimitris Papavasiliou wrote: > The idea is that the C++ program, after some initialization, loads and > evaluates > one or more Scheme files (provided by the user as command line arguments). > During the course of their evaluation, these create objects on the C++ side > (representing the work that is to be done) and, once they're evaluated the > work > of Guile is done. At that point, ideally, I'd like to deinitialize/terminate > Guile, both to reclaim resources which are no longer necessary and to ensure > that it plays no further role in the execution of the rest of the program. As > far as I can see, this is not possible. >From this description, what I understand is that you want to use Scheme as a configuration file for batching the operations to be done in a second phase in C++. However, I failed to see why you need to finalize these objects since you're going to use them in your second phase? > If anyone has any comments or ideas, they would be most welcome. One way I think of would be to fork the process and create your C++ objects in a shared memory area between the parent and the child. Once Guile is done reading your inputs, the child process dies and all its memory is reclaimed by the OS. > I'd appreciate some confirmation of the validity of my approach and > also some tips on a couple of loose ends. I think it's a valid approach. -- Olivier Dion Polymtl
WIKID 1.4 available
release notes: I forgot to mention in NEWS that the configure script now also checks for the required modules (mentioned in README) and errors out if they are not found. Tested w/ w3m 0.5.3+git20210102 and Guile 2.2.7. Obligatory refresher link: https://en.wikipedia.org/wiki/Common_Gateway_Interface README excerpt: This directory contains WIKID, comprising: - a wiki daemon - a CGI front-end - an administrative tool - some (sample) seed pages - documentation NEWS for 1.4 (2021-12-20): - socket I/O uses bytevectors for Guile 2.x If the Guile is 2.x (or later), socket I/O procedures (e.g., ‘send’ and ‘recv!’) take care to convert strings to and from bytevectors for the actual I/O operation. BTW, this change surfaced a buglet in Guile 2.x ‘recvfrom!’ arg processing, namely a classic OBOE in the END parameter check. - convenience script "try-with-w3m" during "make check" The configure script now checks for w3m(1) and if it's found, "make check" creates script "check-wikid.d/try-with-w3m" and mentions it with a "Hint" right before going into the wait loop. If, OTOH, you don't have w3m installed, well then, install it and try again! - bootstrap/maintenance tools upgraded: Guile-BAUX 20211208.0839.a5245e7 GNU gnulib 2021-12-10 21:54:54 GNU Autoconf 2.71 GNU Automake 1.16.5 GNU texinfo 6.8 as before: (none) source code in dir: https://www.gnuvola.org/software/wikid/ -- Thien-Thi Nguyen --- (defun responsep (query) ; (2021) Software Libero (pcase (context query) ; = Dissenso Etico (`(technical ,ml) (correctp ml)) ...)) 748E A0E8 1CB8 A748 9BFA --- 6CE4 6703 2224 4C80 7502 signature.asc Description: PGP signature