Hi all,
it's me, handin server guy again. Sorry to bother.

Our handin server started "crashing" with "bad variable linkage" errors at 
deadline time (presumably under somewhat high load), and since it happened 
twice, I thought I'd report it. Any ideas on what's causing this?

After this "crash", the server keeps running, but rejects all submissions 
because the same checker keeps not loading.

==

[1|2015-11-23T14:51:31] (re)loading module from (file 
/var/handin_config/info1-teaching-material/checkers/06-Datentypen/REDACTED-USER-NAME/../checker.rkt)
[1|2015-11-23T14:51:33] ERROR: link: bad variable linkage;
[1|2015-11-23T14:51:33]  reference to a variable that is uninitialized
[1|2015-11-23T14:51:33]   reference phase level: 0
[1|2015-11-23T14:51:33]   variable module: 
"/var/handin_home/handin/handin-server/checker.rkt"
[1|2015-11-23T14:51:33]   variable phase: 0
[1|2015-11-23T14:51:33]   reference in module: 
"/var/handin_config/info1-teaching-material/checkers/checker-extras.rkt"
[1|2015-11-23T14:51:33]   in: submission-eval

Bigger log fragment available at 
https://gist.github.com/Blaisorblade/7f9c6e7f4f456b588a8a

Other info:
- Restarting the server does fix the error. Somehow.
- For those unfamiliar with the handin server: it has code which automatically 
reloads checkers, as witnessed by the log above 
(https://github.com/ps-tuebingen/handin/blob/master/handin-server/private/reloadable.rkt).
 But that code doesn't fix the problem.
- Googling suggests that stale compiled code might be there. But the source 
code hadn't changed. (Also, I found no description of how this arises).
- Since the server gets sometimes "stuck", I built a trivial watchdog (a 
cronjob) that restarts the server if the status server becomes too slow. The 
above happened after the server was restarted by the watchdog.

One set of hypothesis:
is it possible that stopping the server at the wrong moment corrupts compiled 
files? (But then, why does the first restart not fix the problem?)
Do you take care to make compilation atomic with `rename`?

However, according to docs, the server is designed to survive brutal restarts.

One non-standard thing I do is that I have a `checker-extras.rkt` module with 
some utilities shared across checkers*, and that's not deployed as part of the 
server (for various reasons), but together with the checkers, so it's loaded 
with (require "../checker-extras.rkt"), and seems to be compiled, probably when 
starting the server. Could this interfere badly with the reloading code or with 
restarting?

*I'm aware of your checker utilities, but here we have slightly different 
requirements.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to