Re: How does compiling work?

Devin Asay Thu, 08 Sep 2011 10:31:12 -0700

Great summary, Richard! This is going into my teaching notes file.

Devin


On Sep 8, 2011, at 8:00 AM, Richard Gaskin wrote:

> Julian Ohrt wrote:
> 
>> Is there any documentation how compiling of livecode works internally?
>> Is it a compiler which can produce native code (for Windows, Linux,
>> etc.)? Are the scripts packaged within the executable together with an
>> interpreter and interpreted at run time? Or is it more like a virtual
>> machine approach?
> 
> Yes, I think it could be said that LiveCode has more in common with a virtual 
> machine than almost any other metaphor.
> 
> My understanding of the under-the-hood mechanics is very limited, but that 
> won't stop me from trying. :)
> 
> There are many layers to code execution and the languages which work at each 
> level, which could be summarized as:
> 
> - CPU instruction set/Object code:  the intructions the processor is able to 
> handle on its own, purely binary code; these are very primitive, consistent 
> largely of moving stuff from one memory location to another, some basic math 
> routines, etc.  Most mortals never write machine code directly, relying on 
> assemblers or or compilers to translate their more human-readable code into 
> machine instructions.
> 
> - Assembler:  a way of working directly with the CPU instruction set, but 
> with the advantage of using mnemonic labels for the instructions ("MOVE" 
> rather than "0111010").  Generally speaking, there is usually a one-to-one 
> relationship between Assembler instructions and machine instructions.
> 
> - C: Designed as a substitute for Assembler, C allows you to execute many 
> hundreds or even thousands of machine instructions with relatively little 
> code, but it's still somewhat close to the CPU in terms of memory management, 
> data types, options for register use, etc.
> 
> - C++/C#/Objective C:  a set of libraries and compilers based on C that 
> implement object-oriented programming, executing many more instructions per 
> line of code and usually involving frameworks that handle many of the common 
> tasks an application will perform.
> 
> - Scripting: Instructions written in very high-level languages which often 
> completely automate things like memory management, type conversion, garbage 
> collection, etc., triggering a great many machine instructions for each line 
> of code, favoring developer convenience at a small cost to efficiency and 
> memory.
> 
> At each of these levels, the number of machine instructions triggered by a 
> line of code is generally higher, meaning ever more of the work is done by 
> the system rather than the programmer.
> 
> Much of the LiveCode engine is written in C++ (with some portions in straight 
> C, I believe), and the LiveCode scripting language is often compiled to an 
> intermediary bytecode, which in the list above might be between C++ and 
> Scripting.
> 
> Bytecode is very different from true object code, in that object code 
> represents the instructions as the CPU itself expects to handle them, while 
> bytecode still needs an intermediary mechanism (such as the LiveCode engine) 
> to translate it into machine instructions.
> 
> Bytecode representations are much closer to those in machine instructions 
> than scripts, making the runtime translation of them often as simple as 
> jumping from one register to another from a densely packed and highly 
> optimized lookup table.
> 
> Moreover, bytecode represents a fairly small subset of the instructions 
> compiled from your script; in many cases they jump directly into compiled 
> object code in the engine, which was written in C++ and compiled to machine 
> code using some of the best modern compilers. So in effect, as Osterhaut puts 
> it in his seminal paper on scripting (see 
> <http://www.stanford.edu/~ouster/cgi-bin/papers/scripting.pdf>), good 
> scripting languages are often just a sort of "glue" between true 
> machine-compiled routines.  Bytecode makes that glue smaller and more 
> efficient.
> 
> The scripts you write in LiveCode are what gets saved with the file (at least 
> that's what I see when I look at a saved stack file; I can find the scripts 
> but if the bytecode gets saved with it it's amazingly small because I can't 
> find it at all).
> 
> It's my understanding that when a stack is opened, its scripts are compiled 
> to bytecode as the stack's object records are unpacked and the message path 
> is set up.  This "runtime compilation" involves parsing your script and 
> translating that into binary tokens that execute much more efficiently.  When 
> executing, this bytecode is translated to direct machine instructions on the 
> fly, but as you can see with LiveCode's blazing performance, neither the 
> runtime compilation to bytecode nor the translation of the bytecode into 
> machine instructions is particularly costly.  And by separating the tasks, 
> the more costly parsing of the script is done only once, which is one of the 
> reasons why LC outperforms fully-interpreted systems (another reason is 
> careful pruning of the lookup table used in that parsing and in the 
> subsequent bytecode jumps, but that's another story).
> 
> In fact, since so much of the actual execution takes place in the engine's 
> machine-compiled code, performance for many tasks is on par with other 
> systems where you have to wait for a compiler every time you change your 
> code. :)
> 
> There are exceptions to the general rule that script statements are 
> translated to bytecode in advance of execution.  For example, the "do" 
> command and the "value" function both require parsing during execution, since 
> they work with strings whose values cannot be known in advance, and therefore 
> cannot be compiled in advance.
> 
> But those tokens also make good examples of LiveCode's efficiency: while 
> technically slower than alternative syntax which can be precompiled to 
> bytecode, the time it takes the engine to parse those expressions and 
> translate them into a form which can be executed is usually measured in 
> microseconds, sometimes fractions of microseconds.
> 
> Along those lines, compare the time it takes LiveCode to compile a script 
> when you push the script editor's "Compile" button to compilation times in 
> almost any other system.  With each script compiled to bytecode separately, 
> and with its means of doing so being rather well tuned over a great many 
> years, it's almost instantaneous - you'll never wait for a progress bar when 
> compiling in LiveCode. :)
> 
> 
> In summary, LiveCode attempts to find a sweet spot between raw performance 
> and developer convenience.  You could write faster-executing code in 
> Assembler, but who would want to?  Even using languages like C++ will often 
> take orders of magnitude more development time to accomplish similar goals.  
> LiveCode's two-step compilation allows for blazing fast performance with 
> nearly unprecedented return on your development time.
> 
> IMO, an almost ideal sweet spot indeed.
> 
> --
> Richard Gaskin
> Fourth World
> LiveCode training and consulting: http://www.fourthworld.com
> Webzine for LiveCode developers: http://www.LiveCodeJournal.com
> LiveCode Journal blog: http://LiveCodejournal.com/blog.irv
> 
> _______________________________________________
> use-livecode mailing list
> use-livecode@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription 
> preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode

Devin Asay
Humanities Technology and Research Support Center
Brigham Young University




_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: How does compiling work?

Reply via email to