debugger API PDD, v1.1

Dave Storrs Tue, 04 Sep 2001 18:29:09 -0700
=head1 TITLE

API for the Perl 6 debugger.

=head1 VERSION

1.1

=head2 CURRENT

     Maintainer: David Storrs ([EMAIL PROTECTED])
     Class: Internals
     PDD Number: ?
     Version: 1
     Status: Developing
     Last Modified: August 18, 2001
     PDD Format: 1
     Language: English

=head2 HISTORY

=over 4

=item Version 1.1

=item Version 1

First version

=back

=head1 CHANGES

1.1 - Minor edits throughout
        - Explicit and expanded list of how breakpoints may be set
        - Explicit mention of JIT compilation
        - Added mention of edit-and-continue functionality
        - Added "remote debugging" section.
        - Added "multithreaded debugging" section
        
1 None. First version

=head1 ABSTRACT

This PDD describes the API for the Perl6 debugger.

=head1 DESCRIPTION

The following is a simple English-language description of the
functionality that we need.  Implementation is described in a later
section.  Descriptions are broken out by which major system will need
to provide the functionality (interpreter, optimizer, etc) and the
major systems are arranged in (more or less) the order in which the
code passes through them.  Within each section, functionality is
arranged according to (hopefully) logical groupings.


=head2 Compiler

=head3 Generating Code on the Fly

=over 4

=item *

Compile and return the bytecode stream for a given expression. Used
for evals of user-specified code and edit/JIT compiling of source.
Should be able to compile in any specified context (e.g., scalar,
array, etc).

=item *

Show the bytecode stream emitted by a particular expression, either a
part of the source or user-specified.  (This is basically just the
above method with a 'print' statement wrapped around it.)

=item *

Do JIT compilation of source at runtime (this is implied by the first
item in this list, but it seemed better to mention it explicitly).

=back # Closes 'Generating Code on the Fly' section



=head2 Optimizer

=head3 Generating and Comparing Optimizations

=over 4

=item *

Optimize a specified bytecode stream in place.

=item *

Return an optimized copy of the specified bytecode stream.

=item *

Show the diffs between two bytecode streams (presumably pre- and
post-optimization versions of the same stream).

=back # Closes 'Generating and Comparing Optimizations' section




=head2 Interpreter

=head3 Manipulating the Bytecode Stream

=over 4

=item *

Display the bytecodes for a particular region.

=item *

Fetch the next bytecode from the indicated stream.

// @@NOTE: from a design perspective, this is nicer than doing
"(*bcs)" everywhere, but we definitely don't want to pay a function
call overhead every time we fetch a bytecode.  Can we rely on all
compilers to inline this properly?

=item *

Append/prepend all the bytecodes in 'source_stream' to 'dest_stream'.
Used for things like JIT compilation.

=back # Closes 'Manipulating the Bytecode Stream' section



=head3 Locating Various Points in the Code

=over 4

=item *

Locate the beginning of the next Perl expression in the specified
bytestream (which could be, but is not necessarily, the head of the
stream).

=item *

Locate the beginning of the next Perl source line in the specified
bytestream (which could be, but is not necessarily, the head of the
stream).

=item *

Search the specified bytestream for the specified bytecode.  Return
the original bytecode stream, less everything up to the located
bytecode.

// @@NOTE: Should the return stream include the searched-for bytecode
or not?  In general, I think this will be used to search for 'return'
bytecodes, in order to support the "step out of function"
functionality. In that case, it would be more convenient if the return
were B<not> there.

=item *

Search the specified bytecode stream for the specified line number.
This line may appear in the current module (the default), or in
another module, which must then be specified.

=item *

Search the specified bytecode stream for the beginning of the
specified subroutine.

=item *

Locates the beginning of the source line which called the function for
which the current stack frame was created.

=item *

Locate the next point, or all points, where a specified file is 'use'd
or 'require'd

=back # Closes 'Locating Various Points in the Code' section.



=head3 Moving Through the Code

=over 4

=item *

Continue executing code, stop at end of code or first breakpoint
found.

=item *

Continue up to a specified line, ignoring breakpoints on the way.

=item *

In the source which produced a specified bytecode stream, search
forwards for a specified pattern.

=item *

In the source which produced a specified bytecode stream, search
backwards for a specified pattern.

=item *

In the source which produced a specified bytecode stream, search
forwards for lines where expression is satisfied

=item *

In the source which produced a specified bytecode stream, search
backwards for lines where expression is satisfied

=back # Closes 'Moving through the Code'




=head3 Variable and Code Manipulation

=over 4

=item *

List all subroutines in a particular module (or all modules).

=item *

Locate the file containing the definition of the specified
func/method.


=item *

Fetch an element from a specified (default: main::) symbol table.

=item *

Fetch all elements from a specified (default: main::) symbol table.

=item *

Set an element of a specified symbol table.

=item *

Retrieve an element from a specified scratchpad.  Scratchpads must be
identifiable in a human-readable way.

=item *

Set an element in a specified scratchpad.  However scratchpads are
identified internally, they must provide a human-convenient interface;
one possibility is to address them as:  (pad:<line num>[,filename])
For example, (pad:78) would refer to the pad created on line 78 of the
current file, while (pad:78,Foo::Bar.pm) would be the pad created on
line 78 of Foo::Bar.pm.  

=item *

Fetch code or data from a closure.

=item *

Modify the code and/or data of a specified closure.  Closures must be
identifiable in a human-readable way.

=item *

Execute the next line and name the closure that is produced on that
line.  This name would be attached to the closure structure and could
be used to call debugger commands on the closure in a convenient,
human-readable way.

=item *

Retrieve the stack.  (Probably we will want to limit this to something
like "Retrieve the previous/next stack frame" so that we don't swamp
ourselves; we can always iterate to get more frames.)

=item *

Designate a particular frame within the stack as the "current" frame.
This is mainly intended to allow us to evaluate variables from within
the context of the stack frame where they were created, modified, etc.

=item *

Specify what line in the current function should be executed next.

=item *

Erase all frames above the current one from the stack; restart
execution from the top of this frame.  This is intended to support the
"Specify what line will run next" functionality, up to and including
the ability to jump out of the current function.

=item *

Get/set the string encoding of a specified string (e.g. UTF-8, UTF-16,
etc).

=item *

Get/set the normalization of a specified Unicode string.

=item *

Get the version number of the interpreter.

=back # Closes 'Variable and Code Manipulation'




=head2 Regex Engine

=over 4

=item *

Get contents of $1, $2, $3....

=item *

Get contents of $`, $&, and $' (pre-match, matched string,
post-match).

=item *

Get the matched portion of the string (i.e., how much of the string
has been used up).

=item *

Get unmatched portion of the string.

=item *

Get the complete state machine of the specified regex.  Display it in
some human-readable format.

=item *

Get the next state in state machine.

=item *

Match the next character in the regex.

=back # Closes 'Regex Engine' section.




=head2 Garbage Collector

=over 4

=item *

Perform a GC pass immediately (if we intend to support more than one
style of GC, then it should be possible to specify which variety of GC
is performed).

=item *

Set a lock forbidding the GC from running until the lock is released.

=item *

Release a previously set lock on the GC.

=item *

Display the current state of the GC lock.

=item *

Show which PMCs were visited on a particular GC pass.

=item *

Show whether a particular PMC was visited on a particular pass.

=item *

Show when memory was moved, which memory it was, and where it was
moved to.

=item *

Show when and which memory was reclaimed on a particular pass.

=item *

Show how much memory was allocated by a particular
line/statement/sub/program, and how (i.e., in what structs).

=back # Closes 'Garbage Collector' section





=head2 Profiler

Until now, the Perl core has included a debugger but not a profiler.
Perl6 seems like an excellent time to add a profiler in core, parallel
to the debugger.  The profiler should not be loaded unless
specifically asked for (we don't want to pay for the overhead), but it
should be available.  Assuming that people agree with this, here are
some proposed hooks into the profiler.

=over 4

=item *

Show data type conversions throughout program.

=item *

Show how much CPU time was spent on the program.

=item *

Show how much CPU time was spent in each subroutine.

=item *

Show how much wall-clock time was spent on the program.

=item *

Show how much wall-clock time was spent in each sub.

=item *

Show how much system time was spent in each sub.

=item *

Show how much CPU time was spent on disk I/O, not including virtual
memory.

=item *

Show how much CPU time was spent on virtual memory.

=item *

Show how much CPU time was spent on memory management.

=item *

Show minimum, average, and maximum amount of memory used after the
beginning of execution

=item *

Show minimum, average, and maximum number of filehandles
simultaneously open during execution.

=back # Closes 'Profiler' section




=head2 Edit-and-Continue

When stepping through code, it should be possible to locate a mistake,
correct it, and have the interpreter continue without having to
restart.  This feature is given a separate section because it involves
interactions between all other components, compiler, optimizer,
interpreter, and (sometimes) regex engine.



=head2 Internal to the Debugger

=head3 Breakpoints, Watch Expressions, Actions, Commands-around-prompt

The Perl5 debugger contains four constructs that are almost but not
quite identical: breakpoints, watch expressions, actions, and what I
am calling "commands-around-prompt."

Breakpoints are attached to a particular line of source code and have
a condition associated with them (which, in many cases, is simply
'1'); when the given line of source is about to be executed, the
condition is evaluated. If it evaluates to a true value, then
execution pauses before the line is executed.

Watch expressions are placed on a particular variable to show when it
changes.

Actions are Perl expressions that are attached to a particular line of
source; the expression is evaluated just before the source line is
executed.

"Commands-around-prompt" come in four varieties: debugger commands
before, debugger commands after, Perl commands before, and Perl
command after the prompt.  In each case, a command is run every time
the debugger prompt is displayed, either just before or just after (as
appropriate) the prompt itself appears.


=head3 Breakpoints

=over 4

=item *

Create a breakpoint and append it to the list of breakpoints. A
breakpoint should always have a condition associated with it (this
condition will be '1' unless the user specifies a condition). Unless
the condition evaluates to true, the breakpoint does not activate.
The user should be able to specify this point in as many ways as
possible, including by:

=over 4 # Ways to identify a breakpoint location

=item line number (e.g.: 78)

=item line number in another file (e.g.: "Foo::Bar.pm":78)

This should load the other file if it is not already loaded (under
most circumstances it will be).

=item * subroutine name (e.g.: my_print)

This would stop just before the first line of code in the subroutine.

=item eval, by subroutine and repetition (e.g.: my_print:(eval 2,3))

The above example would stop just before performing the third
repetition of the second 'eval' in the 'my_print' function (assuming
that that eval is in a loop of some sort; if the interpreter can
determine that the eval will not actually execute three times before
the function exits it should produce a warning if you try to set the
above breakpoint.

=item closure by line number, instance number, and (optionally)
filename (e.g.: (closure 78, 2, "Foo::Bar.pm"))

The above example would stop immediately before any line that would
execute the closure that was produced on line 78 of file
"Foo::Bar.pm", on the second occasion that a closure was produced on
that line.  In general, if line 78 produces only one closure (which is
normally the case) then this will refer to the closure produced the
second time that line 78 was executed.

=item closure by variable name (e.g.: (closure "$wurzle"))

The above example would stop before executing any line that would
execute the closure stored in the '$wurzle' variable.

=item closure by name (e.g.: (closure "wurzle"))

The above example would stop before executing any line that would
execute the closure named 'wurzle'.  See the 'Interpreter' section
above for how to name a closure.

=item package name, expressed as a (possibly negated) (e.g.:
/Foo::Bar/ or !/^Bar::Baz\b/)

The first example above would stop immediately before executing any
line of code in the 'Foo::Bar' package.  The second example would stop
before executing any line of code that was NOT in the 'Bar::Baz'
package or one of its descendants.

=item file name (e.g.: "Foo::Bar.pm")

The above example would stop immediately before executing any line of
code in the Foo::Bar.pm file (which might be different than that
Foo::Bar package).

=back # Closes 'Ways to identify a breakpoint location'

=item *

Search the list for a particular breakpoint; if found, return it.

=item *

Display the list of breakpoints.

=item *

Remove one entry from the list of breakpoints.

=item *

Clear the list of breakpoints.

=back # Closes 'Breakpoints' section



=head3 Watch Expressions

The basic intent of a watch expression is to keep track of the value
and attributes of a variable.  Whenever these things change, the
debugger should stop executing and print a message telling you what
happened.  This message will be one of the following (note that
variable names to the left of the ':' in these messages are not to be
interpolated; items to the right of the ':' have been given example
values):

=over 4 # Watch expression messages

=item $foo created

If a watch expression is set on a lexical variable before it has been
created, this message will be printed when that variable comes into
scope.  @@NOTE: Is this feasible?

=item $foo lost scope

When a watch expression is set on a variable, and control flow exits
the scope of that variable, this message will be printed.

=item $foo garbage collected

=item Value of $foo changed; from, to: 7, 15

This message is shown whenever a new value is stored to a scalar.

=item Length of @foo changed; from, to: 2, 7 (result of 'push', line
32 of Foo::Bar.pm)

This message is shown whenever the length of an array changes, and
tells you why and where it changed.

=item New key added to %foo; key, value: 'dict_filepath',
'/usr/dict/words'

This message is shown whenever one or more new key(s) are added to an
existing hash.

=item Key(s) deleted from %foo: 'dict_filepath', 'password_filepath',
'blarg'

This message is shown whenever one or more new key(s) are deleted from
an existing hash, and tells you what those keys were.

=item Value of key(s) in %foo changed; from, to: 'dict_filepath' =>
'/blah/foo/bar.dict', 'baz' => 'jaz'

This message is shown whenever one or more new key(s) are changed
within an existing hash, and tells you what those keys were and what
their new values are.

=back # Close 'Watch expression messages'

=over 4

=item *

Create a watch expression and append it to the list of watch
expressions. The user should be able to specify this point in as many
ways as possible, including by:

=over 4 # Ways to identify a watch expression location

=item package variable name (e.g.: $wurzle, @wurzle, or %wurzle)

=item package variable name by package (e.g.: $Foo::Bar::wurzle)

=item package or lexical variable name by (optionally) file and
(required) line number (e.g.: "Foo::Bar.pm, 23":$wurzle OR 23:$wurzle)

If the file is not specified then the currently-viewed file will be
assumed.  If the file B<is> specified, this should load the other file
if it is not already loaded (under most circumstances it will be).

=item lexical variable name by subroutine name (e.g.:
&my_print::$wurzle OR &Foo::Bar::my_print::$wurzle)

This would set a watch expression on the first variable named $wurzle
declared inside the 'my_print' function.  If more than one variable
named $wurzle is declared inside 'my_print', a warning message should
be shown to that effect when the watch expression is created, even if
warnings are not currently enabled.

=item lexical variable name by (optionally) file name and (required)
line number (e.g.: $wurzle, 78 OR $wurzle, "Foo::Bar.pm":78)

The first example would set a watch expression on the variable named
$wurzle which is created on line 78 of the currently-viewed file.  The
second example would set a watch expression on the variable $wurzle
which is declared on line 78 of "Foor::Bar.pm".  In either case, if
$wurzle is not actually created on that line, a warning message should
be displayed, even if warnings are not currently enabled.

=item closure by line number, instance number, and (optionally)
filename (e.g.: (closure 78, 2, "Foo::Bar.pm"))

The above example would stop immediately before any line that would
execute the closure that was produced on line 78 of file
"Foo::Bar.pm", on the second occasion that that line was executed.

=item closure by variable name (e.g.: (closure "$wurzle"))

The above example would stop before executing any line that would
execute the closure stored in the '$wurzle' variable.

=item closure by name (e.g.: (closure "wurzle"))

The above example would stop before executing any line that would
execute the closure named 'wurzle'.  See the 'Interpreter' section
above for how to name a closure.

=item package name, expressed as a (possibly negated) (e.g.:
/Foo::Bar/ or !/^Bar::Baz(?<{::}(.*))/)

The first example above would stop immediately before executing any
line of code in the 'Foo::Bar' package.  The second example would stop
before executing any line of code that was NOT in the 'Bar::Baz'
package or one of its descendants.

=item file name (e.g.: "Foo::Bar.pm")

The above example would stop immediately before executing any line of
code in the Foo::Bar.pm file (which might be different than that
Foo::Bar package).

=back # Closes 'Ways to identify a watch expression location'

=item *

Search the list for a particular watch expression; if found, return
it.

=item *

Display the list of watch expressions.

=item *

Remove one entry from the list of watch expressions.

=item *

Clear the list of watch expressions.

=back # Closes 'Watch Expressions' section



=head3 Actions

=over 4

=item *

Create an action and append it to the list of actions. An action may
have a condition associated with it (this condition defaults to '1'
unless otherwise specified by the user); if the condition is not met,
then the action does not trigger. Actions are specified by:

=over 4 # Ways to identify an action location

=item file name and line number, where the file name defaults to the
currently-viewed file (e.g.: 78 or "Foo::Bar.pm":78)

=back # Closes 'Ways to identify an action location'

=item *

Search the list for a particular action; if found, return it.

=item *

Display the list of actions.

=item *

Remove one entry from the list of actions.

=item *

Clear the list of actions.

=back # Closes 'Actions' section




=head3 Commands-Around-Prompt

=over 4

=item *

Create a command-around-prompt.

=item *

Append a command-around-prompt to either the 'pre-prompt' or the
'post-prompt' list.

=item *

Evaluate all the commands-around-prompt in the 'pre-prompt' list.

=item *

Evaluate all the commands-around-prompt in the 'post-prompt' list.

=item *

Display the 'pre-prompt' or 'post-prompt' list of
commands-around-prompt.

=item *

Remove one entry from a list of commands-around-prompt.

=item *

Clear the list of commands-around-prompt.

=back # Closes 'Commands-Around-Prompt' section




=head3 The Debugger's Command History

=over 4

=item *

Set and get the size of the command history.

=item *

Get Nth most recent item from the command history.

=item *

Get Nth most recent item from the command history that (does | does
not) match PATTERN.

=item *

Add an element to the command history.

=item *

Display complete command history.

=item *

Display an item from the command history.

=item *

Display last N items from the command history.

=back # Closes 'The Debugger's Command History'



=head3 Debugger Help

=over 4

=item *

Display a short summary of the debugger commands.

=item *

Display the full listing of the debugger commands.

=item *

Set the program that is used to to page long displays (e.g. 'more' or
'less').

=item *

Set the program that is used to show man pages.

=item *

Run 'man' on a specified topic.

=back # Closes 'Debugger Help'



=head3 Command Aliases

=over 4

=item *

Create an alias that sets one user-specified command to run another.

=item *

Delete an alias.

=item *

Show an alias.

=item *

Show all aliases.

=back # Closes 'Aliases'



=head3 Pretty-Printing

=over 4

=item *

Pretty-print the name and type (e.g. HASH) of the variable to which a
reference points.

=item *

Pretty-print the dump of an object variable (all methods, vars, etc).

=item *

Pretty-print the hierarchy of all classes currently loaded.

=item *

Pretty-print the inheritance tree for a particular class.

=item *

Pretty-print the inheritance tree for a particular object.

=item *

Pretty-print the methods callable via specified object or its
ancestors.

=item *

Pretty-print a Unicode string in its current encoding/normalization.
(Useful to determine exactly how many glyphs are in it.)

=back # Closes 'Pretty-Printing'



=head3 Miscellaneous

=over 4

=item *

Quit debugging, exit the debugger.

=item *

Rerun the debugger on the current program; it should be possible to
specify whether the program should be reparsed or not.

=item *

Page a stream to the screen.

=item *

Grep a list of strings for elements that do or do not match a pattern.

=item *

Set or get the value of a debugger option.

=back # Closes 'Miscellaneous'



=head2 Remote Debugging

=over 4

=item *

Attach to an interpreter running on a remote machine.  The default
would be that this would need to be done in a secure way, where the
connection is refused unless a challenge/response is met, and all
communications are encrypted.  If the remote program was specifically
instructed (via a command- or shebang-line switch) to permit insecure
connections, then that option would also be available.  'Attaching'
implies that the debugger gains access to the optree and event loop
(among other things) on the remote machine.

=item *

Intercept communications between two hosts to which the local debugger
is connected.  This is primarily intended for debugging communication
protocols between a client/server pair of hosts, where the debugger
may be running on a third host.

=back # Closes 'Remote Debugging'



=head2 Multithreaded Debugging

=over 4

=item *

Pause all threads

=item *

Pause an individual thread

=item *

Examine the state of a thread.

State includes:

=over 4

=item Current location of control flow in the thread

=item Memory currently held by the thread, and to what it is allocated
(i.e., which variables)

=item Mutexes or other locks held by the thread

=item Whether the thread is currently waiting on something and, if so,
what

=item All items in the Garbage Collection section should be applicable
on a per-thread basis

=item Set Watchs/Actions/Breakpoints/Commands-Around-Prompt on a
per-thread basis

=back # Closes 'State of a thread'

=item *

Kill one or more threads

=back # End of 'Multithreaded Debuggging'



=head1 Implementation

Aye, here's the rub.
debugger API PDD, v1.1

Reply via email to