New submission from Arnaud Fontaine:

I know that compiler module is deprecated since 2.6, but I found a regression 
introduced in version 2.7 by the following cleanup commit (replacing pyassem.py 
modified in this commit by the one just before does not trigger this bug):

http://hg.python.org/releasing/2.7.6/rev/42faa8054c3d

Actually, there is nothing wrong when executing the generated bytecode, it's 
just that tracebacks and anything that display source code from the generated 
bytecode object gets sometimes completely wrong when there is if/elif/else 
because of compiler.pyassem.order_blocks().

This issue was found when noticing that tracebacks in Zope Python Script (using 
RestrictedPython which in turn uses compiler module) were displaying completely 
wrong line information and stepping with pdb into the source code was 
displaying wrong source code and line information too.

After investigating, I found out that the problem is actually in compiler 
module and wrote a small program triggering the bug (I know that some variables 
are undefined but it does not matter here):

if if1:
  if1_entered = 1

if if2:
  if subif1:
    subif1_entered = 1
  elif subif2:
    subif2_entered = 1

  subif_ended = 1

Using compiler module to generate the bytecode with the program attached 
returns two different bytecodes:

$ python2.7 compiler_wrong_ordering_block_test.py 
====================================
  1           0 LOAD_NAME                0 (if1)
              3 POP_JUMP_IF_FALSE       30

  2           6 LOAD_CONST               1 (1)
              9 STORE_NAME               1 (if1_entered)
             12 JUMP_FORWARD            15 (to 30)

  7     >>   15 LOAD_NAME                2 (subif2)
             18 POP_JUMP_IF_FALSE       51

  8          21 LOAD_CONST               1 (1)
             24 STORE_NAME               3 (subif2_entered)
             27 JUMP_FORWARD            21 (to 51)
        >>   30 LOAD_NAME                4 (if2)
             33 POP_JUMP_IF_FALSE       60
             36 LOAD_NAME                5 (subif1)
             39 POP_JUMP_IF_FALSE       15
             42 LOAD_CONST               1 (1)
             45 STORE_NAME               6 (subif1_entered)
             48 JUMP_FORWARD             0 (to 51)

 10     >>   51 LOAD_CONST               1 (1)
             54 STORE_NAME               7 (subif_ended)
             57 JUMP_FORWARD             0 (to 60)
        >>   60 LOAD_CONST               0 (None)
             63 RETURN_VALUE        
====================================
  1           0 LOAD_NAME                0 (if1)
              3 POP_JUMP_IF_FALSE       15

  2           6 LOAD_CONST               1 (1)
              9 STORE_NAME               1 (if1_entered)
             12 JUMP_FORWARD             0 (to 15)

  4     >>   15 LOAD_NAME                2 (if2)
             18 POP_JUMP_IF_FALSE       60

  5          21 LOAD_NAME                3 (subif1)
             24 POP_JUMP_IF_FALSE       36

  6          27 LOAD_CONST               1 (1)
             30 STORE_NAME               4 (subif1_entered)
             33 JUMP_FORWARD            15 (to 51)

  7     >>   36 LOAD_NAME                5 (subif2)
             39 POP_JUMP_IF_FALSE       51

  8          42 LOAD_CONST               1 (1)
             45 STORE_NAME               6 (subif2_entered)
             48 JUMP_FORWARD             0 (to 51)

 10     >>   51 LOAD_CONST               1 (1)
             54 STORE_NAME               7 (subif_ended)
             57 JUMP_FORWARD             0 (to 60)
        >>   60 LOAD_CONST               0 (None)
             63 RETURN_VALUE        

As you can see from the output above, the first generated bytecode gets all the 
line numbers (lnotab) messed up because 'subif2' block (and its children too) 
is emitted *before* its 'dominator' block, namely 'if2'.

Now, compiler.pyassem.order_blocks() arranges blocks to be emitted within a 
'dominators' dict where the key is a block and the value a list of blocks which 
must be emitted *before* that one. However, for at least 'if/elif' blocks 
(created by compiler.pycodegen.CodeGenerator().visitIf()), the 'elif' block 
(subif2) does not have prev/next (where prev and next define respectively the 
parent and children blocks) sets to its parent (if2) as it is only set for the 
first 'if' (subif1) and furthermore for some reasons I don't know the code 
asserts that next list length must be equal to 1 anyway (anyone knows why 
BTW?)...

Thus, when creating 'dominators' dict using next/prev block attributes, the 
'elif' block (subif2) ends up being considered kind of an orphan block because 
next/prev is not set and thus could be emitted anytime... Enabling debug mode 
in compiler.pyassem and adding a print call in order_blocks() to display 
'dominators' dict confirms this (eg 'subif2' block (<block id=9>) dominators 
value list is empty):

=> Blocks:
        ('SET_LINENO', 0)
        ('SET_LINENO', 1)
        ('LOAD_NAME', 'if1')
        ('POP_JUMP_IF_FALSE', <block id=3>)
end <block id=0>
    next [<block id=4>]
    prev []
    [<block id=3>, <block id=4>]
<block id=4>
        ('SET_LINENO', 2)
        ('LOAD_CONST', 1)
        ('STORE_NAME', 'if1_entered')
        ('JUMP_FORWARD', <block id=2>)
end <block id=4>
    next []
    prev [<block id=0>]
    [<block id=2>]
<block id=3>
end <block id=3>
    next [<block id=2>]
    prev []
    [<block id=2>]
<block id=2>
        ('SET_LINENO', 4)
        ('LOAD_NAME', 'if2')
        ('POP_JUMP_IF_FALSE', <block id=6>)
end <block id=2>
    next [<block id=7>]
    prev [<block id=3>]
    [<block id=6>, <block id=7>]
end <block id=7>
    next [<block id=10>]
    prev [<block id=2>]
    [<block id=9>, <block id=10>]
<block id=10>
        ('SET_LINENO', 6)
        ('LOAD_CONST', 1)
        ('STORE_NAME', 'subif1_entered')
        ('JUMP_FORWARD', <block id=8>)
end <block id=10>
    next []
    prev [<block id=7>]
    [<block id=8>]
<block id=9>
        ('SET_LINENO', 7)
        ('LOAD_NAME', 'subif2')
        ('POP_JUMP_IF_FALSE', <block id=11>)
end <block id=9>
    next [<block id=12>]
    prev []
    [<block id=11>, <block id=12>]
<block id=12>
        ('SET_LINENO', 8)
        ('LOAD_CONST', 1)
        ('STORE_NAME', 'subif2_entered')
        ('JUMP_FORWARD', <block id=8>)
end <block id=12>
    next []
    prev [<block id=9>]
    [<block id=8>]
<block id=11>
end <block id=11>
    next [<block id=8>]
    prev []
    [<block id=8>]
<block id=8>
        ('SET_LINENO', 10)
        ('LOAD_CONST', 1)
        ('STORE_NAME', 'subif_ended')
        ('JUMP_FORWARD', <block id=5>)
end <block id=8>
    next []
    prev [<block id=11>]
    [<block id=5>]
<block id=6>
end <block id=6>
    next [<block id=5>]
    prev []
    [<block id=5>]
<block id=5>
        ('LOAD_CONST', None)
        ('RETURN_VALUE',)

=> Dominators:
{<block id=0>: set([]),
 <block id=2>: set([<block id=3>, <block id=4>]),
 <block id=3>: set([<block id=4>]),
 <block id=4>: set([<block id=0>]),
 <block id=5>: set([<block id=6>, <block id=8>]),
 <block id=6>: set([<block id=8>]),
 <block id=7>: set([<block id=2>]),
 <block id=8>: set([<block id=10>, <block id=11>, <block id=12>]),
 <block id=9>: set([]),
 <block id=10>: set([<block id=7>]),
 <block id=11>: set([<block id=10>, <block id=12>]),
 <block id=12>: set([<block id=9>])}

I have tried to write a patch for this issue but to be honest I could not come 
up with something nice, any idea about implementing a proepr fix is welcome?

----------
components: Library (Lib)
files: compiler_wrong_ordering_block_test.py
messages: 211124
nosy: arnau, nascheme, pitrou
priority: normal
severity: normal
status: open
title: Wrong line information in bytecode generated by compiler module
type: behavior
versions: Python 2.7
Added file: 
http://bugs.python.org/file34064/compiler_wrong_ordering_block_test.py

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue20613>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to