Re: [Python-Dev] Need help to fix HTTP Header Injection vulnerability

2019-04-10 Thread Wes Turner
1. Is there a library of URL / Header injection tests e.g. for fuzzing that
we could generate additional test cases with or from?

2. Are requests.get() and requests.post() also vulnerable?

3. Despite the much-heralded UNIX pipe protocols' utility, filenames
containing newlines (the de-facto line record delimiter) are possible:
"file"$'\n'"name"

Should filenames containing newlines and control characters require a kwarg
to be non-None in order to be passed through unescaped to the HTTP request?

On Wednesday, April 10, 2019, Karthikeyan  wrote:

> Thanks Gregory. I think it's a good tradeoff to ensure this validation
> only for URLs of http scheme.
>
> I also agree handling newline is little problematic over the years and the
> discussion over the level at which validation should occur also prolongs
> some of the patches. https://bugs.python.org/issue35906 is another
> similar case where splitlines is used but it's better to raise an error and
> the proposed fix could be used there too. Victor seemed to wrote a similar
> PR like linked one for other urllib functions only to fix similar attack in
> ftplib to reject newlines that was eventually fixed only in ftplib
>
> * https://bugs.python.org/issue30713
> * https://bugs.python.org/issue29606
>
> Search also brings multiple issues with one duplicate over another that
> makes these attacks scattered over the tracker and some edge case missing.
> Slightly off topic, the last time I reported a cookie related issue where
> the policy can be overriden by third party library I was asked to fix it in
> stdlib itself since adding fixes to libraries causes maintenance burden to
> downstream libraries to keep up upstream. With urllib being a heavily used
> module across ecosystem it's good to have a fix landing in stdlib that
> secures downstream libraries encouraging users to upgrade Python too.
>
> Regards,
> Karthikeyan S
>
>>
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Need help to fix HTTP Header Injection vulnerability

2019-04-10 Thread Victor Stinner
Hi,

I dig into Python code history and the bug tracker. I would like to
say that this issue is a work-in-progress since 2004. Different fixes
have been pushed, but there are *A LOT* of open issues:
https://bugs.python.org/issue30458#msg339846

I would suggest to discuss on https://bugs.python.org/issue30458
rather than here, just to avoid to duplicate discussions ;-)

Note: the whole class of issue (HTTP Header Injection) got at least 3
CVE: CVE-2016-5699, CVE-2019-9740, CVE-2019-9947. I changed bpo-30458
title to "[security][CVE-2019-9740][CVE-2019-9947] HTTP Header
Injection (follow-up of CVE-2016-5699)".

Victor

Le mer. 10 avr. 2019 à 12:20, Wes Turner  a écrit :
>
> 1. Is there a library of URL / Header injection tests e.g. for fuzzing that 
> we could generate additional test cases with or from?
>
> 2. Are requests.get() and requests.post() also vulnerable?
>
> 3. Despite the much-heralded UNIX pipe protocols' utility, filenames 
> containing newlines (the de-facto line record delimiter) are possible: 
> "file"$'\n'"name"
>
> Should filenames containing newlines and control characters require a kwarg 
> to be non-None in order to be passed through unescaped to the HTTP request?
>
> On Wednesday, April 10, 2019, Karthikeyan  wrote:
>>
>> Thanks Gregory. I think it's a good tradeoff to ensure this validation only 
>> for URLs of http scheme.
>>
>> I also agree handling newline is little problematic over the years and the 
>> discussion over the level at which validation should occur also prolongs 
>> some of the patches. https://bugs.python.org/issue35906 is another similar 
>> case where splitlines is used but it's better to raise an error and the 
>> proposed fix could be used there too. Victor seemed to wrote a similar PR 
>> like linked one for other urllib functions only to fix similar attack in 
>> ftplib to reject newlines that was eventually fixed only in ftplib
>>
>> * https://bugs.python.org/issue30713
>> * https://bugs.python.org/issue29606
>>
>> Search also brings multiple issues with one duplicate over another that 
>> makes these attacks scattered over the tracker and some edge case missing. 
>> Slightly off topic, the last time I reported a cookie related issue where 
>> the policy can be overriden by third party library I was asked to fix it in 
>> stdlib itself since adding fixes to libraries causes maintenance burden to 
>> downstream libraries to keep up upstream. With urllib being a heavily used 
>> module across ecosystem it's good to have a fix landing in stdlib that 
>> secures downstream libraries encouraging users to upgrade Python too.
>>
>> Regards,
>> Karthikeyan S
>
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/vstinner%40redhat.com



-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build

2019-04-10 Thread Victor Stinner
Le mar. 9 avr. 2019 à 22:16, Steve Dower  a écrit :
> What are the other changes that would be required?

I don't know.

> And is there another
> way to get the same functionality without ABI modifications?

Py_TRACE_REFS is a double linked list of *all* Python objects. To get
this functionality, you need to store the list somewhere. I don't know
how to maintain such list outside the PyObject structure.

One solution would be to enable Py_TRACE_REFS in release mode. Does
anyone want to add 16 bytes to every PyObject? I don't want that :-)


> I think it's worthwhile if we can really get to debug and non-debug
> builds being ABI compatible. Getting partway there in this case doesn't
> seem to offer any benefits.

Disabling Py_TRACE_REFS by default in debug mode reduces the Python
memory footprint. Py_TRACE_REFS costs 2 pointers per PyObject: 16
bytes on 64-bit platforms.

I don't think that I ever used sys.getobjects(), whereas many projects
use gc.get_objects() which is also available in release builds (not
only in debug builds).

I'm quite sure that almost nobody uses debug builds because the ABI is
incompatible.

The main question is if anyone ever used Py_TRACE_REFS? Does someone
use sys.getobjects() or PYTHONDUMPREFS environment variable?

Using PYTHONDUMPREFS=1 on a debug build (with Py_TRACE_REFS) does
simply crash Python 3.7 at exit. So I don't think that anyone use it
:-)


I wrote PR 12614 to remove all code related to Py_TRACE_REFS. I wrote
it to see which code depends on it:

commit 63509498761a0e7f72585a8cd7df325ea2abd1b2 (HEAD ->
remove_trace_refs, origin/remove_trace_refs)
Author: Victor Stinner 
Date:   Thu Mar 28 23:26:58 2019 +0100

WIP: bpo-36465: Remove Py_TRACE_REFS special build

Remove _ob_prev and _ob_next fields of PyObject when Python is
compiled in debug mode to make debug ABI closer to the release ABI.

Remove:

* sys.getobjects()
* PYTHONDUMPREFS environment variable
* _PyCoreConfig.dump_refs
* PyObject._ob_prev and PyObject._ob_next fields
* _PyObject_HEAD_EXTRA and _PyObject_EXTRA_INIT macros
* _Py_AddToAllObjects()
* _Py_PrintReferenceAddresses()
* _Py_PrintReferences()

Victor
-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Need help to fix HTTP Header Injection vulnerability

2019-04-10 Thread Karthikeyan
> 1. Is there a library of URL / Header injection tests e.g. for fuzzing
> that we could generate additional test cases with or from?


https://github.com/swisskyrepo/PayloadsAllTheThings seems to contain
payload related stuff but not sure how useful it is for URL parsing.

>
> 2. Are requests.get() and requests.post() also vulnerable?
>

urllib3 seems to be vulnerable as noted in
https://bugs.python.org/issue36276#msg337837 . requests uses urllib3 under
the hood. The last time I checked requests passed encoded URL to urllib3
where this doesn't seem to be exploitable but I could be wrong.

-- 
Regards,
Karthikeyan S
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] (no subject)

2019-04-10 Thread Robert Okadar
Hi community,

I have developed a tkinter GUI component, Python v3.7. It runs very well in
Linux but seeing a huge performance impact in Windows 10. While in Linux an
almost real-time performance is achieved, in Windows it is slow to an
unusable level.

The code is somewhat stripped down from the original, but the performance
difference is the same anyway. The columns can be resized by clicking on
the column border and dragging it. Resizing works only for the top row (but
it resizes the entire column).
In this demo, all bindings are avoided to exclude influence on the
component performance and thus not included. If you resize the window
(i.e., if you maximize it), you must call the function table.fit() from
IDLE shell.

Does anyone know where is this huge difference in performance coming from?
Can anything be done about it?

All the best,
--
Robert Okadar
IT Consultant

Schedule an *online meeting * with
me!

Visit *aranea-mreze.hr*  or call
* +385 91 300 8887*

import tkinter

class Resizer(tkinter.Frame):
def __init__(self, info_grid, master, **cnf):
self.table_grid = info_grid
tkinter.Frame.__init__(self, master, **cnf)
self.bind('', self.resize_column)
self.bind('', self.resize_start)
self.bind('', self.resize_end)
self._resizing = False

self.bind('', self.onDestroyEvent)

def onDestroyEvent(self, event):
self.table_grid = []

def resize_column(self, event, width = None):
#if self._resizing:
top = self.table_grid.Top
grid = self.table_grid._grid
col = self.master.grid_info()["column"]
if not width:
width = self._width + event.x_root - self._x_root
top.columnconfigure(col, minsize = width)
grid.columnconfigure(col, minsize = width)


def resize_start(self, event):
top = self.table_grid.Top
self._resizing = True
self._x_root = event.x_root
col = self.master.grid_info()["column"]
self._width = top.grid_bbox(row = 0, column = col)[2]
#print event.__dict__

col = self.master.grid_info()["column"]
#print top.grid_bbox(row = 0, column = col)

def resize_end(self, event):
pass
#self.table_grid.xscrollcommand()
#self.table_grid.column_resize_callback(col, self)


class Table(tkinter.Frame):
def __init__(self, master, columns = 10, rows = 20, width = 100,**kw):
tkinter.Frame.__init__(self, master, **kw)
self.columns = []
self._width = width
self._grid = grid = tkinter.Frame(self, bg = "#CC")
self.Top = top = tkinter.Frame(self, bg = "#DD")
self.create_top(columns)
self.create_grid(rows)


#self.bind('', self.on_table_configure)
#self.bind('', self.on_table_map)

top.pack(anchor = 'nw')#, expand = 1, fill = "both")
grid.pack(anchor = 'nw')#fill = "both",expand = 1

def on_table_map(self, event):
theight = self.winfo_height()

def fit(self):#on_table_configure(self, event):
i = 0
for frame in self.Top.grid_slaves(row = 0):
frame.resizer.resize_column(None, width = frame.winfo_width())
i += 1
theight = self.winfo_height()
fheight = self._grid.winfo_height() + self.Top.winfo_height()
#print('', theight, fheight)
if theight > fheight:
rheight = self.grid_array[0][0].winfo_height()
ammount = int((-fheight + theight) / rheight)
#print(rheight, ammount)
for i in range(ammount):
self.add_row()
self.update()


def add_row(self, ammount = 1):
columnsw = self.columns
row = []
i = len(self.grid_array)
for j in range(len(columnsw)):
bg = self.bgcolor0
if i % 2 == 1:
bg = self.bgcolor1
entry = tkinter.Label(self._grid, bg = bg, text = '%i %i' % (i, j))
entry.grid(row = i, column = j, sticky = "we", padx = 2)
row.append(entry)
self.grid_array.append(row)


bgcolor0 = "#FF"
bgcolor1 = "#EE"
def create_grid(self, height):

#grid.grid(row = 0, column = 0, sticky = "nsew")

columnsw = self.columns# = self.Top.grid_slaves(row = 1)
self.grid_array = []
for i in range(height):
row = []
for j in range(len(columnsw)):
bg = self.bgcolor0
if i % 2 == 1:
bg = self.bgcolor1
#entry = self.EntryClass(False, self, self._grid, bg = bg, 
width = 1, )
entry = tkinter.Label(self._grid, bg = bg, text = '%i %i' % 
(i,j))
entry.grid(row = i, column = j, sticky = "we", padx = 2)
row.append(entry)
self.grid

Re: [Python-Dev] (no subject)

2019-04-10 Thread Steven D'Aprano
Hi Robert,

This mailing list is for the development of the Python interpreter, not 
a general help desk. There are many other forums where you can ask for 
help, such as the comp.lang.python newsgroup, Stackoverflow, /r/python 
on Reddit, the IRC channel, and more.

Perhaps you can help us though, I presume you signed up to this mailing 
list via the web interface at

https://mail.python.org/mailman/listinfo/python-dev

Is there something we could do to make it more clear that this is not 
the right place to ask for help?


-- 
Steven
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 590 discussion

2019-04-10 Thread Petr Viktorin

Hello!
I've had time for a more thorough reading of PEP 590 and the reference 
implementation. Thank you for the work!
Overall, I like PEP 590's direction. I'd now describe the fundamental 
difference between PEP 580 and PEP 590 as:

- PEP 580 tries to optimize all existing calling conventions
- PEP 590 tries to optimize (and expose) the most general calling 
convention (i.e. fastcall)


PEP 580 also does a number of other things, as listed in PEP 579. But I 
think PEP 590 does not block future PEPs for the other items.
On the other hand, PEP 580 has a much more mature implementation -- and 
that's where it picked up real-world complexity.


PEP 590's METH_VECTORCALL is designed to handle all existing use cases, 
rather than mirroring the existing METH_* varieties.
But both PEPs require the callable's code to be modified, so requiring 
it to switch calling conventions shouldn't be a problem.


Jeroen's analysis from 
https://mail.python.org/pipermail/python-dev/2018-July/154238.html seems 
to miss a step at the top:


a. CALL_FUNCTION* / CALL_METHOD opcode
  calls
b. _PyObject_FastCallKeywords()
  which calls
c. _PyCFunction_FastCallKeywords()
  which calls
d. _PyMethodDef_RawFastCallKeywords()
  which calls
e. the actual C function (*ml_meth)()

I think it's more useful to say that both PEPs bridge a->e (via 
_Py_VectorCall or PyCCall_Call).



PEP 590 is built on a simple idea, formalizing fastcall. But it is 
complicated by PY_VECTORCALL_ARGUMENTS_OFFSET and 
Py_TPFLAGS_METHOD_DESCRIPTOR.
As far as I understand, both are there to avoid intermediate 
bound-method object for LOAD_METHOD/CALL_METHOD. (They do try to be 
general, but I don't see any other use case.)

Is that right?
(I'm running out of time today, but I'll write more on why I'm asking, 
and on the case I called "impossible" (while avoiding creation of a 
"bound method" object), later.)



The way `const` is handled in the function signatures strikes me as too 
fragile for public API.
I'd like if, as much as possible, PY_VECTORCALL_ARGUMENTS_OFFSET was 
treated as a special optimization that extension authors can either opt 
in to, or blissfully ignore.

That might mean:
- vectorcall, PyObject_VectorCallWithCallable, PyObject_VectorCall, 
PyCall_MakeTpCall all formally take "PyObject *const *args"
- a naïve callee must do "nargs &= ~PY_VECTORCALL_ARGUMENTS_OFFSET" 
(maybe spelled as "nargs &= PY_VECTORCALL_NARGS_MASK"), but otherwise 
writes compiler-enforced const-correct code.
- if PY_VECTORCALL_ARGUMENTS_OFFSET is set, the callee may modify 
"args[-1]" (and only that, and after the author has read the docs).



Another point I'd like some discussion on is that vectorcall function 
pointer is per-instance. It looks this is only useful for type objects, 
but it will add a pointer to every new-style callable object (including 
functions). That seems wasteful.
Why not have a per-type pointer, and for types that need it (like 
PyTypeObject), make it dispatch to an instance-specific function?



Minor things:
- "Continued prohibition of callable classes as base classes" -- this 
section reads as a final. Would you be OK wording this as something 
other PEPs can tackle?
- "PyObject_VectorCall" -- this looks extraneous, and the reference 
imlementation doesn't need it so far. Can it be removed, or justified?
- METH_VECTORCALL is *not* strictly "equivalent to the currently 
undocumented METH_FASTCALL | METH_KEYWORD flags" (it has the 
ARGUMENTS_OFFSET complication).
- I'd like to officially call this PEP "Vectorcall", see 
https://github.com/python/peps/pull/984




Mark, what are your plans for next steps with PEP 590? If a volunteer 
wanted to help you push this forward, what would be the best thing to 
work on?


Jeroen, is there something in PEPs 579/580 that PEP 590 blocks, or 
should address?

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Need help to fix HTTP Header Injection vulnerability

2019-04-10 Thread Ivan Pozdeev via Python-Dev


On 10.04.2019 7:30, Karthikeyan wrote:

Thanks Gregory. I think it's a good tradeoff to ensure this validation only for 
URLs of http scheme.

I also agree handling newline is little problematic over the years and the discussion over the level at which validation should occur also 
prolongs some of the patches. https://bugs.python.org/issue35906 is another similar case where splitlines is used but it's better to raise 
an error and the proposed fix could be used there too. Victor seemed to wrote a similar PR like linked one for other urllib functions only 
to fix similar attack in ftplib to reject newlines that was eventually fixed only in ftplib


* https://bugs.python.org/issue30713
* https://bugs.python.org/issue29606

Search also brings multiple issues with one duplicate over another that makes these attacks scattered over the tracker and some edge case 
missing. Slightly off topic, the last time I reported a cookie related issue where the policy can be overriden by third party library I 
was asked to fix it in stdlib itself since adding fixes to libraries causes maintenance burden to downstream libraries to keep up 
upstream. With urllib being a heavily used module across ecosystem it's good to have a fix landing in stdlib that secures downstream 
libraries encouraging users to upgrade Python too.


Validation should occur whenever user data crosses a trust boundary -- i.e. when the library starts to assume that an extracted chunk now 
contains something valid.


https://tools.ietf.org/html/rfc3986 defines valid syntax (incl. valid characters) for every part of a URL -- _of any scheme_ (FYI, \r\n are 
invalid everywhere and the test code for     `data:' that Karthikeyan referred to is raw data to compare to rather than a part of a URL). It 
also obsoletes all the RFCs that the current code is written against.


AFAICS, urllib.split* fns (obsoleted as public in 3.8) are used by both urllib and urllib2 to parse URLs. They can be made to each validate 
the chunk that they split off. urlparse can validate the entire URL altogether.


Also, all modules ought to use the same code (urlparse looks like the best 
candidate) to parse URLs -- this will minimize the attack surface.

I think I can look into this later this week.

Fixing this is going to break code that relies on the current code accepting invalid URLs. But the docs have never said that e.g. in 
urlopen, anything apart from a (valid) URL is accepted (in particular, this implies that the user is responsible for escaping stuff properly 
before passing it). So I would say that we are within our right here and whoever is relying on those quirks is and has always been on 
unsupported territory.
Determining which of those quirks are exploitable and which are not to fix just the former is an incomparably larger, more error-prone and 
avoidable work. If anything, the history of the issue referenced to by previous posters clearly shows that this is too much to ask from the 
Python team.


I also see other undocumented behavior like accepting '>' (also 
obsoleted as public in 3.8) which I would like to but it's of no harm.

--

Regards,
Ivan

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build

2019-04-10 Thread Steve Dower

On 10Apr2019 0401, Victor Stinner wrote:

Le mar. 9 avr. 2019 à 22:16, Steve Dower  a écrit :

What are the other changes that would be required?


I don't know.


And is there another
way to get the same functionality without ABI modifications?


Py_TRACE_REFS is a double linked list of *all* Python objects. To get
this functionality, you need to store the list somewhere. I don't know
how to maintain such list outside the PyObject structure.


There's certainly no more convenient way to do it. Maybe if we had 
detached reference counts it would be easier, but it would likely still 
result in ABI compatibility issues between debug builds of extensions 
and release builds of Python (the most common scenario, in my experience).



One solution would be to enable Py_TRACE_REFS in release mode. Does
anyone want to add 16 bytes to every PyObject? I don't want that :-)


Yeah, nobody suggested that anyway :)


I think it's worthwhile if we can really get to debug and non-debug
builds being ABI compatible. Getting partway there in this case doesn't
seem to offer any benefits.


Disabling Py_TRACE_REFS by default in debug mode reduces the Python
memory footprint. Py_TRACE_REFS costs 2 pointers per PyObject: 16
bytes on 64-bit platforms.


Right, except it's debug mode.


I don't think that I ever used sys.getobjects(), whereas many projects
use gc.get_objects() which is also available in release builds (not
only in debug builds).

I'm quite sure that almost nobody uses debug builds because the ABI is
incompatible.


There were just over 250,000 downloads of the prebuilt debug binaries 
for Windows (which are optional in the installer and turned off by 
default) in March. Whether they are being used is another question, but 
I know for sure at least a few people who use them.


When you want to use a debug build of your extension module, using a 
debug build of CPython is the only way to do it. So unless we can get 
rid of *all* the ABI incompatibilities, a debug build of CPython is 
still going to be necessary and disabling/removing reference tracking 
doesn't provide any benefit.



The main question is if anyone ever used Py_TRACE_REFS? Does someone
use sys.getobjects() or PYTHONDUMPREFS environment variable?

Using PYTHONDUMPREFS=1 on a debug build (with Py_TRACE_REFS) does
simply crash Python 3.7 at exit. So I don't think that anyone use it
:-)


How do we track reference leaks in the buildbots? Can/should we be using 
this?


It doesn't crash on Python 3.8, so I suspect fixing the bug is a better 
option than using it as an excuse to remove the feature. From a quick 
test, it seems that a tuple element is being freed but not removed from 
the tuple, so it's probably a double-decref bug somewhere in 3.7.


Cheers,
Steve
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build

2019-04-10 Thread Steve Dower

On 10Apr2019 1109, Steve Dower wrote:

On 10Apr2019 0401, Victor Stinner wrote:

I think it's worthwhile if we can really get to debug and non-debug
builds being ABI compatible. Getting partway there in this case doesn't
seem to offer any benefits.


Disabling Py_TRACE_REFS by default in debug mode reduces the Python
memory footprint. Py_TRACE_REFS costs 2 pointers per PyObject: 16
bytes on 64-bit platforms.


Right, except it's debug mode.


I left this comment unfinished :)

It's debug mode, and so you should expect less efficient memory and CPU 
usage. That's why we have two modes - so that it's easier to debug issues.


Now, if debug mode was unusably slow or had way too much overhead, we'd 
want to fix that. But it isn't unusable, so reducing memory usage at the 
cost of making debugging harder is not compelling.


Cheers,
Steve
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build

2019-04-10 Thread Guido van Rossum
I recall finding memory leaks using this. (E.g. I remember a leak in Zope
due to a cache that was never pruned.)

But presumably gc.get_objects() would have been sufficient. (IIRC it didn't
exist at the time.)

On Wed, Apr 10, 2019 at 11:48 AM Steve Dower  wrote:

> On 10Apr2019 1109, Steve Dower wrote:
> > On 10Apr2019 0401, Victor Stinner wrote:
> >>> I think it's worthwhile if we can really get to debug and non-debug
> >>> builds being ABI compatible. Getting partway there in this case doesn't
> >>> seem to offer any benefits.
> >>
> >> Disabling Py_TRACE_REFS by default in debug mode reduces the Python
> >> memory footprint. Py_TRACE_REFS costs 2 pointers per PyObject: 16
> >> bytes on 64-bit platforms.
> >
> > Right, except it's debug mode.
>
> I left this comment unfinished :)
>
> It's debug mode, and so you should expect less efficient memory and CPU
> usage. That's why we have two modes - so that it's easier to debug issues.
>
> Now, if debug mode was unusably slow or had way too much overhead, we'd
> want to fix that. But it isn't unusable, so reducing memory usage at the
> cost of making debugging harder is not compelling.
>
> Cheers,
> Steve
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him/his **(why is my pronoun here?)*

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Need help to fix HTTP Header Injection vulnerability

2019-04-10 Thread Gregory P. Smith
On Wed, Apr 10, 2019 at 11:00 AM Ivan Pozdeev via Python-Dev <
[email protected]> wrote:

>
> On 10.04.2019 7:30, Karthikeyan wrote:
>
> Thanks Gregory. I think it's a good tradeoff to ensure this validation
> only for URLs of http scheme.
>
> I also agree handling newline is little problematic over the years and the
> discussion over the level at which validation should occur also prolongs
> some of the patches. https://bugs.python.org/issue35906 is another
> similar case where splitlines is used but it's better to raise an error and
> the proposed fix could be used there too. Victor seemed to wrote a similar
> PR like linked one for other urllib functions only to fix similar attack in
> ftplib to reject newlines that was eventually fixed only in ftplib
>
> * https://bugs.python.org/issue30713
> * https://bugs.python.org/issue29606
>
> Search also brings multiple issues with one duplicate over another that
> makes these attacks scattered over the tracker and some edge case missing.
> Slightly off topic, the last time I reported a cookie related issue where
> the policy can be overriden by third party library I was asked to fix it in
> stdlib itself since adding fixes to libraries causes maintenance burden to
> downstream libraries to keep up upstream. With urllib being a heavily used
> module across ecosystem it's good to have a fix landing in stdlib that
> secures downstream libraries encouraging users to upgrade Python too.
>
> Validation should occur whenever user data crosses a trust boundary --
> i.e. when the library starts to assume that an extracted chunk now contains
> something valid.
>
> https://tools.ietf.org/html/rfc3986 defines valid syntax (incl. valid
> characters) for every part of a URL -- _of any scheme_ (FYI, \r\n are
> invalid everywhere and the test code for `data:' that Karthikeyan
> referred to is raw data to compare to rather than a part of a URL). It also
> obsoletes all the RFCs that the current code is written against.
>
> AFAICS, urllib.split* fns (obsoleted as public in 3.8) are used by both
> urllib and urllib2 to parse URLs. They can be made to each validate the
> chunk that they split off. urlparse can validate the entire URL altogether.
>
> Also, all modules ought to use the same code (urlparse looks like the best
> candidate) to parse URLs -- this will minimize the attack surface.
>
> I think I can look into this later this week.
>
My PR as of last night cites that RFC and does validation in http.client
while constructing the protocol request payload.  Doing it within split
functions was an initial hack that looked like it might work but didn't
feel right as that isn't what people expect of those functions and that
turned out to be the case as I tested things due to our mess of codepaths
for opening URLs, but they all end with http.client so yay!

I did *not* look at any of the async http client code paths.  (legacy
asyncore or new asyncio).  If there is an issue there, those deserve to
have their own bugs filed.

As for third party PyPI libraries such as urllib3, they are on their own to
fix bugs.  If they happen to use a code path that a stdlib fix helps, good
for them, but honestly they are much better off making and shipping their
own update to avoid the bug.  Users can get it much sooner as it's a mere
pip install -U away rather than a python runtime upgrade.

> Fixing this is going to break code that relies on the current code
> accepting invalid URLs. But the docs have never said that e.g. in urlopen,
> anything apart from a (valid) URL is accepted (in particular, this implies
> that the user is responsible for escaping stuff properly before passing
> it). So I would say that we are within our right here and whoever is
> relying on those quirks is and has always been on unsupported territory.
>
yep.  even http.client.HTTPConnection.request names the function parameter
"url" so anyone embedding whitespace newlines and http protocol strings
within that is well outside of supported territory (as one example in our
own test_xmlrpc was taking advantage of to test a malformed request).

I suggest following up on https://bugs.python.org/issue30458 rather than in
this thread.  the thread did its job, it directed our eyeballs at the
problems. :)

-gps

> Determining which of those quirks are exploitable and which are not to fix
> just the former is an incomparably larger, more error-prone and avoidable
> work. If anything, the history of the issue referenced to by previous
> posters clearly shows that this is too much to ask from the Python team.
>
> I also see other undocumented behavior like accepting '>' (also
> obsoleted as public in 3.8) which I would like to but it's of no harm.
>
> --
>
> Regards,
> Ivan
>
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/greg%40krypto.org
>
__

Re: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build

2019-04-10 Thread Nathaniel Smith
On Wed, Apr 10, 2019, 04:04 Victor Stinner  wrote:

> Le mar. 9 avr. 2019 à 22:16, Steve Dower  a écrit
> :
> > What are the other changes that would be required?
>
> I don't know.
>
> > And is there another
> > way to get the same functionality without ABI modifications?
>
> Py_TRACE_REFS is a double linked list of *all* Python objects. To get
> this functionality, you need to store the list somewhere. I don't know
> how to maintain such list outside the PyObject structure.
>

I assume these pointers get updated from some generic allocation/free code.
Could that code instead overallocate by 16 bytes, use the first 16 bytes to
hold the pointers, and then return the PyObject* as (actual allocated
pointer + 16)? Basically the "container_of" trick.

I don't think that I ever used sys.getobjects(), whereas many projects
> use gc.get_objects() which is also available in release builds (not
> only in debug builds).


Can anyone explain what pydebug builds are... for? Confession: I've never
used them myself, and don't know why I would want to.

(I have to assume that most of Steve's Windows downloads are from folks who
thought they were downloading a python debugger.)

-n
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] (no subject)

2019-04-10 Thread Robert Okadar
Hi Steven,

Thank you for pointing me in the right direction. Will search for help on
places you mentioned.

Not sure how can we help you with developing the Python interpreter, as I
doubt we have any knowledge that this project might use it. When I say
'we', I mean on my colleague and me.

All the best,
--
Robert Okadar
IT Consultant

Schedule an *online meeting * with
me!

Visit *aranea-mreze.hr*  or call
* +385 91 300 8887*


On Wed, 10 Apr 2019 at 17:36, Steven D'Aprano  wrote:

> Hi Robert,
>
> This mailing list is for the development of the Python interpreter, not
> a general help desk. There are many other forums where you can ask for
> help, such as the comp.lang.python newsgroup, Stackoverflow, /r/python
> on Reddit, the IRC channel, and more.
>
> Perhaps you can help us though, I presume you signed up to this mailing
> list via the web interface at
>
> https://mail.python.org/mailman/listinfo/python-dev
>
> Is there something we could do to make it more clear that this is not
> the right place to ask for help?
>
>
> --
> Steven
>
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build

2019-04-10 Thread Brett Cannon
On Wed, Apr 10, 2019 at 12:30 PM Nathaniel Smith  wrote:

> On Wed, Apr 10, 2019, 04:04 Victor Stinner  wrote:
>
>> Le mar. 9 avr. 2019 à 22:16, Steve Dower  a
>> écrit :
>> > What are the other changes that would be required?
>>
>> I don't know.
>>
>> > And is there another
>> > way to get the same functionality without ABI modifications?
>>
>> Py_TRACE_REFS is a double linked list of *all* Python objects. To get
>> this functionality, you need to store the list somewhere. I don't know
>> how to maintain such list outside the PyObject structure.
>>
>
> I assume these pointers get updated from some generic allocation/free
> code. Could that code instead overallocate by 16 bytes, use the first 16
> bytes to hold the pointers, and then return the PyObject* as (actual
> allocated pointer + 16)? Basically the "container_of" trick.
>
> I don't think that I ever used sys.getobjects(), whereas many projects
>> use gc.get_objects() which is also available in release builds (not
>> only in debug builds).
>
>
> Can anyone explain what pydebug builds are... for? Confession: I've never
> used them myself, and don't know why I would want to.
>

There is a bunch of extra things done in a debug build, e.g. all freed
memory is blanked out with a known pattern so it's easy to tell when you're
reading from freed memory (and thus probably messed up your refcounts). And
then various extras are tossed on to the sys module to help with things.
Basically anything people have found useful and require being compiled in
typically get clumped in under the debug build.

-Brett


>
> (I have to assume that most of Steve's Windows downloads are from folks
> who thought they were downloading a python debugger.)
>
> -n
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build

2019-04-10 Thread Steve Dower

On 10Apr2019 1227, Nathaniel Smith wrote:
On Wed, Apr 10, 2019, 04:04 Victor Stinner > wrote:

I don't think that I ever used sys.getobjects(), whereas many projects
use gc.get_objects() which is also available in release builds (not
only in debug builds).


Can anyone explain what pydebug builds are... for? Confession: I've 
never used them myself, and don't know why I would want to.


(I have to assume that most of Steve's Windows downloads are from folks 
who thought they were downloading a python debugger.)


They're for debugging :)

In general, debug builds are meant for faster inner-loop development. 
They generally do incremental builds properly and much faster by 
omitting most optimisations, which also enables source mapping to be 
more accurate when debugging. Assertions are typically enabled so that 
you are notified when a precondition is first identified rather than 
when it causes the crash (compiling these out later means you don't pay 
a runtime cost once you've got the inputs correct - generally these are 
used for developer-controlled values, rather than user-provided ones).


So the idea is that you can quickly edit, build, debug, fix your code in 
a debug configuration, and then use a release configuration for the 
actual released build. Full release builds may take 2-3x longer than 
full debug builds, given the extra effort they make at optimisation, and 
very often can't do minimal incremental builds at all (so they may be 
10-100x slower if you only modified one source file). But because the 
builds behave functionally equivalently, you can iterate with the faster 
configuration and get more done.


(Disclaimer: I do most of my work on Windows where this has been 
properly developed. What I hear from non-Windows developers is that 
other tools can't actually handle this kind of workflow properly. Sorry.)


The reason we ship debug Python binaries is because debug builds use a 
different C Runtime, so if you do a debug build of an extension module 
you're working on it won't actually work with a non-debug build of CPython.


While it's possible that people misread "Download debug binaries" (the 
text in the installer) and think that it's an actual debugger, I'd 
suggest that your total lack of context here means you should avoid 
making assumptions about users you know nothing about.


Cheers,
Steve
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] (no subject)

2019-04-10 Thread Terry Reedy

On 4/10/2019 7:24 AM, Robert Okadar wrote:

Hi community,

I have developed a tkinter GUI component, Python v3.7. It runs very well in
Linux but seeing a huge performance impact in Windows 10. While in Linux an
almost real-time performance is achieved, in Windows it is slow to an
unusable level.

The code is somewhat stripped down from the original, but the performance
difference is the same anyway. The columns can be resized by clicking on
the column border and dragging it. Resizing works only for the top row (but
it resizes the entire column).
In this demo, all bindings are avoided to exclude influence on the
component performance and thus not included. If you resize the window
(i.e., if you maximize it), you must call the function table.fit() from
IDLE shell.

Does anyone know where is this huge difference in performance coming from?
Can anything be done about it?


For reasons explained by Steve, please send this instead to python-list
https://mail.python.org/mailman/listinfo/python-list
To access python-list as a newsgroup, skip comp.lang.python and use 
newsgroup gmane.comp.python.general at news.gmane.org.


I will respond there after testing/verifying and perhaps searching 
bugs.python.org for a similar issue.


--
Terry Jan Reedy

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build

2019-04-10 Thread Terry Reedy

On 4/10/2019 2:45 PM, Steve Dower wrote:

It's debug mode, and so you should expect less efficient memory and CPU 
usage.


On my Windows machine, 'python -m test  -ugui'  takes about twice as long.


That's why we have two modes - so that it's easier to debug issues.


--
Terry Jan Reedy

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] (no subject)

2019-04-10 Thread MRAB

On 2019-04-10 22:00, Terry Reedy wrote:

On 4/10/2019 7:24 AM, Robert Okadar wrote:

Hi community,

I have developed a tkinter GUI component, Python v3.7. It runs very well in
Linux but seeing a huge performance impact in Windows 10. While in Linux an
almost real-time performance is achieved, in Windows it is slow to an
unusable level.

The code is somewhat stripped down from the original, but the performance
difference is the same anyway. The columns can be resized by clicking on
the column border and dragging it. Resizing works only for the top row (but
it resizes the entire column).
In this demo, all bindings are avoided to exclude influence on the
component performance and thus not included. If you resize the window
(i.e., if you maximize it), you must call the function table.fit() from
IDLE shell.

Does anyone know where is this huge difference in performance coming from?
Can anything be done about it?


For reasons explained by Steve, please send this instead to python-list
https://mail.python.org/mailman/listinfo/python-list
To access python-list as a newsgroup, skip comp.lang.python and use
newsgroup gmane.comp.python.general at news.gmane.org.

I will respond there after testing/verifying and perhaps searching
bugs.python.org for a similar issue.


ttk has Treeview, which can be configured as a table.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build

2019-04-10 Thread Victor Stinner
Le mer. 10 avr. 2019 à 20:09, Steve Dower  a écrit :
> > The main question is if anyone ever used Py_TRACE_REFS? Does someone
> > use sys.getobjects() or PYTHONDUMPREFS environment variable?
> >
> > Using PYTHONDUMPREFS=1 on a debug build (with Py_TRACE_REFS) does
> > simply crash Python 3.7 at exit. So I don't think that anyone use it
> > :-)
>
> How do we track reference leaks in the buildbots? Can/should we be using
> this?

Ah, maybe there is a misunderstanding. You don't need Py_TRACE_REFS to
track memory leaks: "python3 -m test -R 3:3" works without that.
test_regrtest contains an unit test for reference leaks (I know it
that I wrote the test :-)), and you can see that the test pass on my
PR. I also checked manually by adding a memory leak into a test: it is
still detected :-)

regrtest uses sys.gettotalrefcount(), sys.getallocatedblocks() and
support.fd_count() to track reference, memory and file descriptor
leaks. None of these functions are related to Py_TRACE_REFS.

Again, the question is who rely on Py_TRACE_REFS. If nobody rely on
it, I don't see the point of keeping this expensive feature (at least,
not by default).

> It doesn't crash on Python 3.8, so I suspect fixing the bug is a better
> option than using it as an excuse to remove the feature.

It's not what I said. I only said that it seems that nobody uses
PYTHONDUMPREFS, since it's broken for a long time. It's just a hint
about the usage of Py_TRACE_REFS.

I don't propose to remove the feature, but to disable it by default.

Victor
-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build

2019-04-10 Thread Victor Stinner
Le mer. 10 avr. 2019 à 21:45, Brett Cannon  a écrit :
>> Can anyone explain what pydebug builds are... for? Confession: I've never 
>> used them myself, and don't know why I would want to.
>
> There is a bunch of extra things done in a debug build, e.g. all freed memory 
> is blanked out with a known pattern so it's easy to tell when you're reading 
> from freed memory (and thus probably messed up your refcounts).

Since the debug build ABI is incompatible, it's not easy to use a
debug build. For that reasons, I'm working for a few years to add such
debugging features into regular release build. For example, you can
now get this debugger on memory allocations using PYTHONMALLOC=debug
environment variable since Python 3.6.

Since such debug feature is not easy to discover (especially if you
don't read closely What's New In Python 3.x), I added a generic "-X
dev" command line option to enable a "development mode". It enables
various similar features to debug code:
https://docs.python.org/dev/using/cmdline.html#id5

Effect of the developer mode:

* Add default warning filter, as -W default.
* Install debug hooks on memory allocators: see the
PyMem_SetupDebugHooks() C function.
* Enable the faulthandler module to dump the Python traceback on a crash.
* Enable asyncio debug mode.
* Set the dev_mode attribute of sys.flags to True

See also https://pythondev.readthedocs.io/debug_tools.html where I
started to document these debug tools and how to use them.


> And then various extras are tossed on to the sys module to help with things. 
> Basically anything people have found useful and require being compiled in 
> typically get clumped in under the debug build.

The debug build still contains many features which are useful to debug
C extensions. For example, it adds sys.gettotalrefcnt() which is a
convenient way to detect reference leaks. This funtion require
Py_REF_DEBUG which modifies Py_INCREF() to add "_Py_RefTotal++;". Iit
is has an impact on overall Python performance and should not be
enabled in release build.

Victor
-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 590 discussion

2019-04-10 Thread Jeroen Demeyer

On 2019-04-10 18:25, Petr Viktorin wrote:

Hello!
I've had time for a more thorough reading of PEP 590 and the reference
implementation. Thank you for the work!


And thank you for the review!


I'd now describe the fundamental
difference between PEP 580 and PEP 590 as:
- PEP 580 tries to optimize all existing calling conventions
- PEP 590 tries to optimize (and expose) the most general calling
convention (i.e. fastcall)


And PEP 580 has better performance overall, even for METH_FASTCALL. See 
this thread:

https://mail.python.org/pipermail/python-dev/2019-April/156954.html

Since these PEPs are all about performance, I consider this a very 
relevant argument in favor of PEP 580.



PEP 580 also does a number of other things, as listed in PEP 579. But I
think PEP 590 does not block future PEPs for the other items.
On the other hand, PEP 580 has a much more mature implementation -- and
that's where it picked up real-world complexity.

About complexity, please read what I wrote in
https://mail.python.org/pipermail/python-dev/2019-March/156853.html

I claim that the complexity in the protocol of PEP 580 is a good thing, 
as it removes complexity from other places, in particular from the users 
of the protocol (better have a complex protocol that's simple to use, 
rather than a simple protocol that's complex to use).


As a more concrete example of the simplicity that PEP 580 could bring, 
CPython currently has 2 classes for bound methods implemented in C:

- "builtin_function_or_method" for normal C methods
- "method-descriptor" for slot wrappers like __eq__ or __add__

With PEP 590, these classes would need to stay separate to get maximal 
performance. With PEP 580, just one class for bound methods would be 
sufficient and there wouldn't be any performance loss. And this extends 
to custom third-party function/method classes, for example as 
implemented by Cython.



PEP 590's METH_VECTORCALL is designed to handle all existing use cases,
rather than mirroring the existing METH_* varieties.
But both PEPs require the callable's code to be modified, so requiring
it to switch calling conventions shouldn't be a problem.


Agreed.


Jeroen's analysis from
https://mail.python.org/pipermail/python-dev/2018-July/154238.html seems
to miss a step at the top:

a. CALL_FUNCTION* / CALL_METHOD opcode
   calls
b. _PyObject_FastCallKeywords()
   which calls
c. _PyCFunction_FastCallKeywords()
   which calls
d. _PyMethodDef_RawFastCallKeywords()
   which calls
e. the actual C function (*ml_meth)()

I think it's more useful to say that both PEPs bridge a->e (via
_Py_VectorCall or PyCCall_Call).


Not quite. For a builtin_function_or_method, we have with PEP 580:

a. call_function()
calls
d. PyCCall_FastCall
which calls
e. the actual C function

and with PEP 590 it's more like:

a. call_function()
calls
c. _PyCFunction_FastCallKeywords
which calls
d. _PyMethodDef_RawFastCallKeywords
which calls
e. the actual C function

Level c. above is the vectorcall wrapper, which is a level that PEP 580 
doesn't have.



The way `const` is handled in the function signatures strikes me as too
fragile for public API.


That's a detail which shouldn't influence the acceptance of either PEP.


Why not have a per-type pointer, and for types that need it (like
PyTypeObject), make it dispatch to an instance-specific function?


That would be exactly https://bugs.python.org/issue29259

I'll let Mark comment on this.


Minor things:
- "Continued prohibition of callable classes as base classes" -- this
section reads as a final. Would you be OK wording this as something
other PEPs can tackle?
- "PyObject_VectorCall" -- this looks extraneous, and the reference
imlementation doesn't need it so far. Can it be removed, or justified?
- METH_VECTORCALL is *not* strictly "equivalent to the currently
undocumented METH_FASTCALL | METH_KEYWORD flags" (it has the
ARGUMENTS_OFFSET complication).
- I'd like to officially call this PEP "Vectorcall", see
https://github.com/python/peps/pull/984


Those are indeed details which shouldn't influence the acceptance of 
either PEP. If you go with PEP 590, then we should discuss this further.



Mark, what are your plans for next steps with PEP 590? If a volunteer
wanted to help you push this forward, what would be the best thing to
work on?


Personally, I think what we need now is a decision between PEP 580 and 
PEP 590 (there is still the possibility of rejecting both but I really 
hope that this won't happen). There is a lot of work that still needs to 
be done after either PEP is accepted, such as:

- finish and merge the reference implementation
- document everything
- use the protocol in more classes where it makes sense (for example, 
staticmethod, wrapper_descriptor)

- use this in Cython
- handle more issues from PEP 579

I volunteer to put my time into this, regardless of which PEP is 
accepted. Of course, I still think that PEP 580 is better, but I also 
want this functiona

Re: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build

2019-04-10 Thread Nathaniel Smith
On Wed, Apr 10, 2019 at 1:50 PM Steve Dower  wrote:
>
> On 10Apr2019 1227, Nathaniel Smith wrote:
> > On Wed, Apr 10, 2019, 04:04 Victor Stinner  > > wrote:
> > I don't think that I ever used sys.getobjects(), whereas many projects
> > use gc.get_objects() which is also available in release builds (not
> > only in debug builds).
> >
> >
> > Can anyone explain what pydebug builds are... for? Confession: I've
> > never used them myself, and don't know why I would want to.
> >
> > (I have to assume that most of Steve's Windows downloads are from folks
> > who thought they were downloading a python debugger.)
>
> They're for debugging :)
>
> In general, debug builds are meant for faster inner-loop development.
> They generally do incremental builds properly and much faster by
> omitting most optimisations, which also enables source mapping to be
> more accurate when debugging. Assertions are typically enabled so that
> you are notified when a precondition is first identified rather than
> when it causes the crash (compiling these out later means you don't pay
> a runtime cost once you've got the inputs correct - generally these are
> used for developer-controlled values, rather than user-provided ones).
>
> So the idea is that you can quickly edit, build, debug, fix your code in
> a debug configuration, and then use a release configuration for the
> actual released build. Full release builds may take 2-3x longer than
> full debug builds, given the extra effort they make at optimisation, and
> very often can't do minimal incremental builds at all (so they may be
> 10-100x slower if you only modified one source file). But because the
> builds behave functionally equivalently, you can iterate with the faster
> configuration and get more done.

Sure, I'm familiar with the idea of debug and optimization settings in
compilers. I build python with custom -g and -O flags all the time. (I
do it by setting OPT when running configure.) It's also handy that
many Linux distros these days let you install debug metadata for all
the binaries they ship – I've used that when debugging third-party
extension modules, to get a better idea of what was happening when a
backtrace passes through libpython. But --with-pydebug is a whole
other thing beyond that, that changes the ABI, has its own wheel tags,
requires special cases in packages that use ctypes to access PyObject*
internals, and appears to be almost entirely undocumented.

It sounds like --with-pydebug has accumulated a big grab bag of
unrelated features, mostly stuff that was useful at some point for
some CPython dev trying to debug CPython itself? It's clearly not
designed with end users as the primary audience, given that no-one
knows what it actually does and that it makes third-party extensions
really awkward to run. If that's right then I think Victor's plan of
to sort through what it's actually doing makes a lot of sense,
especially if we can remove the ABI breaking stuff, since that causes
a disproportionate amount of trouble.

> The reason we ship debug Python binaries is because debug builds use a
> different C Runtime, so if you do a debug build of an extension module
> you're working on it won't actually work with a non-debug build of CPython.

...But this is an important point. I'd forgotten that MSVC has a habit
of changing the entire C runtime when you turn on the compiler's
debugging mode. (On Linux, we generally don't bother rebuilding the C
runtime unless you're actually debugging the C runtime, and anyway if
you do want to switch to a debug version of the C runtime, it's ABI
compatible so your program binaries don't have to be rebuilt.)

Is it true that if the interpreter is built against ucrtd.lib, and an
extension module is built against ucrt.lib, then they'll have
incompatible ABIs and not work together? And that this detail is part
of what's been glommed together into the "d" flag in the soabi tag on
Windows?

Is it possible for the Windows installer to include PDB files (/Zi
/DEBUG) to allow debuggers to understand the regular release
executable? (That's what I would have expected to get if I checked a
box labeled "Download debug binaries".)

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build

2019-04-10 Thread Serhiy Storchaka

10.04.19 14:01, Victor Stinner пише:

Disabling Py_TRACE_REFS by default in debug mode reduces the Python
memory footprint. Py_TRACE_REFS costs 2 pointers per PyObject: 16
bytes on 64-bit platforms.


Does not the memory allocator in debug mode have even larger cost per 
allocated block?


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com