[issue47149] DatagramHandler doing DNS lookup on every log message
New submission from Bruce Merry : logging.DatagramHandler uses socket.sendto to send the messages. If the given address is a hostname rather than an IP address, it will do a DNS lookup every time. I suspect that fixing issue 14855 will also fix this, since fixing that issue requires resolving the hostname to determine whether it is an IPv4 or IPv6 address to create a suitable socket. I've run into this on 3.8, but tagging 3.10 since the code still looks the same. -- components: Library (Lib) messages: 416247 nosy: bmerry priority: normal severity: normal status: open title: DatagramHandler doing DNS lookup on every log message versions: Python 3.10 ___ Python tracker <https://bugs.python.org/issue47149> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue47149] DatagramHandler doing DNS lookup on every log message
Change by Bruce Merry : -- type: -> performance ___ Python tracker <https://bugs.python.org/issue47149> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue47149] DatagramHandler doing DNS lookup on every log message
Bruce Merry added the comment: > If you don’t look it up every time, how do you deal with DNS timeouts? Do you mean expiring the IP address when the TTL is reached? I suppose that could be an issue for a long-running service, and I don't have a good answer to that. Possibly these days with services meshes and load-balancers it is less of a concern since a logging server can move without changing its IP address. But it's important for a logging system not to block the service doing the logging (which is one reason for using UDP in the first place). I only discovered this issue because of some flaky DNS servers that would occasionally take several seconds to answer a query, and block the whole asyncio event loop while it waited. At a minimum it would be useful to document it, so that you know it's something to be concerned about when using DatagramHandler. -- ___ Python tracker <https://bugs.python.org/issue47149> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue47149] DatagramHandler doing DNS lookup on every log message
Bruce Merry added the comment: > Yes, that's what I mean. Isn't the resolver library smart enough to cache > lookups and handle the TTL timeout by itself? Apparently not in this case - with tcpdump I can see the DNS requests being fired off several times a second. I'll need to check what the TTL actually is though. -- ___ Python tracker <https://bugs.python.org/issue47149> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue47149] DatagramHandler doing DNS lookup on every log message
Bruce Merry added the comment: > Hmm. I'm not sure we should try to work around a bad resolver issue. What's > your platform, and how did you install Python? Fair point. It's Ubuntu 20.04, running inside Docker, with the default Python (3.8). I've also reproduced it outside Docker (again Ubuntu 20.04 with system Python). The TTL is 30s, so I'm not sure why systemd-resolved isn't caching it for messages logged several times a second. Even if the system has a local cache though, it's not ideal to have logging block when the TTL expires, particularly in an event-driven (asyncio) service. Updating the address in a background thread while continuing to log to the old address might be better. But my use case is particularly real-time (even 10ms of latency is problematic), and maybe that shouldn't drive the default behaviour. I blame the lack of standard POSIX functions for doing DNS lookups asynchronously and in a way that provides TTL information to the client. -- ___ Python tracker <https://bugs.python.org/issue47149> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue47149] DatagramHandler doing DNS lookup on every log message
Bruce Merry added the comment: > But it's going to be non-trivial, I fear. Yeah. Maybe some documentation is achievable in the short term though, so that users who care more about latency than changing DNS are aware that they should do the lookup themselves? -- ___ Python tracker <https://bugs.python.org/issue47149> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37141] Allow multiple separators in Stream.readuntil
Bruce Merry added the comment: I finally have permission from my employer to sign the contributors agreement, so I'll take a stab at this when I have some free time (unless nobody else gets to it first). -- ___ Python tracker <https://bugs.python.org/issue37141> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37141] Allow multiple separators in Stream.readuntil
Change by Bruce Merry : -- keywords: +patch pull_requests: +16008 stage: test needed -> patch review pull_request: https://github.com/python/cpython/pull/16429 ___ Python tracker <https://bugs.python.org/issue37141> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37141] Allow multiple separators in Stream.readuntil
Bruce Merry added the comment: I've submitted a PR: https://github.com/python/cpython/pull/16429 -- ___ Python tracker <https://bugs.python.org/issue37141> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38242] Revert the new asyncio Streams API
Change by Bruce Merry : -- nosy: +bmerry ___ Python tracker <https://bugs.python.org/issue38242> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36050] Why does http.client.HTTPResponse._safe_read use MAXAMOUNT
Bruce Merry added the comment: Re-opening because the patch to fix this has just been reverted due to bpo-42853. -- status: closed -> open ___ Python tracker <https://bugs.python.org/issue36050> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42853] `OverflowError: signed integer is greater than maximum` in ssl.py for files larger than 2GB
Bruce Merry added the comment: This fix is going to cause a regression of bpo-36050. Would it not be possible to fix this in _ssl.c (by breaking a large read into multiple smaller calls to SSL_read)? It seems like fixing this at the SSL layer is more appropriate than trying to work around it at the HTTP layer, and thus impacting the performance of all HTTP fetches (whether using TLS or not, and whether >2GB or not). -- nosy: +bmerry ___ Python tracker <https://bugs.python.org/issue42853> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42853] `OverflowError: signed integer is greater than maximum` in ssl.py for files larger than 2GB
Bruce Merry added the comment: > It seems like we could have support for OpenSSL 1.1.1 at that level with a > compile time fallback for previous OpenSSL versions that break up the work. > Would hope this solution also yields something we can backport more easily I'd have to look at exactly how the SSL_read API works, but I think once we're in C land and can read into regions of a buffer, reading in 2GB chunks is unlikely to cause a performance hit (unlike the original bpo-36050, where Python had to read a bunch of separate buffers then join them together). So trying to have 3.9 support both SSL_read_ex AND have a fallback sounds like it's adding complexity and risking inconsistency if the fallback doesn't perfectly mimic the SSL_read_ex path, for very little gain. If no-one else steps up sooner I can probably work on a patch, but before sinking time into it I'd like to hear if there is agreement that this is a reasonable approach and ideally have a volunteer to review it (hopefully someone who is familiar with OpenSSL, since I've only briefly dealt with it years ago and crypto isn't somewhere you want to make mistakes). -- ___ Python tracker <https://bugs.python.org/issue42853> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42853] `OverflowError: signed integer is greater than maximum` in ssl.py for files larger than 2GB
Bruce Merry added the comment: > A patch would not land in Python 3.9 since this would be a new feature and > out-of-scope for a released version. I see it as a fix for this bug. While there is already a fix, it regresses another bug (bpo-36050), so this would be a better fix. > Do you really want to store gigabytes of downloads in RAM instead of doing > chunked reads and store them on disk? I work on HPC applications where large quantities of data are stored in an S3-compatible object store and fetched over HTTP at 25Gb/s for processing. The data access layer tries very hard to avoid even making extra copies in memory (which is what caused me to file bpo-36050 in the first place) as it make a significant difference at those speeds. Buffering to disk would be right out. > then there are easier and better ways to deal with large buffers Your example code is probably fine if one is working directly on an SSLSocket, but http.client wraps it in a buffered reader (via `socket.makefile`), and that implements `readinto` by reading into a temporary and copying it (https://github.com/python/cpython/blob/8d0647485db5af2a0f0929d6509479ca45f1281b/Modules/_io/bufferedio.c#L88), which would add overhead. I appreciate that what I'm proposing is a relatively complex change for a released version. A less intrusive option would to be change MAXAMOUNT in http.client from 1MiB to 2GiB-1byte (as suggested by @matan1008). That would still leave 3.9 slower than 3.8 when reading >2GiB responses over plain HTTP, but at least everything in the range [1MiB, 2GiB) would operate at full speed (which is the region I actually care about). -- ___ Python tracker <https://bugs.python.org/issue42853> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36050] Why does http.client.HTTPResponse._safe_read use MAXAMOUNT
Bruce Merry added the comment: > There is nothing to do here. Will you accept patches to fix this for 3.9? I'm not clear whether the "bug fixes only" status of 3.9 allows for fixing performance regressions. -- ___ Python tracker <https://bugs.python.org/issue36050> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36050] Why does http.client.HTTPResponse._safe_read use MAXAMOUNT
Bruce Merry added the comment: > Will you accept patches to fix this for 3.9? I'm not clear whether the "bug > fixes only" status of 3.9 allows for fixing performance regressions. Never mind, I see your already answered this on bpo-42853 (as a no). Thanks for taking the time to answer my questions; I'll just have to skip Python 3.9 for this particular application and go straight to 3.10. -- ___ Python tracker <https://bugs.python.org/issue36050> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21644] Optimize bytearray(int) constructor to use calloc()
Bruce Merry added the comment: > I abandonned the issue because I didn't have time to work on it. If you want, > you can open a new issue for that. If I make a pull request and run some microbenchmarks, will you (or some other core dev) have time to review it? I've had a bad experience before with a PR that I'm still unable to get reviewed after several years, so I'd like to get at least a tentative agreement before I invest time in it. -- ___ Python tracker <https://bugs.python.org/issue21644> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36051] Drop the GIL during large bytes.join operations?
Bruce Merry added the comment: > It seems we can release GIL during iterating the buffer array. That's what I had in mind. Naturally it would require a bit of benchmarking to pick a threshold such that the small case doesn't lose performance due to locking overheads. If no one else is working on it, I can give that a try early next year. -- ___ Python tracker <https://bugs.python.org/issue36051> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36051] Drop the GIL during large bytes.join operations?
Change by Bruce Merry : -- keywords: +patch pull_requests: +17193 stage: -> patch review pull_request: https://github.com/python/cpython/pull/17757 ___ Python tracker <https://bugs.python.org/issue36051> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36051] Drop the GIL during large bytes.join operations?
Bruce Merry added the comment: If we want to be conservative, we could only drop the GIL if all the buffers pass the PyBytes_CheckExact test. Presumably that won't encounter any of these problems because bytes objects are immutable? -- ___ Python tracker <https://bugs.python.org/issue36051> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36051] Drop the GIL during large bytes.join operations?
Change by Bruce Merry : Added file: https://bugs.python.org/file48812/new.csv ___ Python tracker <https://bugs.python.org/issue36051> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36051] Drop the GIL during large bytes.join operations?
Change by Bruce Merry : Added file: https://bugs.python.org/file48811/old.csv ___ Python tracker <https://bugs.python.org/issue36051> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36051] Drop the GIL during large bytes.join operations?
Change by Bruce Merry : Added file: https://bugs.python.org/file48813/benchjoin.py ___ Python tracker <https://bugs.python.org/issue36051> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36051] Drop the GIL during large bytes.join operations?
Bruce Merry added the comment: I've attached a benchmark script and CSV results for master (whichever version that was at the point I forked) and with unconditional dropping of the GIL. It shows up to 3x performance improvement when using 4 threads. That's on my home desktop, which is quite old (Sandy Bridge). I'm expecting more significant gains on server CPUs, whose memory systems are optimised for multi-threaded workloads. The columns are chunk size, number of chunks, number of threads, and per-thread throughput. There are also cases where using multiple threads is a slowdown, but I think that's an artifact of the benchmark. It repeatedly joins the same strings, so performance is higher when they all fit in the cache; when using 4 threads that execute in parallel, the working set is 4x larger and may cease to fit in cache. In real-world usage one is unlikely to be joining the same strings again and again. In the single-threaded case, the benchmark seems to show that for 64K+, performance is improved by dropping the GIL (which I'm guessing must be statistical noise, since there shouldn't be anything contending for it), which is my reasoning behind the 65536 threshold. I'll take a look at extra unit tests soon. Do you know off the top of your head where to look for existing `join` tests to add to? -- ___ Python tracker <https://bugs.python.org/issue36051> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36051] Drop the GIL during large bytes.join operations?
Bruce Merry added the comment: > I'll take a look at extra unit tests soon. Do you know off the top of your > head where to look for existing `join` tests to add to? Never mind, I found it: https://github.com/python/cpython/blob/92709a263e9cec0bc646ccc1ea051fc528800d8d/Lib/test/test_bytes.py#L535-L559 Do you think it would be sufficient to change the stress test from joining 1000 items to joining 10 items? -- ___ Python tracker <https://bugs.python.org/issue36051> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36051] Drop the GIL during large bytes.join operations?
Bruce Merry added the comment: > Do you think it would be sufficient to change the stress test from joining > 1000 items to joining 10 items? Actually that won't work, because the existing stress test is using a non-empty separator. I'll add another version of that stress test that uses an empty separator. -- ___ Python tracker <https://bugs.python.org/issue36051> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36051] Drop the GIL during large bytes.join operations?
Bruce Merry added the comment: I'm realising that the benchmark makes it difficult to see what's going on because it doesn't separate overhead costs (slowdowns because releasing/acquiring the GIL is not free, particularly when contended) from cache effects (slowdowns due to parallel threads creating more cache pressure than threads that take turns). inada.naoki's version of the benchmark is better here because it uses the same input data for all the threads, but the output data will still be different in each thread. For example, on my system I see a big drop in speedup (although I still get speedup) with the new benchmark once the buffer size gets to 2MB per thread, which is not surprising with an 8MB L3 cache. My feeling is that we should try to ignore cache effects when picking a threshold, because we can't predict them generically (they'll vary by workload, thread count, CPU etc) whereas users can benchmark specific use cases to decide whether multithreading gives them a benefit. If the threshold is too low then users can always choose not to use multi-threading (and in general one doesn't expect much from it in Python) but if the threshold is too high then users have no recourse. That being said, 65536 does still seem a bit low based on the results available. I'll try to write a variant of the benchmark in which other threads just spin in Python without creating memory pressure to see if that gives a different picture. I'll also run the benchmark on a server CPU when I'm back at work. -- ___ Python tracker <https://bugs.python.org/issue36051> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36051] Drop the GIL during large bytes.join operations?
Bruce Merry added the comment: I've written a variant of the benchmark in which one thread does joins and the other does unrelated CPU-bound work that doesn't touch memory much. It also didn't show much benefit to thresholds below 512KB. I still want to test things on a server-class CPU, but based on the evidence so far I'm okay with a 1MB threshold. -- ___ Python tracker <https://bugs.python.org/issue36051> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36051] Drop the GIL during large bytes.join operations?
Bruce Merry added the comment: I ran the test on a Xeon machine (Skylake-XP) and it also looks like performance is only improved from 1MB up (somewhat to my surprise, given how poor single-threaded memcpy performance is on that machine). So I've updated the pull request with that threshold. -- ___ Python tracker <https://bugs.python.org/issue36051> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36051] Drop the GIL during large bytes.join operations?
Bruce Merry added the comment: I think I've addressed the concerns that were raised in this bug, but let me know if I've missed any. -- ___ Python tracker <https://bugs.python.org/issue36051> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39974] A race condition with GIL releasing exists in stringlib_bytes_join
Bruce Merry added the comment: Good catch! I'll take a look this week to see what makes sense for the use case for which I originally proposed this optimisation. -- ___ Python tracker <https://bugs.python.org/issue39974> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39974] A race condition with GIL releasing exists in stringlib_bytes_join
Bruce Merry added the comment: > static_buffers is not a static variable. It is auto local variable. > So I think other thread don't hijack it. Oh yes, quite right. I should have looked closer at the code first before commenting. I think this can be closed as not-a-bug, unless +tzickel has example code that gives the wrong output? > perhaps add an if to check if the backing object is really mutable ? > (Py_buffer.readonly) It's not just the buffer data being mutable that's an issue, it's the owning object. It's possible for an object to expose a read-only buffer, but also allow the buffer (including its size or address) to be mutated through its own API. > Also, semi related, (dunno where to discuss it), would a good .join() > optimization be to add an optional length parameter, like .join(iterable, > length=10) You could always open a separate bug for it, but I can't see it catching on given that one needs to modify one's code for it. -- ___ Python tracker <https://bugs.python.org/issue39974> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39974] A race condition with GIL releasing exists in stringlib_bytes_join
Bruce Merry added the comment: +tzickel I'd suggest reading the discussion in issue 36051, and maybe raising a new issue about it if you still have concerns. In short, dropping the GIL in more bytes.join cases wouldn't necessarily be wrong, but it might break code that made the assumption that bytes.join is atomic even though that's never been claimed. -- ___ Python tracker <https://bugs.python.org/issue39974> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41002] HTTPResponse.read with amt is slow
New submission from Bruce Merry : I've run into this on 3.8, but the code on Git master doesn't look significantly different so I assume it still applies. I'm happy to work on a PR for this. When http.client.HTTPResponse.read is called with a specific amount to read, it goes down this code path: ``` if amt is not None: # Amount is given, implement using readinto b = bytearray(amt) n = self.readinto(b) return memoryview(b)[:n].tobytes() ``` That's pretty inefficient, because - `bytearray(amt)` will first zero-fill some memory - `tobytes()` will make an extra copy of this memory - if amt is big enough, it'll cause the temporary memory to be allocated from the kernel, which will *also* zero-fill the pages for security. A better approach would be to use the read method of the underlying fp. I have a micro-benchmark (that I'll attach) showing that for a 1GB body and reading the whole body with or without the amount being explicit, performance is reduced from 3GB/s to 1GB/s. For some unknown reason the requests library likes to read the body in 10KB chunks even if the user has requested the entire body, so this will help here (although the gains probably won't be as big because 10KB is really too small to amortise all the accounting overhead). Output from my benchmark, run against a 1GB file on localhost: httpclient-read: 3019.0 ± 63.8 MB/s httpclient-read-length: 1050.3 ± 4.8 MB/s httpclient-read-raw: 3150.3 ± 5.3 MB/s socket-read: 3134.4 ± 7.9 MB/s -- components: Library (Lib) files: httpbench-simple.py messages: 371732 nosy: bmerry priority: normal severity: normal status: open title: HTTPResponse.read with amt is slow versions: Python 3.8 Added file: https://bugs.python.org/file49239/httpbench-simple.py ___ Python tracker <https://bugs.python.org/issue41002> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41002] HTTPResponse.read with amt is slow
Change by Bruce Merry : -- type: -> performance ___ Python tracker <https://bugs.python.org/issue41002> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41002] HTTPResponse.read with amt is slow
Change by Bruce Merry : -- keywords: +patch pull_requests: +20124 stage: -> patch review pull_request: https://github.com/python/cpython/pull/20943 ___ Python tracker <https://bugs.python.org/issue41002> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41002] HTTPResponse.read with amt is slow
Bruce Merry added the comment: > (perhaps 'MB/s's are wrong). Why, are you getting significantly different results? Just in case it's confusing, the results are reported as A ± B MB/s, where A is the mean and B is the standard deviation of the mean. So it's about 3GB/s when no length if passed, or 1GB/s when a length is passed. -- ___ Python tracker <https://bugs.python.org/issue41002> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue32528] Change base class for futures.CancelledError
Bruce Merry added the comment: FYI this has just bitten me after updating my OS to one that ships Python 3.8. It is code that was written with asyncio cancellation in mind and which expected CancelledError to be caught with "except Exception" (the exception block unwound incomplete operations before re-raising the exception). It's obviously too late to do anything about Python 3.8, but I'm mentioning this as a data point in support of having a deprecation period if similar changes are made in future. On the plus side, while fixing up my code and checking all instances of "except Exception" I found some places where this change did fix latent cancellation bugs. So I'm happy with the change, just a little unhappy that it came as a surprise. -- nosy: +bmerry ___ Python tracker <https://bugs.python.org/issue32528> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21644] Optimize bytearray(int) constructor to use calloc()
Bruce Merry added the comment: Was this abandoned just because nobody had the time, or was there a problem with the approach? I independently wanted this optimisation, and have ended up implementing something very similar to what was reverted in https://hg.python.org/lookup/dff6b4b61cac. In a benchmark that creates a large bytearray, then fills it with socket.readinto, I'm seeing a 2x performance improvement on Linux, and from some quick benchmarking it seems to be just as fast as the old code for small arrays that are allocated from the pool. -- nosy: +bmerry ___ Python tracker <https://bugs.python.org/issue21644> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue32052] Provide access to buffer of asyncio.StreamReader
Bruce Merry added the comment: Ok, I'll open a separate issue to allow a tuple of possible separators. -- nosy: +bmerry ___ Python tracker <https://bugs.python.org/issue32052> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37141] Allow multiple separators in StreamReader.readuntil
New submission from Bruce Merry : Text-based protocols sometimes allow a choice of newline separator - I work with one that allows either \r or \n. Unfortunately that doesn't work with StreamReader.readuntil, which only accepts a single separator, so I've had to do some hacky things to obtain lines without having to >From discussion in issue 32052, it sounded like extending >StreamReader.readuntil to support a tuple of separators would be feasible. -- components: asyncio messages: 344397 nosy: asvetlov, bmerry, yselivanov priority: normal severity: normal status: open title: Allow multiple separators in StreamReader.readuntil type: enhancement versions: Python 3.8 ___ Python tracker <https://bugs.python.org/issue37141> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37141] Allow multiple separators in StreamReader.readuntil
Bruce Merry added the comment: I wasn't aware of that deprecation - it doesn't seem to be mentioned at https://docs.python.org/3.8/library/asyncio-stream.html. What is the replacement? -- ___ Python tracker <https://bugs.python.org/issue37141> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37141] Allow multiple separators in StreamReader.readuntil
Bruce Merry added the comment: Ok. Does the new Stream still have a similar interface for readuntil i.e. is this still a relevant request against the new API? I'm happy to let deprecated APIs stay as-is. -- ___ Python tracker <https://bugs.python.org/issue37141> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37141] Allow multiple separators in Stream.readuntil
Bruce Merry added the comment: Ok, I've changed the issue title to refer to Stream. Since this would be a new feature, I assume it's off the table for 3.8, but I'll see if I get time to implement a PR in time for 3.9 (and get someone at work to sign off on the contributor agreement, which might be the harder part). Thanks for the quick and helpful responses. -- title: Allow multiple separators in StreamReader.readuntil -> Allow multiple separators in Stream.readuntil ___ Python tracker <https://bugs.python.org/issue37141> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36050] Why does http.client.HTTPResponse._safe_read use MAXAMOUNT
New submission from Bruce Merry : While investigating poor HTTP read performance I discovered that reading all the data from a response with a content-length goes via _safe_read, which in turn reads in chunks of at most MAXAMOUNT (1MB) before stitching them together with b"".join. This can really hurt performance for responses larger than MAXAMOUNT, because (a) the data has to be copied an additional time; and (b) the join operation doesn't drop the GIL, so this limits multi-threaded scaling. I'm struggling to see any advantage in doing this chunking - it's not saving memory either (in fact it is wasting it). To give an idea of the performance impact, changing MAXAMOUNT to a very large value made a multithreaded test of mine go from 800MB/s to 2.5GB/s (which is limited by the network speed). -- components: Library (Lib) messages: 336081 nosy: bmerry priority: normal severity: normal status: open title: Why does http.client.HTTPResponse._safe_read use MAXAMOUNT versions: Python 3.7 ___ Python tracker <https://bugs.python.org/issue36050> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36051] (Performance) Drop the GIL during large bytes.join operations?
New submission from Bruce Merry : A common pattern in libraries doing I/O is to receive data in chunks, put them in a list, then join them all together using b"".join(chunks). For example, see http.client.HTTPResponse._safe_read. When the output is large, the memory copies can block the interpreter for a non-trivial amount of time, and prevent multi-threaded scaling. If the GIL could be dropped during the memcpys it could improve parallel I/O performance in some high-bandwidth scenarios (36050 mentions a case where I've run into this serialisation bottleneck in practice). Obviously it could hurt performance to drop the GIL for small cases. As far as I know numpy uses thresholds to decide when it's worth dropping the GIL and it seems to work fairly well. -- components: Interpreter Core messages: 336082 nosy: bmerry priority: normal severity: normal status: open title: (Performance) Drop the GIL during large bytes.join operations? versions: Python 3.7 ___ Python tracker <https://bugs.python.org/issue36051> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue32052] Provide access to buffer of asyncio.StreamReader
New submission from Bruce Merry : While asyncio.StreamReader.readuntil is an improvement on only having readline, it is still quite limited e.g. you cannot have multiple possible terminators. The real problem is that it's not possible to roll your own without accessing _underscore fields (other than by reading one byte at a time, which I'm guessing would be bad for performance). I'm not sure exactly what a public API to assist would look like, but I think the following would be a good start: 1. A get_buffer method, that returns (self._buffer, self._eof); the caller must treat the buffer as readonly. 2. A wait_for_data method to wait for the return value of get_buffer to change (basically like current _wait_for_data) 3. Access to the _limit attribute. With that available, I think readuntil or more complex variants of it could be implemented externally using only the public interface (consumption of data from the buffer would be via readexactly rather than by messing with the buffer array directly). -- components: asyncio messages: 306397 nosy: Bruce Merry, yselivanov priority: normal severity: normal status: open title: Provide access to buffer of asyncio.StreamReader type: enhancement versions: Python 3.7 ___ Python tracker <https://bugs.python.org/issue32052> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue32395] asyncio.StreamReader.readuntil is not general enough
New submission from Bruce Merry : I'd proposed one specific solution in Issue 32052 which asvetlov didn't like, so as requested I'm filing a bug about the problem rather than the solution. The specific case I have is reading a protocol in which either \r or \n can be used to terminate lines. With StreamReader.readuntil, it's only possible to specify one separator, so it can't easily be used (*). Some nice-to-have features, from specific to general: 1. Specify multiple alternate separators. 2. Specify a regex for a separator. 3. Specify a regex for the line. 4. Specify a callback that takes a string and returns the position of the end of the line, if any. Of course, some of these risk quadratic-time behaviour if they have to check the whole buffer every time the buffer is extended, so that would need to be considered in the design. In the last case, the callback could take care of it itself by maintaining internal state. (*) I actually have a solution for this case (https://github.com/ska-sa/aiokatcp/blob/bd8263cefe213003a218fac0dd8c5207cc76aeef/aiokatcp/connection.py#L44-L52), but it only works because \r and \n are semantically equivalent in the particular protocol I'm parsing. -- components: asyncio messages: 308852 nosy: Bruce Merry, yselivanov priority: normal severity: normal status: open title: asyncio.StreamReader.readuntil is not general enough type: enhancement versions: Python 3.7 ___ Python tracker <https://bugs.python.org/issue32395> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue32052] Provide access to buffer of asyncio.StreamReader
Bruce Merry added the comment: A sequence of possible terminators would cover my immediate use case and certainly be an improvement. To facilitate more general use cases without exposing implementation details, would it be practical and maintainable to have a "putback" method that prepends data to the buffer? It might not be fast in all cases (e.g. it might have to make a copy of what's still in the buffer), but possibly BufferedReader could detect the common case (putting back a suffix of what's just been read) and adjust its offsets into its internal buffer (although I'm not at all familiar with BufferedReader, so feel free to tell me I'm talking nonsense). -- ___ Python tracker <https://bugs.python.org/issue32052> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com