Which Python System is affected? (Was: What does the Async Detour usually cost)

2025-06-23 Thread Mild Shock

Hi,

I tested this one:

Python 3.11.11 (0253c85bf5f8, Feb 26 2025, 10:43:25)
[PyPy 7.3.19 with MSC v.1941 64 bit (AMD64)] on win32

I didn't test yet this one, because it is usually slower:

ython 3.14.0b2 (tags/v3.14.0b2:12d3f88, May 26 2025, 13:55:44)
[MSC v.1943 64 bit (AMD64)] on win32

Bye

Mild Shock schrieb:

Hi,

I have some data what the Async Detour usually
costs. I just compared with another Java Prolog
that didn't do the thread thingy.

Reported measurement with the async Java Prolog:

 > JDK 24: 50 ms (using Threads, not yet VirtualThreads)

New additional measurement with an alternative Java Prolog:

JDK 24: 30 ms (no Threads)

But already the using Threads version is quite optimized,
it basically reuse its own thread and uses a mutex
somewhere, so it doesn't really create a new secondary

thread, unless a new task is spawn. Creating a 2nd thread
is silly if task have their own thread. This is the
main potential of virtual threads in upcoming Java,

just run tasks inside virtual threads.

Bye

P.S.: But I should measure with more files, since
the 50 ms and 30 ms are quite small. Also I am using a
warm run, so the files and their meta information is already

cached in operating system memory. I am trying to only
measure the async overhead, but maybe Python doesn't trust
the operating system memory, and calls some disk

sync somewhere. I don't know. I don't open and close the
files, and don't call some disk syncing. Only reading
stats to get mtime and doing some comparisons.

--
https://mail.python.org/mailman3//lists/python-list.python.org


Re: async I/O via threads is extremly slow (Was: Does Python Need Virtual Threads?)

2025-06-23 Thread Left Right via Python-list
I honestly have no idea what's being measured, but here are some
numbers to compare this to, and then some explanation about async I/O
in general.

1. No I/O to a local disk on a modern controller should take
milliseconds. The time you are aiming for is below millisecond. That
is, writing a block to disk should take less than a millisecond.
2. Linux and some other operating systems have an ability to work
asynchronously with network I/O through calls from the epoll family
(the story of these system calls is ironic due to how many iterations
of failures resulted in what we have today, and then was abandoned in
favor of io_uring anyways). I/O to disk, unless you are using io_uring
is never asynchronous. Using threads may help *other* code to run
asynchronously, but I/O inside threads is still blocking.

I have no idea what

stats = await asyncio.to_thread(os.stat, url)

may be doing. But, looking at these stats:
https://en.wikipedia.org/wiki/IOPS, 2 seconds should be enough, on a
modern server to write... a lot of data. Let's be modest and assume
you are using an Intel SSD with 5K IOPS, then you should be able to
write 320 MiB in this time, unless I miscounted the zeroes :D

In general though... I think that the subject of threads and I/O are
orthogonal and trying to come up with any sort of metric to measure
both at the same time somehow... is just not going to give you any
sensible description of the situation. So... maybe rethink the
problem? Maybe eliminate threads from the equation?

Also... I have no idea why Python needs async/await. It's a very
confusing and unwieldy interface to epoll. I never found a practical
reason to use this, unless in the situation where someone else used
this in their library, and I had to use the library. All in all, it's
better to pretend this part of Python doesn't exist. It's pointless
and poorly written on top of that.

On Mon, Jun 23, 2025 at 9:42 PM Mild Shock  wrote:
>
> Hi,
>
> async I/O in Python is extremly disappointing
> and an annoying bottleneck.
>
> The problem is async I/O via threads is currently
> extremly slow. I use a custom async I/O file property
> predicate. It doesn't need to be async for file
>
> system access. But by some historical circumstances
> I made it async since the same file property routine
> might also do a http HEAD request. But what I was
>
> testing and comparing was a simple file system access
> inside a wrapped thread, that is async awaited.
> Such a thread is called for a couple of directory
>
> entries to check a directory tree whether updates
> are need. Here some measurement doing this simple
> involving some little async I/O:
>
> node.js: 10 ms (usual Promises and stuff)
> JDK 24: 50 ms (using Threads, not yet VirtualThreads)
> pypy: 2000 ms
>
> So currently PyPy is 200x times slower than node.js
> when it comes to async I/O. No files were read or
> written in the test case, only "mtime" was read,
>
> via this Python line:
>
> stats = await asyncio.to_thread(os.stat, url)
>
> Bye
>
> Mild Shock schrieb:
> >
> > Concerning virtual threads the only problem
> > with Java I have is, that JDK 17 doesn't have them.
> > And some linux distributions are stuck with JDK 17.
> >
> > Otherwise its not an idea that belongs solely
> > to Java, I think golang pioniered them with their
> > goroutines. I am planning to use them more heavily
> >
> > when they become more widely available, and I don't
> > see any principle objection that Python wouldn't
> > have them as well. It would make async I/O based
> >
> > on async waithing for a thread maybe more lightweight.
> > But this would be only important if you have a high
> > number of tasks.
> >
> > Lawrence D'Oliveiro schrieb:
> >> Short answer: no.
> >>
> >> 
> >>
> >> Firstly, anybody appealing to Java as an example of how to design a
> >> programming language should immediately be sending your bullshit detector
> >> into the yellow zone.
> >>
> >> Secondly, the link to a critique of JavaScript that dates from 2015, from
> >> before the language acquired its async/await constructs, should be
> >> another
> >> warning sign.
> >>
> >> Looking at that Java spec, a “virtual thread” is just another name for
> >> “stackful coroutine”. Because that’s what you get when you take away
> >> implicit thread preemption and substitute explicit preemption instead.
> >>
> >> The continuation concept is useful in its own right. Why not concentrate
> >> on implementing that as a new primitive instead?
> >>
> >
> --
> https://mail.python.org/mailman3//lists/python-list.python.org
-- 
https://mail.python.org/mailman3//lists/python-list.python.org


What does the Async Detour usually cost (Was: What does stats = await asyncio.to_thread(os.stat, url) do?)

2025-06-23 Thread Mild Shock

Hi,

I have some data what the Async Detour usually
costs. I just compared with another Java Prolog
that didn't do the thread thingy.

Reported measurement with the async Java Prolog:

> JDK 24: 50 ms (using Threads, not yet VirtualThreads)

New additional measurement with an alternative Java Prolog:

JDK 24: 30 ms (no Threads)

But already the using Threads version is quite optimized,
it basically reuse its own thread and uses a mutex
somewhere, so it doesn't really create a new secondary

thread, unless a new task is spawn. Creating a 2nd thread
is silly if task have their own thread. This is the
main potential of virtual threads in upcoming Java,

just run tasks inside virtual threads.

Bye

P.S.: But I should measure with more files, since
the 50 ms and 30 ms are quite small. Also I am using a
warm run, so the files and their meta information is already

cached in operating system memory. I am trying to only
measure the async overhead, but maybe Python doesn't trust
the operating system memory, and calls some disk

sync somewhere. I don't know. I don't open and close the
files, and don't call some disk syncing. Only reading
stats to get mtime and doing some comparisons.

Mild Shock schrieb:

So what does:

stats = await asyncio.to_thread(os.stat, url)

Whell it calls in a sparate new secondary thread:

os.stat(url)

It happends that url is only a file path, and
the file path points to an existing file. So the
secondary thread computs the stats, and terminates,

and the async framework hands the stats back to
the main thread that did the await, and the main
thread stops his waiting and continues to run

cooperatively with the other tasks in the current
event loop. The test case measures the wall time.
The results are:

 > node.js: 10 ms (usual Promises and stuff)
 > JDK 24: 50 ms (using Threads, not yet VirtualThreads)
 > pypy: 2000 ms

I am only using one main task, sequentially on
such await calles, with a couple of file, not
more than 50 files.

I could compare with removing the async detour,
to qualify the async I/O detour overhead.

Mild Shock schrieb:

Hi,

async I/O in Python is extremly disappointing
and an annoying bottleneck.

The problem is async I/O via threads is currently
extremly slow. I use a custom async I/O file property
predicate. It doesn't need to be async for file

system access. But by some historical circumstances
I made it async since the same file property routine
might also do a http HEAD request. But what I was

testing and comparing was a simple file system access
inside a wrapped thread, that is async awaited.
Such a thread is called for a couple of directory

entries to check a directory tree whether updates
are need. Here some measurement doing this simple
involving some little async I/O:

node.js: 10 ms (usual Promises and stuff)
JDK 24: 50 ms (using Threads, not yet VirtualThreads)
pypy: 2000 ms

So currently PyPy is 200x times slower than node.js
when it comes to async I/O. No files were read or
written in the test case, only "mtime" was read,

via this Python line:

stats = await asyncio.to_thread(os.stat, url)

Bye

Mild Shock schrieb:


Concerning virtual threads the only problem
with Java I have is, that JDK 17 doesn't have them.
And some linux distributions are stuck with JDK 17.

Otherwise its not an idea that belongs solely
to Java, I think golang pioniered them with their
goroutines. I am planning to use them more heavily

when they become more widely available, and I don't
see any principle objection that Python wouldn't
have them as well. It would make async I/O based

on async waithing for a thread maybe more lightweight.
But this would be only important if you have a high
number of tasks.

Lawrence D'Oliveiro schrieb:

Short answer: no.



Firstly, anybody appealing to Java as an example of how to design a
programming language should immediately be sending your bullshit 
detector

into the yellow zone.

Secondly, the link to a critique of JavaScript that dates from 2015, 
from
before the language acquired its async/await constructs, should be 
another

warning sign.

Looking at that Java spec, a “virtual thread” is just another name for
“stackful coroutine”. Because that’s what you get when you take away
implicit thread preemption and substitute explicit preemption instead.

The continuation concept is useful in its own right. Why not 
concentrate

on implementing that as a new primitive instead?








--
https://mail.python.org/mailman3//lists/python-list.python.org


Re: What does stats = await asyncio.to_thread(os.stat, url) do? (Was async I/O via threads is extremly slow)

2025-06-23 Thread Inada Naoki via Python-list
Other languages uses thread pool, instead of creating new thread.

In Python,loop.run_in_executor uses thread pool.

https://docs.python.org/3.13/library/asyncio-eventloop.html#asyncio.loop.run_in_executor

2025年6月24日(火) 8:12 Mild Shock :
>
> So what does:
>
> stats = await asyncio.to_thread(os.stat, url)
>
> Whell it calls in a sparate new secondary thread:
>
> os.stat(url)
>
> It happends that url is only a file path, and
> the file path points to an existing file. So the
> secondary thread computs the stats, and terminates,
>
> and the async framework hands the stats back to
> the main thread that did the await, and the main
> thread stops his waiting and continues to run
>
> cooperatively with the other tasks in the current
> event loop. The test case measures the wall time.
> The results are:
>
>  > node.js: 10 ms (usual Promises and stuff)
>  > JDK 24: 50 ms (using Threads, not yet VirtualThreads)
>  > pypy: 2000 ms
>
> I am only using one main task, sequentially on
> such await calles, with a couple of file, not
> more than 50 files.
>
> I could compare with removing the async detour,
> to qualify the async I/O detour overhead.
>
> Mild Shock schrieb:
> > Hi,
> >
> > async I/O in Python is extremly disappointing
> > and an annoying bottleneck.
> >
> > The problem is async I/O via threads is currently
> > extremly slow. I use a custom async I/O file property
> > predicate. It doesn't need to be async for file
> >
> > system access. But by some historical circumstances
> > I made it async since the same file property routine
> > might also do a http HEAD request. But what I was
> >
> > testing and comparing was a simple file system access
> > inside a wrapped thread, that is async awaited.
> > Such a thread is called for a couple of directory
> >
> > entries to check a directory tree whether updates
> > are need. Here some measurement doing this simple
> > involving some little async I/O:
> >
> > node.js: 10 ms (usual Promises and stuff)
> > JDK 24: 50 ms (using Threads, not yet VirtualThreads)
> > pypy: 2000 ms
> >
> > So currently PyPy is 200x times slower than node.js
> > when it comes to async I/O. No files were read or
> > written in the test case, only "mtime" was read,
> >
> > via this Python line:
> >
> > stats = await asyncio.to_thread(os.stat, url)
> >
> > Bye
> >
> > Mild Shock schrieb:
> >>
> >> Concerning virtual threads the only problem
> >> with Java I have is, that JDK 17 doesn't have them.
> >> And some linux distributions are stuck with JDK 17.
> >>
> >> Otherwise its not an idea that belongs solely
> >> to Java, I think golang pioniered them with their
> >> goroutines. I am planning to use them more heavily
> >>
> >> when they become more widely available, and I don't
> >> see any principle objection that Python wouldn't
> >> have them as well. It would make async I/O based
> >>
> >> on async waithing for a thread maybe more lightweight.
> >> But this would be only important if you have a high
> >> number of tasks.
> >>
> >> Lawrence D'Oliveiro schrieb:
> >>> Short answer: no.
> >>>
> >>> 
> >>>
> >>> Firstly, anybody appealing to Java as an example of how to design a
> >>> programming language should immediately be sending your bullshit
> >>> detector
> >>> into the yellow zone.
> >>>
> >>> Secondly, the link to a critique of JavaScript that dates from 2015,
> >>> from
> >>> before the language acquired its async/await constructs, should be
> >>> another
> >>> warning sign.
> >>>
> >>> Looking at that Java spec, a “virtual thread” is just another name for
> >>> “stackful coroutine”. Because that’s what you get when you take away
> >>> implicit thread preemption and substitute explicit preemption instead.
> >>>
> >>> The continuation concept is useful in its own right. Why not concentrate
> >>> on implementing that as a new primitive instead?
> >>>
> >>
> >
> --
> https://mail.python.org/mailman3//lists/python-list.python.org



-- 
Inada Naoki  
-- 
https://mail.python.org/mailman3//lists/python-list.python.org


What does stats = await asyncio.to_thread(os.stat, url) do? (Was async I/O via threads is extremly slow)

2025-06-23 Thread Mild Shock

So what does:

stats = await asyncio.to_thread(os.stat, url)

Whell it calls in a sparate new secondary thread:

os.stat(url)

It happends that url is only a file path, and
the file path points to an existing file. So the
secondary thread computs the stats, and terminates,

and the async framework hands the stats back to
the main thread that did the await, and the main
thread stops his waiting and continues to run

cooperatively with the other tasks in the current
event loop. The test case measures the wall time.
The results are:

> node.js: 10 ms (usual Promises and stuff)
> JDK 24: 50 ms (using Threads, not yet VirtualThreads)
> pypy: 2000 ms

I am only using one main task, sequentially on
such await calles, with a couple of file, not
more than 50 files.

I could compare with removing the async detour,
to qualify the async I/O detour overhead.

Mild Shock schrieb:

Hi,

async I/O in Python is extremly disappointing
and an annoying bottleneck.

The problem is async I/O via threads is currently
extremly slow. I use a custom async I/O file property
predicate. It doesn't need to be async for file

system access. But by some historical circumstances
I made it async since the same file property routine
might also do a http HEAD request. But what I was

testing and comparing was a simple file system access
inside a wrapped thread, that is async awaited.
Such a thread is called for a couple of directory

entries to check a directory tree whether updates
are need. Here some measurement doing this simple
involving some little async I/O:

node.js: 10 ms (usual Promises and stuff)
JDK 24: 50 ms (using Threads, not yet VirtualThreads)
pypy: 2000 ms

So currently PyPy is 200x times slower than node.js
when it comes to async I/O. No files were read or
written in the test case, only "mtime" was read,

via this Python line:

stats = await asyncio.to_thread(os.stat, url)

Bye

Mild Shock schrieb:


Concerning virtual threads the only problem
with Java I have is, that JDK 17 doesn't have them.
And some linux distributions are stuck with JDK 17.

Otherwise its not an idea that belongs solely
to Java, I think golang pioniered them with their
goroutines. I am planning to use them more heavily

when they become more widely available, and I don't
see any principle objection that Python wouldn't
have them as well. It would make async I/O based

on async waithing for a thread maybe more lightweight.
But this would be only important if you have a high
number of tasks.

Lawrence D'Oliveiro schrieb:

Short answer: no.



Firstly, anybody appealing to Java as an example of how to design a
programming language should immediately be sending your bullshit 
detector

into the yellow zone.

Secondly, the link to a critique of JavaScript that dates from 2015, 
from
before the language acquired its async/await constructs, should be 
another

warning sign.

Looking at that Java spec, a “virtual thread” is just another name for
“stackful coroutine”. Because that’s what you get when you take away
implicit thread preemption and substitute explicit preemption instead.

The continuation concept is useful in its own right. Why not concentrate
on implementing that as a new primitive instead?






--
https://mail.python.org/mailman3//lists/python-list.python.org


async I/O via threads is extremly slow (Was: Does Python Need Virtual Threads?)

2025-06-23 Thread Mild Shock

Hi,

async I/O in Python is extremly disappointing
and an annoying bottleneck.

The problem is async I/O via threads is currently
extremly slow. I use a custom async I/O file property
predicate. It doesn't need to be async for file

system access. But by some historical circumstances
I made it async since the same file property routine
might also do a http HEAD request. But what I was

testing and comparing was a simple file system access
inside a wrapped thread, that is async awaited.
Such a thread is called for a couple of directory

entries to check a directory tree whether updates
are need. Here some measurement doing this simple
involving some little async I/O:

node.js: 10 ms (usual Promises and stuff)
JDK 24: 50 ms (using Threads, not yet VirtualThreads)
pypy: 2000 ms

So currently PyPy is 200x times slower than node.js
when it comes to async I/O. No files were read or
written in the test case, only "mtime" was read,

via this Python line:

stats = await asyncio.to_thread(os.stat, url)

Bye

Mild Shock schrieb:


Concerning virtual threads the only problem
with Java I have is, that JDK 17 doesn't have them.
And some linux distributions are stuck with JDK 17.

Otherwise its not an idea that belongs solely
to Java, I think golang pioniered them with their
goroutines. I am planning to use them more heavily

when they become more widely available, and I don't
see any principle objection that Python wouldn't
have them as well. It would make async I/O based

on async waithing for a thread maybe more lightweight.
But this would be only important if you have a high
number of tasks.

Lawrence D'Oliveiro schrieb:

Short answer: no.



Firstly, anybody appealing to Java as an example of how to design a
programming language should immediately be sending your bullshit detector
into the yellow zone.

Secondly, the link to a critique of JavaScript that dates from 2015, from
before the language acquired its async/await constructs, should be 
another

warning sign.

Looking at that Java spec, a “virtual thread” is just another name for
“stackful coroutine”. Because that’s what you get when you take away
implicit thread preemption and substitute explicit preemption instead.

The continuation concept is useful in its own right. Why not concentrate
on implementing that as a new primitive instead?




--
https://mail.python.org/mailman3//lists/python-list.python.org