Re: Checking network input processing by Python for a multi-threaded server

2019-05-04 Thread Markus Elfring
>   Server.shutdown() sets a flag that tells the main server to /stop
> accepting new requests/.

Can it be that this method should perform a bit more resource management
(according to the selected configuration like “socketserver.ThreadingMixIn”)?


>   So far as I can tell, for a threaded server, any threads/requests that
> were started and haven't completed their handlers will run to completion --
> however long that handler takes to finish the request. Whether
> threaded/forked/single-thread -- once a request has been accepted, it will
> run to completion.

This is good to know and such functionality fits also to my expectations
for the software behaviour.


> Executing .shutdown() will not kill unfinished handler threads.

I got a result like the following from another test variant
on a Linux system.

elfring@Sonne:~/Projekte/Python> time python3 test-statistic-server2.py
incidence|"available records"|"running threads"|"return code"|"command output"
80|6|1|0|
1|4|3|0|
1|4|4|0|
3|5|2|0|
1|5|3|0|
5|7|1|0|
1|3|4|0|
1|8|1|0|
1|4|1|0|
3|5|1|0|
1|4|2|0|
1|3|2|0|
1|6|2|0|

real0m48,373s
user0m6,682s
sys 0m1,337s


>> How do you think about to improve the distinction for the really
>> desired lock granularity in my use case?
>
>   If your request handler updates ANY shared data, YOU have to code the
> needed MUTEX locks into that handler.

I suggest to increase the precision for such a software requirement.


> You may also need to code logic to ensure any handler threads have completed

Can a class like “BaseServer” be responsible for the determination
if all extra started threads (or background processes) finished their work
as expected?


> before your main thread accesses the shared data

I became unsure at which point a specific Python list variable
will reflect the received record sets from a single test command
in a consistent way according to the discussed data processing.


> -- that may require a different type of lock;

I am curious on corresponding software adjustments.


> something that allows multiple threads to hold in parallel
> (unfortunately, Event() and Condition() aren't directly suitable)

Which data and process management approaches will be needed finally?


>   Condition() with a global counter (the counter needs its own lock)
> might work: handler does something like

How much do the programming interfaces from the available classes support
the determination that submitted tasks were completely finished?


> The may still be a race condition on the last request if it is started
> between the .shutdown call and the counter test (ie; the main submits
> .shutdown, server starts a thread which doesn't get scheduled yet,
> main does the .acquire()s, finds counter is 0 so assumes everything is done,
> and THEN the last thread gets scheduled and increments the counter.

Should mentioned system constraints be provided already by the Python
function (or class) library?

Regards,
Markus
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Checking network input processing by Python for a multi-threaded server

2019-05-04 Thread Markus Elfring
> If you have a multi-threaded application and you want to be on
> the "safe side", you always use your own locks.

I suggest to reconsider your software expectations around
the word “always”.
There are more software design options available.


> Python uses locks to protect its own data structures.
> Whether this protection is enough for your use of Python types
> depends on details you may not want to worry about.

I agree to such a general view.


> For example: most operations on Python types are atomic
> (if they do not involve some kind of "waiting" or "slow operation")
> *BUT* if they can detroy objects, then arbitrary code
> (from destructors) can be executed and then they are not atomic.

The safe handling of finalizers can trigger development challenges.


> As an example "list.append" is atomic (no object is detroyed),
> but "list[:] = ..." is not: while the list operation itself
> is not interrupted by another thread, the operation may destroy
> objects (the old list components) and other threads may get control
> before the assignment has finished.

How would you determine (with the help of the Python function/class library)
that previously submitted tasks were successfully executed?

Regards,
Markus
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Checking network input processing by Python for a multi-threaded server

2019-05-04 Thread Markus Elfring
>>> Server.shutdown() sets a flag that tells the main server to /stop
>>> accepting new requests/.
>>
>> Can it be that this method should perform a bit more resource management
>> (according to the selected configuration like “socketserver.ThreadingMixIn”)?
>>
>   There isn't much more it can do

I see further software design possibilities.


> -- it has been a long standing axiom that killing threads is not recommended.

This data processing approach will trigger various development challenges.


>>> You may also need to code logic to ensure any handler threads have completed
>>
>> Can a class like “BaseServer” be responsible for the determination
>> if all extra started threads (or background processes) finished their work
>> as expected?
>
>   You are asking for changes to the Python library for a use case
> that is not common.

I find this view questionable.


> Normally connections to a server are independent and do not share common data

Would you like to clarify corresponding application statistics any further?


> -- if there is anything in common,

Do you identify any more shared functionality?


> it is likely stored in a database management system

An use case evolved into my need to work also with an ordinary Python list
variable for a simple storage interface.


> which itself will provide locking for updates,

It is nice when you can reuse such software.


> and the connection handler will have to be coded to handle retries
> if multiple connections try to update the same records.

I am not concerned about this aspect for my test case.


> Servers aren't meant to be started and shutdown at a rapid rate

Their run times can vary considerably.


> (it's called "serve_forever" for a reason).

Will an other term become more appropriate?


>   If the socketserver module doesn't provide what you need,

It took a while to understand the observed software behaviour better.


> you are free to copy socketserver.py to some other file (myserver.py?),
> and modify it to fit your needs.

Will it help to clarify any software extensions with corresponding maintainers?


> Maybe have the function that spawns handler threads append the thread ID
> to a list, have the function that cleans up a handler thread at the end
> send its ID via a Queue object,

This approach can be reasonable to some degree.


> and have the master periodically (probably the same loop that checks
> for shutdown), read the Queue, and remove the received ID from the list
> of active threads.

I imagine that there are nicer design options available for notifications
according to thread terminations.


> On shutdown, you loop reading the Queue and removing IDs from the list
> of active threads until the list is empty.

Will a condition variable (or a semaphore) be more helpful here?


>> How much do the programming interfaces from the available classes support
>> the determination that submitted tasks were completely finished?
>>
>   Read the library reference manual and, for those modules with Python
> source, the source files (threading.py, socketserver.py, queue.py...)
>
>   The simpler answer is that these modules DON'T...

I suggest to improve the software situation a bit more.


> It is your responsibility.

This view can be partly appropriate.


> socketserver threading model is that the main server loops waiting
> for connection requests, when it receives a request it creates
> a handler thread, and then it completely forgets about the thread

I find that this technical detail can be better documented.


> -- the thread is independent and completes the handling of the request.

Would you occasionally like to wait on the return value from such
data processing?
(Are these threads joinable?)


> If you need to ensure everything has finished before starting
> the next server instance, you will have to write the extensions
> to socketserver.

I might achieve something myself while it can be also nice to achieve
adjustments together with other developers.


>> Should mentioned system constraints be provided already by the Python
>> function (or class) library?
>
>   Again, the model for socketserver, especially in threaded mode, is that
> requests being handled are completely independent. shutdown() merely stops
> the master from responding to new requests.

I find this understanding of the software situation also useful.


>   In a more normal situation, .shutdown() would be called
> and then the entire program would call exit.

* Do you expect any more functionality here than an exit from a single thread?

* Does this wording include the abortion of threads which were left over?


> It is NOT normal for a program to create a server, shut it down,
> only to then repeat the sequence.

Will your understanding of such an use case grow, too?

Regards,
Markus
-- 
https://mail.python.org/mailman/listinfo/python-list