Re: walrus with a twist :+= or ...

2021-10-29 Thread Peter J. Holzer
On 2021-10-28 19:48:06 -0400, Avi Gross via Python-list wrote:
> My right ALT key now lets me type in all kinds of nonsense like ⑦ and
> © and ß and ℵ0 and ⅔  and ≠ and ⇒ and ♬and although :- makes ÷ I see
> :=  makes ≔ which is just a longer equals sign.

Ah, yes. That's another problem with using a large number of characters:
Some of them will be visually very similar, especially with monospaced
fonts.

I'm not getting any younger and with my preferred font (which is quite
small) I already have to squint to distinguish between : and ; or
between () and {}.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why so fast a web framework?

2021-10-29 Thread Benjamin Schollnick
>> Sometimes this group reminds me of a certain large company I worked for.
>> If they didn't have a solution addressing a problem, they'd pretend it
>> didn't matter and belittle anyone who questioned that version of reality.
>> 
> 
> That's not strictly true; what's happening here is that someone's
> published a cool-looking bar graph but nobody knows what it really
> means. I don't feel the need to delve into everyone's benchmarks to
> explain why Python is purportedly worse. If someone uses those kinds
> of numbers to decide which programming language to use, they have
> bigger problems.

If you dig a bit, the benchmark is scary…
As in stupid-scary.

It consists of, 7 tests, and then a composite score is generated:

JSON Serialization - In this test, each response is a JSON serialization of a 
freshly-instantiated object that maps the key message to the value Hello, World!
Single Query - In this test, each request is processed by fetching a single row 
from a simple database table. That row is then serialized as a JSON response.
Multiple Queries - In this test, each request is processed by fetching multiple 
rows from a simple database table and serializing these rows as a JSON 
response. The test is run multiple times: testing 1, 5, 10, 15, and 20 queries 
per request. All tests are run at 512 concurrency.
Cached Queries - In this test, each request is processed by fetching multiple 
cached objects from an in-memory database (the cache having been populated from 
a database table either as needed or prior to testing) and serializing these 
objects as a JSON response. The test is run multiple times: testing 1, 5, 10, 
15, and 20 cached object fetches per request. All tests are run at 512 
concurrency. Conceptually, this is similar to the multiple-queries test except 
that it uses a caching layer.
Fortunes - In this test, the framework's ORM is used to fetch all rows from a 
database table containing an unknown number of Unix fortune cookie messages 
(the table has 12 rows, but the code cannot have foreknowledge of the table's 
size). An additional fortune cookie message is inserted into the list at 
runtime and then the list is sorted by the message text. Finally, the list is 
delivered to the client using a server-side HTML template. The message text 
must be considered untrusted and properly escaped and the UTF-8 fortune 
messages must be rendered properly.  Whitespace is optional and may comply with 
the framework's best practices.
data Updates - This test exercises database writes. Each request is processed 
by fetching multiple rows from a simple database table, converting the rows to 
in-memory objects, modifying one attribute of each object in memory, updating 
each associated row in the database individually, and then serializing the list 
of objects as a JSON response. The test is run multiple times: testing 1, 5, 
10, 15, and 20 updates per request. Note that the number of statements per 
request is twice the number of updates since each update is paired with one 
query to fetch the object. All tests are run at 512 concurrency.  The response 
is analogous to the multiple-query test. 
plain text - In this test, the framework responds with the simplest of 
responses: a "Hello, World" message rendered as plain text. The size of the 
response is kept small so that gigabit Ethernet is not the limiting factor for 
all implementations. HTTP pipelining is enabled and higher client-side 
concurrency levels are used for this test (see the "Data table" view).

Here, I instead benchmark my django gallery app, using Apache Bench, and so 
forth.  I guess I’ve been over-achieving…

I have to admit, though, that these benchmarks certainly allow everyone to 
play.  

431 cherrypy587 0.0%(0.0%)

Even cherrypy with it’s 587 per second replies with plain-text.  

The tasks seem deceptively (?) simple?  

But looking closer, the data table for each task, gives more details.  For 
example the plain text is run 4 different times, at 4 different client-side 
concurrency levels are used…  But the levels are: 256, 1024, 4096, and 16384.   
That can’t be the concurrency/thread count??!?!?!??  I can believe 1,000 - 
3,000, outrageously high, but believable.  But 16K worth of 
concurrency/threads?  I doubt that Wikipedia even has to dial it that high?

I have to give them points for providing API latency, and framework overhead….  

- Benjamin



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: New assignmens ...

2021-10-29 Thread Antoon Pardon



Op 28/10/2021 om 19:36 schreef Avi Gross via Python-list:

Now for a dumb question. Many languages allow a form of setting a variable to a 
value like:

  


 assign(var, 5+sin(x))

  


If we had a function that then returned var or the value of var, cleanly, then 
would that allow an end run on the walrus operator?

  


if (assign(sign, 5+sin(x)) <= assign(cosign, 5+cos(x))) …

  


Not necessarily pretty and I am sure there may well be reasons it won’t work, 
but I wonder if it will work in more places than the currently minimal walrus 
operator.


This was the orginal code to illustrate the question:

if (self.ctr:=self.ctr-1)<=0

So if I understand your sugested solution it would be something like:

def setxattr(obj, attr, value):
setattr(obj, attr, value)
return value

if setxattr(self, 'ctr', self.ctr - 1) <= 0

Did I get that right?

--
Antoon Pardon.

   


--
https://mail.python.org/mailman/listinfo/python-list


Re: The task is to invent names for things

2021-10-29 Thread alister via Python-list
On Thu, 28 Oct 2021 00:41:41 +0200, Peter J. Holzer wrote:

> On 2021-10-27 12:41:56 +0200, Karsten Hilbert wrote:
>> Am Tue, Oct 26, 2021 at 11:36:33PM + schrieb Stefan Ram:
>> > xyzzy = lambda x: 2 * x
>> >   . Sometimes, this can even lead to "naming paralysis", where one
>> >   thinks excessively long about a good name. To avoid this naming
>> >   paralysis, one can start out with a mediocre name. In the course of
>> >   time, often a better name will come to one's mind.
>> 
>> In that situation, is it preferable to choose a nonsensical name over a
>> mediocre one ?
> 
> I don't know. A mediocre name conveys at least some information, and
> that seems to be better than none. On the other hand it might be just
> enough to lead the reader astray which wouldn't happen with a
> non-sensical name.
> 
> But since perfect names are hard to find, using nonsensical instead of
> mediocre names would mean choosing nonsensical names most of the time.
> So I'll stick with mediocre names if in doubt.
> 
> hp

Although if a mediocre name is chosen there is less impetus on the 
programmer to change it, "its not great but it'll do"
where as a nonsense name sticks out like a saw thumb until it is 
corrected.
I am firmly undecided




-- 
Riches cover a multitude of woes.
-- Menander
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: The task is to invent names for things

2021-10-29 Thread alister via Python-list
On Thu, 28 Oct 2021 00:38:17 +, Eli the Bearded wrote:

> In comp.lang.python, Peter J. Holzer  wrote:
>  ^^
> 
> Those all work. But if you are writing a new web framework and you name
> your method to log stuff to a remote server "Britney" because you were
> listening the singer, that's not perfectly fine, even you want to make
> "Oops, I did it again" jokes about your logged errors.

Although Oops would be aq pretty reasonable name for a logger or error 
handler... 
> 
> 
> Elijah --
> naming is hard, unless it's easy





-- 
After a number of decimal places, nobody gives a damn.
-- 
https://mail.python.org/mailman/listinfo/python-list


Python script seems to stop running when handling very large dataset

2021-10-29 Thread Shaozhong SHI
Python script works well, but seems to stop running at a certain point when
handling very large dataset.

Can anyone shed light on this?

Regards, David
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python script seems to stop running when handling very large dataset

2021-10-29 Thread dn via Python-list
On 30/10/2021 11.42, Shaozhong SHI wrote:
> Python script works well, but seems to stop running at a certain point when
> handling very large dataset.
> 
> Can anyone shed light on this?

Storage space?
Taking time to load/format/process data-set?

-- 
Regards,
=dn
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: The task is to invent names for things

2021-10-29 Thread dn via Python-list
On 29/10/2021 07.07, Stefan Ram wrote:
>   The name should not be "optimized" for a certain use case
>   (as for the use in an if expression) only. "We", "have",
>   and "any" carry little information. A name should pack as
>   much information as possible in as least characters as
>   possible. So, for me, it'd be something like:

Although, does that not imply that "pack[ing]...information" is primary,
and "least characters..." secondary?

How else to define "readability" (wrt 'names')?


> if word_count:

Yes, but this does open the door to the 'gotchas' of truthiness, ie
there ain't no such thing as a free lunch/"silver bullet".

Whereas, names such as:

is_valid
words

help to imply a boolean and a collection, resp. (cf the micro-softy way
of imposing the language-technicalities over naming-readability, eg

bValid


Thereafter, to counter the idea of not having too many names, there may
be an argument for setting-up one's data to be able to code:

if is_valid:
while word_count:
   # pop word from words and do something...

so now we have a data-set which includes:

words: a collection, eg arriving to us from some input
word: one item from words, isolated for processing
word_count: a convenient expression of len( words )
is_valid: an indicator that the words received are ready for our process

(not that I 'like' "is_valid" but need a specific context to improve that)

The "is_" prefix appeals to me because it is so English-like (dare I say
COBOL-like?) that readability is in no doubt. At the same time, whilst
it could be considered and 'extra' name/extra-effort, it adds precision
to the logic.

YMMV!
Indeed some may wish to argue that the data-set includes unnecessary
verbiage, and that this in-and-of-itself might contribute to
cognitive-overload...

import this


>> So, the obvious solution is to ask the language, like Python, to allow
>> variables that are synonyms.
> 
>   Programs already have too many names in them. 
>   There is no need to add even more, especially
>   when they are equivalent, and the average human can
>   only hold 7 ± 2 objects (like names) in his short-
>   term memory.

+1

aka "cognitive overload"

@Martin's term is "naming paralysis", which fits as a component of
"analysis paralysis", cf 'let's get on with it'!

(which summed-up how I felt abstracting and refactoring a function this
morning - dithering and vacillating like I couldn't decide between
chocolate cake or chocolate pudding...)

Which point, when referred to synonym variables, gives the optimal
answer: "both" (cake and pudding!)

When you think about it, passing values to functions, ie an argument
becomes a parameter, that can be considered a form of synonym.

(and (before someone disappears off on another unintended tangent) some
circumstances where it is not!)


>> really good names often end up being really long names
> 
>   If they are really long, they are not really good, 
>   /except/ when they have a really large scope.
> 
>   Names for a small scope can and should be short,
>   names for a large scope may be longer.

Interesting! Surely as something becomes more specific, there is more
meaning and/or greater precision to be embedded within the name. Thus:

operator

assignment_operator
walrus_operator


>   Some very short names have traditional meanings
>   and are ok when used as such:
> 
> for i in range( 10 ):

For historical (?hysterical) reasons I find myself agreeing with this.
(in FORTRAN any variable beginning with the letters "I" through "N" was
regarded as an integer - all others being real/floating-point - thus, it
is a familiar idiom.

That said, this form of for-loop is reminiscent of other languages which
use "pointers" to access data-collections, etc, and keep track of where
to logic is within the process using "counters" - none of which are
necessary in pythonic for-loops.

Accordingly, there is an argument for using:

for ptr in range( 10 ):
for ctr in range( 10 ):

(or even the complete word, if that is the functionality required)

The above allowing for the fact that I have my own 'set' of 'standard
abbreviations' which are idiomatic and readable to me (!). One of which is:

for ndx in range( 10 ):

This abbreviation originated in the days when variable-names had to be
short. So, one tactic was to remove vowels*. Once again, more readable
(to me) than "i", but possibly less-than idiomatic to others...

That said, watching a video from EuroPython 2021, I was amused to see:

for idx in range( 10 ):

and doubly-amused when I experimented with the idea as-presented and
'copied' the code by 'translating' his "idx" into the English "index",
and whilst typing into my own REPL, cogno-translated the thought into my
own "ndx" expression.


Now, I don't know if the EuroPython author uses "idx" because that
relates somehow to his home-language. However, if we analyse such
abbreviations, their writing is a personal exercise. Whereas,
measuring/assessing "readability" involve neither "writing

Re: Python script seems to stop running when handling very large dataset

2021-10-29 Thread Dan Stromberg
On Fri, Oct 29, 2021 at 4:04 PM dn via Python-list 
wrote:

> On 30/10/2021 11.42, Shaozhong SHI wrote:
> > Python script works well, but seems to stop running at a certain point
> when
> > handling very large dataset.
> >
> > Can anyone shed light on this?
>
> Storage space?
> Taking time to load/format/process data-set?
>

It could be many things.

What operating system are you on?

If you're on Linux, you can use strace to attach to a running process to
see what it's up to.  Check out the -p option.  See
https://stromberg.dnsalias.org/~strombrg/debugging-with-syscall-tracers.html

macOS has dtruss, but it's a little hard to enable.  dtruss is similar to
strace.

Both of these tools are better for processes doing system calls (kernel
interactions).  They do not help nearly as much with CPU-bound processes.

It could also be that you're running out of virtual memory, and the
system's virtual memory system is thrashing.

Does the load average on the system go up significantly when the process
seems to get stuck?

You could try attaching to the process with a debugger, too, EG with pudb:
https://github.com/inducer/pudb/issues/31

Barring those, you could sprinkle some print statements in your code, to
see where it's getting stuck. This tends to be an iterative process, where
you add some prints, run, observe the result, and repeat.

HTH.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python script seems to stop running when handling very large dataset

2021-10-29 Thread Paul Bryan
With so little information provided, not much light will be shed. When
it stops running, are there any errors? How is the dataset being
processed? How large is the dataset? How large a dataset can be
successfully processed? What libraries are being used? What version of
Python are you using? On what operating system? With how much memory?
With how much disk space is used? How much is free? Are you processing
files or using a database? If the latter, what database? Does it write
intermediate files during processing? Can you monitor memory usage
during processing (e.g. with a system monitor) to see how much memory
is consumed?


On Fri, 2021-10-29 at 23:42 +0100, Shaozhong SHI wrote:
> Python script works well, but seems to stop running at a certain
> point when
> handling very large dataset.
> 
> Can anyone shed light on this?
> 
> Regards, David

-- 
https://mail.python.org/mailman/listinfo/python-list


CWD + relative path + import name == resultant relative path?

2021-10-29 Thread Dan Stromberg
Is there a predefined function somewhere that can accept the 3 things on
the LHS in the subject line, and give back a resultant relative path -
relative to the unchanged CWD?

To illustrate, imagine:
1) You're sitting in /home/foo/coolprog
2) You look up the AST for /home/foo/coolprog/a/b/c.py
3) /home/foo/coolprog/a/b/c.py has a relative import of ..blah

Is there something preexisting that can figure out what path blah is
supposed to have, assuming 100% pure python code?

Or is it necessary to actually import to get that?  I think I could import
and then check __file__, but I'd rather avoid that if possible.

Thanks!
-- 
https://mail.python.org/mailman/listinfo/python-list


RE: New assignmens ...

2021-10-29 Thread Avi Gross via Python-list
Antoon,

As long as we understand that my suggestion is not meant to be taken seriously, 
your extension is along the lines I intended.

You might indeed have a family of helper functions whose purpose is to bot make 
a change on the side and return the value to be used in a computation. Your 
specific implementation of something like that:

 def setxattr(obj, attr, value):
 setattr(obj, attr, value)
 return value

Would need to have access to the original object and change it in a way that 
propagates properly. So when you do this:

if setxattr(self, 'ctr', self.ctr - 1) <= 0 :

Then assuming passing it 'ctr' as a string makes sense, and the object self is 
passed by reference, I can see it working without a walrus operator.

But it is extra overhead. This being python, setting values WITHIN an object is 
a challenge. I mean there are ways to make a value readable but not writeable 
or writeable only using a designated method, or an attempt to set the value may 
be intercepted and the interceptor may choose to do something different such as 
ignoring the request if someone tries to set the time to thirteen o'clock or 
even setting it to 1 o'clock instead. The above kind of code perhaps should not 
return value but obj.attr so we see what was stored. But again, Python lets you 
intercept things in interesting ways so I can imagine it showing something 
other that what you stored. 

Sigh

As noted, the general case implemented walrus style may have challenges and 
even efforts like the above may not always be straightforward.

Language design is not as trivial as some think and like with many things, 
adding a neat new feature may open up holes including security holes if people 
figure out how to abuse it. Shutting down some such abilities is exactly why 
people code defensively and try to hide the inner aspects of an object by doing 
things like having a proxy in front of it and creating getters and setters.


-Original Message-
From: Python-list  On 
Behalf Of Antoon Pardon
Sent: Friday, October 29, 2021 10:04 AM
To: python-list@python.org
Subject: Re: New assignmens ...



Op 28/10/2021 om 19:36 schreef Avi Gross via Python-list:
> Now for a dumb question. Many languages allow a form of setting a variable to 
> a value like:
>
>   
>
>  assign(var, 5+sin(x))
>
>   
>
> If we had a function that then returned var or the value of var, cleanly, 
> then would that allow an end run on the walrus operator?
>
>   
>
> if (assign(sign, 5+sin(x)) <= assign(cosign, 5+cos(x))) …
>
>   
>
> Not necessarily pretty and I am sure there may well be reasons it won’t work, 
> but I wonder if it will work in more places than the currently minimal walrus 
> operator.

This was the orginal code to illustrate the question:

 if (self.ctr:=self.ctr-1)<=0

So if I understand your sugested solution it would be something like:

 def setxattr(obj, attr, value):
 setattr(obj, attr, value)
 return value

 if setxattr(self, 'ctr', self.ctr - 1) <= 0

Did I get that right?

-- 
Antoon Pardon.



-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: walrus with a twist :+

2021-10-29 Thread Bob Martin
On 28 Oct 2021 at 18:52:26, "Avi Gross"  wrote:
>
> Ages ago, IBM used a different encoding than ASCII called EBCDIC (Extended
> Binary Coded Decimal Interchange Code ) which let them use all 8 bits and
> thus add additional symbols. =B1  =A6  =AC

IBM started using EBCDIC with System 360 and it is still used on mainframes.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python script seems to stop running when handling very large dataset

2021-10-29 Thread Grant Edwards
On 2021-10-29, Shaozhong SHI  wrote:
> Python script works well, but seems to stop running at a certain point when
> handling very large dataset.
>
> Can anyone shed light on this?

No.

Nobody can help you with the amount of information you have provided.

--
Grant



-- 
https://mail.python.org/mailman/listinfo/python-list