Fwd: keying by identity in dict and set

2019-10-26 Thread Steve White
Hi Dieter,

I'm sure that 99% of all use of 'dict' in Python is exactly this.  The
vast majority of my own Python code is such, and that is as it should
be.

Here I have an application where I can do something really cool and
useful, by keying on identity.  The built-in Python structures are
pretty limited, but I'm making a package for use by other people, and
I strongly prefer them to use familitar Python structures with it,
rather than having to learn something new, and I strongly prefer to
use off-the-shelf, tested structures, rather than rolling my own.

I spent some days trying to make it work in more conventional ways ---
nothing worked as well, as cleanly.

I am not advocating this style of programming.  I want the flexibility
to use it, when it is called for.  Yes, it has its limitations, but
limitations are to be understood and worked with.

And it is now obvious to me that somebody went to great pains to make
sure the Python 'dict' does in fact support this.  It works
splendidly.  This is no accident.

It's instructive to compare Python's containers to the container
libraries of other languages, Java for instance.  In Java, there are
many kinds of container classes, which permits finding one that is
optimal for a given application.  In contrast, Python has only a
handfull.  But they are meant to be very flexible, and fairly well
optimized for most applcations.

Yet the documentation *not only* suggests that 'dict' and 'set' cannot
be used for keying by identity, it gives no insight whatever into how
their internal hash algorithms use __hash__() and __eq__().  This
results in hundreds of postings by people who want to just do the
right thing, or who want to do something a little different, often
being answered by people who themselves scarcely understand what is
really going on.  While researching this question, I found several
places where somebody asked about doing something like what I
described here, but they never got a useful answer.  I also found post
that advocate returning the value of id() in __hash__(), without
explaining how __eq__() should then be overloaded.

A little documentation would have saved me personally days of work.
It  would be helpful to know:
  * under what conditions can one expect a "perfect hash", that is,
one where __eq__() will never be called?
  * is it sufficient to return the value of the key object's id()
function to produce a perfect hash?
  * when might it be useful to consider keying by identity?
  * what are the limitations of programming this way?

Thanks!

On Sat, Oct 26, 2019 at 7:17 AM dieter  wrote:
>
> Steve White  writes:
> > Regarding my question
> >  "Is there some fatal reason that this approach must never be
> > used, besides the lack of documentary support for it?"
> > If finally dawned on me that there is a use-case for containers that
> > would preclude using object identity for keys.  That is, if the object
> > is to be serialized, or otherwise stored past the run-time of the
> > program.  Of course, all the identities (all the id() values) will be
> > meaningless once the current run ends.
>
> One motivation to base dict key management on equality
> (rather than identity) are literals:
>
> Consider a dict "d" with at some place
> `d["my example key"] = 1` and at a different place
> (maybe another function, another module) you access
> `d["my example key"]`. You would expect to get `1`
> as result as for your eyes the two literals are equal.
> Would the key management be based on identity, then
> you could get either the expected `1` or a `KeyError`.
> The reason: Python does not manage (most) literals globally;
> this means, if you use the same literal in different places
> you may (or may not) have non-identical objects.
>
> Basing on equality, you are also more flexibal than
> with identity, because can can change the equality
> rules for a class while you cannot change the identity rules.
> Thus, if you need identity based key management,
> define your `__eq__` accordingly.
>
> --
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


syntax for ellipsis like as "net.blobs['data'].data[...] = transformed_image"

2019-10-26 Thread xuanwu348
Hi buddies


Have a good weekend!
I have read some code in caffe, but I confused at "net.blobs['data'].data[...] 
= transformed_image".
The code can be find in this 
link:https://github.com/BVLC/caffe/blob/master/examples/00-classification.ipynb
import caffe
model_def = os.path.join(caffe_root, "models/bvlc_alexnet/deploy.prototxt")
model_weights = os.path.join(caffe_root, 
"models/bvlc_alexnet/bvlc_alexnet.caffemodel")
net = caffe.Net(model_def,
model_weights,
caffe.TEST)
net.blobs['data'].reshape(50,
  3,
  227,227)
net.blobs['data'].data[...] = transformed_image


1. For ellipsis, is there syntax in python like this " 
net.blobs['data'].data[...]"? I have checked it which the type is ""
2. Could you provide some samples in which need to use ellipsis. 


But in python, ellipsis define as below, I did not find the relationship to 
"net.blobs['data'].data[...]":
Help on ellipsis object:


class ellipsis(object)
 |  Methods defined here:
 |
 |  __getattribute__(self, name, /)
 |  Return getattr(self, name).
 |
 |  __new__(*args, **kwargs) from builtins.type
 |  Create and return a new object.  See help(type) for accurate signature.
 |
 |  __reduce__(...)
 |  helper for pickle
 |
 |  __repr__(self, /)
 |  Return repr(self).


Thanks for your help!


Best Regards



-- 
https://mail.python.org/mailman/listinfo/python-list


fileinput

2019-10-26 Thread Pascal
I have a small python (3.7.4) script that should open a log file and
display its content but as you can see, an encoding error occurs :

---

import fileinput
import sys
try:
source = sys.argv[1:]
except IndexError:
source = None
for line in fileinput.input(source):
print(line.strip())

---

python3.7.4 myscript.py myfile.log
Traceback (most recent call last):
...
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe8 in position 799:
invalid continuation byte

python3.7.4 myscript.py < myfile.log
Traceback (most recent call last):
...
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe8 in position 799:
invalid continuation byte

---

I add the encoding hook to overcome the error but this time, the script
reacts differently depending on the input used :

---

import fileinput
import sys
try:
source = sys.argv[1:]
except IndexError:
source = None
for line in fileinput.input(source,
openhook=fileinput.hook_encoded("utf-8", "ignore")):
print(line.strip())

---

python3.7.4 myscript.py myfile.log
first line of myfile.log
...
last line of myfile.log

python3.7.4 myscript.py < myfile.log
Traceback (most recent call last):
...
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe8 in position 799:
invalid continuation byte

python3.7.4 myscript.py /dev/stdin < myfile.log
first line of myfile.log
...
last line of myfile.log

python3.7.4 myscript.py - < myfile.log
Traceback (most recent call last):
...
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe8 in position 799:
invalid continuation byte

---

does anyone have an explanation and/or solution ?
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Congratulations to @Chris

2019-10-26 Thread Jason Friedman
>
> Chris Angelico: [PSF's] 2019 Q2 Community Service Award Winner
> http://pyfound.blogspot.com/2019/10/chris-angelico-2019-q2-community.html
>
> ...and for the many assistances and pearls of wisdom he has contributed
> 'here'!
> --
> Regards,
> =dn
>
> Agreed.
-- 
https://mail.python.org/mailman/listinfo/python-list


web scraper

2019-10-26 Thread joseph pareti
Thank you so much for your very valuable guidance on my python experiment.
Meanwhile the problems I reported before have been solved.

This is part of a program that extracts specific information from bank
transaction records, and one functionality I still need to implement is a *web
scraper*:

The eureka / guide 
"web scraping with python" provides some insights, that are however linked
to a specific website:
by associating the "inspected" web page with the code shown in the Eureka
page, one can build the algorithm, but
can it be generalized?
Not all tags are in the form of   https://www.flipkart.com/laptops/~buyback-guarantee-on-laptops-/pr?sid=6bo%2Cb5g&uniqBStoreParam1=val1&wid=11.productCard.PMU_V2>
contains all the needed data, while my case requires a lookup of one item
at a time, namely:
(i) loop over a bunch of ISIN codes
(ii) access a specific website (=morningstar?), that does the
ISIN-to-fund-name translation
(iii) "inspect" that page containing the result and grab the fund name.

I would appreciate any advice on how to program all this. Thanks.
-- 
Regards,
Joseph Pareti - Artificial Intelligence consultant
Joseph Pareti's AI Consulting Services
https://www.joepareti54-ai.com/
cell +49 1520 1600 209
cell +39 339 797 0644
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: fileinput

2019-10-26 Thread Peter Otten
Pascal wrote:

> I have a small python (3.7.4) script that should open a log file and
> display its content but as you can see, an encoding error occurs :
> 
> ---
> 
> import fileinput
> import sys
> try:
> source = sys.argv[1:]
> except IndexError:
> source = None
> for line in fileinput.input(source):
> print(line.strip())
> 
> ---
> 
> python3.7.4 myscript.py myfile.log
> Traceback (most recent call last):
> ...
> UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe8 in position 799:
> invalid continuation byte
> 
> python3.7.4 myscript.py < myfile.log
> Traceback (most recent call last):
> ...
> UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe8 in position 799:
> invalid continuation byte
> 
> ---
> 
> I add the encoding hook to overcome the error but this time, the script
> reacts differently depending on the input used :
> 
> ---
> 
> import fileinput
> import sys
> try:
> source = sys.argv[1:]
> except IndexError:
> source = None
> for line in fileinput.input(source,
> openhook=fileinput.hook_encoded("utf-8", "ignore")):
> print(line.strip())
> 
> ---
> 
> python3.7.4 myscript.py myfile.log
> first line of myfile.log
> ...
> last line of myfile.log
> 
> python3.7.4 myscript.py < myfile.log
> Traceback (most recent call last):
> ...
> UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe8 in position 799:
> invalid continuation byte
> 
> python3.7.4 myscript.py /dev/stdin < myfile.log
> first line of myfile.log
> ...
> last line of myfile.log
> 
> python3.7.4 myscript.py - < myfile.log
> Traceback (most recent call last):
> ...
> UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe8 in position 799:
> invalid continuation byte
> 
> ---
> 
> does anyone have an explanation and/or solution ?

'-' or no argument tell fileinput to use sys.stdin. This is already text 
decoded using Python's default io-encoding, and the open hook is not called.
You can override the default encoding by setting the environment variable

PYTHONIOENCODING=UTF8:ignore

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: web scraper

2019-10-26 Thread Michael Torrie
On 10/25/19 9:19 AM, joseph pareti wrote:
> but can it be generalized?
> Not all tags are in the form ofto just replace those tags in the code, should
> one process a different website?

Not really, no.  There is not an easy way to generalize this sort of web
scraping.  There are many different ways to use html tags.  And each web
site is going to use a different scheme for defining CSS ids and
classes.  Really web scraping is customized to each website, and it's
prone to breaking as the website can change itself at any time.

The only reliable way to access and process information is if a web site
offers a nice stable web services API you can use.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re:syntax for ellipsis like as "net.blobs['data'].data[...] = transformed_image"

2019-10-26 Thread xuanwu348
Thanks too,
I have find the answer from 
"https://stackoverflow.com/questions/772124/what-does-the-python-ellipsis-object-do";

This came up in another question recently. I'll elaborate on my answer from 
there:

Ellipsis is an object that can appear in slice notation. For example:

myList[1:2, ..., 0]


Its interpretation is purely up to whatever implements the __getitem__ function 
and sees Ellipsis objects there, but its main (and intended) use is in the 
numeric python extension, which adds a multidimensional array type. Since there 
are more than one dimensions, slicing becomes more complex than just a start 
and stop index; it is useful to be able to slice in multiple dimensions as 
well. E.g., given a 4x4 array, the top left area would be defined by the slice 
[:2,:2]:

>>> a
array([[ 1,  2,  3,  4],
   [ 5,  6,  7,  8],
   [ 9, 10, 11, 12],
   [13, 14, 15, 16]])

>>> a[:2,:2]  # top left
array([[1, 2],
   [5, 6]])


Extending this further, Ellipsis is used here to indicate a placeholder for the 
rest of the array dimensions not specified. Think of it as indicating the full 
slice [:] for all the dimensions in the gap it is placed, so for a 3d array, 
a[...,0] is the same as a[:,:,0] and for 4d, a[:,:,:,0], similarly, a[0,...,0] 
is a[0,:,:,0] (with however many colons in the middle make up the full number 
of dimensions in the array).

Interestingly, in python3, the Ellipsis literal (...) is usable outside the 
slice syntax, so you can actually write:

>>> ...
Ellipsis


Other than the various numeric types, no, I don't think it's used. As far as 
I'm aware, it was added purely for numpy use and has no core support other than 
providing the object and corresponding syntax. The object being there didn't 
require this, but the literal "..." support for slices did.







在 2019-10-26 22:51:31,"xuanwu348"  写道:

Hi buddies


Have a good weekend!
I have read some code in caffe, but I confused at "net.blobs['data'].data[...] 
= transformed_image".
The code can be find in this 
link:https://github.com/BVLC/caffe/blob/master/examples/00-classification.ipynb
import caffe
model_def = os.path.join(caffe_root, "models/bvlc_alexnet/deploy.prototxt")
model_weights = os.path.join(caffe_root, 
"models/bvlc_alexnet/bvlc_alexnet.caffemodel")
net = caffe.Net(model_def,
model_weights,
caffe.TEST)
net.blobs['data'].reshape(50,
  3,
  227,227)
net.blobs['data'].data[...] = transformed_image


1. For ellipsis, is there syntax in python like this " 
net.blobs['data'].data[...]"? I have checked it which the type is ""
2. Could you provide some samples in which need to use ellipsis. 


But in python, ellipsis define as below, I did not find the relationship to 
"net.blobs['data'].data[...]":
Help on ellipsis object:


class ellipsis(object)
 |  Methods defined here:
 |
 |  __getattribute__(self, name, /)
 |  Return getattr(self, name).
 |
 |  __new__(*args, **kwargs) from builtins.type
 |  Create and return a new object.  See help(type) for accurate signature.
 |
 |  __reduce__(...)
 |  helper for pickle
 |
 |  __repr__(self, /)
 |  Return repr(self).


Thanks for your help!


Best Regards








 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: keying by identity in dict and set

2019-10-26 Thread Random832
On Sat, Oct 19, 2019, at 07:31, Steve White wrote:
> Hi,
> 
> I have an application that would benefit from object instances
> distinguished by identity being used in dict's and set's. To do this,
> the __hash__ method must be overridden, the obvious return value being
> the instance's id.
> 
> This works flawlessly in extensive tests on several platforms, and on
> a couple of different Python versions and implementations.
> 
> The documentation seems to preclude a second requirement, however.
> 
> I also want to use the == operator on these objects to mean a natural
> comparison of values, different from identity, so that two instances
> comparing equivalent does not imply that they are identical.

I'd like to jump in to this thread to note that while this is reasonably easily 
achieved with a custom mapping class that uses a dict along with a wrapper 
class that stores the identity...

I once tried to make a WeakKeyDictionary that was keyed by identity and had no 
end of trouble.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How to decode UTF strings?

2019-10-26 Thread Eli the Bearded
In comp.lang.python, DFS   wrote:
> On 10/25/2019 10:57 PM, MRAB wrote:
>> Here's a simple example, based in your code:
>> 
>> from email.header import decode_header
>> 
>> def test(header, default_encoding='utf-8'):
>>   parts = []
>> 
>>   for data, encoding in decode_header(header):
>>   if isinstance(data, str):
>>  parts.append(data)
>>   else:
>>  parts.append(data.decode(encoding or default_encoding))
>> 
>>   print(''.join(parts))
>> 
>> test('=?iso-8859-9?b?T/B1eg==?= ')
>> test('=?utf-8?Q?=EB=AF=B8?= ')
>> test('=?GBK?B?0Pu66A==?= ')
>> test('=?UTF-8?B?zp3Or866zr/PgiDOks6tz4HOs86/z4I=?= 
>> ')
> I don't think it's working:

It's close. Just ''.join should be ' '.join.

> $ python decode_utf.py
> O≡uz
> 미
> ╨√║Φ
> Νίκος Βέργος

Is your terminal UTF-8? I think not.

Elijah
--
answered with C code to do this in comp.lang.c
-- 
https://mail.python.org/mailman/listinfo/python-list


pip3 install keyboard did not work in my environment

2019-10-26 Thread tommy yama
Hi,

keyboard module can be installed with pip3?

Regards,
-- 
https://mail.python.org/mailman/listinfo/python-list