Re: decorat{or,ion}

2018-05-19 Thread Steven D'Aprano
On Fri, 18 May 2018 18:31:16 -0700, Mike McClain wrote:

> Let's say I want something that does most or all of foo's functionality
> plus a little more and maybe tweek some of foo's output, so I write a
> wrapper around foo and call it bar. If inside bar are the call to foo,
> as well as methods baz(), buz() and bug() that make their magic and bar
> ends up performing as I want.

I *think* you are describing something like this:

def foo(x):
return x + 1

def bar(arg):
a = baz(arg)  # do some magic
result = bar(a)  # call the important function
return buz(result)  # and a bit more magic


No methods are required! We can simply talk about ordinary functions.

(Methods are, in a sense, just like normal functions except they carry 
around some extra state, namely the object that owns them.)

Is that what you mean?


> If I understood correctly baz(), buz() and bug() plus the glue
> that is bar are decorations around foo or bar is a decorator of foo.

Typically, we wouldn't use the term "decorator" or "decoration" to 
describe a hand-written function like bar(), even if it calls foo(). 
Normally the "decorator" terminology is reserved for one of two things:

(1) The software design pattern of using a factory function to "wrap" one 
function inside an automatically generated wrapper function that provides 
the extra additional functionality:


def factory(func):
# Wrap func() to force it to return zero instead of negative
def wrapped(arg):
result = func(arg)
if result < 0:
result = 0
return result
# return the wrapped function
return wrapped


(2) the syntax for applying such a factory function:

@factory
def myfunction(x):
return 5 - x


Does this help?

I know these concepts are sometimes tricky. Concrete examples often make 
them easier to understand. Feel free to ask more questions as needed.


-- 
Steve

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: seeking deeper (language theory) reason behind Python design choice

2018-05-19 Thread Peter J. Holzer
On 2018-05-16 00:04:06 +, Steven D'Aprano wrote:
> On Tue, 15 May 2018 22:21:15 +0200, Peter J. Holzer wrote:
> > On 2018-05-15 00:52:42 +, Steven D'Aprano wrote:
> [...]
> >> By 1991 there had already been *decades* of experience with C
> > 
> > About one and a half decades.
> 
> That would still be plural decades.
> 
> C's first release was 1972, so more like 19 years than 15.

I thought it was 1974, but Ritchie writes "By early 1973, the essentials
of modern C were complete." So you are closer than me. But we don't
know whether those early users of C tended to confuse “=” and “==”
(Ritchie did change the compound assignments from ”=+”, “=-” etc. so
“+=”, “-=”, etc., so at the early stages he did change syntax if it
proved to be error-prone. He didn't change the precedence of “&” and “|”
after introducing “&&” and “||”, though.). C spread beyond the Unix team
in the mid to late 70s.

> >> proving that the "=" assignment syntax is dangerously confusable with
> >> == and a total bug magnet when allowed as expressions as well, so it
> >> was perfectly reasonable to ban it from the language.
> > 
> > Experience? Yes. Data? I doubt it.
> 
> I'm using "proving" informally above, not in the sense of actual legal or 
> scientific standards of proof.

“Proof by assertion”? I am not a native English speaker, but I don't
think just claiming that something is the case constitutes a “proof”
even in the most informal use of the language.

Humans are notoriously bad at estimating risk. We often talk a lot about
and take rather extreme measures to avoid relatively small risks (e.g.
being killed in a terrorist attack) while mostly ignoring much larger
risks (e.g. being killed in a car accident). Yes, bugs of this class
were found in the wild. Yes, compiler writers added warnings about
suspicious use of assignments in a boolean context. Yes, Guido avoided
assignment expressions because he thought they were too dangerous. But
does any of this prove that assignment expressions are “a total bug
magnet”? I don't think so.


> If you are going to object to that, remember that neither is there 
> scientific data proving that assignment as an expression is useful, so we 
> can always just reject the construct as "useless until proven otherwise" 
> :-)

It is used, therefore it is useful ;-).

I think it is very useful in C which reports exceptional conditions
(errors, end of file, ...) by returning reserved values. So you often
want to combine an assignment and a test. The vast majority of
assignment sub-expressions you'll see in a C program are of this type.

I think it would be much less useful in Python, because Python has
exceptions and generators (and these are used by both the standard
library and idiomatic Python code). So most situations where you would
use an assignment sub-expression in C simply don't arise. As I said, I
don't miss the feature in Python.


> > I have been programming in C since the mid-80's [...]
> > I guess I could write a script which
> > searches through all my repositories for changesets where “=” was
> > replaced by “==”. Not sure what that would prove.)
> 
> You were using version control repositories in the 1980s, and still have 
> access to them? I'm impressed.

I started using Unix in early 1987 and SCCS shortly afterwards. So this
year may mark my 30th anniversary as a version control system user. But
I don't know if I still have any of my SCCS repos, and if I have, the
disks and tapes probably aren't readable any more.

But I wasn't actually thinking of going that far back. I am not much
interested whether I made that kind of error as a C newbie in 1987. I am
interested whether I made it as an experienced C and Perl programmer.
So I was thinking of analyzing repos which are moderately current. If I
could find cases where I wrote “=” instead of “==”, I'd have to admit
that a) I do make that error at least occationally and b) I don't always
find it before committing. But that still wouldn't prove anything: I
would have to compare it to other bug classes, and most importantly,
whether a single programmer (me) tends make (or not make) a specific
error doesn't say anything about the entirety of programmers. And it's
the latter that counts.


> > OTOH, despite having used C and Perl long before Python, I don't miss
> > assignment expressions. Every language has its idioms, and just because
> > you write stuff like “if ((fd = open(...)) == -1)” a lot in C doesn't
> > mean you have to be able to write that in Python.
> 
> I'm not a C coder, but I think that specific example would be immune to 
> the bug we are discussing, since (I think) you can't chain assignments in 
> C. Am I right?
> 
> fd = open(...)) = -1
> 
> would be a syntax error.

Yes, because you omitted a parenthesis ;-), But yes, that would be error
(although not a syntax error - the syntax is fine), because the result
of an assignment isn't an l-value in standard C and cannot be assigned
to. Some compilers (among them GCC)

Re: seeking deeper (language theory) reason behind Python design choice

2018-05-19 Thread Peter J. Holzer
On 2018-05-16 01:26:38 +0100, bartc wrote:
> On 15/05/2018 21:21, Peter J. Holzer wrote:
> > I have been programming in C since the mid-80's and in Perl since the
> > mid-90's (both languages allow assignment expressions). I accumulated my
> > fair share of bugs in that time, but AFAIR I made this particular error
> > very rarely (I cannot confidently claim that I never made it). Clearly
> > it is not “a total bug magnet” in my experience. There are much bigger
> > problems in C and Perl (and Python, too). But of course my experience is
> 
> All those languages use = for assignment and == for equality.
> 
> If like me you normally use a language where = means equality (and := is
> used for assignment), then you're going to get it wrong more frequently when
> using C or Python (I don't use Perl).

Absolutely. These days I program mostly in Python and Perl and find that
I often omit semicolons in Perl. If I was programming in Pascal and C I
probably would mix up “:=”, “=” and “==”. But I don't. All the programming
languages I have used regularly (C, Perl, Java, JavaScript, Python) use
the same operators for assignment and comparison. So my fingers know what
to type.

(I wonder whether the notion that “=” and “==” are easy to mix up stems
from the early days of C when C was an outlier (most other languages at
the time used “=” for equality). Now C is mainstream and it's those other
languages that seem odd.)

> You might get it wrong anyway because = is used for equality in the real
> world too.

Not after a few years of programming. Probably not even after a few
weeks of programming. You develop muscle memory quite quickly. 

hp

-- 
   _  | Peter J. Holzer| we build much bigger, better disasters now
|_|_) || because we have much more sophisticated
| |   | h...@hjp.at | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson 


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: seeking deeper (language theory) reason behind Python design choice

2018-05-19 Thread Marko Rauhamaa
"Peter J. Holzer" :
> (I wonder whether the notion that “=” and “==” are easy to mix up
> stems from the early days of C when C was an outlier (most other
> languages at the time used “=” for equality). Now C is mainstream and
> it's those other languages that seem odd.)

I occasionally mix them up one way or another, whether by typing them
wrong accidentally or through some copy-and-paste mishap. Typos of all
kind happen all the time, but the "="/"==" mixup isn't easy for the eye
to spot and it doesn't create a syntax error.

A famous example:

   +   if ((options == (__WCLONE|__WALL)) && (current->uid = 0))
   +   retval = -EINVAL;

   https://lwn.net/Articles/57135/>


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: what does := means simply?

2018-05-19 Thread bartc

On 19/05/2018 02:26, Chris Angelico wrote:

On Sat, May 19, 2018 at 11:10 AM, bartc  wrote:



The .ppm (really .pbm) file which was the subject of this sub-thread has its
header defined using ASCII. I don't think an EBCDIC 'P4' etc will work.



"Defined using ASCII" is a tricky concept. There are a number of file
formats that have certain parts defined because of ASCII mnemonics,
but are actually defined numerically. The PNG format begins with the
four bytes 89 50 4E 47, chosen because three of those bytes represent
the letters "PNG" in ASCII. But it's defined as those byte values. The
first three represent "i&+" in EBCDIC, and that would be just as
valid, because you get the correct bytes.

Your file contains bytes. Not text.


Not you understand why some of us don't bother with 'text mode' files.

However if you have an actual EBCDIC system and would to read .ppm 
files, then you will have trouble reading the numeric parameters as they 
are expressed using sequences of ASCII digits.


The simplest way would be to pass each byte through an ASCII to EBCDIC 
lookup table (so that code 0x37 for ASCII '7', which is EOT in EBCDIC, 
is turned into 0xF8 which is EBCDIC '7').


But then you are acknowledging the file is, in fact, ASCII.

--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: seeking deeper (language theory) reason behind Python design choice

2018-05-19 Thread Peter J. Holzer
On 2018-05-19 11:38:09 +0300, Marko Rauhamaa wrote:
> "Peter J. Holzer" :
> > (I wonder whether the notion that “=” and “==” are easy to mix up
> > stems from the early days of C when C was an outlier (most other
> > languages at the time used “=” for equality). Now C is mainstream and
> > it's those other languages that seem odd.)
> 
> I occasionally mix them up one way or another, whether by typing them
> wrong accidentally or through some copy-and-paste mishap. Typos of all
> kind happen all the time, but the "="/"==" mixup isn't easy for the eye
> to spot and it doesn't create a syntax error.
> 
> A famous example:
> 
>+   if ((options == (__WCLONE|__WALL)) && (current->uid = 0))
>+   retval = -EINVAL;

It's famous, but it is also a bad example:

1) It wasn't an accident, it was deliberate.
2) It was spotted.

hp

-- 
   _  | Peter J. Holzer| we build much bigger, better disasters now
|_|_) || because we have much more sophisticated
| |   | h...@hjp.at | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson 


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: what does := means simply?

2018-05-19 Thread Peter J. Holzer
On 2018-05-19 11:33:26 +0100, bartc wrote:
> On 19/05/2018 02:26, Chris Angelico wrote:
> > On Sat, May 19, 2018 at 11:10 AM, bartc  wrote:
> > > The .ppm (really .pbm) file which was the subject of this sub-thread has 
> > > its
> > > header defined using ASCII. I don't think an EBCDIC 'P4' etc will work.
> > 
> > "Defined using ASCII" is a tricky concept. There are a number of file
> > formats that have certain parts defined because of ASCII mnemonics,
> > but are actually defined numerically. The PNG format begins with the
> > four bytes 89 50 4E 47, chosen because three of those bytes represent
> > the letters "PNG" in ASCII. But it's defined as those byte values. The
> > first three represent "i&+" in EBCDIC, and that would be just as
> > valid, because you get the correct bytes.
> > 
> > Your file contains bytes. Not text.
> 
> Not you understand why some of us don't bother with 'text mode' files.

"Not" or "Now"?

Yesterday you claimed that you worked with them for 40 years.

> However if you have an actual EBCDIC system and would to read .ppm files,
> then you will have trouble reading the numeric parameters as they are
> expressed using sequences of ASCII digits.
> 
> The simplest way would be to pass each byte through an ASCII to EBCDIC
> lookup table (so that code 0x37 for ASCII '7', which is EOT in EBCDIC, is
> turned into 0xF8 which is EBCDIC '7').

(EBCDIC '7' is actually 0xF7)

I think the simplest way would be perform the calculation by hand
(new_value = old_value * 10 + next_byte - 0x30). At least in a language
which lets me process individual bytes easily. That would even work on
both ASCII and EBCDIC based systems (and on every other platform, too).


> But then you are acknowledging the file is, in fact, ASCII.

No need to acknowledge anything. Yes. the inventor of the format was
obviously thinking in ASCII, but to implement it you don't have to do
that. Just read bytes and extract information from them.

hp

-- 
   _  | Peter J. Holzer| we build much bigger, better disasters now
|_|_) || because we have much more sophisticated
| |   | h...@hjp.at | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson 


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: what does := means simply?

2018-05-19 Thread Chris Angelico
On Sat, May 19, 2018 at 8:33 PM, bartc  wrote:
> On 19/05/2018 02:26, Chris Angelico wrote:
>>
>> On Sat, May 19, 2018 at 11:10 AM, bartc  wrote:
>
>
>>> The .ppm (really .pbm) file which was the subject of this sub-thread has
>>> its
>>> header defined using ASCII. I don't think an EBCDIC 'P4' etc will work.
>>>
>>
>> "Defined using ASCII" is a tricky concept. There are a number of file
>> formats that have certain parts defined because of ASCII mnemonics,
>> but are actually defined numerically. The PNG format begins with the
>> four bytes 89 50 4E 47, chosen because three of those bytes represent
>> the letters "PNG" in ASCII. But it's defined as those byte values. The
>> first three represent "i&+" in EBCDIC, and that would be just as
>> valid, because you get the correct bytes.
>>
>> Your file contains bytes. Not text.
>
>
> Not you understand why some of us don't bother with 'text mode' files.
>
> However if you have an actual EBCDIC system and would to read .ppm files,
> then you will have trouble reading the numeric parameters as they are
> expressed using sequences of ASCII digits.
>
> The simplest way would be to pass each byte through an ASCII to EBCDIC
> lookup table (so that code 0x37 for ASCII '7', which is EOT in EBCDIC, is
> turned into 0xF8 which is EBCDIC '7').
>
> But then you are acknowledging the file is, in fact, ASCII.

Cool! So what happens if you acknowledge that a file is ASCII, and
then it starts with a byte value of E3 ?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: what does := means simply?

2018-05-19 Thread bartc

On 19/05/2018 12:38, Chris Angelico wrote:

On Sat, May 19, 2018 at 8:33 PM, bartc  wrote:



But then you are acknowledging the file is, in fact, ASCII.


Cool! So what happens if you acknowledge that a file is ASCII, and
then it starts with a byte value of E3 ?


It depends.

If this is a .ppm file I'm trying to read, and it starts with anything 
other than 'P' followed by one of '1','2','3','4','5','6' (by which I 
mean the ASCII codes for those), then it's a bad ppm file.


What are you really trying to say here?

Out of interest, how would Python handle the headers for binary file 
formats P4, P5, P6? I'd have a go but I don't want to waste half the day 
trying to get past the language.


It is quite possible to deal with files, including files which are 
completely or partially text, a byte at a time, without having silly 
restrictions put on them by the language.


Here's the palaver I had to go through last time I wrote such a file 
using Python, and this is just for the header:


s="P6\n%d %d\n255\n" % (hdr.width, hdr.height)
sbytes=array.array('B',list(map(ord,s)))
f.write(sbytes)

Was there a simple way to write way to do this? Most likely, but you 
have to find it first! Here's how I write it elsewhere:


  println @f, "P6"
  println @f, width,height
  println @f, 255

It's simpler because it doesn't get tied up in knots in trying to make 
text different from bytes, bytearrays or array.arrays.


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: what does := means simply?

2018-05-19 Thread bartc

On 19/05/2018 12:33, Peter J. Holzer wrote:

On 2018-05-19 11:33:26 +0100, bartc wrote:



Not you understand why some of us don't bother with 'text mode' files.


"Not" or "Now"?


Now.


Yesterday you claimed that you worked with them for 40 years.


Text files, yes. Not 'text mode' which is something inflicted on us by 
the C library.


(All my current programs can deal with lf or cr/lf line endings. I 
dropped cr-only line endings as I hadn't seen such a file since the 90's.)



However if you have an actual EBCDIC system and would to read .ppm files,
then you will have trouble reading the numeric parameters as they are
expressed using sequences of ASCII digits.



I think the simplest way would be perform the calculation by hand
(new_value = old_value * 10 + next_byte - 0x30). At least in a language
which lets me process individual bytes easily. That would even work on
both ASCII and EBCDIC based systems (and on every other platform, too).


/The/ simplest? Don't forget the numbers can be several digits each. 
Here's how I read them (NOT Python):


readln @f, width, height

Would it work in an EBCDIC based system? Probably not. But, who cares? 
(I can't say I've never used such a system, but that was some ancient 
mainframe from the 70s. But I'm pretty certain I won't ever again.)


(Perhaps someone who has access to an EBCDIC system can try a .PPM 
reader to see what happens. I suspect that won't work either.)


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


EuroPython 2018: Call for Proposals closes on Sunday

2018-05-19 Thread M.-A. Lemburg
We would like to remind you that our two week call for proposals (CFP)
closes on Sunday, May 20.

If you’d like to submit a talk, please see our CFP announcement for
details:

https://blog.europython.eu/post/173666124852/europython-2018-call-for-proposals-cfp-is-open

Submissions are possibe via the CFP page on the EuroPython 2018
website:

https://ep2018.europython.eu/en/call-for-proposals/

We’d like to invite everyone to submit proposals for talks, trainings,
panels, etc. Looking at the submission counts, we are especially
looking for more trainings submissions (note that you get a free
conference ticket and training pass as trainer of a selected
training).

Submissions will then go into a talk voting phase where EuroPython
attendees of previous years can vote on the submissions. The program
work group will then use these votes as basis for the final talk
selection and scheduling.

We expect to have the schedule available by end of May.

Please help us spread this reminder by sharing it on your social
networks as widely as possible. Thank you !

Link to the blog post:

https://blog.europython.eu/post/174049845802/europython-2018-call-for-proposals-closes-on

Tweet:

https://twitter.com/europython/status/997842991036424198

Enjoy,
--
EuroPython 2018 Team
https://ep2018.europython.eu/
https://www.europython-society.org/

-- 
https://mail.python.org/mailman/listinfo/python-list


(Not actually off-topic) Anyone here used Rebol or Red?

2018-05-19 Thread Steven D'Aprano
I'm looking for anyone with experience using either Rebol or its more 
modern fork, Red.

And yes, it is relevant to Python.


-- 
Steve

-- 
https://mail.python.org/mailman/listinfo/python-list


TypeError: expected string or Unicode object, NoneType found

2018-05-19 Thread subhabangalore
I wrote a small piece of following code 

import nltk
from nltk.corpus.reader import TaggedCorpusReader
from nltk.tag import CRFTagger
def NE_TAGGER():
reader = TaggedCorpusReader('/python27/', r'.*\.pos')
f1=reader.fileids()
print "The Files of Corpus are:",f1
sents=reader.tagged_sents()
ls=len(sents)
print "Length of Corpus Is:",ls
train_data=sents[:300]
test_data=sents[301:350]
ct = CRFTagger()
crf_tagger=ct.train(train_data,'model.crf.tagger')

This code is working fine. 
Now if I change the data size to say 500 or 3000 in  train_data by giving  
train_data=sents[:500] or
 train_data=sents[:3000] it is giving me the following error.

Traceback (most recent call last):
  File "", line 1, in 
NE_TAGGER()
  File "C:\Python27\HindiCRFNERTagger1.py", line 20, in NE_TAGGER
crf_tagger=ct.train(train_data,'model.crf.tagger')
  File "C:\Python27\lib\site-packages\nltk\tag\crf.py", line 185, in train
trainer.append(features,labels)
  File "pycrfsuite\_pycrfsuite.pyx", line 312, in 
pycrfsuite._pycrfsuite.BaseTrainer.append (pycrfsuite/_pycrfsuite.cpp:3800)
  File "stringsource", line 53, in 
vector.from_py.__pyx_convert_vector_from_py_std_3a__3a_string 
(pycrfsuite/_pycrfsuite.cpp:10738)
  File "stringsource", line 15, in 
string.from_py.__pyx_convert_string_from_py_std__in_string 
(pycrfsuite/_pycrfsuite.cpp:10633)
TypeError: expected string or Unicode object, NoneType found
>>> 

I have searched for solutions in web found the following links as,
https://stackoverflow.com/questions/14219038/python-multiprocessing-typeerror-expected-string-or-unicode-object-nonetype-f
or
https://github.com/kamakazikamikaze/easysnmp/issues/50

reloaded Python but did not find much help. 

I am using Python 2.7.15 (v2.7.15:ca079a3ea3, Apr 30 2018, 16:22:17) [MSC 
v.1500 32 bit (Intel)] on win32

My O/S is, MS-Windows 7.

If any body may kindly suggest a resolution. 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: TypeError: expected string or Unicode object, NoneType found

2018-05-19 Thread Peter Otten
subhabangal...@gmail.com wrote:

> I wrote a small piece of following code
> 
> import nltk
> from nltk.corpus.reader import TaggedCorpusReader
> from nltk.tag import CRFTagger
> def NE_TAGGER():
> reader = TaggedCorpusReader('/python27/', r'.*\.pos')
> f1=reader.fileids()
> print "The Files of Corpus are:",f1
> sents=reader.tagged_sents()
> ls=len(sents)
> print "Length of Corpus Is:",ls
> train_data=sents[:300]
> test_data=sents[301:350]

Offtopic: not that sents[300] is neither in the training nor in the test 
data; Python uses half-open intervals.

> ct = CRFTagger()
> crf_tagger=ct.train(train_data,'model.crf.tagger')
> 
> This code is working fine.
> Now if I change the data size to say 500 or 3000 in  train_data by giving 
> train_data=sents[:500] or
>  train_data=sents[:3000] it is giving me the following error.

What about sents[:499], sents[:498], ...? 

I'm not an nltk user, but to debug the problem I suggest that you identify 
the exact index that triggers the exception, and then print it

print sents[minimal_index_that_causes_typeerror]

Perhaps you can spot a problem with the input data.
 
(In the spirit of the "offtopic" remark: if sents[:333] triggers the failure 
you have to print sents[332])


> Traceback (most recent call last):
>   File "", line 1, in 
> NE_TAGGER()
>   File "C:\Python27\HindiCRFNERTagger1.py", line 20, in NE_TAGGER
> crf_tagger=ct.train(train_data,'model.crf.tagger')
>   File "C:\Python27\lib\site-packages\nltk\tag\crf.py", line 185, in train
> trainer.append(features,labels)
>   File "pycrfsuite\_pycrfsuite.pyx", line 312, in
>   pycrfsuite._pycrfsuite.BaseTrainer.append
>   (pycrfsuite/_pycrfsuite.cpp:3800) File "stringsource", line 53, in
>   vector.from_py.__pyx_convert_vector_from_py_std_3a__3a_string
>   (pycrfsuite/_pycrfsuite.cpp:10738) File "stringsource", line 15, in
>   string.from_py.__pyx_convert_string_from_py_std__in_string
>   (pycrfsuite/_pycrfsuite.cpp:10633)
> TypeError: expected string or Unicode object, NoneType found
 
> 
> I have searched for solutions in web found the following links as,
> https://stackoverflow.com/questions/14219038/python-multiprocessing-typeerror-expected-string-or-unicode-object-nonetype-f
> or
> https://github.com/kamakazikamikaze/easysnmp/issues/50
> 
> reloaded Python but did not find much help.
> 
> I am using Python 2.7.15 (v2.7.15:ca079a3ea3, Apr 30 2018, 16:22:17) [MSC
> v.1500 32 bit (Intel)] on win32
> 
> My O/S is, MS-Windows 7.
> 
> If any body may kindly suggest a resolution.


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: TypeError: expected string or Unicode object, NoneType found

2018-05-19 Thread Terry Reedy

On 5/19/2018 12:47 PM, Peter Otten wrote:

subhabangal...@gmail.com wrote:


I wrote a small piece of following code

import nltk
from nltk.corpus.reader import TaggedCorpusReader
from nltk.tag import CRFTagger


To implement Peter's suggestion:


def NE_TAGGER():


def tagger(stop):


 reader = TaggedCorpusReader('/python27/', r'.*\.pos')
 f1=reader.fileids()
 print "The Files of Corpus are:",f1
 sents=reader.tagged_sents()
 ls=len(sents)
 print "Length of Corpus Is:",ls
 train_data=sents[:300]
 test_data=sents[301:350]


Offtopic: not that sents[300] is neither in the training nor in the test
data; Python uses half-open intervals.


  train_data=sents[:stop]
  test_data=sents[stop:max+50]


 ct = CRFTagger()
 crf_tagger=ct.train(train_data,'model.crf.tagger')

This code is working fine.
Now if I change the data size to say 500 or 3000 in  train_data by giving
train_data=sents[:500] or
  train_data=sents[:3000] it is giving me the following error.


What about sents[:499], sents[:498], ...?


Do a rough binary search for the first stop value that raises.

tagger(400)
tagger(350 or 450, depending)
...

You could automate with bisect module, but bisection by eye should be 
faster.



I'm not an nltk user, but to debug the problem I suggest that you identify
the exact index that triggers the exception, and then print it

print sents[minimal_index_that_causes_typeerror]

Perhaps you can spot a problem with the input data.
  
(In the spirit of the "offtopic" remark: if sents[:333] triggers the failure

you have to print sents[332])


Or mentally subtract 1 from minimal failing stop value.




Traceback (most recent call last):
   File "", line 1, in 
 NE_TAGGER()
   File "C:\Python27\HindiCRFNERTagger1.py", line 20, in NE_TAGGER
 crf_tagger=ct.train(train_data,'model.crf.tagger')
   File "C:\Python27\lib\site-packages\nltk\tag\crf.py", line 185, in train
 trainer.append(features,labels)
   File "pycrfsuite\_pycrfsuite.pyx", line 312, in
   pycrfsuite._pycrfsuite.BaseTrainer.append
   (pycrfsuite/_pycrfsuite.cpp:3800) File "stringsource", line 53, in
   vector.from_py.__pyx_convert_vector_from_py_std_3a__3a_string
   (pycrfsuite/_pycrfsuite.cpp:10738) File "stringsource", line 15, in
   string.from_py.__pyx_convert_string_from_py_std__in_string
   (pycrfsuite/_pycrfsuite.cpp:10633)
TypeError: expected string or Unicode object, NoneType found




I have searched for solutions in web found the following links as,
https://stackoverflow.com/questions/14219038/python-multiprocessing-typeerror-expected-string-or-unicode-object-nonetype-f
or
https://github.com/kamakazikamikaze/easysnmp/issues/50

reloaded Python but did not find much help.

I am using Python 2.7.15 (v2.7.15:ca079a3ea3, Apr 30 2018, 16:22:17) [MSC
v.1500 32 bit (Intel)] on win32

My O/S is, MS-Windows 7.

If any body may kindly suggest a resolution.






--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list


Re: what does := means simply?

2018-05-19 Thread MRAB

On 2018-05-19 13:28, bartc wrote:

On 19/05/2018 12:38, Chris Angelico wrote:

On Sat, May 19, 2018 at 8:33 PM, bartc  wrote:



But then you are acknowledging the file is, in fact, ASCII.


Cool! So what happens if you acknowledge that a file is ASCII, and
then it starts with a byte value of E3 ?


It depends.

If this is a .ppm file I'm trying to read, and it starts with anything
other than 'P' followed by one of '1','2','3','4','5','6' (by which I
mean the ASCII codes for those), then it's a bad ppm file.

What are you really trying to say here?

Out of interest, how would Python handle the headers for binary file
formats P4, P5, P6? I'd have a go but I don't want to waste half the day
trying to get past the language.

It is quite possible to deal with files, including files which are
completely or partially text, a byte at a time, without having silly
restrictions put on them by the language.

Here's the palaver I had to go through last time I wrote such a file
using Python, and this is just for the header:

  s="P6\n%d %d\n255\n" % (hdr.width, hdr.height)
  sbytes=array.array('B',list(map(ord,s)))
  f.write(sbytes)

Was there a simple way to write way to do this? Most likely, but you
have to find it first! Here's how I write it elsewhere:

println @f, "P6"
println @f, width,height
println @f, 255

It's simpler because it doesn't get tied up in knots in trying to make
text different from bytes, bytearrays or array.arrays.


It's very simple:

s = b"P6\n%d %d\n255\n" % (hdr.width, hdr.height)
f.write(s)
--
https://mail.python.org/mailman/listinfo/python-list


Re: what does := means simply?

2018-05-19 Thread bartc

On 19/05/2018 20:47, Dennis Lee Bieber wrote:

On Sat, 19 May 2018 13:28:41 +0100, bartc  declaimed the
following:



Out of interest, how would Python handle the headers for binary file
formats P4, P5, P6? I'd have a go but I don't want to waste half the day
trying to get past the language.


Based upon http://netpbm.sourceforge.net/doc/ppm.html

P6  1024768 255

and

P6
# random comment
1024
768
# another random comment
255

are both valid headers.


The comments and examples here: 
https://en.wikipedia.org/wiki/Netpbm_format, and all actual ppm files 
I've come across, suggest the 3 parts of the header (2 parts for P1/P4) 
are on separate lines. That is, separated by newlines. The comments are 
a small detail that is not hard to deal with.


I think if ppm readers expect the 2-3 line format then generators will 
be less tempted to either stick everything on one line or stretch it 
across half a dozen. The point of ppm is simplicity after all.


And actually, a ppm reader I've just downloaded, an image viewer that 
deals with dozens of formats, had problems when I tried to put 
everything on one line. (I think it needs the signature on its own line.)



Reading an arbitrary PPM thereby is going to be tedious.


PPM was intended to be simple to read and to write (try TIFF, or JPEG, 
for something that is going to be a lot more work).



ppmfil = open("junk.ppm", "wb")


(ppmfil? We don't have have 6-character limits any more.)


header = struct.pack("3s27s",

... b"P6 ",
... bytes("%8s %8s %8s\n" %
... (width, height, maxval),
... "ASCII"))


header

b'P6 1024  768  255\n'


Hmm, I'd write this elsewhere, if it's going to be one line, as just:

  println @f,"P6",width,height,"255"

I'm sure Python must be able to do something along these lines, even if 
it's:


  f.write("P6 "+str(width)+" "+str(height)+" 255\n")

with whatever is needed to make that string compatible with a binary 
file. I don't know what the struct.pack stuff is for; the header can 
clearly be free-format text.



And how would that language handle Unicode text?


That's not relevant here. (Where it might be relevant, then Unicode must 
be encoded as UTF8 within an 8-bit string.)


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 10442: character maps to

2018-05-19 Thread bellcanadardp
On Thursday, 29 January 2009 12:09:29 UTC-5, Anjanesh Lekshminarayanan  wrote:
> > It does auto-detect it as cp1252- look at the files in the traceback and
> > you'll see lib\encodings\cp1252.py. Since cp1252 seems to be the wrong
> > encoding, try opening it as utf-8 or latin1 and see if that fixes it.
> 
> Thanks a lot ! utf-8 and latin1 were accepted !

hello i am having same issue..i believe the code is written in python 2 and i 
am running python 3.6..i tried at the interpreter..f =
open(filename, encoding="utf-8" and also latin-1..but then when i run my file i 
still get the error...also my line is at 7414..how do you find this line??...is 
it better to try to run the file .py in python 2??..thnxz
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 10442: character maps to

2018-05-19 Thread Chris Angelico
On Sun, May 20, 2018 at 8:58 AM,   wrote:
> On Thursday, 29 January 2009 12:09:29 UTC-5, Anjanesh Lekshminarayanan  wrote:
>> > It does auto-detect it as cp1252- look at the files in the traceback and
>> > you'll see lib\encodings\cp1252.py. Since cp1252 seems to be the wrong
>> > encoding, try opening it as utf-8 or latin1 and see if that fixes it.
>>
>> Thanks a lot ! utf-8 and latin1 were accepted !
>
> hello i am having same issue..i believe the code is written in python 2 and i 
> am running python 3.6..i tried at the interpreter..f =
> open(filename, encoding="utf-8" and also latin-1..but then when i run my file 
> i still get the error...also my line is at 7414..how do you find this 
> line??...is it better to try to run the file .py in python 2??..thnxz

You're responding to something from 2009.

Your file is apparently not encoded the way you think it is. You'll
have to figure out what it ACTUALLY is.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 10442: character maps to

2018-05-19 Thread Skip Montanaro
As Chris indicated, you'll have to figure out the correct encoding. You
might want to check out the chardet module (available on PyPI, I believe)
and see if it can come up with a better guess. I imagine there are other
encoding guessers out there. That's just one I'm familiar with.

Skip
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: decorat{or,ion}

2018-05-19 Thread Mike McClain
On Sat, May 19, 2018 at 08:22:59AM +0200, dieter wrote:
> Mike McClain  writes:

>
> An "object", in general, is something that can have attributes
> (holding the object state) and methods (defining often operations on
> the object state but in some cases also general operations (not
> related to the object state).
> In this sense, almost everything you handle in a Python script
> can be view as an object (an exception are the variables;
> they, themselves, are not objects but their values are objects).
>
> In your example, "foo" and "bar" are functions.
> As such, they are also objects - but the feature (attributes,
> methods) is rarely used explicitely (you can use
> "dir()" to learn what attributes and methods
> your function has). You use functions in Python typically as
> you are used to use them in another programming language.
>
>
> > ...
> > For that matter, are baz(), buz() and bug() methods if they only
> > help bar get the job done but don't have a place in bar's interface?
>
> Your functions are methods, if they are defined inside a class.
>
> Typically (there are exceptions), this means that their definition
> occurs at the top level of a "class" statement:
>
>class CLS...:
>  ...
>  def method(self, ...):
>...
>
>In this case, "method" is a method of class "CLS"
>and its instances (called "CLS" objects).
>
> When you access a method of an object, then
> the function representing the method is wrapped;
> the wrapper remembers the object and the function and
> ensures that the object is automatically passed as
> the first parameter of the function ("self" in the example above)
> in method calls.
> Thus, the distinction between a method and a "normal" function
> is that for a method, the first parameter is passed automatically
> while in a "normal" function, the call must pass all arguments explicitly.
>
>
Hi dieter,
Thanks for the response, this is still a foreign language to me and I 
need all
the help I can get. I'm reading the docs, doing the tutorial again but 
still have
more questions than answers.
If I understand what you said, 'taint necessarily so, I'll restate it 
in psuedo
code since I've little feel for python syntax yet.

Let's forget buz().

def bar(*args):
(a,b,rest) = parseArgs(args)
def baz(x):
...
def bug(k,l,m):
...
bug(foo(a), baz(b), rest)

In 'def bar()', baz & bug are simply functions.
Are they accessable outside bar?


class bar(*args):
(a,b,rest) = parseArgs(args)
def baz(x):
...
...

In 'class bar()' I understand baz and bug should be named _baz_ , _bug_,
if they're not expected to be called from outside bar but there's nothing
to prevent one from doing so except manners. Also they're now methods while
still being functions and had to be declared 'def baz(self,x):'.

Feel free to laugh if what I'm saying is nonsense in python.

If I execute 'bar.baz(mu)', assuming mu is enough like b above for
baz not to throw an exception, can I expect the action of baz to change in a
manner similar to the change in parameters? Or does something in self 
prevent that?

How badly did I miss understand?

Thanks,
Mike
--
I am going to go stand outside so if anyone asks about me,
tell them I'M OUTSTANDING!
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: what does := means simply?

2018-05-19 Thread bartc

On 20/05/2018 01:39, Dennis Lee Bieber wrote:

On Sat, 19 May 2018 23:14:08 +0100, bartc  declaimed the
following:



The comments and examples here:
https://en.wikipedia.org/wiki/Netpbm_format, and all actual ppm files
I've come across, suggest the 3 parts of the header (2 parts for P1/P4)
are on separate lines. That is, separated by newlines. The comments are
a small detail that is not hard to deal with.



Wikipedia is not a definitive document...

http://netpbm.sourceforge.net/doc/ppm.html has
"""
Each PPM image consists of the following:

 A "magic number" for identifying the file type. A ppm image's magic
number is the two characters "P6".
 Whitespace (blanks, TABs, CRs, LFs).
 A width, formatted as ASCII characters in decimal.
 Whitespace.
 A height, again in ASCII decimal.
 Whitespace.
 The maximum color value (Maxval), again in ASCII decimal. Must be less
than 65536 and more than zero.
 A single whitespace character (usually a newline).
"""


I think if you are going to be generating ppm, then the best choice of 
format, for the widest acceptance, is to separate the header groups with 
a newline. (As I mentioned my downloaded viewer needs a new line after 
the first group. My own viewer, which I only threw together the other 
day to test that benchmark, also expects the newlines. Otherwise I might 
need to do 5 minutes' coding to fix it.)


(Regarding those benchmarks 
(https://benchmarksgame-team.pages.debian.net/benchmarksgame/performance/mandelbrot.html), 
as far as I can tell every language generates the ppm file inline (no 
special ppm library), and they all generate the P4 signature on one line 
and width/height on the next line.


(Click on any source file and look for "P4". Most do it with less fuss 
than Python too.))



That all the ones you've seen have a certain layout may only mean that
the generating software used a common library implementation:
http://netpbm.sourceforge.net/doc/libnetpbm.html


Blimey, it makes a meal of it. I got the impression this was supposed to 
be a simple image format, and with the line-oriented all-text formats it 
was.


But it could be worse: they might have used XML.

--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: decorat{or,ion}

2018-05-19 Thread Mike McClain
On Sat, May 19, 2018 at 07:22:28AM +, Steven D'Aprano wrote:
> On Fri, 18 May 2018 18:31:16 -0700, Mike McClain wrote:
>

> I *think* you are describing something like this:
Real close!
> def foo(x):
> return x + 1
>
> def bar(arg):
> a = baz(arg)  # do some magic
> result = bar(a)  # call the important function
  result = foo(a)  #small change
> return buz(result)  # and a bit more magic



> Typically, we wouldn't use the term "decorator" or "decoration" to
> describe a hand-written function like bar(), even if it calls foo().
> Normally the "decorator" terminology is reserved for one of two things:
>
> (1) The software design pattern of using a factory function to "wrap" one
> function inside an automatically generated wrapper function that provides
> the extra additional functionality:
>
>
> def factory(func):
> # Wrap func() to force it to return zero instead of negative
> def wrapped(arg):
> result = func(arg)
> if result < 0:
> result = 0
> return result
> # return the wrapped function
> return wrapped
>
>
> (2) the syntax for applying such a factory function:
>
> @factory
> def myfunction(x):
> return 5 - x

Too early to tell. Your definition of factory returns a function ref
so needs assignment unless used (execd) immediately. Yes, no?

def factory(func):
# Wrap func() to a different kind of magic
def bar(arg):
a = baz(arg)  # do some magic
result = func(a)  #small change
return buz(result)  # and a bit more magic
return bar

@factory
def foo(x):
return something_outrageous

At this point it looks to me that I've created a nameless function
that can't be used. How does the assignment take place?

> Does this help?

If I've understood you correctly it helps one Heck of a lot.
If I haven't, no doubt the fault is mine.
I've used several procedural languages but you OOP folk think in
strange paths. So far it is a fun trip.

> I know these concepts are sometimes tricky. Concrete examples often make
> them easier to understand. Feel free to ask more questions as needed.
> --
> Steve

Walk with Light,
Mike
--
I am going to go stand outside so if anyone asks about me,
tell them I'M OUTSTANDING!
-- 
https://mail.python.org/mailman/listinfo/python-list


"Data blocks" syntax specification draft

2018-05-19 Thread Mikhail V
I have made up a printable PDF with the current version
of the syntax suggestion.

https://github.com/Mikhail22/Documents/blob/master/data-blocks-v01.pdf

After some of your comments I've made some further
re-considerations, e.g. element separation should
be now much simpler.
A lot of examples with comparison included.


Comments, suggestions are welcome.


M
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: "Data blocks" syntax specification draft

2018-05-19 Thread Chris Angelico
On Sun, May 20, 2018 at 12:58 PM, Mikhail V  wrote:
> I have made up a printable PDF with the current version
> of the syntax suggestion.
>
> https://github.com/Mikhail22/Documents/blob/master/data-blocks-v01.pdf
>
> After some of your comments I've made some further
> re-considerations, e.g. element separation should
> be now much simpler.
> A lot of examples with comparison included.
>
>
> Comments, suggestions are welcome.
>

One comment.

I'm not interested in downloading a PDF. Can you rework your document
to be in a more textual format like Markdown or reStructuredText?
Since you're hosting on GitHub anyway, the rendering can be done
automatically.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: (Not actually off-topic) Anyone here used Rebol or Red?

2018-05-19 Thread Steven D'Aprano
On Sat, 19 May 2018 14:38:22 +, Steven D'Aprano wrote:

> I'm looking for anyone with experience using either Rebol or its more
> modern fork, Red.
> 
> And yes, it is relevant to Python.

Never mind, the Timbot has answered my question on the Python-Ideas list, 
so we're all good.



-- 
Steve

-- 
https://mail.python.org/mailman/listinfo/python-list