date:20220307

Re: Execute in a multiprocessing child dynamic code loaded by the parent process

2022-03-07 Thread Barry




> On 7 Mar 2022, at 02:33, Martin Di Paola  wrote:
> 
> Yes but I think that unpickle (pickle.loads()) does that plus
> importing any module needed

Are you sure that unpickle will import code? I thought it did not do that.

Barry
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Execute in a multiprocessing child dynamic code loaded by the parent process

2022-03-07 Thread Martin Di Paola


I understand that yes, pickle.loads() imports any necessary module but
only if they can be find in sys.path (like in any "import" statement).

Dynamic code loaded from a plugin (which we presume it is *not* in
sys.path) will not be loaded.

Quick check. Run in one console the following:

import multiprocessing
import multiprocessing.reduction

import pickle
pickle.dumps(multiprocessing.reduction.ForkingPickler)


In a separated Python console run the following:

import pickle
import sys

'multiprocessing' in sys.modules
False

pickle.loads()

'multiprocessing' in sys.modules
True

So the last check proves that pickle.loads imports any necessary module.

Martin.

On Mon, Mar 07, 2022 at 08:28:15AM +, Barry wrote:




On 7 Mar 2022, at 02:33, Martin Di Paola  wrote:

Yes but I think that unpickle (pickle.loads()) does that plus
importing any module needed


Are you sure that unpickle will import code? I thought it did not do that.

Barry

--
https://mail.python.org/mailman/listinfo/python-list

Re: Behavior of the for-else construct

2022-03-07 Thread Grant Edwards

On 2022-03-07, Peter J. Holzer  wrote:
> On 2022-03-06 18:34:39 -0800, Grant Edwards wrote:
>> On 2022-03-06, Avi Gross via Python-list  wrote:
>> > Python is named after a snake right?
>> 
>> No. It's named after a comedy troupe.
>
> He actually wrote that two sentences later.

Yes, I missed that. His messages wrap very strangely in my newsreader.

--
Grant

-- 
https://mail.python.org/mailman/listinfo/python-list

always return the same pdf

2022-03-07 Thread Gonzalo V

Hello everyone.
i had upload a Django app to an ubuntu 18.04 server and it gives me the
same pdf everytime the view is called. To generate the pdf it receipts
differents string buy it gives me the same pdf. Could you give some idea
what is happening?

thanks everyone
@never_cached
def generar_pdf(request):
prueba = request.session.get('contenedor')
cantidad_preguntas=prueba['cantidad_preguntas']
archivo_salida = open("prueba.tex","w")

archivo_salida.write("\\documentclass[10pt,oneside,letterpaper]{article}")
archivo_salida.write("\\usepackage[utf8x]{inputenc}")

  ##hace mas y mas cosas sin importancia con latex que funcionan bien

archivo_a_descargar = open("prueba.pdf","rb") #
respuesta =
HttpResponse(archivo_a_descargar,content_type='application/pdf')
respuesta['Content-Disposition'] = 'attachment; filename="{0}"'.format(
archivo_a_descargar.name)

return respuesta
Saludos,
Gonzalo
-- 
https://mail.python.org/mailman/listinfo/python-list

strange problem building non-pure wheel for apple M1 arm64

2022-03-07 Thread Robin Becker

I use cibuildwheel to build extensions with a github action. For the macos 11.0 arm64 build I get a strange message from 
the load command. So I am looking for assistance to try and figure out what is going wrong.


The cibuild action uses the latest pip 21.2.4 and latest setuptools etc.

I use brew to install freetype version 2.11.1.

The compilations look like this
gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -g -arch 
arm64 -DRENDERPM_FT -DLIBART_COMPILATION -DLIBART_VERSION=2.3.21 -Isrc/rl_addons/renderPM 
-Isrc/rl_addons/renderPM/libart_lgpl -Isrc/rl_addons/renderPM/gt1 -I/usr/local/include/freetype2 
-I/Library/Frameworks/Python.framework/Versions/3.9/include/python3.9 -c src/rl_addons/renderPM/_renderPM.c -o 
build/temp.macosx-11.0-arm64-3.9/src/rl_addons/renderPM/_renderPM.o


this is the load command on multiple lines for readability the strange error is

gcc -bundle -undefined dynamic_lookup -g -arch arm64
build/temp.macosx-11.0-arm64-3.9/src/rl_addons/renderPM/_renderPM.o
build/temp.macosx-11.0-arm64-3.9/src/rl_addons/renderPM/gt1/gt1-dict.o

build/temp.macosx-11.0-arm64-3.9/src/rl_addons/renderPM/gt1/gt1-namecontext.o
'''other compiled code

build/temp.macosx-11.0-arm64-3.9/src/rl_addons/renderPM/libart_lgpl/art_vpath_dash.o
-L/usr/local/lib
-L/usr/lib
-L/Library/Frameworks/Python.framework/Versions/3.9/lib
-lfreetype -o 
build/lib.macosx-11.0-arm64-3.9/reportlab/graphics/_renderPM.cpython-39-darwin.so

ld: warning: ignoring file /usr/local/lib/libfreetype.dylib, building for macOS-arm64 but attempting to link with file 
built for macOS-x86_64


The above message seems bizarre; everything is compiled for arm64, but gcc 
doesn't want to use an arm64 dylib.

Can macos experts assist?
--
Robin Becker
--
https://mail.python.org/mailman/listinfo/python-list

Re: C API PyObject_CallFunctionObjArgs returns incorrect result

2022-03-07 Thread Jen Kris via Python-list

Thank you MRAB for your reply.

Regarding your first question, pSentence is a list.  In the nltk library, 
nltk.word_tokenize takes a string, so we convert sentence to string before we 
call nltk.word_tokenize:

>>> sentence = " ".join(sentence)
>>> pt = nltk.word_tokenize(sentence)
>>> print(sentence)
[ Emma by Jane Austen 1816 ]

But with the C API it looks like this:

PyObject *pSentence = PySequence_GetItem(pSents, sent_count);
PyObject* str_sentence = PyObject_Str(pSentence);  // Convert to string

; See what str_sentence looks like:
PyObject* repr_str = PyObject_Repr(str_sentence);  
PyObject* str_str = PyUnicode_AsEncodedString(repr_str, "utf-8", "~E~");  
const char *bytes_str = PyBytes_AS_STRING(str_str);
printf("REPR_String: %s\n", bytes_str); 

REPR_String: "['[', 'Emma', 'by', 'Jane', 'Austen', '1816', ']']"
So the two string representations are not the same – or at least the   
PyUnicode_AsEncodedString is not the same, as each item is surrounded by single 
quotes. 

Assuming that the conversion to bytes object for the REPR is an accurate 
representation of str_sentence, it looks like I need to strip the quotes from 
str_sentence before “PyObject* pWTok = PyObject_CallFunctionObjArgs(pNltk_WTok, 
str_sentence, 0).”   

So my questions now are (1) is there a C API function that will convert a list 
to a string exactly the same way as ‘’.join, and if not then (2) how can I 
strip characters from a string object in the C API? 

Thanks.



Mar 6, 2022, 17:42 by pyt...@mrabarnett.plus.com:

> On 2022-03-07 00:32, Jen Kris via Python-list wrote:
>
>> I am using the C API in Python 3.8 with the nltk library, and I have a 
>> problem with the return from a library call implemented with 
>> PyObject_CallFunctionObjArgs.
>>
>> This is the relevant Python code:
>>
>> import nltk
>> from nltk.corpus import gutenberg
>> fileids = gutenberg.fileids()
>> sentences = gutenberg.sents(fileids[0])
>> sentence = sentences[0]
>> sentence = " ".join(sentence)
>> pt = nltk.word_tokenize(sentence)
>>
>> I run this at the Python command prompt to show how it works:
>>
> sentence = " ".join(sentence)
> pt = nltk.word_tokenize(sentence)
> print(pt)
>
>> ['[', 'Emma', 'by', 'Jane', 'Austen', '1816', ']']
>>
> type(pt)
>
>> 
>>
>> This is the relevant part of the C API code:
>>
>> PyObject* str_sentence = PyObject_Str(pSentence);
>> // nltk.word_tokenize(sentence)
>> PyObject* pNltk_WTok = PyObject_GetAttrString(pModule_mstr, "word_tokenize");
>> PyObject* pWTok = PyObject_CallFunctionObjArgs(pNltk_WTok, str_sentence, 0);
>>
>> (where pModule_mstr is the nltk library).
>>
>> That should produce a list with a length of 7 that looks like it does on the 
>> command line version shown above:
>>
>> ['[', 'Emma', 'by', 'Jane', 'Austen', '1816', ']']
>>
>> But instead the C API produces a list with a length of 24, and the REPR 
>> looks like this:
>>
>> '[\'[\', "\'", \'[\', "\'", \',\', "\'Emma", "\'", \',\', "\'by", "\'", 
>> \',\', "\'Jane", "\'", \',\', "\'Austen", "\'", \',\', "\'1816", "\'", 
>> \',\', "\'", \']\', "\'", \']\']'
>>
>> I also tried this with PyObject_CallMethodObjArgs and PyObject_Call without 
>> success.
>>
>> Thanks for any help on this.
>>
> What is pSentence? Is it what you think it is?
> To me it looks like it's either the list:
>
>  ['[', 'Emma', 'by', 'Jane', 'Austen', '1816', ']']
>
> or that list as a string:
>
>  "['[', 'Emma', 'by', 'Jane', 'Austen', '1816', ']']"
>
> and that what you're tokenising.
> -- 
> https://mail.python.org/mailman/listinfo/python-list
>

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: C API PyObject_CallFunctionObjArgs returns incorrect result

2022-03-07 Thread Chris Angelico

On Tue, 8 Mar 2022 at 04:06, Jen Kris via Python-list
 wrote:
> But with the C API it looks like this:
>
> PyObject *pSentence = PySequence_GetItem(pSents, sent_count);
> PyObject* str_sentence = PyObject_Str(pSentence);  // Convert to string
>
> PyObject* repr_str = PyObject_Repr(str_sentence);

You convert it to a string, then take the representation of that. Is
that what you intended?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: C API PyObject_CallFunctionObjArgs returns incorrect result

2022-03-07 Thread Jen Kris via Python-list


The PyObject str_sentence is a string representation of a list.  I need to 
convert the list to a string like "".join because that's what the library call 
takes.  


Mar 7, 2022, 09:09 by ros...@gmail.com:

> On Tue, 8 Mar 2022 at 04:06, Jen Kris via Python-list
>  wrote:
>
>> But with the C API it looks like this:
>>
>> PyObject *pSentence = PySequence_GetItem(pSents, sent_count);
>> PyObject* str_sentence = PyObject_Str(pSentence);  // Convert to string
>>
>> PyObject* repr_str = PyObject_Repr(str_sentence);
>>
>
> You convert it to a string, then take the representation of that. Is
> that what you intended?
>
> ChrisA
> -- 
> https://mail.python.org/mailman/listinfo/python-list
>

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: C API PyObject_CallFunctionObjArgs returns incorrect result

2022-03-07 Thread Chris Angelico

On Tue, 8 Mar 2022 at 04:13, Jen Kris  wrote:
>
>
> The PyObject str_sentence is a string representation of a list.  I need to 
> convert the list to a string like "".join because that's what the library 
> call takes.
>

What you're doing is the equivalent of str(sentence), not
"".join(sentence). Since the join method is part of the string
protocol, you'll find it here:

https://docs.python.org/3/c-api/unicode.html#c.PyUnicode_Join

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

RE: Behavior of the for-else construct

2022-03-07 Thread Schachner, Joseph

Can someone please change the topic of this thread?  No longer about for-else.


Teledyne Confidential; Commercially Sensitive Business Data

-Original Message-
From: Dennis Lee Bieber  
Sent: Sunday, March 6, 2022 1:29 PM
To: python-list@python.org
Subject: Re: Behavior of the for-else construct

On Sun, 6 Mar 2022 17:39:51 +0100, "Peter J. Holzer"  
declaimed the following:

>
>(* *) for comments was actually pretty commonly used - maybe because it 
>stands out more than { }. I don't know if I've ever seen (. .) instead 
>of [ ].
>
Or some terminals provided [ ] but not { }  

Modula-2 appears to have fixed on (* *) for comments, and only [ ] for 
indexing.

Consider the potential mayhem going from a language where { } are 
comment delimiters to one where they are block delimiters 


>C also has alternative rerpresentations for characters not in the 
>common subset of ISO-646 and EBCDIC. However, the trigraphs are 
>extremely ugly (e.g ??< ??> instead of { }). I have seen them used (on 
>an IBM/390 system with an EBCDIC variant without curly braces) and it's 
>really no fun to read that.
>
My college mainframe used EBCDIC, but the available languages did not 
include C or Pascal. We had APL, FORTRAN-IV (in full separate compilation form, 
and FLAG [FORTRAN Load and Go] which was a "all in one file, compile & run" 
used by first year students), COBOL (74?), BASIC, SNOBOL, Meta-Symbol and AP 
(both assemblers, though Meta-Symbol could, provided the proper definition 
file, generate absolute binary code for pretty much any processor), and 
something called SL-1 (Simulation Language-1, which produced FORTRAN output for 
discrete event models).

UCSD Pascal, and PDP-11 assembly were run on a pair of LSI-11 systems.
Assembly used for the operating system principles course.

I didn't encounter "real" C until getting a TRS-80 (first as integer 
LC, then Pro-MC), along with Supersoft LISP (on cassette tape!). (I had books 
for C and Ada before encountering compilers for them)


-- 
Wulfraed Dennis Lee Bieber AF6VN
wlfr...@ix.netcom.comhttp://wlfraed.microdiversity.freeddns.org/
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Execute in a multiprocessing child dynamic code loaded by the parent process

2022-03-07 Thread Dieter Maurer

Martin Di Paola wrote at 2022-3-6 20:42 +:
>>Try to use `fork` as "start method" (instead of "spawn").
>
>Yes but no. Indeed with `fork` there is no need to pickle anything. In
>particular the child process will be a copy of the parent so it will
>have all the modules loaded, including the dynamic ones. Perfect.
>
>The problem is that `fork` is the default only in Linux. It works in
>MacOS but it may lead to crashes if the parent process is multithreaded
>(and the my is!) and `fork` does not work in Windows.

Then, you must put the initialization (dynamically loading the modules)
into the function executed in the foreign process.

You could wrap the payload function into a class instances to achieve this.
In the foreign process, you call the instance which first performs
the initialization and then executes the payload.
-- 
https://mail.python.org/mailman/listinfo/python-list

Non sequitur: Changing subject line... WAS: Behavior of the for-else construct

2022-03-07 Thread Dennis Lee Bieber

On Mon, 7 Mar 2022 18:07:42 +, "Schachner, Joseph"
 declaimed the following:

>Can someone please change the topic of this thread?  No longer about for-else.
>

Pretty much anyone can change the subject of the message when replying.

But if one is using a threaded client, that thread on message
references

Message-ID: 
References: 
 <621325684.471007.1646354302...@mail.yahoo.com>
 <20220304225746.mmebv3myg5wbp...@hjp.at>
 <657845041.42944.1646437629...@mail.yahoo.com>
 <20220305001158.g7rmlyoxxtxuf...@hjp.at>
 
  <1mc72hll06itd6jnbgdherqb3thf1fk...@4ax.com>
 <20220306163951.2ozmrhfbtsktb...@hjp.at>
 
 



it will still appear under the parent message; it will only thread
differently if one's client merely sorts on subject and date/time.


-- 
Wulfraed Dennis Lee Bieber AF6VN
wlfr...@ix.netcom.comhttp://wlfraed.microdiversity.freeddns.org/
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: C API PyObject_CallFunctionObjArgs returns incorrect result

2022-03-07 Thread MRAB

On 2022-03-07 17:05, Jen Kris wrote:

Thank you MRAB for your reply.

Regarding your first question, pSentence is a list.  In the nltk 
library, nltk.word_tokenize takes a string, so we convert sentence to 
string before we call nltk.word_tokenize:

>>> sentence = " ".join(sentence)
>>> pt = nltk.word_tokenize(sentence)
>>> print(sentence)
[ Emma by Jane Austen 1816 ]

But with the C API it looks like this:

PyObject *pSentence = PySequence_GetItem(pSents, sent_count);
PyObject* str_sentence = PyObject_Str(pSentence); // Convert to string

; See what str_sentence looks like:
PyObject* repr_str = PyObject_Repr(str_sentence);
PyObject* str_str = PyUnicode_AsEncodedString(repr_str, "utf-8", "~E~");
const char *bytes_str = PyBytes_AS_STRING(str_str);
printf("REPR_String: %s\n", bytes_str);

REPR_String: "['[', 'Emma', 'by', 'Jane', 'Austen', '1816', ']']"

So the two string representations are not the same – or at least the   
PyUnicode_AsEncodedString is not the same, as each item is surrounded 
by single quotes.

Assuming that the conversion to bytes object for the REPR is an 
accurate representation of str_sentence, it looks like I need to strip 
the quotes from str_sentence before “PyObject* pWTok = 
PyObject_CallFunctionObjArgs(pNltk_WTok, str_sentence, 0).”

So my questions now are (1) is there a C API function that will 
convert a list to a string exactly the same way as ‘’.join, and if not 
then (2) how can I strip characters from a string object in the C API?

Your Python code is joining the list with a space as the separator.

The equivalent using the C API is:

    PyObject* separator;
    PyObject* joined;

    separator = PyUnicode_FromString(" ");
    joined = PyUnicode_Join(separator, pSentence);
    Py_DECREF(sep);

Mar 6, 2022, 17:42 by pyt...@mrabarnett.plus.com:

On 2022-03-07 00:32, Jen Kris via Python-list wrote:

I am using the C API in Python 3.8 with the nltk library, and
I have a problem with the return from a library call
implemented with PyObject_CallFunctionObjArgs.

This is the relevant Python code:

import nltk
from nltk.corpus import gutenberg
fileids = gutenberg.fileids()
sentences = gutenberg.sents(fileids[0])
sentence = sentences[0]
sentence = " ".join(sentence)
pt = nltk.word_tokenize(sentence)

I run this at the Python command prompt to show how it works:

sentence = " ".join(sentence)
pt = nltk.word_tokenize(sentence)
print(pt)

['[', 'Emma', 'by', 'Jane', 'Austen', '1816', ']']

type(pt)

This is the relevant part of the C API code:

PyObject* str_sentence = PyObject_Str(pSentence);
// nltk.word_tokenize(sentence)
PyObject* pNltk_WTok = PyObject_GetAttrString(pModule_mstr,
"word_tokenize");
PyObject* pWTok = PyObject_CallFunctionObjArgs(pNltk_WTok,
str_sentence, 0);

(where pModule_mstr is the nltk library).

That should produce a list with a length of 7 that looks like
it does on the command line version shown above:

['[', 'Emma', 'by', 'Jane', 'Austen', '1816', ']']

But instead the C API produces a list with a length of 24, and
the REPR looks like this:

'[\'[\', "\'", \'[\', "\'", \',\', "\'Emma", "\'", \',\',
"\'by", "\'", \',\', "\'Jane", "\'", \',\', "\'Austen", "\'",
\',\', "\'1816", "\'", \',\', "\'", \']\', "\'", \']\']'

I also tried this with PyObject_CallMethodObjArgs and
PyObject_Call without success.

Thanks for any help on this.

What is pSentence? Is it what you think it is?
To me it looks like it's either the list:

['[', 'Emma', 'by', 'Jane', 'Austen', '1816', ']']

or that list as a string:

"['[', 'Emma', 'by', 'Jane', 'Austen', '1816', ']']"

and that what you're tokenising.
-- 
https://mail.python.org/mailman/listinfo/python-list

--
https://mail.python.org/mailman/listinfo/python-list

Re: always return the same pdf

2022-03-07 Thread MRAB


On 2022-03-07 14:08, Gonzalo V wrote:

Hello everyone.
i had upload a Django app to an ubuntu 18.04 server and it gives me the
same pdf everytime the view is called. To generate the pdf it receipts
differents string buy it gives me the same pdf. Could you give some idea
what is happening?

thanks everyone
@never_cached
def generar_pdf(request):
 prueba = request.session.get('contenedor')
 cantidad_preguntas=prueba['cantidad_preguntas']
 archivo_salida = open("prueba.tex","w")

archivo_salida.write("\\documentclass[10pt,oneside,letterpaper]{article}")
 archivo_salida.write("\\usepackage[utf8x]{inputenc}")

   ##hace mas y mas cosas sin importancia con latex que funcionan bien

 archivo_a_descargar = open("prueba.pdf","rb") #
 respuesta =
HttpResponse(archivo_a_descargar,content_type='application/pdf')
 respuesta['Content-Disposition'] = 'attachment; filename="{0}"'.format(
archivo_a_descargar.name)

 return respuesta
Saludos,
Gonzalo


You're using relative paths. Are you sure that they are pointing to the 
correct files?


Is it actually generating the PDF?

You might think that when it generates the PDF it overwrites any 
existing file of that name but is it? Is it simply giving you the PDF 
that's already there?

--
https://mail.python.org/mailman/listinfo/python-list

Re: Behavior of the for-else construct

2022-03-07 Thread Antoon Pardon




Op 4/03/2022 om 02:08 schreef Avi Gross via Python-list:

If Python was being designed TODAY, I wonder if a larger set of key words would 
be marked as RESERVED for future expansion including ORELSE and even 
NEVERTHELESS.


I think a better solution would be to have reserved words written letters form 
the mathematical lettter block.

Something like:

𝐝𝐞𝐟 foo(bar):
   in = file(...)
   𝐟𝐨𝐫 line 𝐢𝐧 in:
  ...

--
https://mail.python.org/mailman/listinfo/python-list

Re: C API PyObject_CallFunctionObjArgs returns incorrect result

2022-03-07 Thread Jen Kris via Python-list

Thanks to MRAB and Chris Angelico for your help.  Here is how I implemented the 
string conversion, and it works correctly now for a library call that needs a 
list converted to a string (error handling not shown):

PyObject* str_sentence = PyObject_Str(pSentence);  
PyObject* separator = PyUnicode_FromString(" ");
PyObject* str_join = PyUnicode_Join(separator, pSentence);
Py_DECREF(separator);
PyObject* pNltk_WTok = PyObject_GetAttrString(pModule_mstr, "word_tokenize");
PyObject* pWTok = PyObject_CallFunctionObjArgs(pNltk_WTok, str_join, 0);

That produces what I need (this is the REPR of pWTok):

"['[', 'Emma', 'by', 'Jane', 'Austen', '1816', ']']"

Thanks again to both of you. 

Jen


Mar 7, 2022, 11:03 by pyt...@mrabarnett.plus.com:

> On 2022-03-07 17:05, Jen Kris wrote:
>
>> Thank you MRAB for your reply.
>>
>> Regarding your first question, pSentence is a list.  In the nltk library, 
>> nltk.word_tokenize takes a string, so we convert sentence to string before 
>> we call nltk.word_tokenize:
>>
>> >>> sentence = " ".join(sentence)
>> >>> pt = nltk.word_tokenize(sentence)
>> >>> print(sentence)
>> [ Emma by Jane Austen 1816 ]
>>
>> But with the C API it looks like this:
>>
>> PyObject *pSentence = PySequence_GetItem(pSents, sent_count);
>> PyObject* str_sentence = PyObject_Str(pSentence); // Convert to string
>>
>> ; See what str_sentence looks like:
>> PyObject* repr_str = PyObject_Repr(str_sentence);
>> PyObject* str_str = PyUnicode_AsEncodedString(repr_str, "utf-8", "~E~");
>> const char *bytes_str = PyBytes_AS_STRING(str_str);
>> printf("REPR_String: %s\n", bytes_str);
>>
>> REPR_String: "['[', 'Emma', 'by', 'Jane', 'Austen', '1816', ']']"
>>
>> So the two string representations are not the same – or at least the   
>> PyUnicode_AsEncodedString is not the same, as each item is surrounded by 
>> single quotes.
>>
>> Assuming that the conversion to bytes object for the REPR is an accurate 
>> representation of str_sentence, it looks like I need to strip the quotes 
>> from str_sentence before “PyObject* pWTok = 
>> PyObject_CallFunctionObjArgs(pNltk_WTok, str_sentence, 0).”
>>
>> So my questions now are (1) is there a C API function that will convert a 
>> list to a string exactly the same way as ‘’.join, and if not then (2) how 
>> can I strip characters from a string object in the C API?
>>
> Your Python code is joining the list with a space as the separator.
>
> The equivalent using the C API is:
>
>     PyObject* separator;
>     PyObject* joined;
>
>     separator = PyUnicode_FromString(" ");
>     joined = PyUnicode_Join(separator, pSentence);
>     Py_DECREF(sep);
>
>>
>> Mar 6, 2022, 17:42 by pyt...@mrabarnett.plus.com:
>>
>>  On 2022-03-07 00:32, Jen Kris via Python-list wrote:
>>
>>  I am using the C API in Python 3.8 with the nltk library, and
>>  I have a problem with the return from a library call
>>  implemented with PyObject_CallFunctionObjArgs.
>>
>>  This is the relevant Python code:
>>
>>  import nltk
>>  from nltk.corpus import gutenberg
>>  fileids = gutenberg.fileids()
>>  sentences = gutenberg.sents(fileids[0])
>>  sentence = sentences[0]
>>  sentence = " ".join(sentence)
>>  pt = nltk.word_tokenize(sentence)
>>
>>  I run this at the Python command prompt to show how it works:
>>
>>  sentence = " ".join(sentence)
>>  pt = nltk.word_tokenize(sentence)
>>  print(pt)
>>
>>  ['[', 'Emma', 'by', 'Jane', 'Austen', '1816', ']']
>>
>>  type(pt)
>>
>>  
>>
>>  This is the relevant part of the C API code:
>>
>>  PyObject* str_sentence = PyObject_Str(pSentence);
>>  // nltk.word_tokenize(sentence)
>>  PyObject* pNltk_WTok = PyObject_GetAttrString(pModule_mstr,
>>  "word_tokenize");
>>  PyObject* pWTok = PyObject_CallFunctionObjArgs(pNltk_WTok,
>>  str_sentence, 0);
>>
>>  (where pModule_mstr is the nltk library).
>>
>>  That should produce a list with a length of 7 that looks like
>>  it does on the command line version shown above:
>>
>>  ['[', 'Emma', 'by', 'Jane', 'Austen', '1816', ']']
>>
>>  But instead the C API produces a list with a length of 24, and
>>  the REPR looks like this:
>>
>>  '[\'[\', "\'", \'[\', "\'", \',\', "\'Emma", "\'", \',\',
>>  "\'by", "\'", \',\', "\'Jane", "\'", \',\', "\'Austen", "\'",
>>  \',\', "\'1816", "\'", \',\', "\'", \']\', "\'", \']\']'
>>
>>  I also tried this with PyObject_CallMethodObjArgs and
>>  PyObject_Call without success.
>>
>>  Thanks for any help on this.
>>
>>  What is pSentence? Is it what you think it is?
>>  To me it looks like it's either the list:
>>
>>  ['[', 'Emma', 'by', 'Jane', 'Austen', '1816', ']']
>>
>>  or that list as a string:
>>
>>  "['[', 'Emma', 'by', 'Jane', 'Austen', '1816', ']']"
>>
>>  and that what you're tokenising.
>>  -- https://mail.python.org/mailman/listinfo/python-list
>>
> -- 
> https://mail.python.org/mailman/listinfo/python-list
>

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Execute in a multiprocessing child dynamic code loaded by the parent process

Re: Execute in a multiprocessing child dynamic code loaded by the parent process

Re: Behavior of the for-else construct

always return the same pdf

strange problem building non-pure wheel for apple M1 arm64

Re: C API PyObject_CallFunctionObjArgs returns incorrect result

Re: C API PyObject_CallFunctionObjArgs returns incorrect result

Re: C API PyObject_CallFunctionObjArgs returns incorrect result

Re: C API PyObject_CallFunctionObjArgs returns incorrect result

RE: Behavior of the for-else construct

Re: Execute in a multiprocessing child dynamic code loaded by the parent process

Non sequitur: Changing subject line... WAS: Behavior of the for-else construct

Re: C API PyObject_CallFunctionObjArgs returns incorrect result

Re: always return the same pdf

Re: Behavior of the for-else construct

Re: C API PyObject_CallFunctionObjArgs returns incorrect result

16 matches

Site Navigation

Mail list logo

Footer information