Re: removing BOM prepended by codecs?

2013-09-25 Thread Dave Angel
On 25/9/2013 06:38, J. Bagg wrote: > So it is just a random sequence of "junk". > > It will be a matter of finding the real start of the record (in this > case a %) and throwing the "junk" away. Please join the list. Your present habit of starting a new thread for each of your messages is getti

removing BOM prepended by codecs?

2013-09-25 Thread J. Bagg
So it is just a random sequence of "junk". It will be a matter of finding the real start of the record (in this case a %) and throwing the "junk" away. I was misled by the note in the codecs class that BOMs were being prepended. Should have looked more carefully.

Re: removing BOM prepended by codecs?

2013-09-24 Thread Piet van Oostrum
"J. Bagg" writes: > I've checked the original files using od and they don't have BOMs. > > I'll remove them in the servlet. The overhead is probably small enough > unless somebody is doing a massive search. We have a limit anyway to > prevent somebody stealing the entire set of data. > > I starte

Re: removing BOM prepended by codecs?

2013-09-24 Thread Chris Angelico
On Wed, Sep 25, 2013 at 4:43 AM, wrote: > - The *mark* (once the Unicode.org terminology in FAQ) indicating > a unicode encoded raw text file is neither a byte order mark, > nor a signature, it is an encoded code point, the encoded > U+FEFF, 'ZERO WIDTH NO-BREAK SPACE', code point. (Note, a > non

removing BOM prepended by codecs?

2013-09-24 Thread J. Bagg
My editor is JEdit. I use it on a Win 7 machine but have everything set up for *nix files as that is the machine I'm normally working on. The files are mailed to me as updates. The library where the indexers work do use MS computers but this is restricted to EndNote with an exporter into the o

Re: removing BOM prepended by codecs?

2013-09-24 Thread wxjmfauth
Le mardi 24 septembre 2013 11:42:22 UTC+2, J. Bagg a écrit : > I'm having trouble with the BOM that is now prepended to codecs files. > > The files have to be read by java servlets which expect a clean file > > without any BOM. > > > > Is there a w

Re: removing BOM prepended by codecs?

2013-09-24 Thread Peter Otten
J. Bagg wrote: > I've checked the original files using od and they don't have BOMs. > > I'll remove them in the servlet. The overhead is probably small enough > unless somebody is doing a massive search. We have a limit anyway to > prevent somebody stealing the entire set of data. > > I started

removing BOM prepended by codecs?

2013-09-24 Thread J. Bagg
I've checked the original files using od and they don't have BOMs. I'll remove them in the servlet. The overhead is probably small enough unless somebody is doing a massive search. We have a limit anyway to prevent somebody stealing the entire set of data. I started writing the Python search

Re: removing BOM prepended by codecs?

2013-09-24 Thread Dave Angel
On 24/9/2013 09:01, J. Bagg wrote: Why would you start a new thread? just do a Reply-List (or Reply-All and remove the extra names) to the appropriate message on the existing thread. > I'm using: > > outputfile = codecs.open (fn, 'w+', 'utf-8', errors='strict') That won't be adding a BOM. It a

Re: removing BOM prepended by codecs?

2013-09-24 Thread Tim Golden
On 24/09/2013 14:01, J. Bagg wrote: > I'm using: > > outputfile = codecs.open (fn, 'w+', 'utf-8', errors='strict') Well for the life of me I can't make that produce a BOM on 2.7 or 3.4. In other words: import codecs with codecs.open("temp

removing BOM prepended by codecs?

2013-09-24 Thread J. Bagg
I'm using: outputfile = codecs.open (fn, 'w+', 'utf-8', errors='strict') to write as I know that the files are unicode compliant. I run the raw files that are delivered through a Python script to check the unicode and report problem characters which are then edited. The files use a whole vari

Re: removing BOM prepended by codecs?

2013-09-24 Thread Peter Otten
J. Bagg wrote: > I'm having trouble with the BOM that is now prepended to codecs files. > The files have to be read by java servlets which expect a clean file > without any BOM. > > Is there a way to stop the BOM being written? I think if you specify the byte order explic

Re: removing BOM prepended by codecs?

2013-09-24 Thread Steven D'Aprano
On Tue, 24 Sep 2013 10:42:22 +0100, J. Bagg wrote: > I'm having trouble with the BOM that is now prepended to codecs files. > The files have to be read by java servlets which expect a clean file > without any BOM. > > Is there a way to stop the BOM being written? Of cou

removing BOM prepended by codecs?

2013-09-24 Thread J. Bagg
I'm having trouble with the BOM that is now prepended to codecs files. The files have to be read by java servlets which expect a clean file without any BOM. Is there a way to stop the BOM being written? It is seriously messing up my work as the servlets do not expect it to be there. I

Re: Proper use of the codecs module.

2013-08-16 Thread Chris Angelico
On Fri, Aug 16, 2013 at 3:02 PM, Andrew wrote: > I have a mixed binary/text file[0], and the text portions use a radically > nonstandard character set. I want to read them easily given information > about the character encoding and an offset for the beginning of a string. To add to all the inform

Re: Proper use of the codecs module.

2013-08-16 Thread Andrew
portions of > the file, then explicitly decode them. Okay, I'll do that. Given what you said about seek() and text mode below, I have no choice anyway. >> I would like to know how to correctly set up a new codec for >> reading files that have nonstandard encodings. > > I sugge

Re: Proper use of the codecs module.

2013-08-16 Thread Steven D'Aprano
converting the two bytes 0x0D0A to a single byte 0x0A). So best to stick to binary only, extract the "text" portions of the file, then explicitly decode them. > The descriptions of the codecs module and codecs.register() in > particular seem to suggest that this is already supported

Proper use of the codecs module.

2013-08-16 Thread Andrew
I have a mixed binary/text file[0], and the text portions use a radically nonstandard character set. I want to read them easily given information about the character encoding and an offset for the beginning of a string. The descriptions of the codecs module and codecs.register() in particular

Re: codecs in a chroot / without fs access

2012-01-10 Thread K Richard Pixley
On 1/9/12 16:41 , Philipp Hagemeister wrote: I want to forbid my application to access the filesystem. The easiest way seems to be chrooting and droping privileges. However, surprisingly, python loads the codecs from the filesystem on-demand, which makes my program crash: import os os.getuid

Re: codecs in a chroot / without fs access

2012-01-10 Thread Miki Tebeka
Another option is to copy the data to the a location under the new chroot and register a new lookup functions (http://docs.python.org/library/codecs.html#codecs.register). This way you can save some memory. -- http://mail.python.org/mailman/listinfo/python-list

codecs in a chroot / without fs access

2012-01-09 Thread Philipp Hagemeister
I want to forbid my application to access the filesystem. The easiest way seems to be chrooting and droping privileges. However, surprisingly, python loads the codecs from the filesystem on-demand, which makes my program crash: >>> import os >>> os.getuid() 0 >>> os.c

Re: When to use codecs vs. io module (Python 2.7 and higher)

2010-12-01 Thread Antoine Pitrou
On Wed, 01 Dec 2010 09:55:01 -0500 pyt...@bdurham.com wrote: > Python 2.7 or higher: Looking for reasons/scenarios where one > should use the codecs vs. io module. > > Are there use cases that favor one specific module over the other > module? > > My understanding is that

When to use codecs vs. io module (Python 2.7 and higher)

2010-12-01 Thread python
Python 2.7 or higher: Looking for reasons/scenarios where one should use the codecs vs. io module. Are there use cases that favor one specific module over the other module? My understanding is that the io module is much faster than the codecs module (and can be used interchangably), but the

Re: Cross-platform detection of exceptions raised during file access via os, shutil, codecs, etc.

2010-06-09 Thread Thomas Jollans
On 06/09/2010 11:56 PM, pyt...@bdurham.com wrote: > I'm looking for some suggestions on how to detect exceptions raised > during common types of file access (os, shutil, codecs, etc.) on a > cross-platform basis. I'm looking for feedback relative to Python 2.6 > and 2.7 but

Cross-platform detection of exceptions raised during file access via os, shutil, codecs, etc.

2010-06-09 Thread python
I'm looking for some suggestions on how to detect exceptions raised during common types of file access (os, shutil, codecs, etc.) on a cross-platform basis. I'm looking for feedback relative to Python 2.6 and 2.7 but would also appreciate hearing of any Python 3.x specific behaviors. Un

Re: Programmatically discovering encoding types supported by codecs module

2010-03-28 Thread python
te: Wed, 24 Mar 2010 19:50:11 -0300 Subject: Re: Programmatically discovering encoding types supported by codecsmodule En Wed, 24 Mar 2010 14:58:47 -0300, escribió: >> After looking at how things are done in codecs.c and >> encodings/__init__.py I think you should enumerate all mo

Re: Programmatically discovering encoding types supported by codecs module

2010-03-24 Thread Gabriel Genellina
into this for me. Benjamin Kaplan made a similar observation. My reply to him included the snippet of code we're using to generate the actual list of encodings that our software will support (thanks to Python's codecs and encodings modules). I was curious as whether both methods woul

Re: Programmatically discovering encoding types supported by codecs module

2010-03-24 Thread python
in Kaplan made a similar observation. My reply to him included the snippet of code we're using to generate the actual list of encodings that our software will support (thanks to Python's codecs and encodings modules). Your help is always appreciated :) Regards, Malcolm - Original

Re: Programmatically discovering encoding types supported by codecs module

2010-03-24 Thread python
Benjamin, > According to my brief messing around with the REPL, encodings.aliases.aliases > is a good place to start. I don't know of any way to get the Language column, > but at the very least that will give you most of the supported encodings and > any aliases they have. Thank you - that's e

Re: Programmatically discovering encoding types supported by codecs module

2010-03-24 Thread Gabriel Genellina
En Wed, 24 Mar 2010 13:17:16 -0300, escribió: Is there a way to programmatically discover the encoding types supported by the codecs module? For example, the following link shows a table with Codec, Aliases, and Language columns. http://docs.python.org/library/codecs.html#standard-encodings

Re: Programmatically discovering encoding types supported by codecs module

2010-03-24 Thread Benjamin Kaplan
On Wed, Mar 24, 2010 at 12:17 PM, wrote: > Is there a way to programmatically discover the encoding types supported by > the codecs module? > > For example, the following link shows a table with Codec, Aliases, and > Language columns. > http://docs.python.org/library/co

Programmatically discovering encoding types supported by codecs module

2010-03-24 Thread python
Is there a way to programmatically discover the encoding types supported by the codecs module? For example, the following link shows a table with Codec, Aliases, and Language columns. http://docs.python.org/library/codecs.html#standard-encodings I'm looking for a way to programmatically gen

Re: Missing codecs in Python 3.0

2009-06-03 Thread Benjamin Peterson
samwyse gmail.com> writes: > > I have a Python 2.6 program (a code generator, actually) that tries > several methods of compressing a string and chooses the most compact. > It then writes out something like this: > { encoding='bz2_codec', data = '...'} I

Re: Missing codecs in Python 3.0

2009-06-02 Thread Martin v. Löwis
samwyse wrote: > I have a Python 2.6 program (a code generator, actually) that tries > several methods of compressing a string and chooses the most compact. > It then writes out something like this: > { encoding='bz2_codec', data = '...'} > > I'm having two problems converting this to Py3. Firs

Re: Missing codecs in Python 3.0

2009-06-02 Thread Carl Banks
rt 2.6 doc page. > You can always use the `bz2` module instead. Or write your own > encoder/decoder for bz2 and register it with the `codecs` module. IIRC, they decided the codecs would only be used for bytes<->unicode encodings in Python 3.0 (which was their intended use all along),

Re: Missing codecs in Python 3.0

2009-06-02 Thread Chris Rebert
is gone forever from > the standard libraries? That appears to be the case. "bz2" is not listed on http://docs.python.org/3.0/library/codecs.html , but it is listed on the counterpart 2.6 doc page. You can always use the `bz2` module instead. Or write your own encoder/decoder for bz2 and

Missing codecs in Python 3.0

2009-06-02 Thread samwyse
I have a Python 2.6 program (a code generator, actually) that tries several methods of compressing a string and chooses the most compact. It then writes out something like this: { encoding='bz2_codec', data = '...'} I'm having two problems converting this to Py3. First is the absence of the bz2

Re: codecs, csv issues

2008-08-22 Thread John Machin
On Aug 22, 11:52 pm, George Sakkis <[EMAIL PROTECTED]> wrote: > I'm trying to use codecs.open() and I see two issues when I pass > encoding='utf8': > > 1) Newlines are hardcoded to LINEFEED (ascii 10) instead of the > platform-specific byte(s). > > im

Re: codecs, csv issues

2008-08-22 Thread Peter Otten
George Sakkis wrote: > I'm trying to use codecs.open() and I see two issues when I pass > encoding='utf8': > > 1) Newlines are hardcoded to LINEFEED (ascii 10) instead of the > platform-specific byte(s). > > import codecs > f = codecs.open('

codecs, csv issues

2008-08-22 Thread George Sakkis
I'm trying to use codecs.open() and I see two issues when I pass encoding='utf8': 1) Newlines are hardcoded to LINEFEED (ascii 10) instead of the platform-specific byte(s). import codecs f = codecs.open('tmp.txt', 'w', encoding='utf8') s

Re: Free software and video codecs

2008-08-12 Thread Ben Finney
[EMAIL PROTECTED] writes: > Do you have a recommendation for how I can convert the quicktime and > flash movies to ogg? That would produce an inferior result, since the Quicktime and Flash movies have (I assume) already been converted to a lossy codec http://en.wikipedia.org/wiki/Lossy_compressio

Re: Free software and video codecs (was: ANN: Chandler 1.0)

2008-08-12 Thread mimiyin
Hi, Do you have a recommendation for how I can convert the quicktime and flash movies to ogg? I've had a hard time finding conversion apps for the Mac.I'd also be happy to turn over the original movies to anyone who can help me convert them. Thanks! Mimi (Product Designer, Chandler Project - aka

Re: Free software and video codecs (was: ANN: Chandler 1.0)

2008-08-10 Thread Kay Schluehr
On 11 Aug., 07:24, Ben Finney <[EMAIL PROTECTED]> wrote: > Kay Schluehr <[EMAIL PROTECTED]> writes: > > On 11 Aug., 04:43, "SPE - Stani's Python Editor" > > <[EMAIL PROTECTED]> wrote: > > > As an open source project please be kind to Linux users and provide > > > also your screencasts in open sourc

Free software and video codecs (was: ANN: Chandler 1.0)

2008-08-10 Thread Ben Finney
Kay Schluehr <[EMAIL PROTECTED]> writes: > On 11 Aug., 04:43, "SPE - Stani's Python Editor" > <[EMAIL PROTECTED]> wrote: > > As an open source project please be kind to Linux users and provide > > also your screencasts in open source video standards such (as ogg > > video) instead of only mov and

How to create python codecs?

2008-08-06 Thread yrogirg
ion anywhere. I`ve tried create simple utf to utf codec for some symbols but it doesn`t work. Here it is. import codecs ### Codec APIs class Codec(codecs.Codec): def encode(self,input,errors='strict'): return codecs.charmap_encode(input,errors,encoding_table) def deco

Re: codecs / subprocess interaction: utf help requested

2007-06-10 Thread smitty1e
mode, all things going swimmingly. > > I don't know what's going on with the piping in the second version. > > It looks like the output of p0 gets converted to unicode at some > > point, > > Whatever gave you that idea? > > > but I might be misunderstandi

Re: codecs / subprocess interaction: utf help requested

2007-06-10 Thread John Machin
ond version. > It looks like the output of p0 gets converted to unicode at some > point, Whatever gave you that idea? > but I might be misunderstanding what's going on. The 4.8 > codecs module documentation doesn't really offer much enlightment, > nor google. About t

codecs / subprocess interaction: utf help requested

2007-06-10 Thread smitty1e
isunderstanding what's going on. The 4.8 codecs module documentation doesn't really offer much enlightment, nor google. About the only other place I can think to look would be the unit test cases shipped with python. Sort of hoping one of the guru-level pythonistas can point to illuminati

Re: codecs - where are those on windows?

2006-11-04 Thread GHUM
s.builtin_module_names > ('__builtin__', '__main__', '_ast', '_bisect', '_codecs', > '_codecs_cn', '_codecs_hk', '_codecs_iso2022', '_codecs_jp', > '_codecs_kr', '_codecs_tw', ...

Re: codecs - where are those on windows?

2006-10-30 Thread Fredrik Lundh
Paul Watson wrote: >> So, my question is: on Windows. where are those CJK codecs? Are they by >> any chance included in the 1.867.776 bytes of python24.dll ? > > If your installation directory is C:\Python25, then look in > > C:\Python25\lib\encodings that's on

Re: codecs - where are those on windows?

2006-10-30 Thread Paul Watson
GHUM wrote: > I stumbled apon a paragraph in python-dev about "reducing the size of > Python" for an embedded device: > > """ > In my experience, the biggest gain can be obtained by dropping the > rarely-used > CJK codecs (for Asian languages). Tha

codecs - where are those on windows?

2006-10-30 Thread GHUM
I stumbled apon a paragraph in python-dev about "reducing the size of Python" for an embedded device: """ In my experience, the biggest gain can be obtained by dropping the rarely-used CJK codecs (for Asian languages). That should sum up to almost 800K (uncompressed), I

Re: Python UTF-8 and codecs

2006-06-27 Thread Serge Orlov
On 6/27/06, Mike Currie <[EMAIL PROTECTED]> wrote: > Well, not really. It doesn't affect the result. I still get the error > message. Did you get a different result? Yes, the program succesfully wrote text file. Without magic abilities to read the screen of your computer I guess you now get ex

Re: Python UTF-8 and codecs

2006-06-27 Thread Mike Currie
Well, not really. It doesn't affect the result. I still get the error message. Did you get a different result? "Serge Orlov" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > On 6/27/06, Mike Currie <[EMAIL PROTECTED]> wrote: >> Okay, >> >> Here is a sample of what I'm doing: >>

Re: Python UTF-8 and codecs

2006-06-27 Thread Serge Orlov
On 6/27/06, Mike Currie <[EMAIL PROTECTED]> wrote: > Okay, > > Here is a sample of what I'm doing: > > > Python 2.4.3 (#69, Mar 29 2006, 17:35:34) [MSC v.1310 32 bit (Intel)] on > win32 > Type "help", "copyright", "credits" or "license" for more information. > >>> filterMap = {} > >>> for i in rang

Re: Python UTF-8 and codecs

2006-06-27 Thread Mike Currie
,255): ... filterMap[chr(i)] = chr(i) ... >>> filterMap[chr(9)] = chr(136) >>> filterMap[chr(10)] = chr(133) >>> filterMap[chr(136)] = chr(9) >>> filterMap[chr(133)] = chr(10) >>> line = '''this has ... tabsand line .

Re: Python UTF-8 and codecs

2006-06-27 Thread Mike Currie
t;> >> I've tried using the codecs.open('foo.txt', 'rU', 'utf-8', >> errors='strict') >> and that doesn't work and I've also try wrapping the file in an >> utf8_writer >> using codecs.lookup('utf8')

Re: Python UTF-8 and codecs

2006-06-27 Thread Serge Orlov
> > I've tried using the codecs.open('foo.txt', 'rU', 'utf-8', errors='strict') > and that doesn't work and I've also try wrapping the file in an utf8_writer > using codecs.lookup('utf8') > > Any clues? Use unicode string

Re: Python UTF-8 and codecs

2006-06-27 Thread Dennis Benzinger
Mike Currie wrote: > I'm trying to write out files that have utf-8 characters 0x85 and 0x08 in > them. Every configuration I try I get a UnicodeError: ascii codec can't > decode byte 0x85 in position 255: oridinal not in range(128) > > I've tried using the codecs.open('foo.txt', 'rU', 'utf-8',

Python UTF-8 and codecs

2006-06-27 Thread Mike Currie
I'm trying to write out files that have utf-8 characters 0x85 and 0x08 in them. Every configuration I try I get a UnicodeError: ascii codec can't decode byte 0x85 in position 255: oridinal not in range(128) I've tried using the codecs.open('foo.txt', 'rU', 'utf-8', errors='strict') and that do

Re: split() can help to read UTF-16 encoded file without codecs support, why?

2006-03-17 Thread Fuzzyman
Zhongjian Lu wrote: > Hi Guys, > > I was processing a UTF-16 coded file with BOM and was not aware of the > codecs package at first. I wrote the following code: > = Code 1 > for i in open("d:\python24\lzjtest.xml", 'r').readlin

split() can help to read UTF-16 encoded file without codecs support, why?

2006-03-17 Thread Zhongjian Lu
Hi Guys, I was processing a UTF-16 coded file with BOM and was not aware of the codecs package at first. I wrote the following code: = Code 1 for i in open("d:\python24\lzjtest.xml", 'r').readlines(): i = i.deco

frozen codecs

2006-02-23 Thread Robin Becker
I'm having a problem with freeze vs Python-2.4. I need to get various codecs into the freeze, but suspect I need to explicitly import them. Must I just import codecs or all the ones I'm likely to use? -- Robin Becker -- http://mail.python.org/mailman/listinfo/python-list

Re: decode unicode string using 'unicode_escape' codecs

2006-01-13 Thread aurora
Cool, it works! I have also done some due diligence that the utf-8 encoding would not introduce any Python escape accidentially. I have written a recipe in the Python cookbook: Efficient character escapes decoding http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/466293 wy > Does this

Re: decode unicode string using 'unicode_escape' codecs

2006-01-13 Thread Steven Bethard
aurora wrote: > I have some unicode string with some characters encode using python > notation like '\n' for LF. I need to convert that to the actual LF > character. There is a 'unicode_escape' codec that seems to suit my purpose. > encoded = u'A\\nA' decoded = encoded.decode('unicod

decode unicode string using 'unicode_escape' codecs

2006-01-12 Thread aurora
parser decodes the bytes into an unicode string with UTF-8 encoding. Then it applies syntax run to decode the unicode characters '\n' to LF. The second is what I want. There must be something available to the Python interpreter that is not available to the user. So it there something

codecs

2005-11-15 Thread TK
Hi there, sys.stdout = codecs.lookup('utf-8')[-1](sys.stdout) What does this line mean? Thanks. o-o Thomas -- http://mail.python.org/mailman/listinfo/python-list

Re: codecs

2005-11-15 Thread André Malo
* TK <[EMAIL PROTECTED]> wrote: > sys.stdout = codecs.lookup('utf-8')[-1](sys.stdout) > What does this line mean? "Wrap stdout with an UTF-8 stream writer". See the codecs module documentation for details. nd -- http://mail.python.org/mailman/listinfo/python-list

Re: Monster python24.dll / stripping off asian codecs to separate package(s) ?

2005-11-04 Thread Robert
A bugfix release 2.3.6 would also be a good idea. I suffered at once from 2 severe errors pending in 2.3 (fixed in 2.4) when going back to 2.3.5 (because of the monster python24.dll issue): * httplib.HTTP(x)Connect inserts still double Content-length headers whichs breaks on many servers. * _ss

Monster python24.dll / stripping off asian codecs to separate package(s) ?

2005-11-04 Thread Robert
Martin v. Löwis schrieb: > Robert wrote: > > Wouldn't it be an issue to think about if future win-python distributions > > should keep on including the asian codecs in the main-dll? > > Indeed, it would. As I said before: if somebody defines a clear, fair > policy

[ANN] Speed up Charmap codecs with fastcharmap module

2005-10-16 Thread Tony Nelson
Fastcharmap is a python extension module that speeds up Charmap codecs by about 5 times. <http://georgeanelson.com/fastcharmap.htm> Usage: import fastcharmap fastcharmap.hook('codec_name') Fastcharmap will then speed up calls that use that codec, such as unicode(

Re: Small python24.dll / how to strip off asian codecs to separate package(s) ?

2005-09-24 Thread Martin v. Löwis
Robert wrote: > Wouldn't it be an issue to think about if future win-python distributions > should keep on including the asian codecs in the main-dll? Indeed, it would. As I said before: if somebody defines a clear, fair policy which finds agreement in the community, I'm willi

Re: Small python24.dll / how to strip off asian codecs to separate package(s) ?

2005-09-24 Thread Robert
thanks, I go for that. Wouldn't it be an issue to think about if future win-python distributions should keep on including the asian codecs in the main-dll? See some reason in including smaller functional pyd's lik zip or even ssl, etc. (as they are used regularly in projects >30%

Re: Small python24.dll / how to strip off asian codecs to separate package(s) ?

2005-09-23 Thread Martin v. Löwis
Robert wrote: > Or how to build one? Just download the source, and follow the instructions in PCBuild/readme.txt. Then, edit the pythoncore project to remove the files you don't want to include, and edit config.c to remove the dangling references. Regards, Martin -- http://mail.python.org/mailma

Small python24.dll / how to strip off asian codecs to separate package(s) ?

2005-09-22 Thread Robert
updating a py2exe'd software I was impressed by python24.dll's footprint - double size of python23.dll Is there a version without/separate asianc codecs (which seem to mainly blow up python24.dll)? Or how to build one? Robert -- http://mail.python.org/mailman/listinfo/python-list

Re: pythonXX.dll size: please split CJK codecs out

2005-08-23 Thread Neil Benn
Giovanni Bajo wrote: >Hello, > >python24.dll is much bigger than python23.dll. This was discussed already on >the newsgroup, see the thread starting here: >http://mail.python.org/pipermail/python-list/2004-July/229096.html > >I don't think I fully understand the reason why additional .pyd modules

Re: pythonXX.dll size: please split CJK codecs out

2005-08-22 Thread Martin v. Löwis
Thomas Heller wrote: > That seems to be true. But it will need zlib.pyd as soon if you try to > import from compressed zips. So, zlib can be thought as part of the > modules required for bootstrap. Right. OTOH, linking zlib to pythonXY means that you cannot build Python at all anymore unless you

Re: pythonXX.dll size: please split CJK codecs out

2005-08-21 Thread Thomas Heller
"Martin v. Löwis" <[EMAIL PROTECTED]> writes: > Ron Adam wrote: >> I would put the starting minimum boundary as: >> >>1. "The minimum required to start the python interpreter with no >> additional required files." >> >> Currently python 2.4 (on windows) does not yet meet that guideline, so >

Re: pythonXX.dll size: please split CJK codecs out

2005-08-21 Thread Ron Adam
Martin v. Löwis wrote: > Ron Adam wrote: > >>I would put the starting minimum boundary as: >> >> 1. "The minimum required to start the python interpreter with no >>additional required files." >> >>Currently python 2.4 (on windows) does not yet meet that guideline, so >>it seems some modules stil

Re: pythonXX.dll size: please split CJK codecs out

2005-08-21 Thread Martin v. Löwis
Ron Adam wrote: > I would put the starting minimum boundary as: > >1. "The minimum required to start the python interpreter with no > additional required files." > > Currently python 2.4 (on windows) does not yet meet that guideline, so > it seems some modules still need to be added while oth

Re: Revamping Python build system (Was: pythonXX.dll size: please split CJK codecs out)

2005-08-21 Thread Martin v. Löwis
Giovanni Bajo wrote: > You seem to ignore the fact that scons can easily generate VS.NET projects. I'm not ignoring it - I'm not aware of it. And also, I don't quite believe it until I see it. > But there is no technical reason why it has to be so. I work on several > portable projects, and they

Re: pythonXX.dll size: please split CJK codecs out

2005-08-21 Thread Ron Adam
Martin v. Löwis wrote: >>Can we at least undo this unfortunate move in time for 2.5? I would be >>grateful >>if *at least* the CJK codecs (which are like 1Mb big) are splitted out of >>python25.dll. IMHO, I would prefer having *more* granularity, rather than >>

Re: Revamping Python build system (Was: pythonXX.dll size: please split CJK codecs out)

2005-08-21 Thread Giovanni Bajo
Martin v. Löwis wrote: >> Out of curiosity, was this ever discussed among Python developers? >> Would something like scons qualify for this? OTOH, scons opens nasty >> self-bootstrapping issues (being written itself in Python). > > No. The Windows build system must be integrated with Visual Studio

Re: Revamping Python build system (Was: pythonXX.dll size: please split CJK codecs out)

2005-08-21 Thread Martin v. Löwis
Giovanni Bajo wrote: >>I'm sure Martin would be happy to consider a patch to make the build >>system more efficient. :) > > Out of curiosity, was this ever discussed among Python developers? Would > something like scons qualify for this? OTOH, scons opens nasty > self-bootstrapping issues (being w

Re: pythonXX.dll size: please split CJK codecs out

2005-08-21 Thread Martin v. Löwis
Giovanni Bajo wrote: > FWIW, this just highlights how ineffecient your build system is. Everything > you > currently do by hand could be automated, including MSI generation. Also, you > describe the Windows procedure, which I suppose it does not take into account > what needs to be done for other

Revamping Python build system (Was: pythonXX.dll size: please split CJK codecs out)

2005-08-21 Thread Giovanni Bajo
Michael Hoffman wrote: >> FWIW, this just highlights how ineffecient your build system is. >> Everything you currently do by hand could be automated, including >> MSI generation. > > I'm sure Martin would be happy to consider a patch to make the build > system more efficient. :) Out of curiosity

Re: pythonXX.dll size: please split CJK codecs out

2005-08-21 Thread Michael Hoffman
Giovanni Bajo wrote: > > FWIW, this just highlights how ineffecient your build system is. Everything > you > currently do by hand could be automated, including MSI generation. I'm sure Martin would be happy to consider a patch to make the build system more efficient. :) > I'm willing to write

Re: pythonXX.dll size: please split CJK codecs out

2005-08-21 Thread Giovanni Bajo
decide. >> Can we at least undo this unfortunate move in time for 2.5? I would >> be grateful if *at least* the CJK codecs (which are like 1Mb big) >> are splitted out of python25.dll. IMHO, I would prefer having *more* >> granularity, rather than *less*. > > If somebod

Re: pythonXX.dll size: please split CJK codecs out

2005-08-21 Thread Martin v. Löwis
.5? I would be > grateful > if *at least* the CJK codecs (which are like 1Mb big) are splitted out of > python25.dll. IMHO, I would prefer having *more* granularity, rather than > *less*. If somebody would formulate a policy (i.e. conditions under which modules go into python2x.dll, vs. going

pythonXX.dll size: please split CJK codecs out

2005-08-20 Thread Giovanni Bajo
feature is a showstopper for adopting py2exe). Can we at least undo this unfortunate move in time for 2.5? I would be grateful if *at least* the CJK codecs (which are like 1Mb big) are splitted out of python25.dll. IMHO, I would prefer having *more* granularity, rather than *less*. +1 on spl

Re: Enumerate registered codecs

2005-07-31 Thread Martin v. Löwis
Paul Watson wrote: > The primary identifier and a descriptive string (localized) need to be > available at a minimum. Having aliases would be a plus. You will have to implement your own list. Getting the well-known aliases is possible through encodings.aliases, but a localized descriptive string

Re: Enumerate registered codecs

2005-07-31 Thread Paul Watson
John Machin wrote: > Paul Watson wrote: > >> I see the list of standard encodings in Python 2.4.1 documentation >> section 4.9.2. >> >> Is there a method to enumerate the registered codecs at runtime? > > This has been asked before, within the last coup

Re: Enumerate registered codecs

2005-07-31 Thread John Machin
Paul Watson wrote: > I see the list of standard encodings in Python 2.4.1 documentation > section 4.9.2. > > Is there a method to enumerate the registered codecs at runtime? This has been asked before, within the last couple of months AFAIR. Use Google to search for codec(s) in th

Enumerate registered codecs

2005-07-31 Thread Paul Watson
I see the list of standard encodings in Python 2.4.1 documentation section 4.9.2. Is there a method to enumerate the registered codecs at runtime? -- http://mail.python.org/mailman/listinfo/python-list

Re: Codecs

2005-07-10 Thread John Machin
o try to decode the file from "legacy" to Unicode -- until the first 'success' (defined how?)? But the file could be decoded by *several* codecs into Unicode without an exception being raised. Just a simple example: the encodings ['iso-8859-' + x for x in '124

Re: Codecs

2005-07-10 Thread Martin v. Löwis
Ivan Van Laningham wrote: > Hi All-- > As far as I can tell, after looking only at the documentation (and not > searching peps etc.), you cannot query the codecs to give you a list of > registered codecs, or a list of possible codecs it could retrieve for > you if you knew enough

Codecs

2005-07-08 Thread Ivan Van Laningham
Hi All-- As far as I can tell, after looking only at the documentation (and not searching peps etc.), you cannot query the codecs to give you a list of registered codecs, or a list of possible codecs it could retrieve for you if you knew enough to ask for them by name. Why not? It seems to me

singing the praises of unicode and codecs

2004-12-10 Thread Steven Bethard
I just wanted to thank Python for making encodings so easy! I recently discovered that one of the tools I use stores everything in UTF-8, and so I was getting some off-by-one errors because I was treating things as strings. I added def __unicode__(self): return str(self).decode('utf