Static HTML documentation from docstrings

2012-02-07 Thread Florian Weimer
I'm slightly confused about docstrings and HTML documentation.  I used
to think that the library reference was (in part) generated from the
source code, but this does not seem to be the case.

Is there any tool support for keeping documentation and code in sync?
-- 
http://mail.python.org/mailman/listinfo/python-list


Reversible malformed UTF-8 to malformed UTF-16 encoding

2019-03-19 Thread Florian Weimer
I've seen occasional proposals like this one coming up:

| I therefore suggested 1999-11-02 on the unic...@unicode.org mailing
| list the following approach. Instead of using U+FFFD, simply encode
| malformed UTF-8 sequences as malformed UTF-16 sequences. Malformed
| UTF-8 sequences consist excludively of the bytes 0x80 - 0xff, and
| each of these bytes can be represented using a 16-bit value from the
| UTF-16 low-half surrogate zone U+DC80 to U+DCFF. Thus, the overlong
| "K" (U+004B) 0xc1 0x8b from the above example would be represented
| in UTF-16 as U+DCC1 U+DC8B. If we simply make sure that every UTF-8
| encoded surrogate character is also treated like a malformed
| sequence, then there is no way that a single high-half surrogate
| could precede the encoded malformed sequence and cause a valid
| UTF-16 sequence to emerge.



Has this ever been implemented in any Python version?  I seem to
remember something like that, but all I could find was me talking
about this in 2000.

It's not entirely clear whether this is a good idea as the default
encoding for security reasons, but it might be nice to be able to read
XML or JSON which is not quite properly encoded, only nearly so,
without treating it as ISO-8859-1 or some other arbitrarily chose
single-byte character set.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Reversible malformed UTF-8 to malformed UTF-16 encoding

2019-03-19 Thread Florian Weimer
* MRAB:

> On 2019-03-19 20:32, Florian Weimer wrote:
>> I've seen occasional proposals like this one coming up:
>> 
>> | I therefore suggested 1999-11-02 on the unic...@unicode.org mailing
>> | list the following approach. Instead of using U+FFFD, simply encode
>> | malformed UTF-8 sequences as malformed UTF-16 sequences. Malformed
>> | UTF-8 sequences consist excludively of the bytes 0x80 - 0xff, and
>> | each of these bytes can be represented using a 16-bit value from the
>> | UTF-16 low-half surrogate zone U+DC80 to U+DCFF. Thus, the overlong
>> | "K" (U+004B) 0xc1 0x8b from the above example would be represented
>> | in UTF-16 as U+DCC1 U+DC8B. If we simply make sure that every UTF-8
>> | encoded surrogate character is also treated like a malformed
>> | sequence, then there is no way that a single high-half surrogate
>> | could precede the encoded malformed sequence and cause a valid
>> | UTF-16 sequence to emerge.
>> 
>> <http://hyperreal.org/~est/utf-8b/releases/utf-8b-20060413043934/kuhn-utf-8b.html>
>> 
>> Has this ever been implemented in any Python version?  I seem to
>> remember something like that, but all I could find was me talking
>> about this in 2000.

> Python 3 has "surrogate escape". Have a read of PEP 383 -- Non-decodable 
> Bytes in System Character Interfaces.

Thanks, this is the information I was looking for.
-- 
https://mail.python.org/mailman/listinfo/python-list


Compression module APIs

2010-05-06 Thread Florian Weimer
As far as I can see, the compression-related APIs (gzip, zlib, bzip2)
in Python 2.5 have three distinct APIs.  Is there really no unified
interface, or am I missing something?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Compression module APIs

2010-05-06 Thread Florian Weimer
* Chris Rebert:

> On Thu, May 6, 2010 at 4:09 AM, Florian Weimer  wrote:
>> As far as I can see, the compression-related APIs (gzip, zlib, bzip2)
>> in Python 2.5 have three distinct APIs.  Is there really no unified
>> interface, or am I missing something?
>
> bz2.BZ2File and gzip.GzipFile both offer a file-like interface for
> reading/writing compressed files in their respective formats.
> The gzip module is already built on top of the zlib module, so there's
> no ZlibFile.
> zlib and bz2 also both offer compatible one-shot compress() and
> decompress() functions.
> So, the interfaces are sorta "unified", although it is true they're
> not grouped into a single generic "compression" module.

Some overlap is there.  But but there does not appear to be a way to
decompress a gzip stream on the fly (which is surprising, as this is a
fairly common operation), and there are no counterparts to
bz2.BZ2{Dec,C}ompressor.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: SQLite is quite SQL compliant

2010-10-03 Thread Florian Weimer
* Ravi:

> The documentation of the sqlite module at 
> http://docs.python.org/library/sqlite3.html
> says:
>
> "...allows accessing the database using a nonstandard variant of the
> SQL..."
>
> But if you see SQLite website they clearly say at
> http://sqlite.org/omitted.html that only very few of the SQL is not
> implemented.

I think that page refers to SQL92, not some more recent version of the
standard.  There are also issues caused by SQLite's approach to
typing, e.g.

  SELECT 1 = '1';

returns a false value, where it would return true on other systems.

SQLite is a fine piece of software, but its SQL dialect has many
quirks.
-- 
http://mail.python.org/mailman/listinfo/python-list


Spreadsheet-style dependency tracking

2010-10-16 Thread Florian Weimer
Are there libraries which implement some form of spreadsheet-style
dependency tracking?  The idea is to enable incremental updates to
some fairly convoluted computation.  I hope that a general dependency
tracking framework would avoid making the computation even more
convoluted and difficult to change.

It would also be preferable if the whole dataset would not have to be
kept in memory for the whole computation.  (It's rather smallish,
though, so it wouldn't be impossible to keep it resident, I guess.)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Spreadsheet-style dependency tracking

2010-10-17 Thread Florian Weimer
* Chris Torek:

> In article <87y69xbz6h@mid.deneb.enyo.de>
> Florian Weimer   wrote:
>>Are there libraries which implement some form of spreadsheet-style
>>dependency tracking?  The idea is to enable incremental updates to
>>some fairly convoluted computation.  I hope that a general dependency
>>tracking framework would avoid making the computation even more
>>convoluted and difficult to change.
>
> Don't know of any libraries myself, but I wrote some code to do
> topological sort for dependencies, which I can paste here.  It
> is also worth noting that a lot of spreadsheets cheat: they just
> repeat a sheet-wide computation until values stop changing (or a
> cycle count limit runs out).

I think most of the relevant implementations use dependency
information to speed up incremental recomputation.  For instance,
gnumeric seems to have code for this.  This is the part I'm most
interested in.  I already have got an explicit ordering of the
computations (fortunately, that part is fairly simple).
-- 
http://mail.python.org/mailman/listinfo/python-list