[issue2927] expose html.parser.unescape
Tom Pinckney added the comment: I don't think Django includes an HTML unescape. I'm not familiar with other frameworks. So I'd still find this useful to include in the stdlib. -- ___ Python tracker <http://bugs.python.org/issue2927> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2927] expose html.parser.unescape
Tom Pinckney added the comment: New patch attached, tested against Python 3.2. This is my first Python patch so apologies if I've done something wrong here. Feedback appreciated! Changes: * fit everything to 80 cols * just made changes to the HTMLParser.unescape function instead of providing a standalone unescape function * fixed test case to fix string literals to work in python 3k * left the cp1252 hacks in there since it looks like they work still, but if there's a problem with them let me know. In practice I have to this at work in order to make unescaping actual web pages work. -- Added file: http://bugs.python.org/file19550/unescape.diff ___ Python tracker <http://bugs.python.org/issue2927> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2927] expose html.parser.unescape
New submission from Tom Pinckney <[EMAIL PROTECTED]>: There is currently a private method inside of html.parser.HTMLParser to unescape HTML &...; style escapes. This would be useful to expose for other users who want to unescape a piece of HTML. Additionally, many websites don't use proper unicode or iso-8859-1 encodings and accidentally use Microsoft Code Page 1252 extensions. I added code to map these to their appropriate unicode values. The unescaping logic was slightly simplified too. This is my first Python patch submission, so please let me know if I've done anything wrong. A new test case was also added for this functionality. -- components: Library (Lib) files: unescape.diff keywords: patch messages: 67102 nosy: thomaspinckney3 severity: normal status: open title: expose html.parser.unescape type: feature request versions: Python 2.6 Added file: http://bugs.python.org/file10383/unescape.diff __ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue2927> __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3300] urllib.quote and unquote - Unicode issues
Tom Pinckney <[EMAIL PROTECTED]> added the comment: I mentioned this is in a brief python-dev discussion earlier this spring, but many popular websites such as Wikipedia and Facebook do use UTF-8 as their character encoding scheme for the path and argument portion of URLs. I know there's no RFC that says this is what should be done, but in order to make urllib work out-of-the-box on as many common websites as possible, I think defaulting to UTF-8 decoding makes a lot of sense. Possibly allow an option charset argument to be passed into quote and unquote, but default to UTF-8 in the absence of an explicit character set being passed in? -- nosy: +thomaspinckney3 ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue3300> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2637] urllib.quote() escapes characters unnecessarily and contrary to docs
Tom Pinckney <[EMAIL PROTECTED]> added the comment: It also looks like urllib.quote (and quote_plus) do not properly handle unicode strings. urllib.urlencode() properly converts unicode strings to utf-8 encoded ascii strings before then calling urllib.quote() on them. -- nosy: +thomaspinckney3 __ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue2637> __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4615] de-duping function in itertools
New submission from Tom Pinckney <[EMAIL PROTECTED]>: Any interest in an itertools de-duping function? I find I have to write this over and over for different projects: def deduped(iter,key=None): keys = set() for x in iter: if key: k = key(x) else: k = x if k in keys: continue keys.add(k) yield(x) -- components: Library (Lib) messages: 77477 nosy: thomaspinckney3 severity: normal status: open title: de-duping function in itertools type: feature request versions: Python 2.6 ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue4615> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4615] de-duping function in itertools
Tom Pinckney <[EMAIL PROTECTED]> added the comment: My latest need for something like this was something like this: src1 = db_query(query_1) src2 = db_query(query_2) results = deduped(src1 + src2, key=lambda x: x.field2) Basically, I wanted data from src1 if it existed and otherwise from src2 , while preserving the order of src1 (I didn't care about order of src2). A previous example was reading from a file and wanting to de-dupe lines based on a field in each line. Again order mattered to me since I wanted to process the non-duped lines in the file in order. A final example was generating a bunch of error messages from a variety of sources and then wanting to make sure there were no duplicate errors. Instead of: errors = set(errors) I find this much clearer: errors = deduped(errors) In reality all of these examples probably do not need to be written as a generator. The lists being de-duped are probably not so huge in practice as to preclude instantiating a new list (given the reality of multi-gig RAM machines etc). It just seemed particularly clear to write this using a yield. An ordered dictionary would probably work for me too. I don't think a Bag would given it's lack of ordering. I do find it very simple to just be able to apply deduped() to any existing sequence/iterator and not have to be more verbose about explicitly iterating and filling in an ordered dictionary somehow. ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue4615> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13241] llvm-gcc-4.2 miscompiles Python (XCode 4.1 on Mac OS 10.7)
Tom Pinckney added the comment: FWIW, clang from Xcode 4.3.2 build 4E2002 w/ command line tools built everything fine for me too (i.e., ./configure CC=clang). LM-SJN-00377886:cpython tom$ uname -a Darwin LM-SJN-00377886 11.3.0 Darwin Kernel Version 11.3.0: Thu Jan 12 18:47:41 PST 2012; root:xnu-1699.24.23~1/RELEASE_X86_64 x86_64 LM-SJN-00377886:cpython tom$ /usr/bin/clang --version Apple clang version 3.1 (tags/Apple/clang-318.0.58) (based on LLVM 3.1svn) Target: x86_64-apple-darwin11.3.0 Thread model: posix LM-SJN-00377886:cpython tom$ hg summary parent: 76839:bb30116024ac tip Minor fix for test_multiprocessing branch: default commit: (clean) update: (current) -- nosy: +thomaspinckney3 ___ Python tracker <http://bugs.python.org/issue13241> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1508475] transparent gzip compression in urllib
Tom Pinckney added the comment: What if this gzip decompression was optional and controlled via a flag or handler instead of making it automagic? It's not entirely trivial to implement so it is nice to have the option of this happening automatically if one wishes. Then, the caller would be aware that Content-length / Accept-encoding / Content-encoding etc have been modified iff they requested gzip decompression. -- nosy: +thomaspinckney3 ___ Python tracker <http://bugs.python.org/issue1508475> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6696] Profile objects should be documented
Tom Pinckney added the comment: Looking at the current docs for 3.3, it looks like there are a bunch of other ways that the docs could be clarified: 1) Proper documentation of the complete profile.Profile() and cProfile.Profile() interfaces. 2) Adding other examples to the quick start section at the top for how to process the resulting stats without printing them to stdout or writing them to a file. I'll take a stab at this. -- nosy: +thomaspinckney3 ___ Python tracker <http://bugs.python.org/issue6696> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6696] Profile objects should be documented
Tom Pinckney added the comment: I took a stab at updating the docs based on the current profiler source. See attached patch for a first draft. This is my first doc patch so would appreciate any feedback on style and substance of my changes. I tried to document more of the modules (for example the internal Profile object), logically arrange things a better better IMHO, and generally be more precise about what is happening. -- keywords: +patch Added file: http://bugs.python.org/file2/patch.diff ___ Python tracker <http://bugs.python.org/issue6696> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6696] Profile objects should be documented
Tom Pinckney added the comment: Thanks for the feedback! I'll incorporate it this weekend and make a new patch, though also feel free to just go ahead and make the changes yourself if you'd rather not wait for me. -- ___ Python tracker <http://bugs.python.org/issue6696> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6696] Profile objects should be documented
Tom Pinckney added the comment: Draft of new updates based on Eric's feedback. I haven't done a final proof-reading over this patch as I wanted to upload it and see if I'm heading in the right direction first. -- Added file: http://bugs.python.org/file29036/patch2.diff ___ Python tracker <http://bugs.python.org/issue6696> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6696] Profile objects should be documented
Tom Pinckney added the comment: Updated based on Ezio's comments. -- Added file: http://bugs.python.org/file29683/patch3.diff ___ Python tracker <http://bugs.python.org/issue6696> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6696] Profile objects should be documented
Tom Pinckney added the comment: Another update based on comments. Removed links to cProfile.py and _lsof.c. -- Added file: http://bugs.python.org/file29756/patch4.diff ___ Python tracker <http://bugs.python.org/issue6696> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6696] Profile objects should be documented
Tom Pinckney added the comment: Great! Just signed the contributor agreement. On Apr 10, 2013, at 1:06 PM, Ezio Melotti wrote: > > Ezio Melotti added the comment: > > Last patch LGTM (except a couple of minor whitespace issues). > Tom, can you sign the contributor agreement > (http://www.python.org/psf/contrib/contrib-form/)? > > -- > stage: patch review -> commit review > versions: -Python 3.2 > > ___ > Python tracker > <http://bugs.python.org/issue6696> > ___ -- ___ Python tracker <http://bugs.python.org/issue6696> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6696] Profile objects should be documented
Tom Pinckney added the comment: Thanks everyone for helping me through my first python patch submission. -- ___ Python tracker <http://bugs.python.org/issue6696> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com