[issue18219] csv.DictWriter is slow when writing files with large number of columns

2017-03-31 Thread Donald Stufft
Changes by Donald Stufft : -- pull_requests: +969 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.

[issue18219] csv.DictWriter is slow when writing files with large number of columns

2016-10-21 Thread Mariatta Wijaya
Mariatta Wijaya added the comment: Thanks David. I uploaded patch to address your concern with the docs. Can you please check? Serhiy, with regards to applying docs and test to 3.5, does that require a different patch than what I have? Thanks. -- Added file: http://bugs.python.org/fil

[issue18219] csv.DictWriter is slow when writing files with large number of columns

2016-10-21 Thread R. David Murray
R. David Murray added the comment: Serhiy: I know you prefer applying test changes to the maint version, and I don't disagree, but there are others who prefer not to and we really don't have an official policy on it at this point. (We used to say no, a few years ago :) The doc change looks wr

[issue18219] csv.DictWriter is slow when writing files with large number of columns

2016-10-21 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Shouldn't docs changes and new tests be added to 3.5? -- ___ Python tracker ___ ___ Python-bugs-li

[issue18219] csv.DictWriter is slow when writing files with large number of columns

2016-10-21 Thread INADA Naoki
INADA Naoki added the comment: committed. -- resolution: -> fixed stage: commit review -> resolved status: open -> closed ___ Python tracker ___

[issue18219] csv.DictWriter is slow when writing files with large number of columns

2016-10-21 Thread Roundup Robot
Roundup Robot added the comment: New changeset 1928074e6519 by INADA Naoki in branch '3.6': Issue #18219: Optimize csv.DictWriter for large number of columns. https://hg.python.org/cpython/rev/1928074e6519 New changeset 6f1602dfa4d5 by INADA Naoki in branch 'default': Issue #18219: Optimize csv.

[issue18219] csv.DictWriter is slow when writing files with large number of columns

2016-10-21 Thread Mariatta Wijaya
Changes by Mariatta Wijaya : Added file: http://bugs.python.org/file45175/issue18219v8.patch ___ Python tracker ___ ___ Python-bugs-list maili

[issue18219] csv.DictWriter is slow when writing files with large number of columns

2016-10-21 Thread STINNER Victor
STINNER Victor added the comment: > My mentor (Yury) prohibit it while I'm beginner. Oh right, trust your mentor more than me ;-) -- ___ Python tracker ___ _

[issue18219] csv.DictWriter is slow when writing files with large number of columns

2016-10-21 Thread INADA Naoki
INADA Naoki added the comment: > If you are confident (ex: if the change is simple, like this one), you can > push it directly. My mentor (Yury) prohibit it while I'm beginner. And as you saw, I missed PEP 8 violation :) -- ___ Python tracker

[issue18219] csv.DictWriter is slow when writing files with large number of columns

2016-10-21 Thread Mariatta Wijaya
Mariatta Wijaya added the comment: Inada-san, Victor, thank you. Here is the updated patch. -- Added file: http://bugs.python.org/file45174/issue18219v7.patch ___ Python tracker ___

[issue18219] csv.DictWriter is slow when writing files with large number of columns

2016-10-21 Thread STINNER Victor
STINNER Victor added the comment: issue18219v6.patch: LGTM, but I added a minor PEP 8 comment. INADA Naoki: "LGTM, Thanks Mariatta. (But one more LGTM from coredev is required for commit)" If you are confident (ex: if the change is simple, like this one), you can push it directly. --

[issue18219] csv.DictWriter is slow when writing files with large number of columns

2016-10-21 Thread INADA Naoki
INADA Naoki added the comment: LGTM, Thanks Mariatta. (But one more LGTM from coredev is required for commit) -- nosy: +inada.naoki versions: -Python 3.5 ___ Python tracker ___

[issue18219] csv.DictWriter is slow when writing files with large number of columns

2016-10-21 Thread Mariatta Wijaya
Changes by Mariatta Wijaya : Added file: http://bugs.python.org/file45173/issue18219v6.patch ___ Python tracker ___ ___ Python-bugs-list maili

[issue18219] csv.DictWriter is slow when writing files with large number of columns

2016-10-21 Thread Mariatta Wijaya
Changes by Mariatta Wijaya : Added file: http://bugs.python.org/file45172/issue18219v5.patch ___ Python tracker ___ ___ Python-bugs-list maili

[issue18219] csv.DictWriter is slow when writing files with large number of columns

2016-10-21 Thread Mariatta Wijaya
Changes by Mariatta Wijaya : Added file: http://bugs.python.org/file45170/issue18219v4.patch ___ Python tracker ___ ___ Python-bugs-list maili

[issue18219] csv.DictWriter is slow when writing files with large number of columns

2016-10-21 Thread Mariatta Wijaya
Changes by Mariatta Wijaya : Added file: http://bugs.python.org/file45169/issue18219v3.patch ___ Python tracker ___ ___ Python-bugs-list maili

[issue18219] csv.DictWriter is slow when writing files with large number of columns

2016-10-20 Thread Mariatta Wijaya
Mariatta Wijaya added the comment: Thanks, Hugh. Please check the updated patch :) -- Added file: http://bugs.python.org/file45168/issue18219v2.patch ___ Python tracker ___ _

[issue18219] csv.DictWriter is slow when writing files with large number of columns

2016-10-20 Thread Hugh Brown
Hugh Brown added the comment: Mariatta: Yes, that is what I was thinking of. That takes my 12 execution time down to 10 seconds. (Or, at least, a fix I did of this nature had that effect -- I have not timed your patch but it should be the same.) -- __

[issue18219] csv.DictWriter is slow when writing files with large number of columns

2016-10-20 Thread Mariatta Wijaya
Mariatta Wijaya added the comment: Thanks Hugh, Are you thinking of something like the following? class DictWriter: def __init__(self, f, fieldnames, restval="", extrasaction="raise", dialect="excel", *args, **kwds): self._fieldnames = fieldnames# list of keys f

[issue18219] csv.DictWriter is slow when writing files with large number of columns

2016-10-20 Thread Hugh Brown
Hugh Brown added the comment: Fabulous. Looks great. Let's ship! It is not the *optimal* fix for 3.x platforms. A better fix would calculate the set of fieldnames only once in __init__ (or only as often as fieldnames is changed). But I stress that it is a robust change that works in versions

[issue18219] csv.DictWriter is slow when writing files with large number of columns

2016-10-20 Thread Mariatta Wijaya
Mariatta Wijaya added the comment: Hello, please review my patch. I used set subtraction to calculate wrong_fields, added more test cases, and clarify documentation with regards to extrasaction parameter. Please let me know if this works. Thanks :) -- nosy: +Mariatta Added file: http

[issue18219] csv.DictWriter is slow when writing files with large number of columns

2016-10-20 Thread SilentGhost
Changes by SilentGhost : -- stage: -> commit review versions: +Python 3.5, Python 3.6, Python 3.7 -Python 3.4 ___ Python tracker ___

[issue18219] csv.DictWriter is slow when writing files with large number of columns

2016-10-20 Thread Hugh Brown
Hugh Brown added the comment: I came across this problem today when I was using a 1000+ column CSV from a client. It was taking about 15 minutes to process each file. I found the problem and made this change: # wrong_fields = [k for k in rowdict if k not in self.fieldnames]

[issue18219] csv.DictWriter is slow when writing files with large number of columns

2013-09-02 Thread Mikhail Traskin
Mikhail Traskin added the comment: Peter, thank you for letting me know that views work with list, I was not aware of this. This is indeed the best solution and it also keeps the DictWriter interface unchanged. Terry, attached patch contains the DictWriter change and a test case in test_csv.p

[issue18219] csv.DictWriter is slow when writing files with large number of columns

2013-08-15 Thread Peter Otten
Peter Otten added the comment: Note that set operations on dict views work with lists, too. So the only change necessary is to replace wrong_fields = [k for k in rowdict if k not in self.fieldnames] with wrong_fields = rowdict.keys() - self.filenames (A backport to 2.7 would need to replace

[issue18219] csv.DictWriter is slow when writing files with large number of columns

2013-08-14 Thread Mikhail Traskin
Mikhail Traskin added the comment: > What is the purpose in touching fieldnames [...] Wrapping the fieldnames property and tupleizing it guarantees that fieldnames and _fieldset fields are consistent. Otherwise, having a separate _fieldset field means that someone who is modifying the fieldnam

[issue18219] csv.DictWriter is slow when writing files with large number of columns

2013-06-21 Thread Terry J. Reedy
Terry J. Reedy added the comment: What is the purpose in touching fieldnames, either in tuple-izing it or in making it private and wrapped with a property. If someone wants to modify it, that is up to them. In any case, this change is not germane to the issue and could break code, so I would n

[issue18219] csv.DictWriter is slow when writing files with large number of columns

2013-06-15 Thread Mikhail Traskin
Mikhail Traskin added the comment: Any way is fine with me. If you prefer to avoid having public filedset property, please use the attached patch. -- Added file: http://bugs.python.org/file30605/csvdictwriter.v2.patch ___ Python tracker

[issue18219] csv.DictWriter is slow when writing files with large number of columns

2013-06-14 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I think there is no need in public fieldset property. Just use private self._fieldset field in private _dict_to_list() method. -- nosy: +serhiy.storchaka ___ Python tracker

[issue18219] csv.DictWriter is slow when writing files with large number of columns

2013-06-14 Thread Mikhail Traskin
New submission from Mikhail Traskin: _dict_to_list method of the csv.DictWriter objects created with extrasaction="raise" uses look-up in the list of field names to check if current row has any unknown fields. This results in O(n^2) execution time and is very slow if there are a lot of columns