Re: Program uses twice as much memory in Python 3.6 than in Python 3.5
> > Running further trials indicate that the problem actually is related to > swapping. If I reduce the model size in the benchmark slightly so that > everything fits into the main memory, the problem disappears. Only when the > memory usage exceeds the 32GB that I have, Python 3.6 will acquire way more > memory (from the swap) than Python 3.5. > > Jan > -- It's very hard to believe... I think there are some factor other than swap cause the problem. Or, can't it reproducible in 64GB RAM machine? -- https://mail.python.org/mailman/listinfo/python-list
Re: pandas dataframe, find duplicates and add suffix
On 3/28/17, zljubi...@gmail.com wrote: > In dataframe > > import pandas as pd > > data = {'model': ['first', 'first', 'second', 'second', 'second', 'third', > 'third'], > 'dtime': ['2017-01-01_112233', '2017-01-01_112234', > '2017-01-01_112234', '2017-01-01_112234', '2017-01-01_112234', > '2017-01-01_112235', '2017-01-01_112235'], > } > df = pd.DataFrame(data, index = ['a.jpg', 'b.jpg', 'c.jpg', 'd.jpg', > 'e.jpg', 'f.jpg', 'g.jpg'], columns=['model', 'dtime']) > > print(df.head(10)) > > model dtime > a.jpg first 2017-01-01_112233 > b.jpg first 2017-01-01_112234 > c.jpg second 2017-01-01_112234 > d.jpg second 2017-01-01_112234 > e.jpg second 2017-01-01_112234 > f.jpg third 2017-01-01_112235 > g.jpg third 2017-01-01_112235 > > within model, there are duplicate dtime values. > For example, rows d and e are duplicates of the c row. > Row g is duplicate of the f row. > > For each duplicate (within model) I would like to add suffix (starting from > 1) to the dtime value. Something like this: > > model dtime > a.jpg first 2017-01-01_112233 > b.jpg first 2017-01-01_112234 > c.jpg second 2017-01-01_112234 > d.jpg second 2017-01-01_112234-1 > e.jpg second 2017-01-01_112234-2 > f.jpg third 2017-01-01_112235 > g.jpg third 2017-01-01_112235-1 > > How to do that? > -- > https://mail.python.org/mailman/listinfo/python-list > I am not expert, just played a little... This one could work: gb = df.groupby([df.model, df.dtime]) df.dtime = df.dtime + gb.cumcount().apply(lambda a:str(-a) if a else '') this one is probably more readable: df.dtime = df.dtime + [str(-a) if a else '' for a in gb.cumcount()] I don't know which one is better in memory consumption and/or speed. This small dataframe gave me: %timeit -r 5 df.dtime + gb.cumcount().apply(lambda a:str(-a) if a else '') 1000 loops, best of 5: 387 µs per loop %timeit -r 5 df.dtime + [str(-a) if a else '' for a in gb.cumcount()] 1000 loops, best of 5: 324 µs per loop PL. -- https://mail.python.org/mailman/listinfo/python-list
Re: Program uses twice as much memory in Python 3.6 than in Python 3.5
On 3/29/17, Jan Gosmann wrote: > On 28 Mar 2017, at 14:21, INADA Naoki wrote: > >> On Wed, Mar 29, 2017 at 12:29 AM, Jan Gosmann >> wrote: >> >> I suppose smaller and faster benchmark is better to others looking for >> it. >> I already stopped the azure instance. >> [...] >> There are no maxrss difference in "smaller existing examples"? >> [...] >> I want to investigate RAM usage, without any swapping. > > Running further trials indicate that the problem actually is related to > swapping. If I reduce the model size in the benchmark slightly so that > everything fits into the main memory, the problem disappears. Only when > the memory usage exceeds the 32GB that I have, Python 3.6 will acquire > way more memory (from the swap) than Python 3.5. > > Jan > -- > https://mail.python.org/mailman/listinfo/python-list Could you add table comparing time benchmarks when memory is bigger? (if your hypothesis is true and memory measurement tools are right than time difference has to be huge) Did you compare "pip list" results? There could be more differences in your environments (not only python version). For example different numpy versions or some missing packages could change game. I tried to search "except.*ImportError" in your repository, but I am not sure that it could change it significantly... ( https://github.com/ctn-archive/gosmann-frontiers2017/search?utf8=%E2%9C%93&q=ImportError&type= This one seems suspitious - sparse matrix class could be game changer from scipy.sparse import bsr_matrix assert bsr_matrix except (ValueError, ImportError): return False ) This one doesn't seems suspicious to me (but who knows?): try: import faulthandler faulthandler.enable() except: pass PL. -- https://mail.python.org/mailman/listinfo/python-list
Re: Python under PowerShell adds characters
On Thu, 30 Mar 2017 04:43 pm, Marko Rauhamaa wrote: > Steven D'Aprano : > >> On Thu, 30 Mar 2017 07:29:48 +0300, Marko Rauhamaa wrote: >>> I'd expect not having to deal with Unicode decoding exceptions with >>> arbitrary input. >> >> That's just silly. If you have *arbitrary* bytes, not all >> byte-sequences are valid Unicode, so you have to expect decoding >> exceptions, if you're processing text. > > The input is not in my control, and bailing out may not be an option: You have to deal with bad input *somehow*. You can't just say it will never happen. If bailing out is not an option, then perhaps the solution is not to read stdin as Unicode text, if there's a chance that it actually doesn't contain Unicode text. Otherwise, you have to deal with any errors. ("Deal with" can include the case of not dealing with them at all, and just letting your script raise an exception.) >$ echo $'aa\n\xdd\naa' | grep aa >aa >aa >$ echo $'\xdd' | python2 -c 'import sys; sys.stdin.read(1)' >$ echo $'\xdd' | python3 -c 'import sys; sys.stdin.read(1)' >Traceback (most recent call last): > File "", line 1, in > File "/usr/lib64/python3.5/codecs.py", line 321, in decode >(result, consumed) = self._buffer_decode(data, self.errors, final) >UnicodeDecodeError: 'utf-8' codec can't decode byte 0xdd in position 0: > invalid continuation byte As I said, what did you expect? You choose to read from stdin as Unicode text, then fed it something that wasn't Unicode text. That's no different from expecting to read a file name, then passing an ASCII NUL byte. Something is going to break, somewhere, so you have to deal with it. I'm not sure if there are better ways, but one way of dealing with this is to bypass the text layer and read from the raw byte-oriented stream: [steve@ando ~]$ echo $'\xdd' | python3 -c 'import sys; print(sys.stdin.buffer.read(1))' b'\xdd' You have a choice. The default choice is aimed at the most-common use-case, which is that input will be text, but its not the only choice. -- Steve “Cheer up,” they said, “things could be worse.” So I cheered up, and sure enough, things got worse. -- https://mail.python.org/mailman/listinfo/python-list
Re: Text-mode apps (Was :Who are the "spacists"?)
On 30 March 2017 at 07:43, Chris Angelico wrote: > On Thu, Mar 30, 2017 at 3:21 PM, Rick Johnson > wrote: >> On Sunday, March 26, 2017 at 2:53:49 PM UTC-5, Chris Angelico wrote: >>> On Mon, Mar 27, 2017 at 6:25 AM, Mikhail V wrote: >>> > On 26 March 2017 at 20:10, Steve D'Aprano >>> > wrote: >>> >> On Mon, 27 Mar 2017 03:57 am, Mikhail V wrote: >> >>> """ >>> I, the undersigned, acknowledge that my program is >>> intentionally excluding everyone who does not fit the >>> following requirements: [choose all applicable] >>> >>> [ ] Speaks English exclusively >> >> Of course, your comment presupposing that every programmer >> is fluent in every natural language. Which is not only >> impractical, it's impossible. > > Nope. I can't speak Mandarin, but I can make absolutely sure that all > my programs can accept Chinese characters. A friend of mine sent me an > audio file with a name that included some Chinese, and I was able to > handle it no problem. > Naming files is another point. Generally if I can't speak Mandarin, I have no right to make support for it since I know nothing about this language nor do I know their symbols. >>> [ ] Uses no diacritical marks >> >> Why is it my responsibiliy to encode my text with >> pronuciation tutorials? Are we adults here or what? >> >>> [ ] Writes all text top-to-bottom, left-to-right >> >> Not my problem. Learn the King's English or go wait for >> extinction to arrive. > > And these two cement your parochialism thoroughly in everyone's minds. > "Pronunciation tutorials", eh? Sure. Tell that to everyone who speaks > Spanish, Turkish, Norwegian, German, or Vietnamese, all of which use > diacritical marks to distinguish between letters. English is the weird > language in that it uses letter pairs instead of adorned letters (eg > "ch" and "sh" instead of "ç" and "ş"). Because "Pronunciation tutorials" is one of rare excuses to use those special characters at all. Now do you know how many phonetical systems linguist have invented over past 200 years? Will you find all them in Unicode? And why you need them today, if you can learn pronunciation by audio tutorials? And letter pair usage is not for fun there, I think we've discussed this some time ago on python-ideas, it is merely a political problem, since every 'king' in each land suddenly thinks that he is a genius typographer and adds few custom characters to Latin after he realizes that there is no sense in forcing everyone to use some outdated system, or even rolls his own bizzare system (e.g. Hangul). >> What don't you add these: >> >> [ ] Has the ability to read and comprehend at a high >> school level. >> [ ] Has functioning visual receptors. >> [ ] Has a functioning brain. >> [ ] Is not currently in a vegetative state > > Nah. If I did, I'd have to say "[ ] Is not trolling python-list" as well. Call me a bigot, but I would say: [x] people, become adult finally and stop playing with your funny hieroglyphs, use Latin set, and concentrate on real problems. If I produce an arcade game or an IDE, would it lose much if I don't include Unicode support? I personally would not do it even in the fear of punishment. Mikhail -- https://mail.python.org/mailman/listinfo/python-list
Re: Text-mode apps (Was :Who are the "spacists"?)
On Thu, 30 Mar 2017 03:21 pm, Rick Johnson wrote: > On Sunday, March 26, 2017 at 2:53:49 PM UTC-5, Chris Angelico wrote: >> On Mon, Mar 27, 2017 at 6:25 AM, Mikhail V wrote: >> > On 26 March 2017 at 20:10, Steve D'Aprano >> > wrote: >> >> On Mon, 27 Mar 2017 03:57 am, Mikhail V wrote: > >> I generally find that when people say that Unicode doesn't >> solve their problems and they need to roll their own, it's >> usually one of two possibilities: 1) "Their problems" are >> all about simplicity. They don't want to have to deal with >> all the complexities of real-world text, so they >> arbitrarily restrict things. > > There are only so many hours in the day Chris. Not every > progammer has the time to cater to every selfish desire of > every potential client. Oh, you're one of *those* coders. The ones who believe that if they personally don't need something, nobody needs it. Listen, I'm 100% in favour of the open source model. I think that coders who scratch their own itch is a great way to produce some really fantastic software. Look at the Linux kernel, and think about how that has evolved from Linus Torvalds scratching his own itch. It can also produce some real garbage too, usually from the kind of coder whose answer to everything is "you don't need to do that". But whatever, its a free country. If you don't want to support a subset of your potential users, or customers, that's entirely up to you. The honest truth is that most software ends up languishing in obscurity, only used by a relative handful of people, so its quite unlikely that you'll every have any users wanting support for Old Persian or Ogham. But if you have any users at all, there's a good chance they'll want to write their name correctly even if they are called Zöe, or include the trademarked name of their Awesome™ product, or write the name of that hot new metal band THЯДSHËR, or use emoji, or to refer to ¢ and °F temperatures. You might call it "selfish" for somebody to want to spell their name correctly, or write in their native language, but selfish or not if you don't give your users the features they want, they are unlikely to use your software. Outside of the Democratic People's Republic of Trumpistan, the world is full of about seven billion people who don't have any interest in your ASCII-only software. It's not 1970 any more, the world is connected. And the brilliant thing about Unicode is that for a little bit of effort you can support Zöe and her French girlfriends, and that Swedish metal band with the umlauts, and the President's Russian backers, and once you've done that, you get at least partial support for Hebrew and Chinese and Korean and Vietnamese and a dozen different Indian languages, and even Old Persian and Ogham, FOR FREE. So if you're wanting to create "the best product you can", why *wouldn't* you use Unicode? > You try to create the best product you can, > but at the end of the process, there will always be > someone (or a group of someones) who are unhappy with the > result. [...] >> [ ] Speaks English exclusively > > Of course, your comment presupposing that every programmer > is fluent in every natural language. Which is not only > impractical, it's impossible. Don't be silly. You don't have to be fluent in a language in order for your program to support users who are. All you have to do is not stop them from using their own native language by forcing them to use ASCII and nothing but ASCII. Of course, if you want to *localise* your UI to their language, then you need somebody to translate error messages, menus, window titles, etc. I'll grant that's not always an easy job. But aren't you lucky, you speak one of a handful of lingua francas in the world, so the chances are your users will be pathetically grateful if all you do is let them type in their own language. Actual UI localisation is a bonus. >> [ ] Uses no diacritical marks > > Why is it my responsibiliy to encode my text with > pronuciation tutorials? Are we adults here or what? Now you're just being absurd. Supporting diacritics doesn't mean you are responsible for teaching your users what they're for. They already know. That's why they want to use them. Diacritics are for: - distinguishing between words which look the same, but have different pronunciation; - distinguishing between different letters of the alphabet, like dotted-i and dotless-ı (or ı and ı-with-a-dot, if you prefer), or a and å; - distinguishing between words which look and sound the same but mean something different; - and making band names look ǨØØĻ and annoy old fuddy-duddies. >> [ ] Writes all text top-to-bottom, left-to-right > > Not my problem. Learn the King's English or go wait for > extinction to arrive. Which king? Harald V speaks Norwegian, Felipe VI speaks Spanish, Hamad bin Isa speaks whatever they speak in Bahrain (probably Arabic), Norodom Sihamoni speaks Cambodian, Vajiralongkorn speaks Thai, Mswati III speaks Swazi, Ab
Re: Text-mode apps (Was :Who are the "spacists"?)
On Fri, 31 Mar 2017 12:25 am, Mikhail V wrote: > Call me a bigot Okay. You're a bigot. -- Steve “Cheer up,” they said, “things could be worse.” So I cheered up, and sure enough, things got worse. -- https://mail.python.org/mailman/listinfo/python-list
Re: Text-mode apps (Was :Who are the "spacists"?)
On Fri, Mar 31, 2017 at 1:16 AM, Steve D'Aprano wrote: > On Fri, 31 Mar 2017 12:25 am, Mikhail V wrote: > >> Call me a bigot > > Okay. You're a bigot. +1 QOTD ChrisA -- https://mail.python.org/mailman/listinfo/python-list
add processing images in the model.py using django
want to create a simple image processing using Django. my tasks is easy I have some user a simple model and that user can upload images in my project using html form or django form and then that images saves to upload_to='mypath' in upload from my model. but I have some questions : I have a simple image processing in my views.py if the processing complete success then create new image and I want that image to add in the upload from my model . how to do that in Django ? models.py class MyModel(models.Model): user = models.ForeignKey(User) upload = models.ImageField(upload_to='mypath/personal/folder/per/user') views.py def index(request): form = ImageUploadForm(request.POST or None, request.FILES or None) if request.method == "POST" and form.is_valid(): image_file = request.FILES['image'].read() ''' image processing new image ''' return render_to_response("blog/success.html", {"new image":new image}) return render_to_response('blog/images.html', {'form': form}, RequestContext(request)) -- https://mail.python.org/mailman/listinfo/python-list
django authentication multi upload files
I have create I simple Django auth project and I need to add the user to can upload some images. multi upload images from internet views.py from django.shortcuts import render from django.http import HttpResponse def Form(request): return render(request, "index/form.html", {}) def Upload(request): for count, x in enumerate(request.FILES.getlist("files")): def process(f): with open('/Users/Michel/django_1.8/projects/upload/media/file_' + str(count), 'wb+') as destination: for chunk in f.chunks(): destination.write(chunk) process(x) return HttpResponse("File(s) uploaded!") but how to define that to multi upload images in specific unique folder for any user. first I use login_required and in destination I use user_directory_path. But how to define the code in the views.py to work with authentication per user. for example for user_1 upload images in your folder for user_1 in folder for user_2. medels.py def user_directory_path(instance, filename): return 'user_{0}/{1}'.format(instance.user.id, filename) class MyModel(models.Model): user = models.ForeignKey(User, unique=True) upload = models.ImageField(upload_to=user_directory_path) -- https://mail.python.org/mailman/listinfo/python-list
Re: Text-mode apps (Was :Who are the "spacists"?)
On 03/30/2017 08:14 AM, Steve D'Aprano wrote: >> Why is it my responsibiliy to encode my text with >> pronuciation tutorials? Are we adults here or what? > > Now you're just being absurd. Ahh yes, good old RR with his reductio ad absurdum fallacies when he's lost the argument. -- https://mail.python.org/mailman/listinfo/python-list
Re: Text-mode apps (Was :Who are the "spacists"?)
On 30 March 2017 at 16:14, Steve D'Aprano wrote: > On Thu, 30 Mar 2017 03:21 pm, Rick Johnson wrote: > >> On Sunday, March 26, 2017 at 2:53:49 PM UTC-5, Chris Angelico wrote: >>> On Mon, Mar 27, 2017 at 6:25 AM, Mikhail V wrote: >>> > On 26 March 2017 at 20:10, Steve D'Aprano >>> > wrote: >>> [ ] Uses no diacritical marks >> >> Why is it my responsibiliy to encode my text with >> pronuciation tutorials? Are we adults here or what? > > Now you're just being absurd. Supporting diacritics doesn't mean you are > responsible for teaching your users what they're for. They already know. > That's why they want to use them. > > Diacritics are for: > > - distinguishing between words which look the same, but have > different pronunciation; > > - distinguishing between different letters of the alphabet, like > dotted-i and dotless-ı (or ı and ı-with-a-dot, if you prefer), > or a and å; > > - distinguishing between words which look and sound the same but > mean something different; > > - and making band names look ǨØØĻ and annoy old fuddy-duddies. > Steve, it is not bad to want to spell your name using spelling which was taught you in the school. But it is bad to stay in illusion that there is something good in using accents. As said it _is_ selfish to force people to use e.g. umlauts and noun Capitalisations in German. It is an big obstacle for reading and burdle for typing. Initially it has nothing to do with people's choice, it is politics only. I can speak and write German fluently, so I know how much better would it be without those odd spelling rules. So don't mix the spoken language and writing system - spoken language will never be extinct, but most writing systems will be obsolete and should be obsolete (you can call me bigot again ;-) Some other interesting aspects: if I localise a software in Russian language and use Cyrillic letters, english speakers will not be able to read _anything_, and if I'll use Latin letters instead, then non-russian users will be at least able to read something, so if you know some Russian spoken language it will be much more help for you. Concrete software example - lets say I make an IDE. It is far more important to give the users mechanisms to customize the glyphs (e.g. edit math signs) than supporting Unicode. And therefore it is much more important to make those font format definitions transparent, non-bloated and easily editable. Where is it all? Mikhail -- https://mail.python.org/mailman/listinfo/python-list
Re: Text-mode apps (Was :Who are the "spacists"?)
On 30/03/17 16:57, Mikhail V wrote: Steve, it is not bad to want to spell your name using spelling which was taught you in the school. But it is bad to stay in illusion that there is something good in using accents. *plonk* -- Rhodri James *-* Kynesim Ltd -- https://mail.python.org/mailman/listinfo/python-list
Re: Program uses twice as much memory in Python 3.6 than in Python 3.5
I reproduced the issue. This is very usual, memory usage issue. Slashing is just a result of large memory usage. After 1st pass of optimization, RAM usage is 20GB+ on Python 3.5 and 30GB on Python 3.6. And Python 3.6 starts slashing in 2nd optimization pass. I enabled tracemalloc while 1st pass. Results is end of this mail. It seems frozenset() cause regression, but I'm not sure yet. I don't know what is contents of frozenset yet. (I know almost nothing about this application). Jan, do you know about what this is? Could you make script which just runs `transitive_closure(edges)` with edges similar to `log_reduction.py spaun`? I'll dig into it later, maybe next week. --- Python 3.6.1 1191896 memory blocks: 22086104.2 KiB File "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py", line 85 reachables[vertex] = frozenset(reachables[vertex]) File "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py", line 410 self.dependents = transitive_closure(self.dg.forward) 602986 memory blocks: 51819.1 KiB File "", line 14 File "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py", line 634 first_view=None, v_offset=0, v_size=0, v_base=None) Python 3.5.3 1166804 memory blocks: 6407.0 KiB File "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py", line 85 reachables[vertex] = frozenset(reachables[vertex]) File "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py", line 410 self.dependents = transitive_closure(self.dg.forward) 602989 memory blocks: 51819.3 KiB File "", line 14 File "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py", line 634 first_view=None, v_offset=0, v_size=0, v_base=None) -- https://mail.python.org/mailman/listinfo/python-list
Re: Program uses twice as much memory in Python 3.6 than in Python 3.5
Maybe, this commit make this regression. https://github.com/python/cpython/commit/4897300276d870f99459c82b937f0ac22450f0b6 Old: minused = (so->used + other->used)*2 (L619) New: minused = so->used + other->used (L620) minused = (minused > 5) ? minused * 2 : minused * 4; (L293) So size of small set is doubled. $ /usr/bin/python3 Python 3.5.2+ (default, Sep 22 2016, 12:18:14) [GCC 6.2.0 20160927] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import sys >>> s = set(range(10)) >>> sys.getsizeof(frozenset(s)) 736 >>> $ python3 Python 3.6.0 (default, Dec 30 2016, 20:49:54) [GCC 6.2.0 20161005] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import sys >>> s = set(range(10)) >>> sys.getsizeof(frozenset(s)) 1248 >>> On Fri, Mar 31, 2017 at 2:34 AM, INADA Naoki wrote: > I reproduced the issue. > This is very usual, memory usage issue. Slashing is just a result of > large memory usage. > > After 1st pass of optimization, RAM usage is 20GB+ on Python 3.5 and > 30GB on Python 3.6. > And Python 3.6 starts slashing in 2nd optimization pass. > > I enabled tracemalloc while 1st pass. Results is end of this mail. > It seems frozenset() cause regression, but I'm not sure yet. > I don't know what is contents of frozenset yet. (I know almost > nothing about this application). > > Jan, do you know about what this is? > Could you make script which just runs `transitive_closure(edges)` with > edges similar to > `log_reduction.py spaun`? > > I'll dig into it later, maybe next week. > > --- > Python 3.6.1 > 1191896 memory blocks: 22086104.2 KiB > File > "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py", > line 85 > reachables[vertex] = frozenset(reachables[vertex]) > File > "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py", > line 410 > self.dependents = transitive_closure(self.dg.forward) > 602986 memory blocks: 51819.1 KiB > File "", line 14 > File > "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py", > line 634 > first_view=None, v_offset=0, v_size=0, v_base=None) > > Python 3.5.3 > 1166804 memory blocks: 6407.0 KiB > File > "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py", > line 85 > reachables[vertex] = frozenset(reachables[vertex]) > File > "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py", > line 410 > self.dependents = transitive_closure(self.dg.forward) > 602989 memory blocks: 51819.3 KiB > File "", line 14 > File > "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py", > line 634 > first_view=None, v_offset=0, v_size=0, v_base=None) -- https://mail.python.org/mailman/listinfo/python-list
Re: Program uses twice as much memory in Python 3.6 than in Python 3.5
Filed an issue: https://bugs.python.org/issue29949 Thanks for your report, Jan. On Fri, Mar 31, 2017 at 3:04 AM, INADA Naoki wrote: > Maybe, this commit make this regression. > > https://github.com/python/cpython/commit/4897300276d870f99459c82b937f0ac22450f0b6 > > Old: > minused = (so->used + other->used)*2 (L619) > > New: > minused = so->used + other->used (L620) > minused = (minused > 5) ? minused * 2 : minused * 4; (L293) > > So size of small set is doubled. > > $ /usr/bin/python3 > Python 3.5.2+ (default, Sep 22 2016, 12:18:14) > [GCC 6.2.0 20160927] on linux > Type "help", "copyright", "credits" or "license" for more information. import sys s = set(range(10)) sys.getsizeof(frozenset(s)) > 736 > > $ python3 > Python 3.6.0 (default, Dec 30 2016, 20:49:54) > [GCC 6.2.0 20161005] on linux > Type "help", "copyright", "credits" or "license" for more information. import sys s = set(range(10)) sys.getsizeof(frozenset(s)) > 1248 > > > > On Fri, Mar 31, 2017 at 2:34 AM, INADA Naoki wrote: >> I reproduced the issue. >> This is very usual, memory usage issue. Slashing is just a result of >> large memory usage. >> >> After 1st pass of optimization, RAM usage is 20GB+ on Python 3.5 and >> 30GB on Python 3.6. >> And Python 3.6 starts slashing in 2nd optimization pass. >> >> I enabled tracemalloc while 1st pass. Results is end of this mail. >> It seems frozenset() cause regression, but I'm not sure yet. >> I don't know what is contents of frozenset yet. (I know almost >> nothing about this application). >> >> Jan, do you know about what this is? >> Could you make script which just runs `transitive_closure(edges)` with >> edges similar to >> `log_reduction.py spaun`? >> >> I'll dig into it later, maybe next week. >> >> --- >> Python 3.6.1 >> 1191896 memory blocks: 22086104.2 KiB >> File >> "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py", >> line 85 >> reachables[vertex] = frozenset(reachables[vertex]) >> File >> "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py", >> line 410 >> self.dependents = transitive_closure(self.dg.forward) >> 602986 memory blocks: 51819.1 KiB >> File "", line 14 >> File >> "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py", >> line 634 >> first_view=None, v_offset=0, v_size=0, v_base=None) >> >> Python 3.5.3 >> 1166804 memory blocks: 6407.0 KiB >> File >> "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py", >> line 85 >> reachables[vertex] = frozenset(reachables[vertex]) >> File >> "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py", >> line 410 >> self.dependents = transitive_closure(self.dg.forward) >> 602989 memory blocks: 51819.3 KiB >> File "", line 14 >> File >> "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py", >> line 634 >> first_view=None, v_offset=0, v_size=0, v_base=None) -- https://mail.python.org/mailman/listinfo/python-list
Re: Program uses twice as much memory in Python 3.6 than in Python 3.5
That's great news. I'm busy with other things right now, but will look into your findings in more detail later. On 03/30/2017 02:09 PM, INADA Naoki wrote: Filed an issue: https://bugs.python.org/issue29949 Thanks for your report, Jan. On Fri, Mar 31, 2017 at 3:04 AM, INADA Naoki wrote: Maybe, this commit make this regression. https://github.com/python/cpython/commit/4897300276d870f99459c82b937f0ac22450f0b6 Old: minused = (so->used + other->used)*2 (L619) New: minused = so->used + other->used (L620) minused = (minused > 5) ? minused * 2 : minused * 4; (L293) So size of small set is doubled. $ /usr/bin/python3 Python 3.5.2+ (default, Sep 22 2016, 12:18:14) [GCC 6.2.0 20160927] on linux Type "help", "copyright", "credits" or "license" for more information. import sys s = set(range(10)) sys.getsizeof(frozenset(s)) 736 $ python3 Python 3.6.0 (default, Dec 30 2016, 20:49:54) [GCC 6.2.0 20161005] on linux Type "help", "copyright", "credits" or "license" for more information. import sys s = set(range(10)) sys.getsizeof(frozenset(s)) 1248 On Fri, Mar 31, 2017 at 2:34 AM, INADA Naoki wrote: I reproduced the issue. This is very usual, memory usage issue. Slashing is just a result of large memory usage. After 1st pass of optimization, RAM usage is 20GB+ on Python 3.5 and 30GB on Python 3.6. And Python 3.6 starts slashing in 2nd optimization pass. I enabled tracemalloc while 1st pass. Results is end of this mail. It seems frozenset() cause regression, but I'm not sure yet. I don't know what is contents of frozenset yet. (I know almost nothing about this application). Jan, do you know about what this is? Could you make script which just runs `transitive_closure(edges)` with edges similar to `log_reduction.py spaun`? I'll dig into it later, maybe next week. --- Python 3.6.1 1191896 memory blocks: 22086104.2 KiB File "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py", line 85 reachables[vertex] = frozenset(reachables[vertex]) File "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py", line 410 self.dependents = transitive_closure(self.dg.forward) 602986 memory blocks: 51819.1 KiB File "", line 14 File "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py", line 634 first_view=None, v_offset=0, v_size=0, v_base=None) Python 3.5.3 1166804 memory blocks: 6407.0 KiB File "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py", line 85 reachables[vertex] = frozenset(reachables[vertex]) File "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py", line 410 self.dependents = transitive_closure(self.dg.forward) 602989 memory blocks: 51819.3 KiB File "", line 14 File "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py", line 634 first_view=None, v_offset=0, v_size=0, v_base=None) -- https://mail.python.org/mailman/listinfo/python-list
Re: Program uses twice as much memory in Python 3.6 than in Python 3.5
FYI, this small patch may fix your issue: https://gist.github.com/methane/8faf12621cdb2166019bbcee65987e99 -- https://mail.python.org/mailman/listinfo/python-list
error in syntax description for comprehensions?
https://docs.python.org/3/reference/expressions.html#displays-for-lists-sets-and-dictionaries describes the syntax for comprehensions as comprehension ::= expression comp_for comp_for ::= [ASYNC] "for" target_list "in" or_test [comp_iter] comp_iter ::= comp_for | comp_if comp_if ::= "if" expression_nocond [comp_iter] Is the comp_for missing an argument after "in"? One has to follow the definition of or_test and its components, but I can't find anything that results to a single variable or expression. Actually, I'm not sure what or_test would do there either with or without an additional element following "in". Ross Boylan -- https://mail.python.org/mailman/listinfo/python-list
Re: error in syntax description for comprehensions?
On Thursday, March 30, 2017 at 4:59:03 PM UTC-4, Boylan, Ross wrote: > https://docs.python.org/3/reference/expressions.html#displays-for-lists-sets-and-dictionaries > describes the syntax for comprehensions as > comprehension ::= expression comp_for > comp_for ::= [ASYNC] "for" target_list "in" or_test [comp_iter] > comp_iter ::= comp_for | comp_if > comp_if ::= "if" expression_nocond [comp_iter] > > Is the comp_for missing an argument after "in"? > One has to follow the definition of or_test and its components, but I can't > find anything that results to a single variable or expression. > > Actually, I'm not sure what or_test would do there either with or without an > additional element following "in". Syntax grammars can be obtuse. An or_test can be an and_test, which can be a not_test, which can be a comparison. It continues from there. The whole chain of "can be" is: or_test and_test not_test comparison or_expr xor_expr and_expr shift_expr a_expr m_expr u_expr power primary atom identifier ... and identifier is what you are looking for. --Ned. -- https://mail.python.org/mailman/listinfo/python-list
Re: Program uses twice as much memory in Python 3.6 than in Python 3.5
On 2017-03-30 19:04, INADA Naoki wrote: Maybe, this commit make this regression. https://github.com/python/cpython/commit/4897300276d870f99459c82b937f0ac22450f0b6 Old: minused = (so->used + other->used)*2 (L619) New: minused = so->used + other->used (L620) minused = (minused > 5) ? minused * 2 : minused * 4; (L293) So size of small set is doubled. $ /usr/bin/python3 Python 3.5.2+ (default, Sep 22 2016, 12:18:14) [GCC 6.2.0 20160927] on linux Type "help", "copyright", "credits" or "license" for more information. import sys s = set(range(10)) sys.getsizeof(frozenset(s)) 736 $ python3 Python 3.6.0 (default, Dec 30 2016, 20:49:54) [GCC 6.2.0 20161005] on linux Type "help", "copyright", "credits" or "license" for more information. import sys s = set(range(10)) sys.getsizeof(frozenset(s)) 1248 Copying a small set _might_ double its memory usage. Python 3.5.2 (v3.5.2:4def2a2901a5, Jun 25 2016, 22:18:55) [MSC v.1900 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import sys >>> s = set(range(10)) >>> sys.getsizeof(s) 736 >>> sys.getsizeof(set(s)) 736 >>> >>> sys.getsizeof(set(set(s))) 736 Python 3.6.1 (v3.6.1:69c0db5, Mar 21 2017, 18:41:36) [MSC v.1900 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import sys >>> s = set(range(10)) >>> sys.getsizeof(s) 736 >>> sys.getsizeof(set(s)) 1248 >>> >>> sys.getsizeof(set(set(s))) 1248 >>> -- https://mail.python.org/mailman/listinfo/python-list
Re: error in syntax description for comprehensions?
On 3/30/2017 4:57 PM, Boylan, Ross wrote: https://docs.python.org/3/reference/expressions.html#displays-for-lists-sets-and-dictionaries describes the syntax for comprehensions as comprehension ::= expression comp_for comp_for ::= [ASYNC] "for" target_list "in" or_test [comp_iter] comp_iter ::= comp_for | comp_if comp_if ::= "if" expression_nocond [comp_iter] Is the comp_for missing an argument after "in"? The or_test *is* the 'argument'. One has to follow the definition of or_test and its components, > but I can't find anything that results to a single variable > or expression. An or_test *is* a single expression. Like all python expressions, it evaluates to a python object. In this case, the object is passed to iter() and so the object must be an iterable. >>> a, b = None, range(3) >>> a or b range(0, 3) >>> for i in a or b: print(i) 0 1 2 -- Terry Jan Reedy -- https://mail.python.org/mailman/listinfo/python-list
Re: Program uses twice as much memory in Python 3.6 than in Python 3.5
On 3/30/17, INADA Naoki wrote: > Maybe, this commit make this regression. > > https://github.com/python/cpython/commit/4897300276d870f99459c82b937f0ac22450f0b6 > > Old: > minused = (so->used + other->used)*2 (L619) > > New: > minused = so->used + other->used (L620) > minused = (minused > 5) ? minused * 2 : minused * 4; (L293) > > So size of small set is doubled. > > $ /usr/bin/python3 > Python 3.5.2+ (default, Sep 22 2016, 12:18:14) > [GCC 6.2.0 20160927] on linux > Type "help", "copyright", "credits" or "license" for more information. import sys s = set(range(10)) sys.getsizeof(frozenset(s)) > 736 > > $ python3 > Python 3.6.0 (default, Dec 30 2016, 20:49:54) > [GCC 6.2.0 20161005] on linux > Type "help", "copyright", "credits" or "license" for more information. import sys s = set(range(10)) sys.getsizeof(frozenset(s)) > 1248 > > > > On Fri, Mar 31, 2017 at 2:34 AM, INADA Naoki > wrote: >> I reproduced the issue. >> This is very usual, memory usage issue. Slashing is just a result of >> large memory usage. >> >> After 1st pass of optimization, RAM usage is 20GB+ on Python 3.5 and >> 30GB on Python 3.6. >> And Python 3.6 starts slashing in 2nd optimization pass. >> >> I enabled tracemalloc while 1st pass. Results is end of this mail. >> It seems frozenset() cause regression, but I'm not sure yet. >> I don't know what is contents of frozenset yet. (I know almost >> nothing about this application). >> >> Jan, do you know about what this is? >> Could you make script which just runs `transitive_closure(edges)` with >> edges similar to >> `log_reduction.py spaun`? >> >> I'll dig into it later, maybe next week. >> >> --- >> Python 3.6.1 >> 1191896 memory blocks: 22086104.2 KiB >> File >> "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py", >> line 85 >> reachables[vertex] = frozenset(reachables[vertex]) >> File >> "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py", >> line 410 >> self.dependents = transitive_closure(self.dg.forward) >> 602986 memory blocks: 51819.1 KiB >> File "", line 14 >> File >> "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py", >> line 634 >> first_view=None, v_offset=0, v_size=0, v_base=None) >> >> Python 3.5.3 >> 1166804 memory blocks: 6407.0 KiB >> File >> "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py", >> line 85 >> reachables[vertex] = frozenset(reachables[vertex]) >> File >> "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py", >> line 410 >> self.dependents = transitive_closure(self.dg.forward) >> 602989 memory blocks: 51819.3 KiB >> File "", line 14 >> File >> "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py", >> line 634 >> first_view=None, v_offset=0, v_size=0, v_base=None) > -- > https://mail.python.org/mailman/listinfo/python-list > Interesting. 1. using set_table_resize with growing factor 2 or 4 doesn't seem to be optimal for constructing (immutable) frozenset from set. 2. growth factor 2 could be too big too ( https://en.wikipedia.org/wiki/Dynamic_array#Growth_factor ,https://github.com/python/cpython/blob/80ec8364f15857c405ef0ecb1e758c8fc6b332f7/Objects/listobject.c#L58 ) PL. -- https://mail.python.org/mailman/listinfo/python-list