Re: Program uses twice as much memory in Python 3.6 than in Python 3.5

2017-03-30 Thread INADA Naoki
>
> Running further trials indicate that the problem actually is related to
> swapping. If I reduce the model size in the benchmark slightly so that
> everything fits into the main memory, the problem disappears. Only when the
> memory usage exceeds the 32GB that I have, Python 3.6 will acquire way more
> memory (from the swap) than Python 3.5.
>
> Jan
> --

It's very hard to believe...
I think there are some factor other than swap cause the problem.
Or, can't it reproducible in 64GB RAM machine?
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: pandas dataframe, find duplicates and add suffix

2017-03-30 Thread Pavol Lisy
On 3/28/17, zljubi...@gmail.com  wrote:
> In dataframe
>
> import pandas as pd
>
> data = {'model': ['first', 'first', 'second', 'second', 'second', 'third',
> 'third'],
> 'dtime': ['2017-01-01_112233', '2017-01-01_112234',
> '2017-01-01_112234', '2017-01-01_112234', '2017-01-01_112234',
> '2017-01-01_112235', '2017-01-01_112235'],
> }
> df = pd.DataFrame(data, index = ['a.jpg', 'b.jpg', 'c.jpg', 'd.jpg',
> 'e.jpg', 'f.jpg', 'g.jpg'], columns=['model', 'dtime'])
>
> print(df.head(10))
>
> model  dtime
> a.jpg   first  2017-01-01_112233
> b.jpg   first  2017-01-01_112234
> c.jpg  second  2017-01-01_112234
> d.jpg  second  2017-01-01_112234
> e.jpg  second  2017-01-01_112234
> f.jpg   third  2017-01-01_112235
> g.jpg   third  2017-01-01_112235
>
> within model, there are duplicate dtime values.
> For example, rows d and e are duplicates of the c row.
> Row g is duplicate of the f row.
>
> For each duplicate (within model) I would like to add suffix (starting from
> 1) to the dtime value. Something like this:
>
> model  dtime
> a.jpg   first  2017-01-01_112233
> b.jpg   first  2017-01-01_112234
> c.jpg  second  2017-01-01_112234
> d.jpg  second  2017-01-01_112234-1
> e.jpg  second  2017-01-01_112234-2
> f.jpg   third  2017-01-01_112235
> g.jpg   third  2017-01-01_112235-1
>
> How to do that?
> --
> https://mail.python.org/mailman/listinfo/python-list
>

I am not expert, just played a little...

This one could work:

gb = df.groupby([df.model, df.dtime])
df.dtime = df.dtime + gb.cumcount().apply(lambda a:str(-a) if a else '')

this one is probably more readable:
df.dtime = df.dtime + [str(-a) if a else '' for a in gb.cumcount()]

I don't know which one is better in memory consumption and/or speed.

This small dataframe gave me:

%timeit -r 5 df.dtime + gb.cumcount().apply(lambda a:str(-a) if a else '')
1000 loops, best of 5: 387 µs per loop

%timeit -r 5 df.dtime + [str(-a) if a else '' for a in gb.cumcount()]
1000 loops, best of 5: 324 µs per loop

PL.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Program uses twice as much memory in Python 3.6 than in Python 3.5

2017-03-30 Thread Pavol Lisy
On 3/29/17, Jan Gosmann  wrote:
> On 28 Mar 2017, at 14:21, INADA Naoki wrote:
>
>> On Wed, Mar 29, 2017 at 12:29 AM, Jan Gosmann 
>> wrote:
>>
>> I suppose smaller and faster benchmark is better to others looking for
>> it.
>> I already stopped the azure instance.
>> [...]
>> There are no maxrss difference in "smaller existing examples"?
>> [...]
>> I want to investigate RAM usage, without any swapping.
>
> Running further trials indicate that the problem actually is related to
> swapping. If I reduce the model size in the benchmark slightly so that
> everything fits into the main memory, the problem disappears. Only when
> the memory usage exceeds the 32GB that I have, Python 3.6 will acquire
> way more memory (from the swap) than Python 3.5.
>
> Jan
> --
> https://mail.python.org/mailman/listinfo/python-list

Could you add table comparing time benchmarks when memory is bigger?
(if your hypothesis is true and memory measurement tools are right
than time difference has to be huge)

Did you compare "pip list" results? There could be more differences in
your environments (not only python version). For example different
numpy versions or some missing packages could change game.

I tried to search "except.*ImportError" in your repository, but I am
not sure that it could change it significantly...

( 
https://github.com/ctn-archive/gosmann-frontiers2017/search?utf8=%E2%9C%93&q=ImportError&type=

This one seems suspitious - sparse matrix class could be game changer

from scipy.sparse import bsr_matrix
assert bsr_matrix
except (ValueError, ImportError):
return False

)


This one doesn't seems suspicious to me (but who knows?):

try:
import faulthandler
faulthandler.enable()
except:
pass

PL.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python under PowerShell adds characters

2017-03-30 Thread Steve D'Aprano
On Thu, 30 Mar 2017 04:43 pm, Marko Rauhamaa wrote:

> Steven D'Aprano :
> 
>> On Thu, 30 Mar 2017 07:29:48 +0300, Marko Rauhamaa wrote:
>>> I'd expect not having to deal with Unicode decoding exceptions with
>>> arbitrary input.
>>
>> That's just silly. If you have *arbitrary* bytes, not all
>> byte-sequences are valid Unicode, so you have to expect decoding
>> exceptions, if you're processing text.
> 
> The input is not in my control, and bailing out may not be an option:


You have to deal with bad input *somehow*. You can't just say it will never
happen. If bailing out is not an option, then perhaps the solution is not
to read stdin as Unicode text, if there's a chance that it actually doesn't
contain Unicode text. Otherwise, you have to deal with any errors.

("Deal with" can include the case of not dealing with them at all, and just
letting your script raise an exception.)



>$ echo $'aa\n\xdd\naa' | grep aa
>aa
>aa
>$ echo $'\xdd' | python2 -c 'import sys; sys.stdin.read(1)'
>$ echo $'\xdd' | python3 -c 'import sys; sys.stdin.read(1)'
>Traceback (most recent call last):
>  File "", line 1, in 
>  File "/usr/lib64/python3.5/codecs.py", line 321, in decode
>(result, consumed) = self._buffer_decode(data, self.errors, final)
>UnicodeDecodeError: 'utf-8' codec can't decode byte 0xdd in position 0:
> invalid continuation byte

As I said, what did you expect? You choose to read from stdin as Unicode
text, then fed it something that wasn't Unicode text. That's no different
from expecting to read a file name, then passing an ASCII NUL byte.
Something is going to break, somewhere, so you have to deal with it.

I'm not sure if there are better ways, but one way of dealing with this is
to bypass the text layer and read from the raw byte-oriented stream:

[steve@ando ~]$ echo $'\xdd' | python3 -c 'import sys;
print(sys.stdin.buffer.read(1))'
b'\xdd'


You have a choice. The default choice is aimed at the most-common use-case,
which is that input will be text, but its not the only choice.



-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Text-mode apps (Was :Who are the "spacists"?)

2017-03-30 Thread Mikhail V
On 30 March 2017 at 07:43, Chris Angelico  wrote:
> On Thu, Mar 30, 2017 at 3:21 PM, Rick Johnson
>  wrote:
>> On Sunday, March 26, 2017 at 2:53:49 PM UTC-5, Chris Angelico wrote:
>>> On Mon, Mar 27, 2017 at 6:25 AM, Mikhail V  wrote:
>>> > On 26 March 2017 at 20:10, Steve D'Aprano  
>>> > wrote:
>>> >> On Mon, 27 Mar 2017 03:57 am, Mikhail V wrote:
>>

>>> """
>>> I, the undersigned, acknowledge that my program is
>>> intentionally excluding everyone who does not fit the
>>> following requirements: [choose all applicable]
>>>
>>> [ ] Speaks English exclusively
>>
>> Of course, your comment presupposing that every programmer
>> is fluent in every natural language. Which is not only
>> impractical, it's impossible.
>
> Nope. I can't speak Mandarin, but I can make absolutely sure that all
> my programs can accept Chinese characters. A friend of mine sent me an
> audio file with a name that included some Chinese, and I was able to
> handle it no problem.
>

Naming files is another point. Generally if I can't speak Mandarin,
I have no right to make support for it since I know nothing about this
language nor do I know their symbols.

>>> [ ] Uses no diacritical marks
>>
>> Why is it my responsibiliy to encode my text with
>> pronuciation tutorials? Are we adults here or what?
>>
>>> [ ] Writes all text top-to-bottom, left-to-right
>>
>> Not my problem. Learn the King's English or go wait for
>> extinction to arrive.
>
> And these two cement your parochialism thoroughly in everyone's minds.
> "Pronunciation tutorials", eh? Sure. Tell that to everyone who speaks
> Spanish, Turkish, Norwegian, German, or Vietnamese, all of which use
> diacritical marks to distinguish between letters. English is the weird
> language in that it uses letter pairs instead of adorned letters (eg
> "ch" and "sh" instead of "ç" and "ş").

Because "Pronunciation tutorials" is one of rare excuses to use those
special characters at all. Now do you know how many phonetical
systems linguist have invented over past 200 years? Will you find
all them in Unicode? And why you need them today, if you can
learn pronunciation by audio tutorials?

And letter pair usage is not for fun there, I think we've discussed this
some time ago on python-ideas, it is merely a political problem,
since every 'king' in each land suddenly thinks that he is a genius
typographer and adds few custom characters to Latin after he realizes
that there is no sense in forcing everyone to use some outdated system,
or even rolls his own bizzare system (e.g. Hangul).

>> What don't you add these:
>>
>> [ ] Has the ability to read and comprehend at a high
>> school level.
>> [ ] Has functioning visual receptors.
>> [ ] Has a functioning brain.
>> [ ] Is not currently in a vegetative state
>
> Nah. If I did, I'd have to say "[ ] Is not trolling python-list" as well.

Call me a bigot, but I would say:
[x]  people, become adult finally and stop playing with your funny
hieroglyphs, use Latin set, and concentrate on real problems.

If I produce an arcade game or an IDE, would it lose much if
I don't include Unicode support? I personally would not do it
even in the fear of punishment.


Mikhail
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Text-mode apps (Was :Who are the "spacists"?)

2017-03-30 Thread Steve D'Aprano
On Thu, 30 Mar 2017 03:21 pm, Rick Johnson wrote:

> On Sunday, March 26, 2017 at 2:53:49 PM UTC-5, Chris Angelico wrote:
>> On Mon, Mar 27, 2017 at 6:25 AM, Mikhail V  wrote:
>> > On 26 March 2017 at 20:10, Steve D'Aprano 
>> > wrote:
>> >> On Mon, 27 Mar 2017 03:57 am, Mikhail V wrote:
> 
>> I generally find that when people say that Unicode doesn't
>> solve their problems and they need to roll their own, it's
>> usually one of two possibilities:  1) "Their problems" are
>> all about simplicity. They don't want to have to deal with
>> all the complexities of real-world text, so they
>> arbitrarily restrict things.
> 
> There are only so many hours in the day Chris. Not every
> progammer has the time to cater to every selfish desire of
> every potential client.

Oh, you're one of *those* coders. The ones who believe that if they
personally don't need something, nobody needs it.

Listen, I'm 100% in favour of the open source model. I think that coders who
scratch their own itch is a great way to produce some really fantastic
software. Look at the Linux kernel, and think about how that has evolved
from Linus Torvalds scratching his own itch.

It can also produce some real garbage too, usually from the kind of coder
whose answer to everything is "you don't need to do that".

But whatever, its a free country. If you don't want to support a subset of
your potential users, or customers, that's entirely up to you.

The honest truth is that most software ends up languishing in obscurity,
only used by a relative handful of people, so its quite unlikely that
you'll every have any users wanting support for Old Persian or Ogham.

But if you have any users at all, there's a good chance they'll want to
write their name correctly even if they are called Zöe, or include the
trademarked name of their Awesome™ product, or write the name of that hot
new metal band THЯДSHËR, or use emoji, or to refer to ¢ and °F
temperatures.

You might call it "selfish" for somebody to want to spell their name
correctly, or write in their native language, but selfish or not if you
don't give your users the features they want, they are unlikely to use your
software. Outside of the Democratic People's Republic of Trumpistan, the
world is full of about seven billion people who don't have any interest in
your ASCII-only software. It's not 1970 any more, the world is connected.

And the brilliant thing about Unicode is that for a little bit of effort you
can support Zöe and her French girlfriends, and that Swedish metal band
with the umlauts, and the President's Russian backers, and once you've done
that, you get at least partial support for Hebrew and Chinese and Korean
and Vietnamese and a dozen different Indian languages, and even Old Persian
and Ogham, FOR FREE.

So if you're wanting to create "the best product you can", why *wouldn't*
you use Unicode?


> You try to create the best product you can, 
> but at the end of the process, there will always be 
> someone (or a group of someones) who are unhappy with the
> result.


[...]
>> [ ] Speaks English exclusively
> 
> Of course, your comment presupposing that every programmer
> is fluent in every natural language. Which is not only
> impractical, it's impossible.

Don't be silly. You don't have to be fluent in a language in order for your
program to support users who are. All you have to do is not stop them from
using their own native language by forcing them to use ASCII and nothing
but ASCII.

Of course, if you want to *localise* your UI to their language, then you
need somebody to translate error messages, menus, window titles, etc. I'll
grant that's not always an easy job.

But aren't you lucky, you speak one of a handful of lingua francas in the
world, so the chances are your users will be pathetically grateful if all
you do is let them type in their own language. Actual UI localisation is a
bonus.


>> [ ] Uses no diacritical marks
> 
> Why is it my responsibiliy to encode my text with
> pronuciation tutorials? Are we adults here or what?

Now you're just being absurd. Supporting diacritics doesn't mean you are
responsible for teaching your users what they're for. They already know.
That's why they want to use them.

Diacritics are for:

- distinguishing between words which look the same, but have 
  different pronunciation;

- distinguishing between different letters of the alphabet, like 
  dotted-i and dotless-ı (or ı and ı-with-a-dot, if you prefer), 
  or a and å;

- distinguishing between words which look and sound the same but
  mean something different;

- and making band names look ǨØØĻ and annoy old fuddy-duddies.


>> [ ] Writes all text top-to-bottom, left-to-right
> 
> Not my problem. Learn the King's English or go wait for
> extinction to arrive.

Which king?

Harald V speaks Norwegian, Felipe VI speaks Spanish, Hamad bin Isa speaks
whatever they speak in Bahrain (probably Arabic), Norodom Sihamoni speaks
Cambodian, Vajiralongkorn speaks Thai, Mswati III speaks Swazi, Ab

Re: Text-mode apps (Was :Who are the "spacists"?)

2017-03-30 Thread Steve D'Aprano
On Fri, 31 Mar 2017 12:25 am, Mikhail V wrote:

> Call me a bigot

Okay. You're a bigot.


-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Text-mode apps (Was :Who are the "spacists"?)

2017-03-30 Thread Chris Angelico
On Fri, Mar 31, 2017 at 1:16 AM, Steve D'Aprano
 wrote:
> On Fri, 31 Mar 2017 12:25 am, Mikhail V wrote:
>
>> Call me a bigot
>
> Okay. You're a bigot.

+1 QOTD

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


add processing images in the model.py using django

2017-03-30 Thread Xristos Xristoou
want to create a simple image processing using Django. my tasks is easy I have 
some user a simple model and that user can upload images in my project using 
html form or django form and then that images saves to upload_to='mypath' in 
upload from my model. but I have some questions :
I have a simple image processing in my views.py if the processing complete 
success then create new image and I want that image to add in the upload from 
my model .
how to do that in Django ?
models.py
class MyModel(models.Model):
user = models.ForeignKey(User)
upload = models.ImageField(upload_to='mypath/personal/folder/per/user')
views.py
def index(request):
form = ImageUploadForm(request.POST or None, request.FILES or None)
if request.method == "POST" and form.is_valid():
image_file = request.FILES['image'].read()
'''
image processing new image
'''


return render_to_response("blog/success.html", {"new image":new image})
return render_to_response('blog/images.html', {'form': form}, 
RequestContext(request))
-- 
https://mail.python.org/mailman/listinfo/python-list


django authentication multi upload files

2017-03-30 Thread Xristos Xristoou
I have create I simple Django auth project and I need to add the user to can 
upload some images. multi upload images from internet
views.py
from django.shortcuts import render
from django.http import HttpResponse

def Form(request):
return render(request, "index/form.html", {})

def Upload(request):
for count, x in enumerate(request.FILES.getlist("files")):
def process(f):
with open('/Users/Michel/django_1.8/projects/upload/media/file_' + 
str(count), 'wb+') as destination:
for chunk in f.chunks():
destination.write(chunk)
process(x)
return HttpResponse("File(s) uploaded!")
but how to define that to multi upload images in specific unique folder for any 
user. first I use login_required and in destination I use user_directory_path. 
But how to define the code in the views.py to work with authentication per 
user. for example for user_1 upload images in your folder for user_1 in folder 
for user_2.
medels.py
def user_directory_path(instance, filename):
return 'user_{0}/{1}'.format(instance.user.id, filename)

class MyModel(models.Model):
user = models.ForeignKey(User, unique=True)
upload = models.ImageField(upload_to=user_directory_path)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Text-mode apps (Was :Who are the "spacists"?)

2017-03-30 Thread Michael Torrie
On 03/30/2017 08:14 AM, Steve D'Aprano wrote:
>> Why is it my responsibiliy to encode my text with
>> pronuciation tutorials? Are we adults here or what?
>
> Now you're just being absurd. 

Ahh yes, good old RR with his reductio ad absurdum fallacies when he's
lost the argument.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Text-mode apps (Was :Who are the "spacists"?)

2017-03-30 Thread Mikhail V
On 30 March 2017 at 16:14, Steve D'Aprano  wrote:
> On Thu, 30 Mar 2017 03:21 pm, Rick Johnson wrote:
>
>> On Sunday, March 26, 2017 at 2:53:49 PM UTC-5, Chris Angelico wrote:
>>> On Mon, Mar 27, 2017 at 6:25 AM, Mikhail V  wrote:
>>> > On 26 March 2017 at 20:10, Steve D'Aprano 
>>> > wrote:

>>> [ ] Uses no diacritical marks
>>
>> Why is it my responsibiliy to encode my text with
>> pronuciation tutorials? Are we adults here or what?
>
> Now you're just being absurd. Supporting diacritics doesn't mean you are
> responsible for teaching your users what they're for. They already know.
> That's why they want to use them.
>
> Diacritics are for:
>
> - distinguishing between words which look the same, but have
>   different pronunciation;
>
> - distinguishing between different letters of the alphabet, like
>   dotted-i and dotless-ı (or ı and ı-with-a-dot, if you prefer),
>   or a and å;
>
> - distinguishing between words which look and sound the same but
>   mean something different;
>
> - and making band names look ǨØØĻ and annoy old fuddy-duddies.
>

Steve, it is not bad to want to spell your name using spelling which
was taught you in the school. But it is bad to stay in illusion that there
is something good in using accents. As said it _is_ selfish to force
people to use e.g. umlauts and noun Capitalisations in German.
It is an big obstacle for reading and burdle for typing.
Initially it has nothing to do with people's choice, it is politics only.
I can speak and write German fluently, so I know how much
better would it be without those odd spelling rules.

So don't mix the spoken language and writing system - spoken
language will never be extinct, but most writing systems will be obsolete
and should be obsolete (you can call me bigot again ;-)

Some other interesting aspects: if I localise a software in Russian
language and use Cyrillic letters, english speakers will not be able
to read _anything_, and if I'll use Latin letters instead, then
non-russian users
will be at least able to read something, so if you know some Russian
spoken language it will be much more help for you.

Concrete software example - lets say I make an IDE.
It is far more important to give the users mechanisms to customize
the glyphs (e.g. edit math signs) than supporting Unicode.
And therefore it is much more important to make those font
format definitions transparent, non-bloated and easily editable.
Where is it all?


Mikhail
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Text-mode apps (Was :Who are the "spacists"?)

2017-03-30 Thread Rhodri James

On 30/03/17 16:57, Mikhail V wrote:

Steve, it is not bad to want to spell your name using spelling which
was taught you in the school. But it is bad to stay in illusion that there
is something good in using accents.


*plonk*

--
Rhodri James *-* Kynesim Ltd
--
https://mail.python.org/mailman/listinfo/python-list


Re: Program uses twice as much memory in Python 3.6 than in Python 3.5

2017-03-30 Thread INADA Naoki
I reproduced the issue.
This is very usual, memory usage issue.  Slashing is just a result of
large memory usage.

After 1st pass of optimization, RAM usage is 20GB+ on Python 3.5 and
30GB on Python 3.6.
And Python 3.6 starts slashing in 2nd optimization pass.

I enabled tracemalloc while 1st pass.  Results is end of this mail.
It seems frozenset() cause regression, but I'm not sure yet.
I don't know what is contents of frozenset yet.  (I know almost
nothing about this application).

Jan, do you know about what this is?
Could you make script which just runs `transitive_closure(edges)` with
edges similar to
`log_reduction.py spaun`?

I'll dig into it later, maybe next week.

---
Python 3.6.1
1191896 memory blocks: 22086104.2 KiB
  File 
"/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py",
line 85
reachables[vertex] = frozenset(reachables[vertex])
  File 
"/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py",
line 410
self.dependents = transitive_closure(self.dg.forward)
602986 memory blocks: 51819.1 KiB
  File "", line 14
  File 
"/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py",
line 634
first_view=None, v_offset=0, v_size=0, v_base=None)

Python 3.5.3
1166804 memory blocks: 6407.0 KiB
  File 
"/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py",
line 85
reachables[vertex] = frozenset(reachables[vertex])
  File 
"/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py",
line 410
self.dependents = transitive_closure(self.dg.forward)
602989 memory blocks: 51819.3 KiB
  File "", line 14
  File 
"/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py",
line 634
first_view=None, v_offset=0, v_size=0, v_base=None)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Program uses twice as much memory in Python 3.6 than in Python 3.5

2017-03-30 Thread INADA Naoki
Maybe, this commit make this regression.

https://github.com/python/cpython/commit/4897300276d870f99459c82b937f0ac22450f0b6

Old:
minused = (so->used + other->used)*2  (L619)

New:
minused = so->used + other->used  (L620)
minused = (minused > 5) ? minused * 2 : minused * 4;  (L293)

So size of small set is doubled.

$ /usr/bin/python3
Python 3.5.2+ (default, Sep 22 2016, 12:18:14)
[GCC 6.2.0 20160927] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> s = set(range(10))
>>> sys.getsizeof(frozenset(s))
736
>>>

$ python3
Python 3.6.0 (default, Dec 30 2016, 20:49:54)
[GCC 6.2.0 20161005] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import  sys
>>> s = set(range(10))
>>> sys.getsizeof(frozenset(s))
1248
>>>



On Fri, Mar 31, 2017 at 2:34 AM, INADA Naoki  wrote:
> I reproduced the issue.
> This is very usual, memory usage issue.  Slashing is just a result of
> large memory usage.
>
> After 1st pass of optimization, RAM usage is 20GB+ on Python 3.5 and
> 30GB on Python 3.6.
> And Python 3.6 starts slashing in 2nd optimization pass.
>
> I enabled tracemalloc while 1st pass.  Results is end of this mail.
> It seems frozenset() cause regression, but I'm not sure yet.
> I don't know what is contents of frozenset yet.  (I know almost
> nothing about this application).
>
> Jan, do you know about what this is?
> Could you make script which just runs `transitive_closure(edges)` with
> edges similar to
> `log_reduction.py spaun`?
>
> I'll dig into it later, maybe next week.
>
> ---
> Python 3.6.1
> 1191896 memory blocks: 22086104.2 KiB
>   File 
> "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py",
> line 85
> reachables[vertex] = frozenset(reachables[vertex])
>   File 
> "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py",
> line 410
> self.dependents = transitive_closure(self.dg.forward)
> 602986 memory blocks: 51819.1 KiB
>   File "", line 14
>   File 
> "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py",
> line 634
> first_view=None, v_offset=0, v_size=0, v_base=None)
>
> Python 3.5.3
> 1166804 memory blocks: 6407.0 KiB
>   File 
> "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py",
> line 85
> reachables[vertex] = frozenset(reachables[vertex])
>   File 
> "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py",
> line 410
> self.dependents = transitive_closure(self.dg.forward)
> 602989 memory blocks: 51819.3 KiB
>   File "", line 14
>   File 
> "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py",
> line 634
> first_view=None, v_offset=0, v_size=0, v_base=None)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Program uses twice as much memory in Python 3.6 than in Python 3.5

2017-03-30 Thread INADA Naoki
Filed an issue: https://bugs.python.org/issue29949

Thanks for your report, Jan.

On Fri, Mar 31, 2017 at 3:04 AM, INADA Naoki  wrote:
> Maybe, this commit make this regression.
>
> https://github.com/python/cpython/commit/4897300276d870f99459c82b937f0ac22450f0b6
>
> Old:
> minused = (so->used + other->used)*2  (L619)
>
> New:
> minused = so->used + other->used  (L620)
> minused = (minused > 5) ? minused * 2 : minused * 4;  (L293)
>
> So size of small set is doubled.
>
> $ /usr/bin/python3
> Python 3.5.2+ (default, Sep 22 2016, 12:18:14)
> [GCC 6.2.0 20160927] on linux
> Type "help", "copyright", "credits" or "license" for more information.
 import sys
 s = set(range(10))
 sys.getsizeof(frozenset(s))
> 736

>
> $ python3
> Python 3.6.0 (default, Dec 30 2016, 20:49:54)
> [GCC 6.2.0 20161005] on linux
> Type "help", "copyright", "credits" or "license" for more information.
 import  sys
 s = set(range(10))
 sys.getsizeof(frozenset(s))
> 1248

>
>
>
> On Fri, Mar 31, 2017 at 2:34 AM, INADA Naoki  wrote:
>> I reproduced the issue.
>> This is very usual, memory usage issue.  Slashing is just a result of
>> large memory usage.
>>
>> After 1st pass of optimization, RAM usage is 20GB+ on Python 3.5 and
>> 30GB on Python 3.6.
>> And Python 3.6 starts slashing in 2nd optimization pass.
>>
>> I enabled tracemalloc while 1st pass.  Results is end of this mail.
>> It seems frozenset() cause regression, but I'm not sure yet.
>> I don't know what is contents of frozenset yet.  (I know almost
>> nothing about this application).
>>
>> Jan, do you know about what this is?
>> Could you make script which just runs `transitive_closure(edges)` with
>> edges similar to
>> `log_reduction.py spaun`?
>>
>> I'll dig into it later, maybe next week.
>>
>> ---
>> Python 3.6.1
>> 1191896 memory blocks: 22086104.2 KiB
>>   File 
>> "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py",
>> line 85
>> reachables[vertex] = frozenset(reachables[vertex])
>>   File 
>> "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py",
>> line 410
>> self.dependents = transitive_closure(self.dg.forward)
>> 602986 memory blocks: 51819.1 KiB
>>   File "", line 14
>>   File 
>> "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py",
>> line 634
>> first_view=None, v_offset=0, v_size=0, v_base=None)
>>
>> Python 3.5.3
>> 1166804 memory blocks: 6407.0 KiB
>>   File 
>> "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py",
>> line 85
>> reachables[vertex] = frozenset(reachables[vertex])
>>   File 
>> "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py",
>> line 410
>> self.dependents = transitive_closure(self.dg.forward)
>> 602989 memory blocks: 51819.3 KiB
>>   File "", line 14
>>   File 
>> "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py",
>> line 634
>> first_view=None, v_offset=0, v_size=0, v_base=None)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Program uses twice as much memory in Python 3.6 than in Python 3.5

2017-03-30 Thread Jan Gosmann
That's great news. I'm busy with other things right now, but will look 
into your findings in more detail later.



On 03/30/2017 02:09 PM, INADA Naoki wrote:

Filed an issue: https://bugs.python.org/issue29949

Thanks for your report, Jan.

On Fri, Mar 31, 2017 at 3:04 AM, INADA Naoki  wrote:

Maybe, this commit make this regression.

https://github.com/python/cpython/commit/4897300276d870f99459c82b937f0ac22450f0b6

Old:
minused = (so->used + other->used)*2  (L619)

New:
minused = so->used + other->used  (L620)
minused = (minused > 5) ? minused * 2 : minused * 4;  (L293)

So size of small set is doubled.

$ /usr/bin/python3
Python 3.5.2+ (default, Sep 22 2016, 12:18:14)
[GCC 6.2.0 20160927] on linux
Type "help", "copyright", "credits" or "license" for more information.

import sys
s = set(range(10))
sys.getsizeof(frozenset(s))

736
$ python3
Python 3.6.0 (default, Dec 30 2016, 20:49:54)
[GCC 6.2.0 20161005] on linux
Type "help", "copyright", "credits" or "license" for more information.

import  sys
s = set(range(10))
sys.getsizeof(frozenset(s))

1248


On Fri, Mar 31, 2017 at 2:34 AM, INADA Naoki  wrote:

I reproduced the issue.
This is very usual, memory usage issue.  Slashing is just a result of
large memory usage.

After 1st pass of optimization, RAM usage is 20GB+ on Python 3.5 and
30GB on Python 3.6.
And Python 3.6 starts slashing in 2nd optimization pass.

I enabled tracemalloc while 1st pass.  Results is end of this mail.
It seems frozenset() cause regression, but I'm not sure yet.
I don't know what is contents of frozenset yet.  (I know almost
nothing about this application).

Jan, do you know about what this is?
Could you make script which just runs `transitive_closure(edges)` with
edges similar to
`log_reduction.py spaun`?

I'll dig into it later, maybe next week.

---
Python 3.6.1
1191896 memory blocks: 22086104.2 KiB
   File 
"/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py",
line 85
 reachables[vertex] = frozenset(reachables[vertex])
   File 
"/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py",
line 410
 self.dependents = transitive_closure(self.dg.forward)
602986 memory blocks: 51819.1 KiB
   File "", line 14
   File 
"/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py",
line 634
 first_view=None, v_offset=0, v_size=0, v_base=None)

Python 3.5.3
1166804 memory blocks: 6407.0 KiB
   File 
"/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py",
line 85
 reachables[vertex] = frozenset(reachables[vertex])
   File 
"/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py",
line 410
 self.dependents = transitive_closure(self.dg.forward)
602989 memory blocks: 51819.3 KiB
   File "", line 14
   File 
"/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py",
line 634
 first_view=None, v_offset=0, v_size=0, v_base=None)


--
https://mail.python.org/mailman/listinfo/python-list


Re: Program uses twice as much memory in Python 3.6 than in Python 3.5

2017-03-30 Thread INADA Naoki
FYI, this small patch may fix your issue:
https://gist.github.com/methane/8faf12621cdb2166019bbcee65987e99
-- 
https://mail.python.org/mailman/listinfo/python-list


error in syntax description for comprehensions?

2017-03-30 Thread Boylan, Ross
https://docs.python.org/3/reference/expressions.html#displays-for-lists-sets-and-dictionaries
describes the syntax for comprehensions as
comprehension ::=  expression comp_for
comp_for  ::=  [ASYNC] "for" target_list "in" or_test [comp_iter]
comp_iter ::=  comp_for | comp_if
comp_if   ::=  "if" expression_nocond [comp_iter]

Is the comp_for missing an argument after "in"?
One has to follow the definition of or_test and its components, but I can't 
find anything that results to a single variable or expression.

Actually, I'm not sure what or_test would do there either with or  without an 
additional element following "in". 

Ross Boylan
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: error in syntax description for comprehensions?

2017-03-30 Thread Ned Batchelder
On Thursday, March 30, 2017 at 4:59:03 PM UTC-4, Boylan, Ross wrote:
> https://docs.python.org/3/reference/expressions.html#displays-for-lists-sets-and-dictionaries
> describes the syntax for comprehensions as
> comprehension ::=  expression comp_for
> comp_for  ::=  [ASYNC] "for" target_list "in" or_test [comp_iter]
> comp_iter ::=  comp_for | comp_if
> comp_if   ::=  "if" expression_nocond [comp_iter]
> 
> Is the comp_for missing an argument after "in"?
> One has to follow the definition of or_test and its components, but I can't 
> find anything that results to a single variable or expression.
> 
> Actually, I'm not sure what or_test would do there either with or  without an 
> additional element following "in". 


Syntax grammars can be obtuse. An or_test can be an and_test, which
can be a not_test, which can be a comparison.  It continues from there.
The whole chain of "can be" is:

or_test
and_test
not_test
comparison
or_expr
xor_expr
and_expr
shift_expr
a_expr
m_expr
u_expr
power
primary
atom
identifier

... and identifier is what you are looking for.

--Ned.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Program uses twice as much memory in Python 3.6 than in Python 3.5

2017-03-30 Thread MRAB

On 2017-03-30 19:04, INADA Naoki wrote:

Maybe, this commit make this regression.

https://github.com/python/cpython/commit/4897300276d870f99459c82b937f0ac22450f0b6

Old:
minused = (so->used + other->used)*2  (L619)

New:
minused = so->used + other->used  (L620)
minused = (minused > 5) ? minused * 2 : minused * 4;  (L293)

So size of small set is doubled.

$ /usr/bin/python3
Python 3.5.2+ (default, Sep 22 2016, 12:18:14)
[GCC 6.2.0 20160927] on linux
Type "help", "copyright", "credits" or "license" for more information.

import sys
s = set(range(10))
sys.getsizeof(frozenset(s))

736




$ python3
Python 3.6.0 (default, Dec 30 2016, 20:49:54)
[GCC 6.2.0 20161005] on linux
Type "help", "copyright", "credits" or "license" for more information.

import  sys
s = set(range(10))
sys.getsizeof(frozenset(s))

1248



Copying a small set _might_ double its memory usage.


Python 3.5.2 (v3.5.2:4def2a2901a5, Jun 25 2016, 22:18:55) [MSC v.1900 64 
bit (AMD64)] on win32

Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> s = set(range(10))
>>> sys.getsizeof(s)
736
>>> sys.getsizeof(set(s))
736
>>>
>>> sys.getsizeof(set(set(s)))
736


Python 3.6.1 (v3.6.1:69c0db5, Mar 21 2017, 18:41:36) [MSC v.1900 64 bit 
(AMD64)] on win32

Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> s = set(range(10))
>>> sys.getsizeof(s)
736
>>> sys.getsizeof(set(s))
1248
>>>
>>> sys.getsizeof(set(set(s)))
1248
>>>

--
https://mail.python.org/mailman/listinfo/python-list


Re: error in syntax description for comprehensions?

2017-03-30 Thread Terry Reedy

On 3/30/2017 4:57 PM, Boylan, Ross wrote:

https://docs.python.org/3/reference/expressions.html#displays-for-lists-sets-and-dictionaries
describes the syntax for comprehensions as
comprehension ::=  expression comp_for
comp_for  ::=  [ASYNC] "for" target_list "in" or_test [comp_iter]
comp_iter ::=  comp_for | comp_if
comp_if   ::=  "if" expression_nocond [comp_iter]

Is the comp_for missing an argument after "in"?


The or_test *is* the 'argument'.


One has to follow the definition of or_test and its components,

> but I can't find anything that results to a single variable
> or expression.

An or_test *is* a single expression.  Like all python expressions, it 
evaluates to a python object.  In this case, the object is passed to 
iter() and so the object must be an iterable.


>>> a, b = None, range(3)
>>> a or b
range(0, 3)
>>> for i in a or b: print(i)

0
1
2

--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list


Re: Program uses twice as much memory in Python 3.6 than in Python 3.5

2017-03-30 Thread Pavol Lisy
On 3/30/17, INADA Naoki  wrote:
> Maybe, this commit make this regression.
>
> https://github.com/python/cpython/commit/4897300276d870f99459c82b937f0ac22450f0b6
>
> Old:
> minused = (so->used + other->used)*2  (L619)
>
> New:
> minused = so->used + other->used  (L620)
> minused = (minused > 5) ? minused * 2 : minused * 4;  (L293)
>
> So size of small set is doubled.
>
> $ /usr/bin/python3
> Python 3.5.2+ (default, Sep 22 2016, 12:18:14)
> [GCC 6.2.0 20160927] on linux
> Type "help", "copyright", "credits" or "license" for more information.
 import sys
 s = set(range(10))
 sys.getsizeof(frozenset(s))
> 736

>
> $ python3
> Python 3.6.0 (default, Dec 30 2016, 20:49:54)
> [GCC 6.2.0 20161005] on linux
> Type "help", "copyright", "credits" or "license" for more information.
 import  sys
 s = set(range(10))
 sys.getsizeof(frozenset(s))
> 1248

>
>
>
> On Fri, Mar 31, 2017 at 2:34 AM, INADA Naoki 
> wrote:
>> I reproduced the issue.
>> This is very usual, memory usage issue.  Slashing is just a result of
>> large memory usage.
>>
>> After 1st pass of optimization, RAM usage is 20GB+ on Python 3.5 and
>> 30GB on Python 3.6.
>> And Python 3.6 starts slashing in 2nd optimization pass.
>>
>> I enabled tracemalloc while 1st pass.  Results is end of this mail.
>> It seems frozenset() cause regression, but I'm not sure yet.
>> I don't know what is contents of frozenset yet.  (I know almost
>> nothing about this application).
>>
>> Jan, do you know about what this is?
>> Could you make script which just runs `transitive_closure(edges)` with
>> edges similar to
>> `log_reduction.py spaun`?
>>
>> I'll dig into it later, maybe next week.
>>
>> ---
>> Python 3.6.1
>> 1191896 memory blocks: 22086104.2 KiB
>>   File
>> "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py",
>> line 85
>> reachables[vertex] = frozenset(reachables[vertex])
>>   File
>> "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py",
>> line 410
>> self.dependents = transitive_closure(self.dg.forward)
>> 602986 memory blocks: 51819.1 KiB
>>   File "", line 14
>>   File
>> "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py",
>> line 634
>> first_view=None, v_offset=0, v_size=0, v_base=None)
>>
>> Python 3.5.3
>> 1166804 memory blocks: 6407.0 KiB
>>   File
>> "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py",
>> line 85
>> reachables[vertex] = frozenset(reachables[vertex])
>>   File
>> "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py",
>> line 410
>> self.dependents = transitive_closure(self.dg.forward)
>> 602989 memory blocks: 51819.3 KiB
>>   File "", line 14
>>   File
>> "/home/inada-n/work/gosmann-frontiers2017/gosmann_frontiers2017/optimized/optimizer.py",
>> line 634
>> first_view=None, v_offset=0, v_size=0, v_base=None)
> --
> https://mail.python.org/mailman/listinfo/python-list
>

Interesting.

1. using set_table_resize with growing factor 2 or 4 doesn't seem to
be optimal for constructing (immutable) frozenset from set.

2. growth factor 2 could be too big too (
https://en.wikipedia.org/wiki/Dynamic_array#Growth_factor
,https://github.com/python/cpython/blob/80ec8364f15857c405ef0ecb1e758c8fc6b332f7/Objects/listobject.c#L58
)

PL.
-- 
https://mail.python.org/mailman/listinfo/python-list