from:"andrea crotti"

pip and different branches?

2013-05-20 Thread andrea crotti

We use github and we work on many different branches at the same time.

The problem is that we have >5 repos now, and for each repo we might
have the same branches on all of them.

Now we use pip and install requirements such as:
git+ssh://g...@github.com/repo.git@dev

Now the problem is that the requirements file are also under revision
control, and constantly we end up in the situation that when we merge
branches the branch settings get messed up, because we forget to change
them.

I was looking for a solution for this that would allow me to:
- use the branch of the "main" repo for all the dependencies
- fallback on master if that branch doesn't exist

I thought about a few options:
1. create a wrapper for PIP that manipulates the requirement file, that now
   would become templates.
   In this way I would have to know however if a branch exist or not,
   and I didn't find a way to do that without cloning the repo.

2. modify PIP to not fail when checking out a non existing branch, so
   that if it's not found it falls back on master automatically.

3. use some git magic hooks but I'm not sure what exactly

4. stop using virtualenv + pip and use something smarter that handles
   this.

Any suggestions?
Thanks
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Getting a callable for any value?

2013-05-29 Thread andrea crotti


On 05/29/2013 06:46 PM, Croepha wrote:

Is there anything like this in the standard library?

class AnyFactory(object):
def __init__(self, anything):
self.product = anything
def __call__(self):
return self.product
def __repr__(self):
return "%s.%s(%r)" % (self.__class__.__module__, 
self.__class__.__name__, self.product)


my use case is: 
collections.defaultdict(AnyFactory(collections.defaultdict(AnyFactory(None




I think I would scratch my head for a good half an hour if I see a 
string like this, so I hope there isn't anything in the standard library 
to do that :D
-- 
http://mail.python.org/mailman/listinfo/python-list

decorator to fetch arguments from global objects

2013-06-18 Thread andrea crotti

Using a CouchDB server we have a different database object potentially for
every request.

We already set that db in the request object to make it easy to pass it
around form our django app, however it would be nice if I could set it once
in the API and automatically fetch it from there.

Basically I have something like

class Entity:
 def save_doc(db)
...

I would like basically to decorate this function in such a way that:
- if I pass a db object use it
- if I don't pass it in try to fetch it from a global object
- if both don't exist raise an exception

Now it kinda of works already with the decorator below.
The problem is that the argument is positional so I end up maybe passing it
twice.
So I have to enforce that 'db' if there is passed as first argument..

It would be a lot easier removing the db from the arguments but then it
would look too magic and I didn't want to change the signature.. any other
advice?

def with_optional_db(func):
"""Decorator that sets the database to the global current one if
not passed in or if passed in and None
"""
@wraps(func)
def _with_optional_db(*args, **kwargs):
func_args = func.func_code.co_varnames
db = None
# if it's defined in the first elements it needs to be
# assigned to *args, otherwise to kwargs
if 'db' in func_args:
assert 'db' == func_args[0], "Needs to be the first defined"
else:
db = kwargs.get('db', None)

if db is None:
kwargs['db'] = get_current_db()

assert kwargs['db'] is not None, "Can't have a not defined database"
ret = func(*args, **kwargs)
return ret

return _with_optional_db
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: python-django for dynamic web survey?

2013-06-18 Thread andrea crotti

Django makes your life a lot easier in many ways, but you still need some
time to learn it.
The task you're trying it's not trivial though, depending on your
experience it might take a while with any library/framework..
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: decorator to fetch arguments from global objects

2013-06-18 Thread andrea crotti

2013/6/18 Wolfgang Maier 

> andrea crotti  gmail.com> writes:
>
> >
> >
> > Using a CouchDB server we have a different database object potentially
> for
> every request.
> >
> > We already set that db in the request object to make it easy to pass it
> around form our django app, however it would be nice if I could set it once
> in the API and automatically fetch it from there.
> >
> > Basically I have something like
> >
> > class Entity:
> >  def save_doc(db)
> > ...
> >
> > I would like basically to decorate this function in such a way that:
> > - if I pass a db object use it
> > - if I don't pass it in try to fetch it from a global object
> > - if both don't exist raise an exception
> >
> > Now it kinda of works already with the decorator below.
> > The problem is that the argument is positional so I end up maybe passing
> it twice.
> > So I have to enforce that 'db' if there is passed as first argument..
> >
> > It would be a lot easier removing the db from the arguments but then it
> would look too magic and I didn't want to change the signature.. any other
> advice?
> >
> > def with_optional_db(func):
> > """Decorator that sets the database to the global current one if
> > not passed in or if passed in and None
> > """
> >   wraps(func)
> > def _with_optional_db(*args, **kwargs):
> > func_args = func.func_code.co_varnames
> > db = None
> > # if it's defined in the first elements it needs to be
> > # assigned to *args, otherwise to kwargs
> > if 'db' in func_args:
> > assert 'db' == func_args[0], "Needs to be the first defined"
> > else:
> > db = kwargs.get('db', None)
> >
> > if db is None:
> > kwargs['db'] = get_current_db()
> >
> > assert kwargs['db'] is not None, "Can't have a not defined
> database"
> > ret = func(*args, **kwargs)
> > return ret
> >
> > return _with_optional_db
> >
>
> I'm not sure, whether your code would work. I get the logic for the db in
> kwargs case, but why are you checking whether db is in func_args? Isn't the
> real question whether it's in args ?? In general, I don't understand why
> you
> want to use .func_code.co_varnames here. You know how you defined your
> function (or rather method):
> class Entity:
> def save_doc(db):
> ...
> Maybe I misunderstood the problem?
> Wolfgang
>
>
>
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>


Well the point is that I could allow someone to not use "db" as argument of
the function if he only wants to use the global db object..

Or at least I want to check that it's the first argument and not in another
position, just as a sanity check.

I might drop some magic and make it a bit simpler though, even the default
argument DEFAULT_DB could be actually good, and I would not even need the
decorator at that point..
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: decorator to fetch arguments from global objects

2013-06-18 Thread andrea crotti

2013/6/18 Terry Reedy 

> On 6/18/2013 5:47 AM, andrea crotti wrote:
>
>> Using a CouchDB server we have a different database object potentially
>> for every request.
>>
>> We already set that db in the request object to make it easy to pass it
>> around form our django app, however it would be nice if I could set it
>> once in the API and automatically fetch it from there.
>>
>> Basically I have something like
>>
>> class Entity:
>>   def save_doc(db)
>>
>
> If save_doc does not use an instance of Entity (self) or Entity itself
> (cls), it need not be put in the class.


I missed a self it's a method actually..


>
>
>   ...
>>
>> I would like basically to decorate this function in such a way that:
>> - if I pass a db object use it
>> - if I don't pass it in try to fetch it from a global object
>> - if both don't exist raise an exception
>>
>
> Decorators are only worthwhile if used repeatedly. What you specified can
> easily be written, for instance, as
>
> def save_doc(db=None):
>   if db is None:
> db = fetch_from_global()
>   if isinstance(db, dbclass):
> save_it()
>   else:
> raise ValueError('need dbobject')
>
>
>
Yes that's exactly why I want a decorator, to avoid all this boilerplate
for every function method that uses a db object..
-- 
http://mail.python.org/mailman/listinfo/python-list

using SQLalchemy

2012-06-21 Thread andrea crotti

We have a very chaotic database (on MySql) at the moment, with for
example table names used as keys to query other tables (but that's just
an example).

We are going to redesign it but first I still have to replace the
perl+vbscript system with only one Python program, so I still have to
deal with the current state.

I'm trying to use SQLalchemy and it looks absolutely great, but in
general as a policy we don't use external dependencies..

To try to do an exception in this case:
- are there any problems with SQLalchemy on Windows?
- are there any possibly drawbacks of using SQLalchemy instead of the
  MySqlDB interface?

  For the second point I guess that we might have a bit less fine
  tuning, but the amount of data is not so much and speed is also not a
  bit issue (also because all the queries are extremely inefficient
  now).

  Any other possible issue?

Thanks,
Andrea
-- 
http://mail.python.org/mailman/listinfo/python-list

retry many times decorator

2012-06-28 Thread andrea crotti

Hi everyone, I'm replacing a perl system that has to work a lot with
databases and perforce (and a few other things).

This script have to run completely unsupervisioned, so it's important
that it doesn't just quit at the first attempt waiting for human
intervent..

They say that the network sometimes has problems so all over the code
there are things like:
until ($dbh = DBI->connect('...'))
{
sleep( 5 * 60 );
)


Since I really don't want to do that I tried to do a decorator:

class retry_n_times:
def __init__(self, ntimes=3, timeout=3, fatal=True):
self.ntimes = ntimes
self.timeout = timeout
self.fatal = fatal

def __call__(self, func):
def _retry_n_times(*args, **kwargs):
attempts = 0
while True:
logger.debug("Attempt number %s of %s" % (attempts,
func.__name__))
ret = func(*args, **kwargs)
if ret:
return ret
else:
sleep(self.timeout)

attempts += 1
if attempts == self.ntimes:
logger.error("Giving up the attempts while running
%s" % func.__name__)
if self.fatal:
exit(100)

return _retry_n_times

which can be used as

@retry_n_times(ntimes=10)
def connect():
try:
conn = mysql_connection()
except Exception:
return False
else:
return True


So the function to be decorated has to return a boolean..  The problem
is that I would like to keep the exception message to report a bit
better what could be the problem, in case the retry fails.

Any idea about how to improve it?
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: retry many times decorator

2012-06-28 Thread andrea crotti

> Returning a boolean isn't very Pythonic. It would be better, IMHO, if
> it could swallow a specified exception (or specified exceptions?)
> raised when an attempt failed, up to the maximum permitted number of
> attempts. If the final attempt fails, propagate the exception.
> --
> http://mail.python.org/mailman/listinfo/python-list


Yes right I also didn't like it..  Now it become something as below,
so I capture every possible exception (since it must be generic) and
log what exception was raised.  I'm not re-raising because if it fails
and it's fatal I should just exit, and if it's not fatal it should
just continue..

class retry_n_times:
def __init__(self, ntimes=3, timeout=3, fatal=True):
self.ntimes = ntimes
self.timeout = timeout
self.fatal = fatal

def __call__(self, func):
def _retry_n_times(*args, **kwargs):
attempts = 0
while True:
logger.debug("Attempt number %s of %s" % (attempts,
func.__name__))
try:
ret = func(*args, **kwargs)
except Exception as e:
logger.error("Got exception %s with error %s" %
(type(e), str(e)))
sleep(self.timeout)
else:
return ret

attempts += 1
if attempts == self.ntimes:
logger.error("Giving up the attempts while running
%s" % func.__name__)
if self.fatal:
exit(100)

return _retry_n_times
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: retry many times decorator

2012-06-28 Thread Andrea Crotti


On 06/28/2012 06:43 PM, Steven D'Aprano wrote:

On Thu, 28 Jun 2012 17:26:36 +0100, andrea crotti wrote:


I disagree. If you make a coding error in your function, why do you think
it is useful to retry that buggy code over and over again? It's never
going to get less buggy unless you see the exception and fix the bug.

For any operation that you want to retry, identify the *temporary*
errors, catch them, and retry the request. *Permanent* errors should
immediately fail, without retrying. *Unexpected* errors should not be
caught, since they probably represent a bug in your code.


Ah well maybe I wasn't clear, but I'm not going to retry random things, 
I will only decorate
the functions that I know for sure that could go wrong for temporary 
network problems.


For example they told me that sometimes mysql just doesn't respond in 
time for some reasons,

but there's nothing permanently wrong, so retrying is the best option..

It would be good of course, however, to catch the exceptions that are 
known to be permanent problems

in the function at least, and leave the retry as last resource..

Thanks for the idea of the exponential backoff, which is also a better 
name than timeout for the variable..

--
http://mail.python.org/mailman/listinfo/python-list

Re: retry many times decorator

2012-06-29 Thread andrea crotti

On the other hand now that I think again even supposing there is a
permanent error like MySql completely down, retrying continuosly
won't do any harm anyway because the machine will not be able to do
anything else anyway, when someone will fix MySql it would
restart again without human intervention.

So I think I could even just let it retry and use maybe a SMTPHanlder
for the logging errors, to make the notification of problems very
quick..
-- 
http://mail.python.org/mailman/listinfo/python-list

adding a simulation mode

2012-07-04 Thread andrea crotti

I'm writing a program which has to interact with many external
resources, at least:
- mysql database
- perforce
- shared mounts
- files on disk

And the logic is quite complex, because there are many possible paths to
follow depending on some other parameters.
This program even needs to run on many virtual machines at the same time
so the interaction is another thing I need to check...

Now I successfully managed to mock the database with sqlalchemy and only
the fields I actually need, but I now would like to simulate also
everything else.

I would like for example that if I simulate I can pass a fake database,
a fake configuration and get the log of what exactly would happen.  But
I'm not sure how to implement it now..  One possibility would be to have
a global variable (PRETEND_ONLY = False) that if set should be checked
before every potentially system-dependent command.

For example

copytree(src, dest) becomes:
if not PRETEND_ONLY:
copytree(src, dest)

But I don't like it too much because I would have to add a lot of
garbage around..

Another way is maybe to set the sys.excepthook to something that catchs
all the exceptions that would be thrown by these comands, but I'm not
sure is a good idea either..

Any suggestion?
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: adding a simulation mode

2012-07-04 Thread andrea crotti

2012/7/4 Steven D'Aprano :
>
> Then, have your code accept the thing as an argument.
>
> E.g. instead of having a hard-coded database connection, allow the
> database connection to be set (perhaps as an argument, perhaps as a
> config option, etc.).
>
> There are libraries to help with mocks, e.g.:
>
> http://garybernhardt.github.com/python-mock-comparison/

Ah yes this part is already done, I pass an object to the entry point
of the program which represents the database connection, which looks
like this:

class MockMysqlEngine(MySqlEngine):
# TODO: make the engine more generic would avoid this dirty hack
def __init__(self, *args, **kwargs):
# self.engine =
create_engine('sqlite:home/andrea/testdb.sqlite', echo=True)
self.engine = create_engine('sqlite://', echo=True)
self.meta = MetaData(bind=self.engine)
self.session_maker = sessionmaker(bind=self.engine)


Now I populate statically the schema and populate with some test data
too, but I'm also implementing a weay to just pass some CSV files so
that other people can easily write some test cases with some other
possible database configurations.

(And I use mock for a few other things)


>
>
>> For example
>>
>> copytree(src, dest) becomes:
>> if not PRETEND_ONLY:
>> copytree(src, dest)
>
> Ewww :(
>
> Mocking the file system is probably the hardest part, because you
> generally don't have a "FileSystem" object available to be replaced. In
> effect, your program has one giant global variable, the file system.
> Worse, it's not even a named variable, it's hard-coded everywhere you use
> it.
>
> I don't know of any good solution for that. I've often thought about it,
> but don't have an answer.
>
> I suppose you could monkey-patch a bunch of stuff:
>
> if ONLY_PRETEND:
>  open = my_mock_open
>  copytree = my_mock_copytree
>  # etc.
> main()  # run your application
>
>
>
> but that would also be painful.
>

Yes there is no easy solution apparently..  But I'm also playing
around with vagrant and virtual machine generations, suppose I'm able
to really control what will be on the machine at time X, creating it
on demand with what I need, it might be a good way to solve my
problems (a bit overkill and slow maybe though).

I'll try the sys.excepthook trick first, any error should give me an
exception, so if I catch them all I think it might work already..
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: adding a simulation mode

2012-07-04 Thread andrea crotti

> Yes there is no easy solution apparently..  But I'm also playing
> around with vagrant and virtual machine generations, suppose I'm able
> to really control what will be on the machine at time X, creating it
> on demand with what I need, it might be a good way to solve my
> problems (a bit overkill and slow maybe though).
>
> I'll try the sys.excepthook trick first, any error should give me an
> exception, so if I catch them all I think it might work already..


I actually thought that the sys.excepthook would be easy but it's not
so trivial apparently:
This simple sample never reaches the print("here"), because even if
the exception is catched it still quits with return code=1.

I also tried to catch the signal but same result, how do I make it
continue and just don't complain?

The other option if of course to do a big try/except, but I would
prefer the excepthook solution..


import sys
from shutil import copy

def my_except_hook(etype, value, tb):
print("got an exception of type", etype)


if __name__ == '__main__':
sys.excepthook = my_except_hook
copy('sdflsdk')
print("here")
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: adding a simulation mode

2012-07-05 Thread andrea crotti

2012/7/5 Dieter Maurer :
>
> There is a paradigm called "inversion of control" which can be used
> to handle those requirements.
>
> With "inversion of control", the components interact on the bases
> of interfaces. The components themselves do not know each other, they
> know only the interfaces they want to interact with. For the interaction
> to really take place, a component asks a registry "give me a component
> satisfying this interface", gets it and uses the interface.
>
> If you follow this paradigm, it is easy to switch components: just
> register different alternatives for the interface at hand.
>
>
> "zope.interface" and "zope.component" are python packages that
> support this paradigm. Despite the "zope" in their name, they can be
> used outside of "Zope".
>
> "zope.interface" models interfaces, while "zope.component" provides
> so called "utilities" (e.g. "database utility", "filesystem utility", ...)
> and "adapters" and the corresponding registries.
>
>
> Of course, they contain only the infrastructure for the "inversion of control"
> paradigm. Up to you to provide the implementation for the various
> mocks.
>


Thanks that's a good point, in short I could do something like:


class FSActions:
  @classmethod
  def copy_directories(cls, src, dest)
  raise NotImplementedError

  @classmethod
  

And then have different implementations of this interface.  This would
work, but I don't really like the idea of constructing interfaces that
provide only the few things I need.

Instead of being good APIs they might become just random
functionalities put together to make my life easier, and at that point
maybe just some clear mocking might be even better..
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: stuck in files!!

2012-07-06 Thread andrea crotti

2012/7/6 Chirag B :
> i want to kno how to link two applications using python for eg:notepad
> txt file and some docx file. like i wat to kno how to take path of
> those to files and run them simultaneously.like if i type something in
> notepad it has to come in wordpad whenever i run that code.
> --
> http://mail.python.org/mailman/listinfo/python-list

I don't think that "I want to know" ever lead to some useful answers,
it would be not polite even if
people were actually paid to answer ;)

Anyway it's quite an application-os specific question, probably not
very easy either..
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: adding a simulation mode

2012-07-12 Thread andrea crotti

One thing that I don't quite understand is why some calls even if I
catch the exception still makes the whole program quit.
For example this

try:
copytree('sjkdf', 'dsflkj')
Popen(['notfouhd'], shell=True)
except Exception as e:
print("here")


behaves differently from:

try:
Popen(['notfouhd'], shell=True)
copytree('sjkdf', 'dsflkj')
except Exception as e:
print("here")

because if copytree fails it quits anyway.
I also looked at the code but can't quite get why.. any idea?
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: adding a simulation mode

2012-07-12 Thread andrea crotti

One way instead that might actually work is this

def default_mock_action(func_name):
def _default_mock_action(*args, **kwargs):
print("running {} with args {} and {}".format(func_name, args, kwargs))

return _default_mock_action


def mock_fs_actions(to_run):
"""Take a function to run, and run it in an environment which
mocks all the possibly dangerous operations
"""
side_effect = [
'copytree',
'copy',
]

acts = dict((s, default_mock_action(s)) for s in side_effect)

with patch('pytest.runner.commands.ShellCommand.run',
default_mock_action('run')):
with patch.multiple('shutil', **acts):
to_run()


So I can just pass the main function inside the mock like
mock_fs_actions(main)

and it seems to do the job, but I have to list manually all the things
to mock and I'm not sure is the best idea anyway..
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: adding a simulation mode

2012-07-12 Thread andrea crotti

Well that's what I thought, but I can't find any explicit exit
anywhere in shutil, so what's going on there?
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: adding a simulation mode

2012-07-12 Thread andrea crotti

2012/7/12 John Gordon :
> In  andrea crotti 
>  writes:
>
>> Well that's what I thought, but I can't find any explicit exit
>> anywhere in shutil, so what's going on there?
>
> Try catching SystemExit specifically (it doesn't inherit from Exception,
> so "except Exception" won't catch it.)
>
> --

Ah yes that actually works, but I think is quite dodgy, why was it
done like this?
In shutil there is still no mention of SystemExit, and trying to raise
the single exceptions by and
doens't still make it exit, so I would still like to know how it is
possible just for curiosity..
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: adding a simulation mode

2012-07-13 Thread andrea crotti

2012/7/13 Steven D'Aprano :

> Well of course it does. If copytree fails, the try block ends and
> execution skips straight to the except block, which runs, and then the
> program halts because there's nothing else to be done.
>
> That at least is my guess, based on the described symptoms.
>

Well I think that's what I was stupidly missing, I always had only one
possibly failing thing in a try/except block,
and I always gave for granted that it doesn't jump to the except block
on first error, but of course it makes
more sense if it does...

Thanks a lot
-- 
http://mail.python.org/mailman/listinfo/python-list

assertraises behaviour

2012-07-16 Thread andrea crotti

I found that the behaviour of assertRaises used as a context manager a
bit surprising.

This small example doesn't fail, but the OSError exception is cathed
even if not declared..
Is this the expected behaviour (from the doc I would say it's not).
(Running on arch-linux 64 bits and Python 2.7.3, but it doesn the same
with Python 3.2.3)

import unittest

class TestWithRaises(unittest.TestCase):
def test_ass(self):
with self.assertRaises(AssertionError):
assert False, "should happen"
raise OSError("should give error")


if __name__ == '__main__':
unittest.main()
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: assertraises behaviour

2012-07-16 Thread andrea crotti

2012/7/16 Christian Heimes :
>
> The OSError isn't catched as the code never reaches the line with "raise
> OSError". In other words "raise OSError" is never executed as the
> exception raised by "assert False" stops the context manager.
>
> You should avoid testing more than one line of code in a with
> self.assertRaises() block.
>
> Christian
>
> --
> http://mail.python.org/mailman/listinfo/python-list

Ok now it's more clear, and normally I only test one thing in the block.
But I find this quite counter-intuitive, because in the block I want
to be able to run something that raises the exception I catch (and
which I'm aware of), but if another exception is raised it *should*
show it and fail in my opinion..
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: assertraises behaviour

2012-07-16 Thread andrea crotti

Good thanks, but there is something that implements this behaviour..
For example nose runs all the tests, and if there are failures it goes
on and shows the failed tests only in the end, so I think it is
possible to achieve somehow, is that correct?
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Diagramming code

2012-07-16 Thread Andrea Crotti


On 07/16/2012 02:26 AM, hamilton wrote:

Is there any software to help understand python code ?

Thanks

hamilton


Sometimes to get some nice graphs I use gprof2dot 
(http://code.google.com/p/jrfonseca/wiki/Gprof2Dot)

or doxygen (http://www.stack.nl/~dimitri/doxygen/)

gprof2dot analyses the output of the profiling that you get running the 
code through the python profiler,

doing for example:

python -m cProfile -o $STATS $FNAME $@
$GPROF2DOT -f pstats $STATS | dot -T$TYPE -o $OUT

doxygen is more useful for C++ but it's also able to infer a few things 
(without running) from a python project..

--
http://mail.python.org/mailman/listinfo/python-list

Re: assertraises behaviour

2012-07-17 Thread andrea crotti

2012/7/16 Peter Otten <__pete...@web.de>:
> No, I don't see how the code you gave above can fail with an OSError.
>
> Can you give an example that produces the desired behaviour with nose? Maybe
> we can help you translate it to basic unittest.
>
> --
> http://mail.python.org/mailman/listinfo/python-list


Well this is what I meant:

import unittest

class TestWithRaises(unittest.TestCase):
def test_first(self):
assert False

def test_second(self):
print("also called")
assert True

if __name__ == '__main__':
unittest.main()

in this case also the second test is run even if the first fails..
But that's probably easy because we just need to catch exceptions for
every method call, so it's not exactly the same thing..
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: assertraises behaviour

2012-07-17 Thread Andrea Crotti

To clarify my "problem", I just thought that assertRaises if used as 
context manager should behave as following:

- keep going if the exception declared is raised
- re-raise an error even if catched after the declared exception was catched

I was also confused by the doc a bit:
"Test that an exception is raised when /callable/ is called with any 
positional or keyword arguments that are also passed to assertRaises() 
. 
The test passes if /exception/ is raised, is an error if another 
exception is raised, or fails if no exception is raised"


which speaks about when it's not used as a context manager..

I understand why it's not possible and it's not a big issue though..
-- 
http://mail.python.org/mailman/listinfo/python-list

reloading code and multiprocessing

2012-07-19 Thread andrea crotti

We need to be able to reload code on a live system.  This live system
has a daemon process always running but it runs many subprocesses with
multiprocessing, and the subprocesses might have a short life...

Now I found a way to reload the code successfully, as you can see from
this testcase:


def func():
from . import a
print(a.ret())


class TestMultiProc(unittest.TestCase):
def setUp(self):
open(path.join(cur_dir, 'a.py'), 'w').write(old_a)

def tearDown(self):
remove(path.join(cur_dir, 'a.py'))

def test_reloading(self):
"""Starting a new process gives a different result
"""
p1 = Process(target=func)
p2 = Process(target=func)
p1.start()
res = p1.join()
open(path.join(cur_dir, 'a.py'), 'w').write(new_a)
remove(path.join(cur_dir, 'a.pyc'))

p2.start()
res = p2.join()


As long as I import the code in the function and make sure to remove the
"pyc" files everything seems to work..
Are there any possible problems which I'm not seeing in this approach or
it's safe?

Any other better ways otherwise?
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: reloading code and multiprocessing

2012-07-23 Thread andrea crotti

2012/7/20 Chris Angelico :
> On Thu, Jul 19, 2012 at 8:15 PM, andrea crotti
>  wrote:
>> We need to be able to reload code on a live system.  This live system
>> has a daemon process always running but it runs many subprocesses with
>> multiprocessing, and the subprocesses might have a short life...
>> ...
>> As long as I import the code in the function and make sure to remove the
>> "pyc" files everything seems to work..
>> Are there any possible problems which I'm not seeing in this approach or
>> it's safe?
>
> Python never promises reloading reliability, but from my understanding
> of what you've done here, it's probably safe. However, you may find
> that you're using the wrong language for the job; it depends on how
> expensive it is to spin off all those processes and ship their work to
> them. But if that's not an issue, I'd say you have something safe
> there. (Caveat: I've given this only a fairly cursory examination, and
> I'm not an expert. Others may have more to say. I just didn't want the
> resident Markov chainer to be the only one to respond!!)
>
> ChrisA
> --
> http://mail.python.org/mailman/listinfo/python-list


Thanks Chris, always nice to get a "human" answer ;)

Anyway the only other problem which I found is that if I start the
subprocesses after many other things are initialised, it might happen
that the reloading doesn't work correctly, is that right?

Because sys.modules will get inherited from the subprocesses and it
will not reimport what has been already imported as far as I
understood..

So or I make sure I import everything only where it is needed or (and
maybe better and more explicit) I remove manually from sys.modules all
the modules that I want to reload, what do you think?
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: reloading code and multiprocessing

2012-07-25 Thread andrea crotti

2012/7/23 Chris Angelico :
>
> That would probably be correct. However, I still think you may be
> fighting against the language instead of playing to its strengths.
>
> I've never fiddled with sys.modules like that, but I know some have,
> without problem.
>
> ChrisA
> --
> http://mail.python.org/mailman/listinfo/python-list


I would also like to avoid this in general, but we have many
subprocesses to launch and some of them might take weeks, so we need
to have a process which is always running, because there is never a
point in time where we can just say let's stop everything and start again..

Anyway if there are better solutions I'm still glad to hear them, but
I would also like to keep it simple..

Another thing which now we need to figure out is how to communicate
with the live process..  For example we might want to submit something
manually, which should pass from the main process.

The first idea is to have a separate process that opens a socket and
listens for data on a local port, with a defined protocol.

Then the main process can parse these commands and run them.
Are there easier ways otherwise?
-- 
http://mail.python.org/mailman/listinfo/python-list

Dumping all the sql statements as backup

2012-07-25 Thread andrea crotti

I have some long running processes that do very long simulations which
at the end need to write things on a database.

At the moment sometimes there are network problems and we end up with
half the data on the database.

The half-data problem is probably solved easily with sessions and
sqlalchemy (a db-transaction), but still we would like to be able to
keep a backup SQL file in case something goes badly wrong and we want to
re-run it manually..

This might also be useful if we have to rollback the db for some reasons
to a previous day and we don't want to re-run the simulations..

Anyone did something similar?
It would be nice to do something like:

with CachedDatabase('backup.sql'):
# do all your things
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Dumping all the sql statements as backup

2012-07-25 Thread andrea crotti

2012/7/25 Jack  Since you know the content of what the sql code is, why not just build
> the sql file(s) needed and store them so that in case of a burp you can
> just execute the code file. If you don't know the exact sql code, dump
> it to a file as the statements are constructed... The only problem you
> would run into in this scenario is duplicate data, which is also easily
> solvable by using transaction-level commits to the db.
> --
> http://mail.python.org/mailman/listinfo/python-list


Yes but how do I construct them with SqlAlchemy?
One possible option I found is to enable the logging of some parts of
SqlAlchemy, and use that log, (echo=True in create_engine does
something similar) but maybe there is a better option..

But I need to filter only the insert/update/delete probably..

And in general the processes have to run independently so in case of
database connection problems I would just let them retry until it
actually works.

When the transaction actually works then in the backed up log I can
add a marker(or archive the log), to avoid replaying it.
-- 
http://mail.python.org/mailman/listinfo/python-list

regexps to objects

2012-07-27 Thread andrea crotti

I have some complex input to parse (with regexps), and I would like to
create nice objects directy from them.
The re module doesn't of course try to conver to any type, so I was
playing around to see if it's worth do something as below, where I
assign a constructor to every regexp and build an object from the
result..

Do you think it makes sense in general or how do you cope with this problem?

import re
from time import strptime
TIME_FORMAT_INPUT = '%m/%d/%Y %H:%M:%S'

def time_string_to_obj(timestring):
return strptime(timestring, TIME_FORMAT_INPUT)


REGEXPS = {
'num': ('\d+', int),
'date': ('[0-9/]+ [0-9:]+', time_string_to_obj),
}


def reg_to_obj(reg, st):
reg, constr = reg
found = re.match(reg, st)
return constr(found.group())


if __name__ == '__main__':
print reg_to_obj(REGEXPS['num'], '100')
print reg_to_obj(REGEXPS['date'], '07/24/2012 06:23:13')
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: reloading code and multiprocessing

2012-07-27 Thread andrea crotti

2012/7/25 andrea crotti :
>
> I would also like to avoid this in general, but we have many
> subprocesses to launch and some of them might take weeks, so we need
> to have a process which is always running, because there is never a
> point in time where we can just say let's stop everything and start again..
>
> Anyway if there are better solutions I'm still glad to hear them, but
> I would also like to keep it simple..
>
> Another thing which now we need to figure out is how to communicate
> with the live process..  For example we might want to submit something
> manually, which should pass from the main process.
>
> The first idea is to have a separate process that opens a socket and
> listens for data on a local port, with a defined protocol.
>
> Then the main process can parse these commands and run them.
> Are there easier ways otherwise?


So I was trying to do this, removing the module from sys.modules and
starting a new process (after modifying the file), but it doesn't work
as I expected.
The last assertion fails, but how?

The pyc file is not generated, the module is actually not in
sys.modules, and the function doesn't in the subprocess doesn't fail
but still returns the old value.
Any idea?

old_a = "def ret(): return 0"
new_a = "def ret(): return 1"


def func_no_import(queue):
queue.put(a_glob.ret())


class TestMultiProc(unittest.TestCase):

def test_reloading_with_global_import(self):
"""In this case the import is done before the process are started,
so we need to clean sys.modules to make sure we reload everything
"""
queue = Queue()
open(path.join(CUR_DIR, 'old_a.py'), 'w').write(old_a)

p1 = Process(target=func_no_import, args=(queue, ))
p1.start()
p1.join()
self.assertEqual(queue.get(), 0)

open(path.join(CUR_DIR, 'old_a.py'), 'w').write(new_a)
del sys.modules['auto_tester.tests.a_glob']

p2 = Process(target=func_no_import, args=(queue, ))
p2.start()
p2.join()
self.assertEqual(queue.get(), 1)
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: py2c - an open source Python to C/C++ is looking for developers

2012-07-30 Thread andrea crotti

2012/7/30  :
> I created py2c ( http://code.google.com/p/py2c )- an open source Python to 
> C/C++ translator!
> py2c is looking for developers!
> To join create a posting in the py2c-discuss Google Group or email me!
> Thanks
> PS:I hope this is the appropiate group for this message.
> --
> http://mail.python.org/mailman/listinfo/python-list

It looks like a very very hard task, and really useful or for exercise?

The first few lines I've seen there are the dangerous * imports and
LazyStrin looks like a typo..

from ast import *
import functools
from c_types import *
from lazystring import *
#constant data
empty = LazyStrin
ordertuple = ((Or,),(And
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Pass data to a subprocess

2012-07-31 Thread andrea crotti

>
>
> def procs():
> mp = MyProcess()
> # with the join we are actually waiting for the end of the running time
> mp.add([1,2,3])
> mp.start()
> mp.add([2,3,4])
> mp.join()
> print(mp)
>

I think I got it now, if I already just mix the start before another
add, inside the Process.run it won't see the new data that has been
added after the start.

So this way is perfectly safe only until the process is launched, if
it's running I need to use some multiprocess-aware data structure, is
that correct?
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Pass data to a subprocess

2012-07-31 Thread andrea crotti

2012/7/31 Laszlo Nagy :
>> I think I got it now, if I already just mix the start before another add,
>> inside the Process.run it won't see the new data that has been added after
>> the start. So this way is perfectly safe only until the process is launched,
>> if it's running I need to use some multiprocess-aware data structure, is
>> that correct?
>
> Yes. Read this:
>
> http://docs.python.org/library/multiprocessing.html#exchanging-objects-between-processes
>
> You can use Queues and Pipes. Actually, these are basic elements of the
> multiprocessing module and they are well documented. I wonder if you read
> the documentation at all, before posting questions here.
>
>
> --
> http://mail.python.org/mailman/listinfo/python-list


As I wrote "I found many nice things (Pipe, Manager and so on), but
actually even
this seems to work:" yes I did read the documentation.

I was just surprised that it worked better than I expected even
without Pipes and Queues, but now I understand why..

Anyway now I would like to be able to detach subprocesses to avoid the
nasty code reloading that I was talking about in another thread, but
things get more tricky, because I can't use queues and pipes to
communicate with a running process that it's noit my child, correct?
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Pass data to a subprocess

2012-08-01 Thread andrea crotti

2012/8/1 Laszlo Nagy :
>> I was just surprised that it worked better than I expected even
>> without Pipes and Queues, but now I understand why..
>>
>> Anyway now I would like to be able to detach subprocesses to avoid the
>> nasty code reloading that I was talking about in another thread, but
>> things get more tricky, because I can't use queues and pipes to
>> communicate with a running process that it's noit my child, correct?
>>
> Yes, I think that is correct. Instead of detaching a child process, you can
> create independent processes and use other frameworks for IPC. For example,
> Pyro.  It is not as effective as multiprocessing.Queue, but in return, you
> will have the option to run your service across multiple servers.
>
> The most effective IPC is usually through shared memory. But there is no OS
> independent standard Python module that can communicate over shared memory.
> Except multiprocessing of course, but AFAIK it can only be used to
> communicate between fork()-ed processes.


Thanks, there is another thing which is able to interact with running
processes in theory:
https://github.com/lmacken/pyrasite

I don't know though if it's a good idea to use a similar approach for
production code, as far as I understood it uses gdb..  In theory
though I could be able to set up every subprocess with all the data
they need, so I might not even need to share data between them.

Anyway now I had another idea to avoid to be able to stop the main
process without killing the subprocesses, using multiple forks.  Does
the following makes sense?  I don't really need these subprocesses to
be daemons since they should quit when done, but is there anything
that can go wrong with this approach?

from os import fork
from time import sleep
from itertools import count
from sys import exit

from multiprocessing import Process, Queue

class LongProcess(Process):
def __init__(self, idx, queue):
Process.__init__(self)
# self.daemon = True
self.queue = queue
self.idx = idx

def run(self):
for i in count():
self.queue.put("%d: %d"  % (self.idx, i))
print("adding %d: %d"  % (self.idx, i))
sleep(2)


if __name__ == '__main__':
qu = Queue()

# how do I do a multiple fork?
for i in range(5):
pid = fork()
# if I create here all the data structures I should still be
able to do things
if pid == 0:
lp = LongProcess(1, qu)
lp.start()
lp.join()
exit(0)
else:
print("started subprocess with pid ", pid)
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Pass data to a subprocess

2012-08-01 Thread andrea crotti

2012/8/1 Laszlo Nagy :
> On thing is sure: os.fork() doesn't work under Microsoft Windows. Under
> Unix, I'm not sure if os.fork() can be mixed with
> multiprocessing.Process.start(). I could not find official documentation on
> that.  This must be tested on your actual platform. And don't forget to use
> Queue.get() in your test. :-)
>

Yes I know we don't care about Windows for this particular project..
I think mixing multiprocessing and fork should not harm, but probably
is unnecessary since I'm already in another process after the fork so
I can just make it run what I want.

Otherwise is there a way to do same thing only using multiprocessing?
(running a process that is detachable from the process that created it)
-- 
http://mail.python.org/mailman/listinfo/python-list

CRC-checksum failed in gzip

2012-08-01 Thread andrea crotti

We're having some really obscure problems with gzip.
There is a program running with python2.7 on a 2.6.18-128.el5xen (red
hat I think) kernel.

Now this program does the following:
if filename == 'out2.txt':
 out2 = open('out2.txt')
elif filename == 'out2.txt.gz'
 out2 = open('out2.txt.gz')

text = out2.read()

out2.close()

very simple right? But sometimes we get a checksum error.
Reading the code I got the following:

 - CRC is at the end of the file and is computed against the whole
file (last 8 bytes)
 - after the CRC there is the \ marker for the EOF
 - readline() doesn't trigger the checksum generation in the
beginning, but only when the EOF is reached
 - until a file is flushed or closed you can't read the new content in it

but the problem is that we can't reproduce it, because doing it
manually on the same files it works perfectly,
and the same files some time work some time don't work.

The files are on a shared NFS drive, I'm starting to think that it's a
network/fs problem, which might truncate the file
adding an EOF before the end and thus making the checksum fail..
But is it possible?
Or what else could it be?
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: CRC-checksum failed in gzip

2012-08-01 Thread andrea crotti

2012/8/1 Laszlo Nagy :
> On 2012-08-01 12:39, andrea crotti wrote:
>>
>> We're having some really obscure problems with gzip.
>> There is a program running with python2.7 on a 2.6.18-128.el5xen (red
>> hat I think) kernel.
>>
>> Now this program does the following:
>> if filename == 'out2.txt':
>>   out2 = open('out2.txt')
>> elif filename == 'out2.txt.gz'
>>   out2 = open('out2.txt.gz')
>
> Gzip file is binary. You should open it in binary mode.
>
> out2 = open('out2.txt.gz',"b")
>
> Otherwise carriage return and newline characters will be converted
> (depending on the platform).
>
>
> --
> http://mail.python.org/mailman/listinfo/python-list


Ah no sorry I just wrote wrong that part of the code, it was
otu2 = gzip.open('out2.txt.gz') because otherwise nothing would possibly work..
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: CRC-checksum failed in gzip

2012-08-01 Thread andrea crotti

Full traceback:

Exception in thread Thread-8:
Traceback (most recent call last):
  File "/user/sim/python/lib/python2.7/threading.py", line 530, in
__bootstrap_inner
self.run()
  File "/user/sim/tests/llif/AutoTester/src/AutoTester2.py", line 67, in run
self.processJobData(jobData, logger)
  File "/user/sim/tests/llif/AutoTester/src/AutoTester2.py", line 204,
in processJobData
self.run_simulator(area, jobData[1] ,log)
  File "/user/sim/tests/llif/AutoTester/src/AutoTester2.py", line 142,
in run_simulator
report_file, percentage, body_text = SimResults.copy_test_batch(log, area)
  File "/user/sim/tests/llif/AutoTester/src/SimResults.py", line 274,
in copy_test_batch
out2_lines = out2.read()
  File "/user/sim/python/lib/python2.7/gzip.py", line 245, in read
self._read(readsize)
  File "/user/sim/python/lib/python2.7/gzip.py", line 316, in _read
self._read_eof()
  File "/user/sim/python/lib/python2.7/gzip.py", line 338, in _read_eof
hex(self.crc)))
IOError: CRC check failed 0x4f675fba != 0xa9e45aL


- The file is written with the linux gzip program.
- no I can't reproduce the error with the same exact file that did
failed, that's what is really puzzling,
  there seems to be no clear pattern and just randmoly fails. The file
is also just open for read from this program,
  so in theory no way that it can be corrupted.

  I also checked with lsof if there are processes that opened it but
nothing appears..

- can't really try on the local disk, might take ages unfortunately
(we are rewriting this system from scratch anyway)
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Pass data to a subprocess

2012-08-01 Thread andrea crotti

2012/8/1 Roy Smith :
> In article ,
>  Laszlo Nagy  wrote:
>
>> Yes, I think that is correct. Instead of detaching a child process, you
>> can create independent processes and use other frameworks for IPC. For
>> example, Pyro.  It is not as effective as multiprocessing.Queue, but in
>> return, you will have the option to run your service across multiple
>> servers.
>
> You might want to look at beanstalk (http://kr.github.com/beanstalkd/).
> We've been using it in production for the better part of two years.  At
> a 30,000 foot level, it's an implementation of queues over named pipes
> over TCP, but it takes care of a zillion little details for you.
>
> Setup is trivial, and there's clients for all sorts of languages.  For a
> Python client, go with beanstalkc (pybeanstalk appears to be
> abandonware).
>>
>> The most effective IPC is usually through shared memory. But there is no
>> OS independent standard Python module that can communicate over shared
>> memory.
>
> It's true that shared memory is faster than serializing objects over a
> TCP connection.  On the other hand, it's hard to imagine anything
> written in Python where you would notice the difference.
> --
> http://mail.python.org/mailman/listinfo/python-list


That does look nice and I would like to have something like that..
But since I have to convince my boss of another external dependency I
think it might be worth
to try out zeromq instead, which can also do similar things and looks
more powerful, what do you think?
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: CRC-checksum failed in gzip

2012-08-01 Thread andrea crotti

2012/8/1 Laszlo Nagy :
>>there seems to be no clear pattern and just randmoly fails. The file
>> is also just open for read from this program,
>>so in theory no way that it can be corrupted.
>
> Yes, there is. Gzip stores CRC for compressed *blocks*. So if the file is
> not flushed to the disk, then you can only read a fragment of the block, and
> that changes the CRC.
>
>>
>>I also checked with lsof if there are processes that opened it but
>> nothing appears..
>
> lsof doesn't work very well over nfs. You can have other processes on
> different computers (!) writting the file. lsof only lists the processes on
> the system it is executed on.
>
>>
>> - can't really try on the local disk, might take ages unfortunately
>> (we are rewriting this system from scratch anyway)
>>
>


Thanks a lotl, someone that writes on the file while reading might be
an explanation, the problem is that everyone claims that they are only
reading the file.

Apparently this file is generated once and a long time after only read
by two different tools (in sequence), so this could not be possible
either in theory.. I'll try to investigate more in this sense since
it's the only reasonable explation I've found so far.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Pass data to a subprocess

2012-08-01 Thread andrea crotti

2012/8/1 Laszlo Nagy :
>
> So detaching the child process will not make IPC stop working. But exiting
> from the original parent process will. (And why else would you detach the
> child?)
>
> --
> http://mail.python.org/mailman/listinfo/python-list


Well it makes perfect sense if it stops working to me, so or
- I use zeromq or something similar to communicate
- I make every process independent without the need to further
communicate with the parent..
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: CRC-checksum failed in gzip

2012-08-01 Thread andrea crotti

2012/8/1 Steven D'Aprano :
> On Wed, 01 Aug 2012 14:01:45 +0100, andrea crotti wrote:
>
>> Full traceback:
>>
>> Exception in thread Thread-8:
>
> "DANGER DANGER DANGER WILL ROBINSON!!!"
>
> Why didn't you say that there were threads involved? That puts a
> completely different perspective on the problem.
>
> I *was* going to write back and say that you probably had either file
> system corruption, or network errors. But now that I can see that you
> have threads, I will revise that and say that you probably have a bug in
> your thread handling code.
>
> I must say, Andrea, your initial post asking for help was EXTREMELY
> misleading. You over-simplified the problem to the point that it no
> longer has any connection to the reality of the code you are running.
> Please don't send us on wild goose chases after bugs in code that you
> aren't actually running.
>
>
>>   there seems to be no clear pattern and just randmoly fails.
>
> When you start using threads, you have to expect these sorts of
> intermittent bugs unless you are very careful.
>
> My guess is that you have a bug where two threads read from the same file
> at the same time. Since each read shares state (the position of the file
> pointer), you're going to get corruption. Because it depends on timing
> details of which threads do what at exactly which microsecond, the effect
> might as well be random.
>
> Example: suppose the file contains three blocks A B and C, and a
> checksum. Thread 8 starts reading the file, and gets block A and B. Then
> thread 2 starts reading it as well, and gets half of block C. Thread 8
> gets the rest of block C, calculates the checksum, and it doesn't match.
>
> I recommend that you run a file system check on the remote disk. If it
> passes, you can eliminate file system corruption. Also, run some network
> diagnostics, to eliminate corruption introduced in the network layer. But
> I expect that you won't find anything there, and the problem is a simple
> thread bug. Simple, but really, really hard to find.
>
> Good luck.
>

Thanks a lot, that makes a lot of sense..  I haven't given this detail
before because I didn't write this code, and I forgot that there were
threads involved completely, I'm just trying to help to fix this bug.

Your explanation makes a lot of sense, but it's still surprising that
even just reading files without ever writing them can cause troubles
using threads :/
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: CRC-checksum failed in gzip

2012-08-01 Thread andrea crotti

2012/8/1 Laszlo Nagy :
>
>> Thanks a lot, that makes a lot of sense..  I haven't given this detail
>> before because I didn't write this code, and I forgot that there were
>> threads involved completely, I'm just trying to help to fix this bug.
>>
>> Your explanation makes a lot of sense, but it's still surprising that
>> even just reading files without ever writing them can cause troubles
>> using threads :/
>
> Make sure that file objects are not shared between threads. If that is
> possible. It will probably solve the problem (if that is related to
> threads).


Well I just have to create a lock I guess right?
with lock:
# open file
# read content
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: CRC-checksum failed in gzip

2012-08-02 Thread andrea crotti

2012/8/1 Steven D'Aprano :
>
> When you start using threads, you have to expect these sorts of
> intermittent bugs unless you are very careful.
>
> My guess is that you have a bug where two threads read from the same file
> at the same time. Since each read shares state (the position of the file
> pointer), you're going to get corruption. Because it depends on timing
> details of which threads do what at exactly which microsecond, the effect
> might as well be random.
>
> Example: suppose the file contains three blocks A B and C, and a
> checksum. Thread 8 starts reading the file, and gets block A and B. Then
> thread 2 starts reading it as well, and gets half of block C. Thread 8
> gets the rest of block C, calculates the checksum, and it doesn't match.
>
> I recommend that you run a file system check on the remote disk. If it
> passes, you can eliminate file system corruption. Also, run some network
> diagnostics, to eliminate corruption introduced in the network layer. But
> I expect that you won't find anything there, and the problem is a simple
> thread bug. Simple, but really, really hard to find.
>
> Good luck.

One last thing I would like to do before I add this fix is to actually
be able to reproduce this behaviour, and I thought I could just do the
following:

import gzip
import threading


class OpenAndRead(threading.Thread):
def run(self):
fz = gzip.open('out2.txt.gz')
fz.read()
fz.close()


if __name__ == '__main__':
for i in range(100):
OpenAndRead().start()


But no matter how many threads I start, I can't reproduce the CRC
error, any idea how I can try to help it happening?

The code in run should be shared by all the threads since there are no
locks, right?
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: CRC-checksum failed in gzip

2012-08-02 Thread andrea crotti

2012/8/2 Laszlo Nagy :
>
> Your example did not share the file object between threads. Here an example
> that does that:
>
> class OpenAndRead(threading.Thread):
> def run(self):
> global fz
> fz.read(100)
>
> if __name__ == '__main__':
>
>fz = gzip.open('out2.txt.gz')
>for i in range(10):
> OpenAndRead().start()
>
> Try this with a huge file. And here is the one that should never throw CRC
> error, because the file object is protected by a lock:
>
> class OpenAndRead(threading.Thread):
> def run(self):
> global fz
> global fl
> with fl:
> fz.read(100)
>
> if __name__ == '__main__':
>
>fz = gzip.open('out2.txt.gz')
>fl = threading.Lock()
>for i in range(2):
> OpenAndRead().start()
>
>
>>
>> The code in run should be shared by all the threads since there are no
>> locks, right?
>
> The code is shared but the file object is not. In your example, a new file
> object is created, every time a thread is started.
>


Ok sure that makes sense, but then this explanation is maybe not right
anymore, because I'm quite sure that the file object is *not* shared
between threads, everything happens inside a thread..

I managed to get some errors doing this with a big file
class OpenAndRead(threading.Thread):
 def run(self):
 global fz
 fz.read(100)

if __name__ == '__main__':

fz = gzip.open('bigfile.avi.gz')
for i in range(20):
 OpenAndRead().start()

and it doesn't fail without the *global*, but this is definitively not
what the code does, because every thread gets a new file object, it's
not shared..

Anyway we'll read once for all the threads or add the lock, and
hopefully it should solve the problem, even if I'm not convinced yet
that it was this.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: CRC-checksum failed in gzip

2012-08-02 Thread andrea crotti

2012/8/2 andrea crotti :
>
> Ok sure that makes sense, but then this explanation is maybe not right
> anymore, because I'm quite sure that the file object is *not* shared
> between threads, everything happens inside a thread..
>
> I managed to get some errors doing this with a big file
> class OpenAndRead(threading.Thread):
>  def run(self):
>  global fz
>  fz.read(100)
>
> if __name__ == '__main__':
>
> fz = gzip.open('bigfile.avi.gz')
> for i in range(20):
>  OpenAndRead().start()
>
> and it doesn't fail without the *global*, but this is definitively not
> what the code does, because every thread gets a new file object, it's
> not shared..
>
> Anyway we'll read once for all the threads or add the lock, and
> hopefully it should solve the problem, even if I'm not convinced yet
> that it was this.


Just for completeness as suggested this also does not fail:

class OpenAndRead(threading.Thread):
def __init__(self, lock):
threading.Thread.__init__(self)
self.lock = lock

def run(self):
 global fz
 with self.lock:
 fz.read(100)

if __name__ == '__main__':
lock = threading.Lock()
fz = gzip.open('bigfile.avi.gz')
for i in range(20):
 OpenAndRead(lock).start()
-- 
http://mail.python.org/mailman/listinfo/python-list

Sharing code between different projects?

2012-08-13 Thread andrea crotti

I am in the situation where I am working on different projects that
might potentially share a lot of code.

I started to work on project A, then switched completely to project B
and in the transiction I copied over a lot of code with the
corresponding tests, and I started to modify it.

Now it's time to work again on project A, but I don't want to copy
things over again.

I would like to design a simple and nice way to share between projects,
where the things I want to share are simple but useful things as for
example:

class TempDirectory:
"""Create a temporary directory and cd to it on enter, cd back to
the original position and remove it on exit
"""
def __init__(self):
self.oldcwd = getcwd()
self.temp_dir = mkdtemp()

def __enter__(self):
logger.debug("create and move to temp directory %s" % self.temp_dir)
return self.temp_dir

def __exit__(self, type, value, traceback):
# I first have to move out
chdir(self.oldcwd)
logger.debug("removing the temporary directory and go back to
the original position %s" % self.temp_dir)
rmtree(self.temp_dir)


The problem is that there are functions/classes from many domains, so it
would not make much sense to create a real project, and the only name I
could give might be "utils or utilities"..

In plus the moment the code is shared I must take care of versioning and
how to link different pieces together (we use perforce by the way).

If then someone else except me will want to use these functions then of
course I'll have to be extra careful, designing really good API's and so
on, so I'm wondering where I should set the trade-off between ability to
share and burden to maintain..

Anyone has suggestions/real world experiences about this?
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Sharing code between different projects?

2012-08-14 Thread andrea crotti

2012/8/13 Rob Day :
> I'd just create a module - called shared_utils.py or similar - and import
> that in both projects. It might be a bit messy if there's no 'unifying
> theme' to the module - but surely it'd be a lot less messy than your
> TempDirectory class, and anyone else who knows Python will understand
> 'import shared_utils' much more easily.
>
> I realise you might not want to say, but if you could give some idea what
> sort of projects these are, and what sorts of code you're trying to share,
> it might make things a bit clearer.
>
> I'm not really sure what your concerns about 'versioning and how to link
> different pieces together' are - what d you think could go wrong here?
>

It's actually not so simple..

Because the two projects live in different parts of the repository
with different people allowed to work on them, and they have to run on
different machines..

In plus I'm using perforce which doesn't have any svn:externals-like
thing as far as I know..  The thing I should do probably is to set up
workspace (which contains *absolute* paths of the machines) with the
right setting to make module available in the right position.

Second problem is that one of the two projects has a quite insane
requirement, which is to be able to re-run itself on a specific
version depending on a value fetched from the database.

This becomes harder if divide code around, but in theory I can use the
changeset number which is like a SVN revision so this should be fine.

The third problem is that from the moment is not just me using these
things, how can I be sure that changing something will not break
someone else code?

I have unit tests on both projects plus the tests for the utils, but
as soon as I separate them it becomes harder to test everything..

So well everything can have a solution probably, I just hope it's
worth the effort..

Another thing which would be quite cool might be a import hook which
fetches things from the repository when needed, with a simple
bootstrap script for every project to be able to use this feature, but
it only makes sense if I need this kind of feature in many projects.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Sharing code between different projects?

2012-08-14 Thread andrea crotti

2012/8/14 Jean-Michel Pichavant :
>
> I can think of logilab-common (http://www.logilab.org/848/)
>
> Having a company-wide python module properly distributed is one to achieve
> your goal. Without distributing your module to the public, there's a way to
> have a pypi-like server runnning on your private network :
>
> http://pypi.python.org/pypi/pypiserver/
>
> JM
>
> Note : looks like pypi.python.org is having some trouble, the above link is
> broken. Search for recent announcement about pypiserver.
>


Thanks, yes we need something like this..
I'll copy the name probably, I prefer "common" to "utils/utilities"..
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Sharing code between different projects?

2012-08-15 Thread andrea crotti

2012/8/14 Cameron Simpson :
>
> Having just skimmed this thread, one thing I haven't quite seen suggested is
> this:
>
> Really do make a third "utilities" project, and treat "the project" and
> "deploy" as separate notions. So to actually run/deploy project A's code
> you'd have a short script that copied project A and the utilities project
> code into a tree and ran off that. Or even a simple process/script to
> update the copy of "utilities" in "project A"'s area.
>
> So you don't "share" code on an even handed basis but import the
> "utilities" library into each project as needed.
>
> I do this (one my own very small scale) in one of two ways:
>
>   - as needed, copy the desired revision of utilities into the project's
> library space and do perforce's equivalent of Mercurial's addremove
> on that library tree (comment "update utilities to revision X").
>
>   - keep a perforce work area for the utilities in your project A area,
> where your working project A can hook into it with a symlink or some
> deploy/copy procedure as suggested above.
> With this latter one you can push back into the utilities library
> from your "live" project, because you have a real checkout. So:
>
>   projectAdir
> projectA-perforce-checkout
> utilities-perforce-checkout
>   projectBdir
> projectB-perforce-checkout
> utilities-perforce-checkout
>

Thanks, is more or less what I was going to do..  But I would not use
symlinks and similar things, because then every user should set it up
accordingly.

Potentially we could instead use the perforce API to change the
workspace mappings at run-time, and thus "force" perforce to checkout
the files in the right place..

There is still the problem that people should checkout things from two
places all the time instead of one..

> Personally I become more and more resistent to cut/paste even for small
> things as soon as multiple people use it; you will never get to backport
> updates to even trivial code to all the copies.
>
> Cheers,


Well sure, but on the other end as soon as multiple people use it you
can't change any of the public functions signatures without being
afraid that you'll break something..
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Sharing code between different projects?

2012-08-15 Thread andrea crotti

Also looking at logilab-common I thought that it would be great if we
could actually make this "common" library even open source, and use it
as one of the other many external libraries.

Since Python code is definitively not the the core business of this
company I might even convince them, but the problem is that then all
the internal people working on it would not be able to use the
standard tools that they use with everything else..

Did anyone manage to convince his company to do something similar?
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Sharing code between different projects?

2012-08-16 Thread andrea crotti

2012/8/16 Jean-Michel Pichavant :
>
> SVN allows to define external dependencies, where one repository will
> actually checkout another one at a specific version. If SVN does it, I guess
> any decent SCM also provide such feature.
>
> Assuming our project is named 'common', and you have 2 projects A and B :
>
> A
>- common@rev1
>
> B
>- common@rev2
>
> Project A references the lib as "A.common", B as "B.common". You need to be
> extra carefull to never reference common as 'common' in any place.
>
> JM
>


Unfortunately I think you guess wrong
http://forums.perforce.com/index.php?/topic/553-perforce-svnexternals-equivalent/
Anyway with views and similar things is not that hard to implement the
same thing..
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: how to call perl script from html using python

2012-08-16 Thread andrea crotti

2012/8/16 Pervez Mulla :
>
> Hey Steven ,
>
> Thank you for your response,
>
> I will in detail now about my project,
>
> Actually the project entire backend in PERL language , Am using Django 
> framework for my front end .

> I have written code for signup page in python , which is working perfectly .
>
> In HTml when user submit POST method, it calling Python code Instead of 
> this I wanna call perl script for sign up ..
>
> below in form for sign up page in python 

Good that's finally an explanation, so the question you can ask google
was "how do I call an external process from Python",
which has absolutely nothing to with HTML, and is very easy to find
out (hint: subprocess).
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Sharing code between different projects?

2012-08-16 Thread andrea crotti

2012/8/16 andrea crotti :
>
>
> Unfortunately I think you guess wrong
> http://forums.perforce.com/index.php?/topic/553-perforce-svnexternals-equivalent/
> Anyway with views and similar things is not that hard to implement the
> same thing..


I'm very happy to say that I finally made it!

It took 3 hours to move / merge a few thousand lines around but
everything seems to work perfectly now..

At the moment I'm just using symlinks, I'll see later if something
smarter is necessary, thanks to everyone for the ideas.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why doesn't Python remember the initial directory?

2012-08-20 Thread andrea crotti

2012/8/20 kj :
> In  Roy Smith  
> writes> This means that no library code can ever count on, for example,
> being able to reliably find the path to the file that contains the
> definition of __main__.  That's a weakness, IMO.  One manifestation
> of this weakness is that os.chdir breaks inspect.getmodule, at
> least on Unix.  If you have some Unix system handy, you can try
> the following.  First change the argument to os.chdir below to some
> valid directory other than your working directory.  Then, run the
> script, making sure that you refer to it using a relative path.
> When I do this on my system (OS X + Python 2.7.3), the script bombs
> at the last print statement, because the second call to inspect.getmodule
> (though not the first one) returns None.
>
> import inspect
> import os
>
> frame = inspect.currentframe()
>
> print inspect.getmodule(frame).__name__
>
> os.chdir('/some/other/directory') # where '/some/other/directory' is
>   # different from the initial directory
>
> print inspect.getmodule(frame).__name__
>
> ...
>
> % python demo.py
> python demo.py
> __main__
> Traceback (most recent call last):
>   File "demo.py", line 11, in 
> print inspect.getmodule(frame).__name__
> AttributeError: 'NoneType' object has no attribute '__name__'
>
>..

As in many other cases the programming language can't possibly act
safely on all the possible stupid things that the programmer wants to
do, and not understanding how an operating system works doesn't help
either..

In the specific case there is absolutely no use of os.chdir, since you
can:
- use absolute paths
- things like subprocess.Popen accept a cwd argument
- at worst you can chdir back to the previous position right after the
broken thing that require a certain path that you are calling is run
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Why doesn't Python remember the initial directory?

2012-08-20 Thread andrea crotti

2012/8/20 Roy Smith :
> In article ,
>  Walter Hurry  wrote:
>
>> It is difficult to think of a sensible use for os.chdir, IMHO.
>
> It is true that you can mostly avoid chdir() by building absolute
> pathnames, but it's often more convenient to just cd somewhere and use
> names relative to that.  Fabric (a very cool tool for writing remote
> sysadmin scripts), gives you a cd() command which is a context manager,
> making it extra convenient.
>
> Also, core files get created in the current directory.  Sometimes
> daemons will cd to some fixed location to make sure that if they dump
> core, it goes in the right place.
>
> On occasion, you run into (poorly designed, IMHO) utilities which insist
> of reading or writing a file in the current directory.  If you're
> invoking one of those, you may have no choice but to chdir() to the
> right place before running them.
> --
> http://mail.python.org/mailman/listinfo/python-list


I've done quite a lot of system programming as well, and changing
directory is only a source of possible troubles in general.

If I really have to for some reasons I do this


class TempCd:
"""Change temporarily the current directory
"""
def __init__(self, newcwd):
self.newcwd = newcwd
self.oldcwd = getcwd()

def __enter__(self):
chdir(self.newcwd)
return self

def __exit__(self, type, value, traceback):
chdir(self.oldcwd)


with TempCd('/tmp'):
# now working in /tmp

# now in the original

So it's not that hard to avoid problems..
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: pyQT performance?

2012-09-10 Thread Andrea Crotti


On 09/10/2012 07:29 PM, jayden.s...@gmail.com wrote

Have you ever used py2exe? After converting the python codes to executable, 
does it save the time of interpreting the script language? Thank a lot!


Py2exe normally never speeds up anything, simply because it doesn't 
convert to executable, but simply
package everything together, so I haven't tried in this particular case 
but it shouldn't make a difference..

--
http://mail.python.org/mailman/listinfo/python-list

main and dependent objects

2012-09-13 Thread andrea crotti

I am in a situation where I have a class Obj which contains many
attributes, and also contains logically another object of class
Dependent.

This dependent_object, however, also needs to access many fields of the
original class, so at the moment we did something like this:


class Dependent:
def __init__(self, orig):
self.orig = orig

def using_other_attributes(self):
print("Using attr1", self.orig.attr1)


class Obj:
def __init__(self):
self.attr1 = "attr1"
self.attr2 = "attr2"
self.attr3 = "attr3"

self.dependent_object = Dependent(self)


But I'm not so sure it's a good idea, it's a bit smelly..
Any other suggestion about how to get a similar result?

I could of course passing all the arguments needed to the constructor of
Dependent, but it's a bit tedious..


Thanks,
Andrea
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: main and dependent objects

2012-09-13 Thread andrea crotti

2012/9/13 Jean-Michel Pichavant :
>
> Nothing shocking right here imo. It looks like a classic parent-child 
> implementation.
> However it seems the relation between Obj and Dependent are 1-to-1. Since 
> Dependent need to access all Obj attributes, are you sure that Dependent and 
> Obj are not actually the same class ?
>
>
> JM

Yes well the main class is already big enough, and the relation is 1-1
but the dependent class can be also considered separate to split
things more nicely..

So I think it will stay like this for now and see how it goes.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Python presentations

2012-09-13 Thread andrea crotti

2012/9/13 William R. Wing (Bill Wing) :
>
> [byte]
>
> Speaking from experience as both a presenter and an audience member, please 
> be sure that anything you demo interactively you include in your slide deck 
> (even if only as an addendum).  I assume your audience will have access to 
> the deck after your talk (on-line or via hand-outs), and you want them to be 
> able to go home and try it out for themselves.
>
> Nothing is more frustrating than trying to duplicate something you saw a 
> speaker do, and fail because of some detail you didn't notice at the time of 
> the talk.  A good example is one that was discussed on the matplotlib-users 
> list several weeks ago:
>
> http://www.loria.fr/~rougier/teaching/matplotlib/
>
> -Bill


Yes that's a good point thanks, in general everything is already in a
git repository, now only in my dropbox but later I will make it
public.

Even the code that I should write there should already written anyway,
and to make sure everything is available I could use the save function
of IPython and add it to the repository...

In general I think that explaining code on a slide (if it involves
some new concepts in particular) it's better, but then showing what it
does it's always a plus.

It's not the same if you say this will go 10x faster than the previous
one, and showing that it actually does on your machine..
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Python presentations

2012-09-13 Thread Andrea Crotti


On 09/13/2012 11:58 PM, Miki Tebeka wrote:

What do you think work best in general?

I find typing during class (other than small REPL examples) time consuming and 
error prone.

What works well for me is to create a slidy HTML presentation with asciidoc, 
then I can include code snippets that can be also run from the command line.
(Something like:

 [source,python,numbered]
 ---
 include::src/sin.py[]
 ---

Output example: http://i.imgur.com/Aw9oQ.png
)

Let me know if you're interested and I'll send you a example project.

HTH,
--
Miki


Yes please send me something and I'll have a look.
For my slides I'm using hieroglyph:
http://heiroglyph.readthedocs.org/en/latest/index.html

which works with sphinx, so in theory I might be able to run the code as 
well..


But in general probably the best way is to copy and paste in a ipython 
session, to show

that what I just explained actually works as expected..
--
http://mail.python.org/mailman/listinfo/python-list

Re: Decorators not worth the effort

2012-09-14 Thread andrea crotti

I think one very nice and simple example of how decorators can be used is this:

def memoize(f, cache={}, *args, **kwargs):
def _memoize(*args, **kwargs):
key = (args, str(kwargs))
if not key in cache:
cache[key] = f(*args, **kwargs)

return cache[key]

return _memoize

def fib(n):
if n <= 1:
return 1
return fib(n-1) + fib(n-2)

@memoize
def fib_memoized(n):
if n <= 1:
return 1
return fib_memoized(n-1) + fib_memoized(n-2)


The second fibonacci looks exactly the same but while the first is
very slow and would generate a stack overflow the second doesn't..

I might use this example for the presentation, before explaining what it is..
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Decorators not worth the effort

2012-09-14 Thread andrea crotti

2012/9/14 Chris Angelico :
>
> Trouble is, you're starting with a pretty poor algorithm. It's easy to
> improve on what's poor. Memoization can still help, but I would start
> with a better algorithm, such as:
>
> def fib(n):
> if n<=1: return 1
> a,b=1,1
> for i in range(1,n,2):
> a+=b
> b+=a
> return b if n%2 else a
>
> def fib(n,cache=[1,1]):
> if n<=1: return 1
> while len(cache)<=n:
> cache.append(cache[-1] + cache[-2])
> return cache[n]
>
> Personally, I don't mind (ab)using default arguments for caching, but
> you could do the same sort of thing with a decorator if you prefer. I
> think the non-decorated non-recursive version is clear and efficient
> though.
>
> ChrisA
> --
> http://mail.python.org/mailman/listinfo/python-list


The poor algorithm is much more close to the mathematical definition
than the smarter iterative one..  And in your second version you
include some ugly caching logic inside it, so why not using a
decorator then?

I'm not saying that with the memoization is the "good" solution, just
that I think it's a very nice example of how to use a decorator, and
maybe a good example to start with a talk on decorators..
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: subprocess call is not waiting.

2012-09-18 Thread andrea crotti

I have a similar problem, something which I've never quite understood
about subprocess...
Suppose I do this:

proc = subprocess.Popen(['ls', '-lR'], stdout=subprocess.PIPE,
stderr=subprocess.PIPE)

now I created a process, which has a PID, but it's not running apparently...
It only seems to run when I actually do the wait.

I don't want to make it waiting, so an easy solution is just to use a
thread, but is there a way with subprocess?
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: subprocess call is not waiting.

2012-09-19 Thread andrea crotti

2012/9/18 Dennis Lee Bieber :
>
> Unless you have a really massive result set from that "ls", that
> command probably ran so fast that it is blocked waiting for someone to
> read the PIPE.

I tried also with "ls -lR /" and that definitively takes a while to run,
when I do this:

proc = subprocess.Popen(['ls', '-lR', '/'], stdout=subprocess.PIPE,
stderr=subprocess.PIPE)

nothing is running, only when I actually do
proc.communicate()

I see the process running in top..
Is it still an observation problem?

Anyway I also need to know when the process is over while waiting, so
probably a thread is the only way..
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: subprocess call is not waiting.

2012-09-19 Thread andrea crotti

2012/9/19 Hans Mulder :
> Yes: using "top" is an observation problem.
>
> "Top", as the name suggests, shows only the most active processes.

Sure but "ls -lR /" is a very active process if you try to run it..
Anyway as written below I don't need this anymore.

>
> It's quite possible that your 'ls' process is not active, because
> it's waiting for your Python process to read some data from the pipe.
>
> Try using "ps" instead.  Look in thte man page for the correct
> options (they differ between platforms).  The default options do
> not show all processes, so they may not show the process you're
> looking for.
>
>> Anyway I also need to know when the process is over while waiting, so
>> probably a thread is the only way..
>
> This sounds confused.
>
> You don't need threads.  When 'ls' finishes, you'll read end-of-file
> on the proc.stdout pipe.  You should then call proc.wait() to reap
> its exit status (if you don't, you'll leave a zombie process).
> Since the process has already finished, the proc.wait() call will
> not actually do any waiting.
>
>
> Hope this helps,
>


Well there is a process which has to do two things, monitor
periodically some external conditions (filesystem / db), and launch a
process that can take very long time.

So I can't put a wait anywhere, or I'll stop everything else.  But at
the same time I need to know when the process is finished, which I
could do but without a wait might get hacky.

So I'm quite sure I just need to run the subprocess in a subthread
unless I'm missing something obvious..
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Python presentations

2012-09-19 Thread andrea crotti

2012/9/19 Trent Nelson :
>
> FWIW, I gave a presentation on decorators to the New York Python
> User Group back in 2008.  Relevant blog post:
>
> http://blogs.onresolve.com/?p=48
>
> There's a link to the PowerPoint presentation I used in the first
> paragraph.  It's in .pptx format; let me know if you'd like it in
> some other form.
>
> Regards,
>
> Trent.


Ok thanks a lot, how long did it take for you to present that material?

Interesting the part about the learning process, I had a similar
experience, but probably skip this since I only have 30 minutes.

Another thing which I would skip or only explain how it works are
parametrized decorators, in the triple-def form they just look to ugly
to be worth the effort (but at least should be understood).
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: 'str' object does not support item assignment

2012-09-23 Thread Andrea Crotti


On 09/23/2012 07:31 PM, jimbo1qaz wrote:

spots[y][x]=mark fails with a "'str' object does not support item assignment" 
error,even though:

a=[["a"]]
a[0][0]="b"

and:

a=[["a"]]
a[0][0]=100

both work.
Spots is a nested list created as a copy of another list.


But
a = "a"
a[0] = 'c'
fails for the same reason, which is that strings in Python are immutable..
--
http://mail.python.org/mailman/listinfo/python-list

Re: Python presentations

2012-09-24 Thread andrea crotti

For anyone interested, I already moved the slides on github
(https://github.com/AndreaCrotti/pyconuk2012_slides)
and for example the decorator slides will be generated from this:

https://raw.github.com/AndreaCrotti/pyconuk2012_slides/master/deco_context/deco.rst

Notice the literalinclude with :pyobject: which allows to include any
function or class automatically very nicely from external files ;)
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PHP vs. Python

2012-09-25 Thread andrea crotti

2012/9/25  :
> On Thursday, 23 December 2004 03:33:36 UTC+5:30, (unknown)  wrote:
>> Anyone know which is faster?  I'm a PHP programmer but considering
>> getting into Python ... did searches on Google but didn't turn much up
>> on this.
>>
>> Thanks!
>> Stephen
>
>
> Here some helpful gudance.
>
> http://hentenaar.com/serendipity/index.php?/archives/27-Benchmark-PHP-vs.-Python-vs.-Perl-vs.-Ruby.html
> --
> http://mail.python.org/mailman/listinfo/python-list


Quite ancient versions of everything, would be interesting to see if
things are different now..

Anyway you can switch to Python happily, it might not be faster but
99% of the times that's not an issue..
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Generating C++ code

2012-10-09 Thread Andrea Crotti


On 10/09/2012 05:00 PM, Jean-Michel Pichavant wrote:

Greetings,

I'm trying to generate C++ code from an XML file. I'd like to use a template 
engine, which imo produce something readable and maintainable.
My google search about this subject has been quite unsuccessful, I've been 
redirected to template engine specific to html mostly.

Does anybody knows a python template engine for generating C++ code ?

Here's my flow:

XML file -> nice python app -> C++ code

 From what I know I could use Cheetah, a generic template engine. I never used 
it though, I'm not sure this is what I need.
I'm familiar with jinja2 but I'm not sure I could use it to generate C++ code, 
did anybody try ? (maybe that's a silly question)

Any advice would be appreciated.

JM


I think you can use anything to generate C++ code, but is it a good idea?
Are you going to produce this code only one time and then maintain it 
manually?


And are you sure that the design that you would get from the XML file 
actually makes sense when

translated in C++?
--
http://mail.python.org/mailman/listinfo/python-list

Re: Generating C++ code

2012-10-10 Thread andrea crotti

2012/10/10 Jean-Michel Pichavant :
> Well, the C++ code will end up running on a MIPS on a SOC, unfortunately, 
> python is not an option here.
> The xml to C++ makes a lot of sense, because only a small part of the code is 
> generated that way (everything related to log & fatal events). Everything 
> else is written directly in C++.
>
> To answer Andrea's question, the files are regenerated for every compilation 
> (well, unless the xml didn't change, but the xml is highly subject to 
> changes, that's actually its purpose)
>
> Currently we already have a python script that translate this xml file to 
> C++, but it's done in a way that is difficult to maintain. Basically, when 
> parsing the xml file, it writes the generated C++ code. Something like:
> if 'blabla' in xml:
>   h_file.write("#define blabla 55", append="top")
>   c_file.write("someglobal = blabla", append="bottom")
>
> This is working, but the python code is quite difficult to maintain, there's 
> a lot of escaping going on, it's almost impossible to see the structure of 
> the c files unless generating one and hopping it's successful. It's also 
> quite difficult to insert code exactly where you want, because you do not 
> know the order in which the xml trees are defined then parsed.
>
> I was just wondering if a template engine would help. Maybe not.
>
> JM
> --
> http://mail.python.org/mailman/listinfo/python-list


I think it depends on what you're writing from the XML, are you
generating just constants (like the #define) or also new classes for
example?

If it's just constants why don't you do a generation from XML -> ini
or something similar and then parse it in the C++ properly, then it
would be very easy to do?

You could also parse the XML in the first place but probably that's
harder given your requirements, but I don't think that an ini file
would be a problem, or would it?
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: locking files on Linux

2012-10-18 Thread andrea crotti

2012/10/18 Grant Edwards :
> On 2012-10-18, andrea crotti  wrote:
>
>
> File locks under Unix have historically been "advisory".  That means
> that programs have to _choose_ to pay attention to them.  Most
> programs do not.
>
> Linux does support mandatory locking, but it's rarely used and must be
> manually enabled at the filesystem level. It's probably worth noting
> that in the Linux kernel docs, the document on mandatory file locking
> begins with a section titled "Why you should avoid mandatory locking".
>
> http://en.wikipedia.org/wiki/File_locking#In_Unix-like_systems
> http://kernel.org/doc/Documentation/filesystems/locks.txt
> http://kernel.org/doc/Documentation/filesystems/mandatory-locking.txt
> http://www.thegeekstuff.com/2012/04/linux-file-locking-types/
> http://www.hackinglinuxexposed.com/articles/20030623.html
>
> --
> Grant Edwards   grant.b.edwardsYow! Your CHEEKS sit like
>   at   twin NECTARINES above
>   gmail.coma MOUTH that knows no
>BOUNDS --
> --
> http://mail.python.org/mailman/listinfo/python-list


Uhh I see thanks, I guess I'll use the good-old .lock file (even if it
might have some problems too).

Anyway I'm only afraid that my same application could modify the
files, so maybe I can instruct it to check if the file is locked.

Or maybe using sqlite would work even if writing from different
processes?

I would prefer to keep something human readable as INI-format though,
rather then a sqlite file..

Thanks
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: locking files on Linux

2012-10-18 Thread andrea crotti

2012/10/18 Oscar Benjamin :
>
> Why not come up with a test that actually shows you if it works? Here
> are two suggestions:
>
> 1) Use time.sleep() so that you know how long the lock is held for.
> 2) Write different data into the file from each process and see what
> you end up with.
>


Ok thanks I will try, but I thought that what I did was the worst
possible case, because I'm opening and writing on the same file from
two different processes, locking the file with LOCK_EX.

It should not open it at all as far as I understood...
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Testing against multiple versions of Python

2012-10-19 Thread andrea crotti

2012/10/19 Michele Simionato :
> Yesterday I released a new version of the decorator module. It should run 
> under Python 2.4, 2.5, 2.6, 2.7, 3.0, 3.1, 3.2, 3.3. I did not have the will 
> to install on my machine 8 different versions of Python, so I just tested it 
> with Python 2.7 and 3.3. But I do not feel happy with that. Is there any kind 
> of service where a package author can send a pre-release version of her 
> package and have its tests run againsts a set of different Python versions?
> I seem to remember somebody talking about a service like that years ago but I 
> don't remembers. I do not see anything on PyPI. Any advice is welcome!
>
>  Michele Simionato
>
>
> --
> http://mail.python.org/mailman/listinfo/python-list


Travis on github maybe is what you want?
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: locking files on Linux

2012-10-19 Thread andrea crotti

2012/10/18 Oscar Benjamin :
>
> The lock is cooperative. It does not prevent the file from being
> opened or overwritten. It only prevents any other process from
> obtaining the lock. Here you open the file with mode 'w' which
> truncates the file instantly (without checking for the lock).
>
>
> Oscar


Very good thanks now I understood, actually my problem was in the
assumption that it should fail when the lock is already taken, but by
default lockf just blocks until the lock is released.

It seems to work quite nicely so I'm going to use this..
-- 
http://mail.python.org/mailman/listinfo/python-list

resume execution after catching with an excepthook?

2012-10-24 Thread andrea crotti

So I would like to be able to ask for confirmation when I receive a C-c,
and continue if the answer is "N/n".

I'm already using an exception handler set with sys.excepthook, but I
can't make it work with the confirm_exit, because it's going to quit in
any case..

A possible solution would be to do a global "try/except
KeyboardInterrupt", but since I already have an excepthook I wanted to
use this.  Any way to make it continue where it was running after the
exception is handled?


def confirm_exit():
while True:
q = raw_input("This will quit the program, are you sure? [y/N]")
if q in ('y', 'Y'):
sys.exit(0)
elif q in ('n', 'N'):
print("Continuing execution")
# just go back to normal execution, is it possible??
break


def _exception_handler(etype, value, tb):
if etype == KeyboardInterrupt:
confirm_exit()
else:
sys.exit(1)


def set_exception_handler():
sys.excepthook = _exception_handler
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: resume execution after catching with an excepthook?

2012-10-25 Thread andrea crotti

2012/10/25 Steven D'Aprano :
> On Wed, 24 Oct 2012 13:51:30 +0100, andrea crotti wrote:
>
>> So I would like to be able to ask for confirmation when I receive a C-c,
>> and continue if the answer is "N/n".
>
> I don't think there is any way to do this directly.
>
> Without a try...except block, execution will cease after an exception is
> caught, even when using sys.excepthook. I don't believe that there is any
> way to jump back to the line of code that just failed (and why would you,
> it will just fail again) or the next line (which will likely fail because
> the previous line failed).
>
> I think the only way you can do this is to write your own execution loop:
>
> while True:
> try:
> run(next_command())
> except KeyboardInterrupt:
> if confirm_quit():
> break
>
>
> Of course you need to make run() atomic, or use transactions that can be
> reverted or backed out of. How plausible this is depends on what you are
> trying to do -- Python's Ctrl-C is not really designed to be ignored.
>
> Perhaps a better approach would be to treat Ctrl-C as an unconditional
> exit, and periodically poll the keyboard for another key press to use as
> a conditional exit. Here's a snippet of platform-specific code to get a
> key press:
>
> http://code.activestate.com/recipes/577977
>
> Note however that it blocks if there is no key press waiting.
>
> I suspect that you may need a proper event loop, as provided by GUI
> frameworks, or curses.
>
>
>
> --
> Steven
> --
> http://mail.python.org/mailman/listinfo/python-list



Ok thanks, but here the point is not to resume something that is going
to fail again, just to avoid accidental kill of processes that take a
long time.  Probably needed only by me in debugging mode, but anyway I
can do the simple try/except then, thanks..
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Immutability and Python

2012-10-29 Thread andrea crotti

2012/10/29 Jean-Michel Pichavant :
>
> "return NumWrapper(self.number + 1) "
>
> still returns a(nother) mutable object.
>
> So what's the point of all this ?
>
> JM
>

Well sure but it doesn't modify the first object, just creates a new
one.  There are in general good reasons to do that, for example I can
then compose things nicely:

num.increment().increment()

or I can parallelize operations safely not caring about the order of
operations.

But while I do this all the time with more functional languages, I
don't tend to do exactly the same in Python, because I have the
impression that is not worth, but maybe I'm wrong..
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Immutability and Python

2012-10-29 Thread andrea crotti

2012/10/29 andrea crotti :
>>
>
> Well sure but it doesn't modify the first object, just creates a new
> one.  There are in general good reasons to do that, for example I can
> then compose things nicely:
>
> num.increment().increment()
>
> or I can parallelize operations safely not caring about the order of
> operations.
>
> But while I do this all the time with more functional languages, I
> don't tend to do exactly the same in Python, because I have the
> impression that is not worth, but maybe I'm wrong..


By the way on this topic there is a great talk by the creator of
Clojure: http://www.infoq.com/presentations/Value-Values
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Immutability and Python

2012-10-29 Thread andrea crotti

2012/10/29 Jean-Michel Pichavant :
>
>
> In an OOP language num.increment() is expected to modify the object in place.
> So I think you're right when you say that functional languages technics do 
> not necessarily apply to Python, because they don't.
>
> I would add that what you're trying to suggest in the first post was not 
> really about immutability, immutable objects in python are ... well 
> immutable, they can be used as a dict key for instance, your NumWrapper 
> object cannot.
>
>
> JM

Yes right immutable was not the right word, I meant that as a contract
with myself I'm never going to modify its state.

Also because how doi I make an immutable object in pure Python?

But the example with the dictionary is not correct though, because this:

In [145]: class C(object):
   .: def __hash__(self):
   .: return 42
   .:

In [146]: d = {C(): 1}

works perfectly, but an object of class C can mutate as much as it
wants, as my NumWrapper instance..
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Immutability and Python

2012-10-29 Thread andrea crotti

2012/10/29 Chris Angelico :
> On Tue, Oct 30, 2012 at 2:55 AM, Paul Rubin  wrote:
>> andrea crotti  writes:
>>> and we want to change its state incrementing the number ...
>>>  the immutability purists would instead suggest to do this:
>>> def increment(self):
>>> return NumWrapper(self.number + 1)
>>
>> Immutability purists would say that numbers don't have "state" and if
>> you're trying to change a number's state by incrementing it, that's not
>> immutability.  You end up with a rather different programming style than
>> imperative programming, for example using tail recursion (maybe wrapped
>> in an itertools-like higher-order function) instead of indexed loops to
>> iterate over a structure.
>
> In that case, rename increment to next_integer and TYAOOYDAO. [1]
> You're not changing the state of this number, you're locating the
> number which has a particular relationship to this one (in the same
> way that GUI systems generally let you locate the next and previous
> siblings of any given object).
>
> ChrisA
> [1] "there you are, out of your difficulty at once" - cf WS Gilbert's 
> "Iolanthe"
> --
> http://mail.python.org/mailman/listinfo/python-list


Yes the name should be changed, but the point is that they are both
ways to implement the same thing.

For example suppose I want to have 10 objects (for some silly reason)
that represent the next number, in the first case I would do:

numbers = [NumWrapper(orig.number)] * 10
for num in numbers:
num.increment()

while in the second is as simple as:
numbers = [orig.next_number()] * 10

composing things become much easier, but as a downside it's not always
so easy and convienient to write code in this way, it probably depends
on the use case..
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Nice solution wanted: Hide internal interfaces

2012-10-29 Thread andrea crotti

2012/10/29 Johannes Bauer :
> Hi there,
>
> I'm currently looking for a good solution to the following problem: I
> have two classes A and B, which interact with each other and which
> interact with the user. Instances of B are always created by A.
>
> Now I want A to call some private methods of B and vice versa (i.e. what
> C++ "friends" are), but I want to make it hard for the user to call
> these private methods.
>
> Currently my ugly approach is this: I delare the internal methods
> private (hide from user). Then I have a function which gives me a
> dictionary of callbacks to the private functions of the other objects.
> This is in my opinion pretty ugly (but it works and does what I want).
>
> I'm pretty damn sure there's a nicer (prettier) solution out there, but
> I can't currently think of it. Do you have any hints?
>
> Best regards,
> Joe
>

And how are you declaring methods private?  Because there is no real
private attribute in Python, if you declare them with a starting "_"
they are still perfectly accessible..
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Nice solution wanted: Hide internal interfaces

2012-10-30 Thread andrea crotti

2012/10/30 alex23 :
> On Oct 30, 2:33 am, Johannes Bauer  wrote:
>> I'm currently looking for a good solution to the following problem: I
>> have two classes A and B, which interact with each other and which
>> interact with the user. Instances of B are always created by A.
>>
>> Now I want A to call some private methods of B and vice versa (i.e. what
>> C++ "friends" are), but I want to make it hard for the user to call
>> these private methods.
>
> One approach could be to only have the public interface on B, and then
> create a wrapper for B that provides the private interface:
>
> class B:
> def public_method(self):
> pass
>
> class B_Private:
> def __init__(self, context):
> self.context = context
>
> def private_method(self):
> # manipulate self.context
>
> class A:
> def __init__(self):
> self.b = B()
> self.b_private = B_Private(self.b)
>
> def foo(self):
> # call public method
> self.b.public_method()
>
> # call private method
> self.b_private.private_method()
>
> It doesn't stop a user from accessing the private methods, but it does
> separate them so they have to *intentionally* choose to use them.
> --
> http://mail.python.org/mailman/listinfo/python-list



Partly unrelated, but you could also define a clear API and expose it
through your __init__.py.

For example:
package/a.py:
class A: pass

package/b.py:
class B:pass

package/__init__.py
from a import A

so now doing "from package import" will only show A.

This doesn't work on the method-level, but it's useful to know and
commonly done in many projects..


In some projects they even use a file "api.py" to you have to
explicitly import

from package.api import ..
(which I think is overkill since __init__.py does the same)
-- 
http://mail.python.org/mailman/listinfo/python-list

lazy properties?

2012-11-01 Thread Andrea Crotti

Seeing the wonderful "lazy val" in Scala I thought that I should try to 
get the following also in Python.

The problem is that I often have this pattern in my code:

class Sample:
def __init__(self):
self._var = None

@property
def var(self):
if self._var is None:
self._var = long_computation()
else:
return self._var


which is quite useful when you have some expensive attribute to compute 
that is not going to change.
I was trying to generalize it in a @lazy_property but my attempts so far 
failed, any help on how I could do that?


What I would like to write is
@lazy_property
def var_lazy(self):
return long_computation()

and this should imply that the long_computation is called only once..
--
http://mail.python.org/mailman/listinfo/python-list

Re: accepting file path or file object?

2012-11-05 Thread andrea crotti

2012/11/5 Peter Otten <__pete...@web.de>:
> I sometimes do something like this:
>
> $ cat xopen.py
> import re
> import sys
> from contextlib import contextmanager
>
> @contextmanager
> def xopen(file=None, mode="r"):
> if hasattr(file, "read"):
> yield file
> elif file == "-":
> if "w" in mode:
> yield sys.stdout
> else:
> yield sys.stdin
> else:
> with open(file, mode) as f:
> yield f
>
> def grep(stream, regex):
> search = re.compile(regex).search
> return any(search(line) for line in stream)
>
> if len(sys.argv) == 1:
> print grep(["alpha", "beta", "gamma"], "gamma")
> else:
> with xopen(sys.argv[1]) as f:
> print grep(f, sys.argv[2])
> $ python xopen.py
> True
> $ echo 'alpha beta gamma' | python xopen.py - gamma
> True
> $ echo 'alpha beta gamma' | python xopen.py - delta
> False
> $ python xopen.py xopen.py context
> True
> $ python xopen.py xopen.py gamma
> True
> $ python xopen.py xopen.py delta
> False
> $
>
>
> --
> http://mail.python.org/mailman/listinfo/python-list

That's nice thanks, there is still the problem of closing the file
handle but that's maybe not so important if it gets closed at
termination anyway..
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: creating size-limited tar files

2012-11-07 Thread Andrea Crotti


On 11/07/2012 08:32 PM, Roy Smith wrote:

In article <509ab0fa$0$6636$9b4e6...@newsspool2.arcor-online.net>,
  Alexander Blinne  wrote:


I don't know the best way to find the current size, I only have a
general remark.
This solution is not so good if you have to impose a hard limit on the
resulting file size. You could end up having a tar file of size "limit +
size of biggest file - 1 + overhead" in the worst case if the tar is at
limit - 1 and the next file is the biggest file. Of course that may be
acceptable in many cases or it may be acceptable to do something about
it by adjusting the limit.

If you truly have a hard limit, one possible solution would be to use
tell() to checkpoint the growing archive after each addition.  If adding
a new file unexpectedly causes you exceed your hard limit, you can
seek() back to the previous spot and truncate the file there.

Whether this is worth the effort is an exercise left for the reader.


So I'm not sure if it's an hard limit or not, but I'll check tomorrow.
But in general for the size I could also take the size of the files and 
simply estimate the size of all of them,

pushing as many as they should fit in a tarfile.
With compression I might get a much smaller file maybe, but it would be 
much easier..


But the other problem is that at the moment the people that get our 
chunks reassemble the file with a simple:


cat file1.tar.gz file2.tar.gz > file.tar.gz

which I suppose is not going to work if I create 2 different tar files, 
since it would recreate the header in all of the them, right?
So or I give also a script to reassemble everything or I have to split 
in a more "brutal" way..


Maybe after all doing the final split was not too bad, I'll first check 
if it's actually more expensive for the filesystem (which is very very slow)

or it's not a big deal...
--
http://mail.python.org/mailman/listinfo/python-list

Re: creating size-limited tar files

2012-11-08 Thread andrea crotti

2012/11/7 Oscar Benjamin :
>
> Correct. But if you read the rest of Alexander's post you'll find a
> suggestion that would work in this case and that can guarantee to give
> files of the desired size.
>
> You just need to define your own class that implements a write()
> method and then distributes any data it receives to separate files.
> You can then pass this as the fileobj argument to the tarfile.open
> function:
> http://docs.python.org/2/library/tarfile.html#tarfile.open
>
>
> Oscar



Yes yes I saw the answer, but now I was thinking that what I need is
simply this:
tar czpvf - /path/to/archive | split -d -b 100M - tardisk

since it should run only on Linux it's probably way easier, my script
will then only need to create the list of files to tar..

The only doubt is if this is more or less reliably then doing it in
Python, when can this fail with some bad broken pipe?
(the filesystem is not very good as I said and it's mounted with NFS)
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: creating size-limited tar files

2012-11-08 Thread andrea crotti

2012/11/8 andrea crotti :
>
>
>
> Yes yes I saw the answer, but now I was thinking that what I need is
> simply this:
> tar czpvf - /path/to/archive | split -d -b 100M - tardisk
>
> since it should run only on Linux it's probably way easier, my script
> will then only need to create the list of files to tar..
>
> The only doubt is if this is more or less reliably then doing it in
> Python, when can this fail with some bad broken pipe?
> (the filesystem is not very good as I said and it's mounted with NFS)

In the meanwhile I tried a couple of things, and using the pipe on
Linux actually works very nicely, it's even faster than simple tar for
some reasons..

[andrea@andreacrotti isos]$ time tar czpvf - file1.avi file2.avi |
split -d -b 1000M - inchunks
file1.avi
file2.avi

real1m39.242s
user1m14.415s
sys 0m7.140s

[andrea@andreacrotti isos]$ time tar czpvf total.tar.gz file1.avi file2.avi
file1.avi
file2.avi

real1m41.190s
user1m13.849s
sys 0m5.723s

[andrea@andreacrotti isos]$ time split -d -b 1000M total.tar.gz inchunks

real0m55.282s
user0m0.020s
sys 0m3.553s
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: creating size-limited tar files

2012-11-14 Thread andrea crotti

2012/11/14 Kushal Kumaran :
>
> Well, well, I was wrong, clearly.  I wonder if this is fixable.
>
> --
> regards,
> kushal
> --
> http://mail.python.org/mailman/listinfo/python-list

But would it not be possible to use the pipe in memory in theory?
That would be way faster and since I have in theory enough RAM it
might be a great improvement..
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: creating size-limited tar files

2012-11-14 Thread andrea crotti

2012/11/14 Dave Angel :
> On 11/14/2012 10:56 AM, andrea crotti wrote:
>> Ok this is all very nice, but:
>>
>> [andrea@andreacrotti tar_baller]$ time python2 test_pipe.py > /dev/null
>>
>> real  0m21.215s
>> user  0m0.750s
>> sys   0m1.703s
>>
>> [andrea@andreacrotti tar_baller]$ time ls -lR /home/andrea | cat > /dev/null
>>
>> real  0m0.986s
>> user  0m0.413s
>> sys   0m0.600s
>>
>> 
>>
>>
>> So apparently it's way slower than using this system, is this normal?
>
> I'm not sure how this timing relates to the thread, but what it mainly
> shows is that starting up the Python interpreter takes quite a while,
> compared to not starting it up.
>
>
> --
>
> DaveA
>


Well it's related because my program has to be as fast as possible, so
in theory I thought that using Python pipes would be better because I
can get easily the PID of the first process.

But if it's so slow than it's not worth, and I don't think is the
Python interpreter because it's more or less constantly many times
slower even changing the size of the input..
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: creating size-limited tar files

2012-11-14 Thread Andrea Crotti


On 11/14/2012 04:33 PM, Dave Angel wrote:

Well, as I said, I don't see how the particular timing has anything to
do with the rest of the thread.  If you want to do an ls within a Python
program, go ahead.  But if all you need can be done with ls itself, then
it'll be slower to launch python just to run it.

Your first timing runs python, which runs two new shells, ls, and cat.
Your second timing runs ls and cat.

So the difference is starting up python, plus starting the shell two
extra times.

I'd also be curious if you flushed the system buffers before each
timing, as the second test could be running entirely in system memory.
And no, I don't know offhand how to flush them in Linux, just that
without it, your timings are not at all repeatable.  Note the two
identical runs here.

davea@think:~/temppython$ time ls -lR ~ | cat > /dev/null

real0m0.164s
user0m0.020s
sys 0m0.000s
davea@think:~/temppython$ time ls -lR ~ | cat > /dev/null

real0m0.018s
user0m0.000s
sys 0m0.010s

real time goes down by 90%, while user time drops to zero.
And on a 3rd and subsequent run, sys time goes to zero as well.



Right I didn't think about that..
Anyway the only thing I wanted to understand is if using the pipes in 
subprocess is exactly the same as doing

the Linux pipe, or not.

And any idea on how to run it in ram?
Maybe if I create a pipe in tmpfs it might already work, what do you think?
--
http://mail.python.org/mailman/listinfo/python-list

Re: forking and avoiding zombies!

2012-12-11 Thread andrea crotti

Yes I wanted to avoid to do something too complex, anyway I'll just
comment it well and add a link to the original code..

But this is now failing to me:

def daemonize(stdin='/dev/null', stdout='/dev/null', stderr='/dev/null'):
# Perform first fork.
try:
pid = os.fork()
if pid > 0:
sys.exit(0) # Exit first parent.
except OSError as e:
sys.stderr.write("fork #1 failed: (%d) %s\n" % (e.errno, e.strerror))
sys.exit(1)

# Decouple from parent environment.
os.chdir("/")
os.umask(0)
os.setsid()

# Perform second fork.
try:
pid = os.fork()
if pid > 0:
sys.exit(0) # Exit second parent.
except OSError, e:
sys.stderr.write("fork #2 failed: (%d) %s\n" % (e.errno, e.strerror))
sys.exit(1)

# The process is now daemonized, redirect standard file descriptors.
sys.stdout.flush()
sys.stderr.flush()

si = file(stdin, 'r')
so = file(stdout, 'a+')
se = file(stderr, 'a+', 0)
os.dup2(si.fileno(), sys.stdin.fileno())
os.dup2(so.fileno(), sys.stdout.fileno())
os.dup2(se.fileno(), sys.stderr.fileno())


if __name__ == '__main__':
daemonize(stdout='sample_file', stderr='sample')
print("hello world, now should be the child!")


[andrea@andreacrotti experiments]$ python2 daemon.py
Traceback (most recent call last):
  File "daemon.py", line 49, in 
daemonize(stdout='sample_file', stderr='sample')
  File "daemon.py", line 41, in daemonize
so = file(stdout, 'a+')
IOError: [Errno 13] Permission denied: 'sample_file'

The parent process can write to that file easily, but the child can't,
why is it working for you and not for me though?
(Running this on Linux with a non-root user)
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: forking and avoiding zombies!

2012-12-11 Thread andrea crotti

Ah sure that makes sense!

But actually why do I need to move away from the current directory of
the parent process?
In my case it's actually useful to be in the same directory, so maybe
I can skip that part,
or otherwise I need another chdir after..
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: forking and avoiding zombies!

2012-12-11 Thread andrea crotti

2012/12/11 peter :
> On 12/11/2012 10:25 AM, andrea crotti wrote:
>>
>> Ah sure that makes sense!
>>
>> But actually why do I need to move away from the current directory of
>> the parent process?
>> In my case it's actually useful to be in the same directory, so maybe
>> I can skip that part,
>> or otherwise I need another chdir after..
>
> You don't need to move away from the current directory. You cant use os to
> get the current work directory
>
> stderrfile = '%s/error.log' % os.getcwd()
> stdoutfile = '%s/out.log' % os.getcwd()
>
> then call the daemon function like this.
>
> daemonize(stdout=stdoutfile, stderr=stderrfile)


But the nice thing now is that all the forked processes log also on
stdout/stderr, so if I launch the parent in the shell I get

DEBUG -> csim_flow.worker[18447]: Moving the log file
/user/sim/tests/batch_pdump_records/running/2012_12_11_13_4_10.log to
/user/sim/tests/batch_pdump_records/processed/2012_12_11_13_4_10.log

DEBUG -> csim_flow.area_manager[19410]: No new logs found in
/user/sim/tests/batch_pdump_records

where in [] I have the PID of the process.
In this suggested way I should use some other files as standard output
and error, but for that I already have the logging module that logs
in the right place..
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: forking and avoiding zombies!

2012-12-11 Thread andrea crotti

2012/12/11 Jean-Michel Pichavant :
> - Original Message -
>> So I implemented a simple decorator to run a function in a forked
>> process, as below.
>>
>> It works well but the problem is that the childs end up as zombies on
>> one machine, while strangely
>> I can't reproduce the same on mine..
>>
>> I know that this is not the perfect method to spawn a daemon, but I
>> also wanted to keep the code
>> as simple as possible since other people will maintain it..
>>
>> What is the easiest solution to avoid the creation of zombies and
>> maintain this functionality?
>> thanks
>>
>>
>> def on_forked_process(func):
>> from os import fork
>> """Decorator that forks the process, runs the function and gives
>> back control to the main process
>> """
>> def _on_forked_process(*args, **kwargs):
>> pid = fork()
>> if pid == 0:
>> func(*args, **kwargs)
>> _exit(0)
>> else:
>> return pid
>>
>> return _on_forked_process
>> --
>> http://mail.python.org/mailman/listinfo/python-list
>>
>
> Ever though about using the 'multiprocessing' module? It's a slightly higher 
> API and I don't have issues with zombie processes.
> You can combine this with a multiprocess log listener so that all logs are 
> sent to the main process.
>
> See Vinay Sajip's code about multiprocessing and logging, 
> http://plumberjack.blogspot.fr/2010/09/using-logging-with-multiprocessing.html
>
> I still had to write some cleanup code before leaving the main process, but 
> once terminate is called on all remaining subprocesses, I'm not left with 
> zombie processes.
> Here's the cleaning:
>
> for proc in multiprocessing.active_children():
> proc.terminate()
>
> JM
>
>
> -- IMPORTANT NOTICE:
>
> The contents of this email and any attachments are confidential and may also 
> be privileged. If you are not the intended recipient, please notify the 
> sender immediately and do not disclose the contents to any other person, use 
> it for any purpose, or store or copy the information in any medium. Thank you.


Yes I thought about that but I want to be able to kill the parent
without killing the childs, because they can run for a long time..

Anyway I got something working now with this

def daemonize(func):

def _daemonize(*args, **kwargs):
# Perform first fork.
try:
pid = os.fork()
if pid > 0:
sys.exit(0) # Exit first parent.
except OSError as e:
sys.stderr.write("fork #1 failed: (%d) %s\n" % (e.errno,
e.strerror))
sys.exit(1)

# Decouple from parent environment.
# check if decoupling here makes sense in our case
# os.chdir("/")
# os.umask(0)
# os.setsid()

# Perform second fork.
try:
pid = os.fork()
if pid > 0:
return pid

except OSError, e:
sys.stderr.write("fork #2 failed: (%d) %s\n" % (e.errno,
e.strerror))
sys.exit(1)

# The process is now daemonized, redirect standard file descriptors.
sys.stdout.flush()
sys.stderr.flush()
func(*args, **kwargs)

return _daemonize


@daemonize
def long_smarter_process():
while True:
sleep(2)
print("Hello how are you?")


And it works exactly as before, but more correctly..
-- 
http://mail.python.org/mailman/listinfo/python-list

1 2 3 4 >

1 - 100 of 300 matches

Mail list logo