Re: ARM cross compile - one last problem
On Jun 19, 10:49 am, [EMAIL PROTECTED] wrote: > Hello all, > > I've been trying to get Python to cross compile to linux running on an > ARM. I've been fiddling with the cross compile patches > here:http://sourceforge.net/tracker/index.php?func=detail&aid=1597850&grou... > > and I've had some success. Python compiles and now all of the > extensions do too, but when I try to import some of them (time, > socket, etc.), they have trouble finding certain symbols. > Py_Exc_IOError and _Py_NoneStruct are the two I remember seeing. It > would appear that they are exported by libpython, which I believe is > statically linked into the python executable? That's where I start to > get confused. What part of python is breaking? Where should I be > looking for problems? > > Thanks a lot! > > Justin Alright, I looked into this a little more, and those symbols definitely exist in my compiled python executable. How are extensions linked to the python interpreter? Justin -- http://mail.python.org/mailman/listinfo/python-list
Stackless Integration
Hi, I've been looking at stackless python a little bit, and it's awesome. My question is, why hasn't it been integrated into the upstream python tree? Does it cause problems with the current C-extensions? It seems like if something is fully compatible and better, then it would be adopted. However, it hasn't been in what appears to be 7 years of existence, so I assume there's a reason. Justin -- http://mail.python.org/mailman/listinfo/python-list
Re: Stackless Integration
On Aug 9, 8:57 am, "Terry Reedy" <[EMAIL PROTECTED]> wrote: > First, which 'stackless'? The original continuation-stackless (of about 7 > years ago)? Or the more current tasklet-stackless (which I think is much > younger than that)? > The current iteration. I can certianly understand Guido's distaste for continuations. > > overcome. It is just not part of the stdlib. And I wish it were! It wouldn't be such a pain to get to my developers then. > And as far as I know or > could find in the PEP index, C. Tismer has never submitted a PEP asking > that it be made so. Doing so would mean a loss of control, so there is a > downside as well as the obvious upside of distribution. That's true. Though, hopefully, the powers that be would allow him to maintain it while it's in the stdlib. Maybe we should file a PEP for him... :) Justin -- http://mail.python.org/mailman/listinfo/python-list
Re: Stackless Integration
> It's not Pythonic. > > Jean-Paul Ha! I wish there was a way to indicate sarcasm on the net. You almost got people all riled up! -- http://mail.python.org/mailman/listinfo/python-list
Re: Threaded Design Question
On Aug 9, 11:25 am, [EMAIL PROTECTED] wrote: > > Here's how I have it designed so far. The main thread starts a > Watch(threading.Thread) class that loops and searches a directory for > files. It has been passed a Queue.Queue() object (watch_queue), and > as it finds new files in the watch folder, it adds the file name to > the queue. > > The main thread then grabs an item off the watch_queue, and kicks off > processing on that file using another class Worker(threading.thread). > Sounds good. > > I made definite progress by creating two queues...watch_queue and > processing_queue, and then used lists within the classes to store the > state of which files are processing/watched. > This sounds ugly, synchronization is one of those evils of multithreaded programming that should be avoided if possible. I see a couple of dirt simple solutions: 1. Have the watch thread move the file into a "Processing" folder that it doesn't scan 2. Have the watch thread copy the file into a python tempfile object and push that onto the queue, then delete the real file. This can be done efficiently (well, more efficiently than new.write(old.read()) with shutil.copyfileobj(old, new) Both those take very few lines of code, don't require synchronization, and don't require extending standard classes. -- http://mail.python.org/mailman/listinfo/python-list
Re: Threaded Design Question
On Aug 9, 5:39 pm, MRAB <[EMAIL PROTECTED]> wrote: > On Aug 9, 7:25 pm, [EMAIL PROTECTED] wrote: > > > Hi all! I'm implementing one of my first multithreaded apps, and have > > gotten to a point where I think I'm going off track from a standard > > idiom. Wondering if anyone can point me in the right direction. > > > The script will run as a daemon and watch a given directory for new > > files. Once it determines that a file has finished moving into the > > watch folder, it will kick off a process on one of the files. Several > > of these could be running at any given time up to a max number of > > threads. > > > Here's how I have it designed so far. The main thread starts a > > Watch(threading.Thread) class that loops and searches a directory for > > files. It has been passed a Queue.Queue() object (watch_queue), and > > as it finds new files in the watch folder, it adds the file name to > > the queue. > > > The main thread then grabs an item off the watch_queue, and kicks off > > processing on that file using another class Worker(threading.thread). > > > My problem is with communicating between the threads as to which files > > are currently processing, or are already present in the watch_queue so > > that the Watch thread does not continuously add unneeded files to the > > watch_queue to be processed. For example...Watch() finds a file to be > > processed and adds it to the queue. The main thread sees the file on > > the queue and pops it off and begins processing. Now the file has > > been removed from the watch_queue, and Watch() thread has no way of > > knowing that the other Worker() thread is processing it, and shouldn't > > pick it up again. So it will see the file as new and add it to the > > queue again. PS.. The file is deleted from the watch folder after it > > has finished processing, so that's how i'll know which files to > > process in the long term. > > I would suggest something like the following in the watch thread: > > seen_files = {} > > while True: > # look for new files > for name in os.listdir(folder): > if name not in seen_files: > process_queue.add(name) > seen_files[name] = True > > # forget any missing files and mark the others as not seen, ready for > next time > seen_files = dict((name, False) for name, seen in seen_files.items() > if seen) > > time.sleep(1) Hmm, this wouldn't work. It's not thread safe and the last line before you sleep doesn't make any sense. -- http://mail.python.org/mailman/listinfo/python-list
Re: Threaded Design Question
> approach. That sounds the easiest, although I'm still interested in > any idioms or other proven approaches for this sort of thing. > > ~Sean Idioms certainly have their place, but in the end you want clear, correct code. In the case of multi-threaded programming, synchronization adds complexity, both in code and concepts, so figuring out a clean design that uses message passing tends to be clearer and more robust. Most idioms are just a pattern to which somebody found a simple, robust solution, so if you try to think of a simple, robust solution, you're probably doing it right. Especially in trivial cases like the one above. Justin -- http://mail.python.org/mailman/listinfo/python-list
Re: The Future of Python Threading
On Aug 10, 3:57 am, Steve Holden <[EMAIL PROTECTED]> wrote: > Justin T. wrote: > > Hello, > > > While I don't pretend to be an authority on the subject, a few days of > > research has lead me to believe that a discussion needs to be started > > (or continued) on the state and direction of multi-threading python. > [...] > > What these seemingly unrelated thoughts come down to is a perfect > > opportunity to become THE next generation language. It is already far > > more advanced than almost every other language out there. By > > integrating stackless into an architecture where tasklets can be > > divided over several parallelizable threads, it will be able to > > capitalize on performance gains that will have people using python > > just for its performance, rather than that being the excuse not to use > > it. > > Aah, the path to world domination. You know you don't *have* to use > Python for *everything*. > True, but Python seems to be the *best* place to tackle this problem, at least to me. It has a large pool of developers, a large standard library, it's evolving, and it's a language I like :). Languages that seamlessly support multi-threaded programming are coming, as are extensions that make it easier on every existent platform. Python has the opportunity to lead that change. > > Be my guest, if it's so simple. > I knew somebody was going to say that! I'm pretty busy, but I'll see if I can find some time to look into it. > > I doubt that a thread on c.l.py is going to change much. It's the > python-dev and py3k lists where you'll need to take up the cudgels, > because I can almost guarantee nobody is going to take the GIL out of > 2.6 or 2.7. > I was hoping to get a constructive conversation on what the structure of a multi-threaded python would look like. It would appear that this was not the place for that. > Is it even possible > to run threads of the same process at different priority levels on all > platforms? No, it's not, and even fewer allow the scheduler to change the priority dynamically. Linux, however, is one that does. -- http://mail.python.org/mailman/listinfo/python-list
Re: The Future of Python Threading
On Aug 10, 3:52 am, Jean-Paul Calderone <[EMAIL PROTECTED]> wrote: > On Fri, 10 Aug 2007 10:01:51 -, "Justin T." <[EMAIL PROTECTED]> wrote: > >Hello, > > >While I don't pretend to be an authority on the subject, a few days of > >research has lead me to believe that a discussion needs to be started > >(or continued) on the state and direction of multi-threading python. > > > [snip - threading in Python doesn't exploit hardware level parallelism > > well, we should incorporate stackless and remove the GIL to fix this] > > I think you have a misunderstanding of what greenlets are. Greenlets are > essentially a non-preemptive user-space threading mechanism. They do not > allow hardware level parallelism to be exploited. I'm not an expert, but I understand that much. What greenlets do is force the programmer to think about concurrent programming. It doesn't force them to think about real threads, which is good, because a computer should take care of that for you.Greenlets are nice because they can run concurrently, but they don't have to. This means you can safely divide them up among many threads. You could not safely do this with just any old python program. > > >There has been much discussion on this in the past [2]. Those > >discussions, I feel, were premature. Now that stackless is mature (and > >continuation free!), Py3k is in full swing, and parallel programming > >has been fully realized as THE next big problem for computer science, > >the time is ripe for discussing how we will approach multi-threading > >in the future. > > Many of the discussions rehash the same issues as previous ones. Many > of them are started based on false assumptions or are discussions between > people who don't have a firm grasp of the relevant issues. That's true, but there are actually a lot of good ideas in there as well. > > I don't intend to suggest that no improvements can be made in this area of > Python interpreter development, but it is a complex issue and cheerleading > will only advance the cause so far. At some point, someone needs to write > some code. Stackless is great, but it's not the code that will solve this > problem. Why not? It doesn't solve it on its own, but its a pretty good start towards something that could. -- http://mail.python.org/mailman/listinfo/python-list
The Future of Python Threading
Hello, While I don't pretend to be an authority on the subject, a few days of research has lead me to believe that a discussion needs to be started (or continued) on the state and direction of multi-threading python. Python is not multi-threading friendly. Any code that deals with the python interpreter must hold the global interpreter lock (GIL). This has the effect of serializing (to a certain extent) all python specific operations. IE, any thread that is written purely in python will not release the GIL except at particular (and possibly non- optimal) times. Currently that's the rather arbitrary quantum of 100 bytecode instructions. Since the ability of the OS to schedule python threads is based on when its possible to run that thread (according to the lock), python threads do not benefit from a good scheduler in the same manner that real OS threads do, even though python threads are supposed to be a thin wrapper around real OS threads[1]. The detrimental effects of the GIL have been discussed several times and nobody has ever done anything about it. This is because the GIL isn't really that bad right now. The GIL isn't held that much, and pthreads spawned by python-C interations (Ie, those that reside in extensions) can do all their processing concurrently as long as they aren't dealing with python data. What this means is that python multithreading isn't really broken as long as python is thought of as a convenient way of manipulating C. After all, 100 bytecode instructions go by pretty quickly, so the GIL isn't really THAT invasive. Python, however, is much better than a convenient method of manipulating C. Python provides a simple language which can be implemented in any way, so long as promised behaviors continue. We should take advantage of that. The truth is that the future (and present reality) of almost every form of computing is multi-core, and there currently is no effective way of dealing with concurrency. We still worry about setting up threads, synchronization of message queues, synchronization of shared memory regions, dealing with asynchronous behaviors, and most importantly, how threaded an application should be. All of this is possible to do manually in C, but its hardly optimal. For instance, at compile time you have no idea if your library is going to be running on a machine with 1 processor or 100. Knowing that makes a huge difference in architecture as 200 threads might run fine on the 100 core machine where it might thrash the single processor to death. Thread pools help, but they need to be set up and initialized. There are very few good thread pool implementations that are meant for generic use. It is my feeling that there is no better way of dealing with dynamic threading than to use a dynamic language. Stackless python has proven that clever manipulation of the stack can dramatically improve concurrent performance in a single thread. Stackless revolves around tasklets, which are a nearly universal concept. For those who don't follow experimental python implementations, stackless essentially provides an integrated scheduler for "green threads" (tasklets), or extremely lightweight snippets of code that can be run concurrently. It even provides a nice way of messaging between the tasklets. When you think about it, lots of object oriented code can be organized as tasklets. After all, encapsulation provides an environment where side effects of running functions can be minimized, and is thus somewhat easily parallelized (with respect to other objects). Functional programming is, of course, ideal, but its hardly the trendy thing these days. Maybe that will change when people realize how much easier it is to test and parallelize. What these seemingly unrelated thoughts come down to is a perfect opportunity to become THE next generation language. It is already far more advanced than almost every other language out there. By integrating stackless into an architecture where tasklets can be divided over several parallelizable threads, it will be able to capitalize on performance gains that will have people using python just for its performance, rather than that being the excuse not to use it. The nice thing is that this requires a fairly doable amount of work. First, stackless should be integrated into the core. Then there should be an effort to remove the reliance on the GIL for python threading. After that, advanced features like moving tasklets amongst threads should be explored. I can imagine a world where a single python web application is able to redistribute its millions of requests amongst thousands of threads without the developer ever being aware that the application would eventually scale. An efficient and natively multi- threaded implementation of python will be invaluable as cores continue to multiply like rabbits. There has been much discussion on this in the past [2]. Those discussions, I feel, were premature. Now that stackless is mature (and continuation free!
Re: The Future of Python Threading
On Aug 10, 2:02 pm, [EMAIL PROTECTED] (Luc Heinrich) wrote: > Justin T. <[EMAIL PROTECTED]> wrote: > > What these seemingly unrelated thoughts come down to is a perfect > > opportunity to become THE next generation language. > > Too late: <http://www.erlang.org/> > > :) > > -- > Luc Heinrich Uh oh, my ulterior motives have been discovered! I'm aware of Erlang, but I don't think it's there yet. For one thing, it's not pretty enough. It also doesn't have the community support that a mainstream language needs. I'm not saying it'll never be adequate, but I think that making python into an Erlang competitor while maintaining backwards compatibility with the huge amount of already written python software will make python a very formidable choice as languages adapt more and more multi-core support. Python is in a unique position as its actually a flexible enough language to adapt to a multi-threaded environment without resorting to terrible hacks. Justin -- http://mail.python.org/mailman/listinfo/python-list
Re: The Future of Python Threading
On Aug 10, 10:34 am, Jean-Paul Calderone <[EMAIL PROTECTED]> wrote: > >I'm not an expert, but I understand that much. What greenlets do is > >force the programmer to think about concurrent programming. It doesn't > >force them to think about real threads, which is good, because a > >computer should take care of that for you.Greenlets are nice because > >they can run concurrently, but they don't have to. This means you can > >safely divide them up among many threads. You could not safely do this > >with just any old python program. > > There may be something to this. On the other hand, there's no _guarantee_ > that code written with greenlets will work with pre-emptive threading instead > of cooperative threading. There might be a tendency on the part of developers > to try to write code which will work with pre-emptive threading, but it's just > that - a mild pressure towards a particular behavior. That's not sufficient > to successfully write correct software (where "correct" in this context means > "works when used with pre-emptive threads", of course). Agreed. Stackless does include a preemptive mode, but if you don't use it, then you don't need to worry about locking at all. It would be quite tricky to get around this, but I don't think it's impossible. For instance, you could just automatically lock anything that was not a local variable. Or, if you required all tasklets in one object to run in one thread, then you would only have to auto-lock globals. > > One also needs to consider the tasks necessary to really get this integration > done. It won't change very much if you just add greenlets to the standard > library. For there to be real consequences for real programmers, you'd > probably want to replace all of the modules which do I/O (and maybe some > that do computationally intensive things) with versions implemented using > greenlets. Otherwise you end up with a pretty hard barrier between greenlets > and all existing software that will probably prevent most people from changing > how they program. If the framework exists to efficiently multi-thread python, I assume that the module maintainers will slowly migrate over if there is a performance benefit there. > > Then you have to worry about the other issues greenlets introduce, like > invisible context switches, which can make your code which _doesn't_ use > pre-emptive threading broken. Not breaking standard python code would definitely be priority #1 in an experiment like this. I think that by making the changes at the core we could achieve it. A standard program, after all, is just 1 giant tasklet. > > All in all, it seems like a wash to me. There probably isn't sufficient > evidence to answer the question definitively either way, though. And trying > to make it work is certainly one way to come up with such evidence. :) ::Sigh:: I honestly don't see myself having time to really do anything more than experiment with this. Perhaps I will try to do that though. Sometimes I do grow bored of my other projects. :) Justin -- http://mail.python.org/mailman/listinfo/python-list