Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

Joshua Harlow Thu, 10 Jul 2014 14:20:07 -0700

This is not supposed to be from 'Outlook', haha.

Using my adjusted mail account so that its don't go to spam on the receiving 
end due to DMARC/DKIM issues...


My fault ;)

-Josh

On Jul 10, 2014, at 12:51 PM, Outlook <harlo...@outlook.com> wrote:

> On Jul 10, 2014, at 3:48 AM, Yuriy Taraday <yorik....@gmail.com> wrote:
> 
>> On Wed, Jul 9, 2014 at 7:39 PM, Clint Byrum <cl...@fewbar.com> wrote:
>> Excerpts from Yuriy Taraday's message of 2014-07-09 03:36:00 -0700:
>> > On Tue, Jul 8, 2014 at 11:31 PM, Joshua Harlow <harlo...@yahoo-inc.com>
>> > wrote:
>> >
>> > > I think clints response was likely better than what I can write here, but
>> > > I'll add-on a few things,
>> > >
>> > >
>> > > >How do you write such code using taskflow?
>> > > >
>> > > >  @asyncio.coroutine
>> > > >  def foo(self):
>> > > >      result = yield from some_async_op(...)
>> > > >      return do_stuff(result)
>> > >
>> > > The idea (at a very high level) is that users don't write this;
>> > >
>> > > What users do write is a workflow, maybe the following (pseudocode):
>> > >
>> > > # Define the pieces of your workflow.
>> > >
>> > > TaskA():
>> > >   def execute():
>> > >       # Do whatever some_async_op did here.
>> > >
>> > >   def revert():
>> > >       # If execute had any side-effects undo them here.
>> > >
>> > > TaskFoo():
>> > >    ...
>> > >
>> > > # Compose them together
>> > >
>> > > flow = linear_flow.Flow("my-stuff").add(TaskA("my-task-a"),
>> > > TaskFoo("my-foo"))
>> > >
>> >
>> > I wouldn't consider this composition very user-friendly.
>> >
>> 
> 
> So just to make this understandable, the above is a declarative structure of 
> the work to be done. I'm pretty sure it's general agreed[1] in the 
> programming world that when declarative structures can be used they should be 
> (imho openstack should also follow the same pattern more than it currently 
> does). The above is a declaration of the work to be done and the ordering 
> constraints that must be followed. Its just one of X ways to do this (feel 
> free to contribute other variations of these 'patterns' @ 
> https://github.com/openstack/taskflow/tree/master/taskflow/patterns).
> 
> [1] http://latentflip.com/imperative-vs-declarative/ (and many many others).
> 
>> I find it extremely user friendly when I consider that it gives you
>> clear lines of delineation between "the way it should work" and "what
>> to do when it breaks."
>> 
>> So does plain Python. But for plain Python you don't have to explicitly use 
>> graph terminology to describe the process.
>>  
> 
> I'm not sure where in the above you saw graph terminology. All I see there is 
> a declaration of a pattern that explicitly says run things 1 after the other 
> (linearly).
> 
>> > > # Submit the workflow to an engine, let the engine do the work to execute
>> > > it (and transfer any state between tasks as needed).
>> > >
>> > > The idea here is that when things like this are declaratively specified
>> > > the only thing that matters is that the engine respects that declaration;
>> > > not whether it uses asyncio, eventlet, pigeons, threads, remote
>> > > workers[1]. It also adds some things that are not (imho) possible with
>> > > co-routines (in part since they are at such a low level) like stopping 
>> > > the
>> > > engine after 'my-task-a' runs and shutting off the software, upgrading 
>> > > it,
>> > > restarting it and then picking back up at 'my-foo'.
>> > >
>> >
>> > It's absolutely possible with coroutines and might provide even clearer
>> > view of what's going on. Like this:
>> >
>> > @asyncio.coroutine
>> > def my_workflow(ctx, ...):
>> >     project = yield from ctx.run_task(create_project())
>> >     # Hey, we don't want to be linear. How about parallel tasks?
>> >     volume, network = yield from asyncio.gather(
>> >         ctx.run_task(create_volume(project)),
>> >         ctx.run_task(create_network(project)),
>> >     )
>> >     # We can put anything here - why not branch a bit?
>> >     if create_one_vm:
>> >         yield from ctx.run_task(create_vm(project, network))
>> >     else:
>> >         # Or even loops - why not?
>> >         for i in range(network.num_ips()):
>> >             yield from ctx.run_task(create_vm(project, network))
>> >
>> 
>> Sorry but the code above is nothing like the code that Josh shared. When
>> create_network(project) fails, how do we revert its side effects? If we
>> want to resume this flow after reboot, how does that work?
>> 
>> I understand that there is a desire to write everything in beautiful
>> python yields, try's, finally's, and excepts. But the reality is that
>> python's stack is lost the moment the process segfaults, power goes out
>> on that PDU, or the admin rolls out a new kernel.
>> 
>> We're not saying "asyncio vs. taskflow". I've seen that mistake twice
>> already in this thread. Josh and I are suggesting that if there is a
>> movement to think about coroutines, there should also be some time spent
>> thinking at a high level: "how do we resume tasks, revert side effects,
>> and control flow?"
>> 
>> If we embed taskflow deep in the code, we get those things, and we can
>> treat tasks as coroutines and let taskflow's event loop be asyncio just
>> the same. If we embed asyncio deep into the code, we don't get any of
>> the high level functions and we get just as much code churn.
> 
> +1 the declaration of what is to do is not connected to the how to do it. 
> IMHO most projects (maybe outside of the underlying core ones, this could 
> include oslo,messaging, since it's a RPC layer at the lowest level) don't 
> care about anything but the workflow that they want to have execute 
> successfully or not. To me that implies that those projects only care about 
> said declaration (this seems to be a reasonable assumption as pretty much all 
> of openstack apis are just workflows around a driver api that abstracts away 
> vendors/open-source solutions).
> 
>> 
>> > There's no limit to coroutine usage. The only problem is the library that
>> > would bind everything together.
>> > In my example run_task will have to be really smart, keeping track of all
>> > started tasks, results of all finished ones, skipping all tasks that have
>> > already been done (and substituting already generated results).
>> > But all of this is doable. And I find this way of declaring workflows way
>> > more understandable than whatever would it look like with Flow.add's
>> >
>> 
>> The way the flow is declared is important, as it leads to more isolated
>> code. The single place where the flow is declared in Josh's example means
>> that the flow can be imported, the state deserialized and inspected,
>> and resumed by any piece of code: an API call, a daemon start up, an
>> admin command, etc.
>> 
>> I may be wrong, but it appears to me that the context that you built in
>> your code example is hard, maybe impossible, to resume after a process
>> restart unless _every_ task is entirely idempotent and thus can just be
>> repeated over and over.
>> 
>> I must have not stressed this enough in the last paragraph. The point is to 
>> make run_task method very smart. It should do smth like this (yes, I'm 
>> better in Python than English):
>> 
>> @asyncio.coroutine
>> def run_task(self, task):
>>     task_id = yield from self.register_task(task)
>>     res = yield from self.get_stored_result(task_id)
>>     if res is not None:
>>         return res
>>     try:
>>         res = yield from task
>>     except Exception as exc:
>>         yield from self.store_error(task_id, exc)
>>         raise exc
>>     else:
>>         yield from self.store_result(task_id, res)
>>         return res
>> 
> 
> I think you just created 
> https://github.com/openstack/taskflow/blob/master/taskflow/engines/action_engine/runner.py#L55
>  ;-)
> 
> Except nothing in the above code says about about 'yield from', or other, it 
> just is a run loop that schedules things (currently using 
> concurrent.futures), waits for them to complete  and then does things with 
> there results (aka the completion part). I even had a note about the 1 place 
> (potentially more) that asyncio would plug-in: 
> https://github.com/openstack/taskflow/blob/master/taskflow/engines/action_engine/runner.py#L95
>  (since asyncio has a concurrent.futures equivalent, in fact exactly the 
> same, which imho is a peculiarity of the python stdlib right now, aka the 
> nearly identical code bases that are asyncio.futures and concurrent.futures).
> 
>> So it won't run any task more then once (unless smth very bad happens 
>> between task run and DB update, but noone is safe from that) and it will 
>> remember all errors that happened before.
>> On the outer level all tasks will get stored in the context and if some 
>> error occurs, they will just get reverted e.g. by calling their revert() 
>> method in reverse loop.
>> 
>> how do we resume tasks, revert side effects, and control flow?
>> 
>> This would allow to resume tasks, revert side effects on error and provide 
>> way better way of describing control flow (plain Python instead new .add() 
>> language).
>> Declaration of control flow is also just a Python method, nothing else. So 
>> you can import it anywhere, start the same task on one node, stop it and 
>> continue running it on another one.
>> 
>> I'm not suggesting that taskflow is useless and asyncio is better (apple vs 
>> oranges). I'm saying that using coroutines (asyncio) can improve ways we can 
>> use taskflow and provide clearer method of developing these flows.
>> This was mostly response to the "this is impossible with coroutines". I say 
>> it is possible and it can even be better.
>> -- 
> 
> Agreed, to me it's a comparison of an engine (asyncio) and a car (taskflow), 
> the car has a fuel-tank (the storage layer), a body & wheels (the workflow 
> structure), a transmission (the runner from above) and also an engine 
> (asyncio or other...). So the comparison to me isn't a comparison about 
> better or worse since you can't compare an engine to a car. I totally expect 
> the engine/s to be able to be asyncio based (as pointed out in clints 
> response); even if it's right now concurrent.futures based; I think we can 
> have more of a detailed specification here on how that would work but 
> hopefully the idea is more clear now.
> 
> The following might help visualize this (for those that are more visual 
> types): https://wiki.openstack.org/wiki/File:Core.png
> 
> Anyways other peculiarities that I believe coroutines would have, other 
> thoughts would be interesting to hear...
> 
> 1. How would a coroutine structure handle resumption? When the call-graph is 
> the structure (aka in taskflow terminology a 'pattern') and when a coroutine 
> does a 'yield from' in the middle of its execution (to say start another 
> section of the workflow) and then the machine dies that was running this 
> work, upon restart how does the running coroutine (run_task for example could 
> be the thing executing all this) get back into the middle of that coroutine? 
> Since a coroutine can yield multiple exit points (that coroutine can also 
> store local state in variables and other objects), how would the run_task get 
> back into one of those with the prior state? This seems problematic if the 
> workflow structure is the coroutine call-graph structure. If the solution is 
> to have coroutines chained together and no multiple yield points are allowed 
> (or only 1 yield point is allowed) then it seems you just recreated the 
> existing taskflow pattern ;)
> 
>  a) This is imho even a bigger problem when you start running those 
> coroutines on remote machines, since taskflow has awareness of the declared 
> workflow (not necessarily implemented by a graph, which is why the core.png 
> has a 'cloud' around the compilation) it can request that remote workers[2] 
> execute the components of that workflow (and receive there results and 
> repeat...) in parallel. This is one thing that I don't understand how the 
> call-graph as the structure would actually achieve (if the declared workflow 
> is embedded in the call graph there is no visibility into it to actually do 
> this). This feature is key to scale imho (and is pretty commonly done in 
> other parallel frameworks, via message passing, or under other names, note 
> that [3, 4] and other libraries do similar things).
> 
> 2. Introspection, I hope this one is more obvious. When the coroutine 
> call-graph is the workflow there is no easy way to examine it before it 
> executes (and change parts of it for example before it executes). This is a 
> nice feature imho when it's declaratively and explicitly defined, you get the 
> ability to do this. This part is key to handling upgrades that typically 
> happen (for example the a workflow with the 5th task was upgraded to a newer 
> version, we need to stop the service, shut it off, do the code upgrade, 
> restart the service and change 5th task from v1 to v1.1).
> 
> 3. Dataflow: tasks in taskflow can not just declare workflow dependencies but 
> also dataflow dependencies (this is how tasks transfer things from one to 
> another). I suppose the dataflow dependency would mirror to coroutine 
> variables & arguments (except the variables/arguments would need to be 
> persisted somewhere so that it can be passed back in on failure of the 
> service running that coroutine). How is that possible without an abstraction 
> over those variables/arguments (a coroutine can't store these things in local 
> variables since those will be lost)?It would seem like this would need to 
> recreate the persistence & storage layer[5] that taskflow already uses for 
> this purpose to accomplish this.
> 
> [2] http://docs.openstack.org/developer/taskflow/workers.html
> [3] http://python-rq.org/ 
> [4] http://pythonhosted.org/pulsar/apps/tasks.html
> [5] http://docs.openstack.org/developer/taskflow/persistence.html
> 
> Anyway it would be neat to try to make a 'coro' pattern in taskflow and have 
> the existing engine from above work with those types of coroutines seamlessly 
> (I expect this to actually be pretty easy) and let users who want to go about 
> this path try it (as long as they understand the drawbacks this imho brings) 
> then this could be an avenue to enabling both approaches (either way has its 
> own set of drawbacks and advantages). That way the work done for coroutines 
> is at one place and not scattered all over, this then gains all the projects 
> a declarative workflow (where those projects feel this is applicable) that 
> they can use to reliably (and easily scale it) to execute things. 
> 
> -Josh
> 
>> 
>> Kind regards, Yuriy.
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

Reply via email to