Re: Things to consider for Upcomming releases

Kelven Yang Mon, 13 Aug 2012 18:35:05 -0700

Job heartbeat(progress report etc), job expiration, job cancellation, and
job throttling will be improved in the new architecture


Kelven

On 8/13/12 4:46 AM, "Suresh Sadhu" <suresh.sa...@citrix.com> wrote:

>
>Including few more points ..
>
>HI All,
>
>As I heard , Upcoming releases has major architecture changes involved.
>It will be good if we consider the following items for better
>improvement.so that it will help QA/Support and customers. Also it will
>minimize  support calls count.
>
>Also please feel to add if I miss any data points or you feel you can add
>few more points for improvements to the below list... kindly correct me
>if my assumption/views are wrong.
>
>-
>
>Job in waiting state
>*****************
>--- we don't fix the time to job completion ..because we don't know how
>much time  will it  take to complete  a particular job But due to this
>design any initials job went in loop/infinite then other jobs are queued
>and wait for first job to finish.
>
>The only way to come out of this situation is ..manually update the field
>status in the DB.
>
>Is there any alternate(better) way to overcome the above problem...
>please share your view and thoughts
>
>MY though:
>If we put job priority/ Job waiting period  as configurable parameters
>and  end user can set/update the priority based on his needs and also
>waiting period.so that even one job in waiting state based on priority
>other waiting job needs to trigger.
>
>In Current design if one job is in waiting state.. end user can't stop
>the job.
>So if we introduce configurable parameters  so the job in waiting(hanged
>state ) can be come out after configured duration over /expired.
>
>
>
>Issue no# http://bugs.cloud.com/show_bug.cgi?id=12061
>Job fails/retry mechanism :
>********************
>If any job fails  due to some exception we don't try  after some time.
>
>Like example:
>[ It's not accurate example but gives some info]
>
>In Vmware case: you can't take snapshot  on root and data disk of vm at
>the same time. If you try to trigger the snapshot on both disk on same
>time.
>First request will be succeeded and second request will failed with
>proper limitation message.
>
>Again end user has to initiate the snapshot on another disk(i. datadisk)
>
>My Thought:
>It will be good if we keep the failed job in queue and once first job
>completes ..Job manager should take/consider waiting job(failed job) in
>queue and process it.
>
>Issue no# http://bugs.cloud.com/show_bug.cgi?id=11531
>
>
>Please feel free to add few more data points here.
>
>Usability in terms of UI refresh:
>************************
>CS has still has caching issue until and unless you manually click on
>refresh button. Sometimes you still see the cached values.
>
>
>Issue no#http://bugs.cloudstack.org/browse/CS-14988
>
>
>Error &Exception Handling & coordination between the tasks on same
>resource.
>***************************************************************
>I don't have much data points .if anybody has please share your views.
>
>But will give one example:
>
>Problem:
>
>Power on stopped VM and at the same time perform snapshot on root disk-
>Fail(deploy VM failed with lock problem-Java.lang.exception occurred but
>snapshot jib completed successfully and tried again startVM this time its
>deployed successfully.)please check the attached log and execution logs.
>
>Limitation:
>
>This is not a problem under current architecture. We currently don't
>coordinate tasks but to throw runtime errors, when a snapshot task is
>being taken, VM operation may be temporarily unavailable to user and user
>needs to retry
>
>
>Also  for HA  CloudStack HA/VMSync behavior is going to be
>same(implementation) for all hypervisor or still  the functionality is
>same(no change in existing functionality) in upcoming release also.
>
>
>
>Regards
>
>Sadhu
>
>
>
>
>
>
>
>

Re: Things to consider for Upcomming releases

Reply via email to