Re: [Xen-devel] [PATCH] xen: enable per-VCPU parameter for RTDS
On Mon, Apr 4, 2016 at 9:07 PM, Chong Li wrote: > Commit f7b87b0745b4 ("enable per-VCPU parameter for RTDS") introduced > a bug: it made it possible, in Credit and Credit2, when doing domain > or vcpu parameters' manipulation, to leave the hypervisor with a > spinlock held and interrupts disabled. > > Fix it. > > Signed-off-by: Chong Li > > Acked-by: Dario Faggioli I'm wondering if the title "xen: enable per-VCPU parameter for RTDS" is suitable for this patch, although I don't have a better title. The title in my mind is: xen: fix incorrect lock for credit and credit2 I won't fight for this title, though. :-) Probably no need to resend... Thanks, Meng -- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] xenpm and scheduler
On Mon, Apr 11, 2016 at 10:16 AM, tutu sky wrote: > > hi, > Does xenpm 'cpufreq' or 'cpuidle' feature, has any effect on scheduling > decisions? > Please do not cross post. No effect on RTDS scheduler. May I ask a question: why do you need to consider this? Thanks, Meng -- --- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Xen 4.7 Headline Features (for PR)
On Fri, Apr 22, 2016 at 9:26 AM, Lars Kurth wrote: > > Folks, > > given that we have we are getting close to RC's, I would like to start to > spec out the headline Features for the press release. The big items I am > aware of are COLO. I am a little confused about xSplice. > > Maybe we can use this thread to start collating a short-list. How about the improved RTDS scheduler: (1) Change the RTDS scheduler from quantum-driven model to event-driven model; (2) Support get/set per-VCPU parameters in RTDS toolstack. Thanks, Meng -- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] Should we mark RTDS as supported feature from experimental feature?
Hi Dario and all, When RTDS scheduler is initialized, it will print out that the scheduler is an experimental feature with the following lines: printk("Initializing RTDS scheduler\n" "WARNING: This is experimental software in development.\n" "Use at your own risk.\n"); On RTDS' wiki [1], it says the RTDS scheduler is experimental feature. All of the above information haven't been updated since Xen 4.5. However, inside MAINTAINERS file, the status of RTDS scheduler is marked as Supported (refer to commit point 28041371 by Dario Faggioli on 2015-06-25). In my opinion, the RTDS scheduler's functionality is finished and tested. So should I send a patch to change the message printed out when the scheduler is initialized? If I understand correctly, the status in MAINTAINERS file should have the highest priority and information from other sources should keep updated with what the MAINTAINERS file says? Please correct me if I'm wrong. [1] http://wiki.xenproject.org/wiki/RTDS-Based-Scheduler Thanks, Meng -- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Should we mark RTDS as supported feature from experimental feature?
>> When RTDS scheduler is initialized, it will print out that the >> scheduler is an experimental feature with the following lines: >> >> printk("Initializing RTDS scheduler\n" >> >>"WARNING: This is experimental software in development.\n" >> >>"Use at your own risk.\n"); >> >> On RTDS' wiki [1], it says the RTDS scheduler is experimental >> feature. >> > Yes. > >> However, inside MAINTAINERS file, the status of RTDS scheduler is >> marked as Supported (refer to commit point 28041371 by Dario Faggioli >> on 2015-06-25). >> > There's indeed a discrepancy between the way one can read that bit of > MAINTAINERS, and what is generally considered Supported (e.g., subject > to security support, etc). > > This is true in general, not only for RTDS (more about this below). Ah-ha, I see. > >> In my opinion, the RTDS scheduler's functionality is finished and >> tested. So should I send a patch to change the message printed out >> when the scheduler is initialized? >> > So, yes, the scheduler is now feature complete (with the per-vcpu > parameters) and adheres to a much more sensible and scalable design > (event driven). Yet, these features have been merged very recently, > therefore, when you say "tested", I'm not so sure I agree. In fact, we > do test it on OSSTest, but only in a couple of tests. The combination > of these two things make me think that we should allow for at least > another development cycle, before considering switching. I see. So should we mark it as Completed for Xen 4.7? or should we wait until Xen 4.8 to mark it as Completed if nothing bad happens to the scheduler? > > And speaking of OSSTest, there have benn occasional failures, on ARM, > which I haven't yet found the time to properly analyze. It may be just > something related to the fact that the specific board was very slow, > but I'm not sure yet. Hmm, I see. I plan to have a look at Xen on ARM this summer. When I boot Xen on ARM, I probably could have a look at it as well. > > And even in that case, I wonder how we should handle such a > situation... I was thinking of adding a work-conserving mode, what do > you think? Hmm, I can get why work-conserving mode is necessary and useful. I'm thinking about the tradeoff between the scheduler's complexity and the benefit brought by introducing complexity. The work-conserving mode is useful. However, there are other real time features in terms of the scheduler that may be also useful. For example, I heard from some company that they want to run RT VM with non-RT VM, which is supported in RT-Xen 2.1 version, but not supported in RTDS. There are other RT-related issues we may need to solve to make it more suitable for real-time or embedded field, such as protocols to handle the shared resource. Since the scheduler aims for the embedded and real-time applications, those RT-related features seems to me more important than the work-conserving feature. What do you think? > You may have something similar in RT-Xen already but, even > if you don't, there are a number of ways for achieving that without > disrupting the real-time guarantees. Actually, in RT-Xen, we don't have the work-conserving version yet. The work-conversing feature may not affect the real-time guarantees, but it may not bring any improved real-time guarantees in theory. When an embedded system designer wants to use the RTDS scheduler "with work-conserving feature" (suppose we implement it), he cannot pack more workload to the system by leveraging the work-conserving feature. In practice, the system may run faster than he expects, but he won't know how faster it will be unless we provide theoretical guarantee. > > What do you think? IMHO, handling the other real-time features related to the scheduler may be more important than the work-conserving feature, in order to make the scheduler more adoptable in embedded world. > >> If I understand correctly, the status in MAINTAINERS file should have >> the highest priority and information from other sources should keep >> updated with what the MAINTAINERS file says? >> >> Please correct me if I'm wrong. >> > This has been discussed before. Have a look at this thread/messages. > > http://lists.xenproject.org/archives/html/xen-devel/2015-06/msg00972.html > http://lists.xenproject.org/archives/html/xen-devel/2015-06/msg01775.html I remembered this. Always keep an eye on ARINC653 as well. :-) > > And at this: > http://lists.xenproject.org/archives/html/xen-devel/2015-06/msg01992.html Yes. I read this before I asked. :-) > > The feature document template has been put
Re: [Xen-devel] Should we mark RTDS as supported feature from experimental feature?
On Tue, Apr 26, 2016 at 4:56 AM, Andrew Cooper wrote: > >>> However, inside MAINTAINERS file, the status of RTDS scheduler is >>> marked as Supported (refer to commit point 28041371 by Dario Faggioli >>> on 2015-06-25). >>> >> There's indeed a discrepancy between the way one can read that bit of >> MAINTAINERS, and what is generally considered Supported (e.g., subject >> to security support, etc). >> >> This is true in general, not only for RTDS (more about this below). > > The purpose of starting the feature docs (in docs/features/) was to > identify the technical status of a feature, along side some > documentation pertinent to its use. > > I am tempted to suggest a requirement of "no security support without a > feature doc" for new features, in an effort to resolve the current > uncertainty as to what is supported and what is not. I see. As I said in Dario's reply, I will add a feature doc in the summer about the RTDS scheduler. > > As for the MAINTAINERS file, supported has a different meaning. From > the file itself, Right. I read this doc before asking. :-) > > Descriptions of section entries: > > M: Mail patches to: FullName > L: Mailing list that is relevant to this area > W: Web-page with status/info > T: SCM tree type and location. Type is one of: git, hg, quilt, stgit. > S: Status, one of the following: >Supported: Someone is actually paid to look after this. >Maintained: Someone actually looks after it. >Odd Fixes: It has a maintainer but they don't have time to do > much other than throw the odd patch in. See below.. >Orphan: No current maintainer [but maybe you could take the > role as you write your new code]. >Obsolete:Old code. Something tagged obsolete generally means > it has been replaced by a better system and you > should be using that. > > Nothing in the MAINTAINERS file constitutes a security statement. I didn't realize this before. Thank you very much for clarification! Meng -- --- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Should we mark RTDS as supported feature from experimental feature?
>> The feature document template has been put together: >> http://lists.xenproject.org/archives/html/xen-devel/2015-08/msg01929.html >> >> And there are feature documents in tree already. >> >> Actually, writing one for RTDS would be a rather interesting and useful >> thing to do, IMO! :-) > > I think it would be helpful to try to spell out what we think are the > criteria for marking RTDS non-experimental. Reading your e-mail, Dario, > I might infer the following criteria: > > 1. New event-driven code spends most of a full release cycle in the tree > being tested > 2. Better tests in osstest (which ones?) > 3. A feature doc I agree with the above three items. > 4. A work-conserving mode I think we need to consider the item 4 carefully. Work-conserving mode is not a must for real-time schedulers and it is not the main purpose/goal of the RTDS scheduler. > > #3 definitely sounds like a good idea. #1 is probably reasonable. > > I don't think #4 should be a blocker; we have plenty of work-conserving > schedulers. :-) Exactly.. Actually, work-conserving feature is not a top property for real-time applications. The resource sharing issues, interacted with the scheduler, are more important than the work-conserving "issue" for complex non-independent real-time applications. > > Regarding #2, did you have specific tests in mind? I've been thinking about how to confirm the correctness of (RTDS) schedulers. It is actually quite challenging to prove the scheduler is correct. I'm thinking what the goal of the tests is? It will determine how the scheduler should be tested, IMHO. There are three possible goals in increasing difficulty: (1) Make sure the scheduler won't crash the system, or (2) make sure the performance of the scheduler is correct, or (3) prove the scheduler is correct? Which one are we talking about here? (maybe item 1?) Thanks, Meng --- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Should we mark RTDS as supported feature from experimental feature?
On Tue, Apr 26, 2016 at 6:49 PM, Dario Faggioli wrote: > On Tue, 2016-04-26 at 14:38 -0400, Meng Xu wrote: >> > So, yes, the scheduler is now feature complete (with the per-vcpu >> > parameters) and adheres to a much more sensible and scalable design >> > (event driven). Yet, these features have been merged very recently, >> > therefore, when you say "tested", I'm not so sure I agree. In fact, >> > we >> > do test it on OSSTest, but only in a couple of tests. The >> > combination >> > of these two things make me think that we should allow for at least >> > another development cycle, before considering switching. >> I see. So should we mark it as Completed for Xen 4.7? or should we >> wait until Xen 4.8 to mark it as Completed if nothing bad happens to >> the scheduler? >> > We should define the criteria. :-) > > In any case, not earlier than 4.8, IMO. > >> > And even in that case, I wonder how we should handle such a >> > situation... I was thinking of adding a work-conserving mode, what >> > do >> > you think? >> Hmm, I can get why work-conserving mode is necessary and useful. I'm >> thinking about the tradeoff between the scheduler's complexity and >> the benefit brought by introducing complexity. >> >> The work-conserving mode is useful. However, there are other real >> time >> features in terms of the scheduler that may be also useful. For >> example, I heard from some company that they want to run RT VM with >> non-RT VM, which is supported in RT-Xen 2.1 version, but not >> supported >> in RTDS. >> > I remember that, but I'm not sure what "running a non-RT VM" inside > RTDS would mean. According to what algorithm these non real-time VMs > would be scheduled? A non-RT VM means the VM whose priority is lower than any RT VM. The non-RT VMs won't get scheduled until all RT VMs have been scheduled. We can use the same gEDF scheduling policy to schedule non-RT VMs. > > Since you mentioned complexity, adding a work conserving mode should be > easy enough, and if you allow a VM to be in work conserving mode, and > have a very small (or even zero) budget, here you are a non real-time > VM. OK. I think it depends on what algorithm we want to use for the work conserving mode? Do you have some algorithm in mind? > >> There are other RT-related issues we may need to solve to make it >> more >> suitable for real-time or embedded field, such as protocols to handle >> the shared resource. >> >> Since the scheduler aims for the embedded and real-time applications, >> those RT-related features seems to me more important than the >> work-conserving feature. >> >> What do you think? >> > There always will be new/other features... But that's not the point. OK. > > What we need, here, is agree on what is the _minimum_ set of them that > allows us to call the scheduler complete and usable. I think we're > pretty close, with this work conserving mode I'm talking about the only > candidate I can think of. Since the point you raised here is that the work-conserving is (probably) a must. > >> > >> > You may have something similar in RT-Xen already but, even >> > if you don't, there are a number of ways for achieving that without >> > disrupting the real-time guarantees. >> Actually, in RT-Xen, we don't have the work-conserving version yet. >> > Yeah, sorry, I probably was confusing it with the "RT / non-RT" flag. I see. :-) Best regards, Meng -- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Should we mark RTDS as supported feature from experimental feature?
> >> > 4. A work-conserving mode >> I think we need to consider the item 4 carefully. Work-conserving >> mode >> is not a must for real-time schedulers and it is not the main >> purpose/goal of the RTDS scheduler. >> > It's indeed not a must for real-time schedulers. In fact, it's only > important if one wants the system to be overall usable, when using a > real-time scheduler. :-P > > Also, I may be wrong but it should not be too hard to implement... > I.e., a win-win. :-) I'm thinking if we want to implement work-conserving policy in RTDS, how should we allocate the unused resource to domains. Should this allocation be promotional to the budget/period each domain is configured with? I guess the complexity totally depends on which work-conserving algorithm we want to encode into RTDS. For example, we can have priority bands that when a VCPU depletes its budget, it will goes to the lower priority band. The VCPU on a lower priority band will not be scheduled until all VCPUs in a higher priority band are scheduled. This policy seems easy to incorporate into the RTDS. (But I have to think harder to make sure there is not catch :-) ) Best, Meng --- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Should we mark RTDS as supported feature from experimental feature?
On Wed, Apr 27, 2016 at 8:27 AM, Dario Faggioli wrote: > On Tue, 2016-04-26 at 21:16 -0400, Meng Xu wrote: >> > It's indeed not a must for real-time schedulers. In fact, it's only >> > important if one wants the system to be overall usable, when using >> > a >> > real-time scheduler. :-P >> > >> > Also, I may be wrong but it should not be too hard to implement... >> > I.e., a win-win. :-) >> I'm thinking if we want to implement work-conserving policy in RTDS, >> how should we allocate the unused resource to domains. Should this >> allocation be promotional to the budget/period each domain is >> configured with? >> I guess the complexity totally depends on which work-conserving >> algorithm we want to encode into RTDS. >> > Indeed it does. > > Everything works for me, basically. As you say, it would not be a > critical aspect of this scheduler, and the implementation details of > the work conserving mode is not going to be the reason why people > choose it anyway... It's just to avoid that people runs away from it > (and from Xen) screaming! :-) I see. Right! This is a good point. > > So, for instance, how do you manage non real-time VMs in RT Xen? RT-Xen is not working-serving right now. The way we handle the non RT VM in RT-Xen 2.1 (not the latest version) is that we use another bit in rt_vcpu to indicate if a VCPU is RT or not. The non-RT VCPUs always have lower priority than the RT VCPUs. > You > say you still use EDF, how do you do that? When RT VCPUs all depleted budget, the non-RT VCPUs will be scheduled by gEDF scheduling policy. > When does one non real-time > VM preempt another non real-time VM? (Ideally, I'd go and check the RT- > Xen code that does this myself, but right now, I can't, sorry.) The non-RT VCPU cannot be scheduled if any RT VCPU still has budget. Once non-RT VCPUs are scheduled, they are preempted/scheduled based on gEDF, since a non-RT VCPU also has budget and period parameters. > > We could go for this that you have already, and as soon as a VM > exhausts its budget, we demote it to non real-time, until it receives > the replenishment. Or something like that. Right. To make it work-conserving, we will have to keep decreasing the priority whenever it runs out of budget at that priority, until there is no idle resource in the system any more. > > In this case, we basically get two features at the cost of one (support > for non real-time VMs and work conserving mode for real-time VMs). Not > to mention that you basically have the code already, and "only" need to > upstream it! :-DD > Right. That is true... Let me think about it and send out a design later. Meng -- --- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for 4.7 4/4] xen: adopt .deinit_pdata and improve timer handling
On Wed, May 4, 2016 at 11:51 AM, George Dunlap wrote: > On 03/05/16 22:46, Dario Faggioli wrote: >> The scheduling hooks API is now used properly, and no >> initialization or de-initialization happen in >> alloc/free_pdata any longer. >> >> In fact, just like it is for Credit2, there is no real >> need for implementing alloc_pdata and free_pdata. >> >> This also made it possible to improve the replenishment >> timer handling logic, such that now the timer is always >> kept on one of the pCPU of the scheduler it's servicing. >> Before this commit, in fact, even if the pCPU where the >> timer happened to be initialized at creation time was >> moved to another cpupool, the timer stayed there, >> potentially inferfearing with the new scheduler of the > > * interfering > >> pCPU itself. >> >> Signed-off-by: Dario Faggioli > > I don't know much about the logic, so I'll wait for Meng Xu to review it. > I will look at it this week...(I will try to do it asap...) Meng ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for 4.7 4/4] xen: adopt .deinit_pdata and improve timer handling
On Tue, May 3, 2016 at 5:46 PM, Dario Faggioli wrote: > > The scheduling hooks API is now used properly, and no > initialization or de-initialization happen in > alloc/free_pdata any longer. > > In fact, just like it is for Credit2, there is no real > need for implementing alloc_pdata and free_pdata. > > This also made it possible to improve the replenishment > timer handling logic, such that now the timer is always > kept on one of the pCPU of the scheduler it's servicing. > Before this commit, in fact, even if the pCPU where the > timer happened to be initialized at creation time was > moved to another cpupool, the timer stayed there, > potentially inferfearing with the new scheduler of the > pCPU itself. > > Signed-off-by: Dario Faggioli > -- Reviewed-and-Tested-by: Meng Xu I do have a minor comment about the patch, although it is not important at all and it is not really about this patch... > @@ -614,7 +612,8 @@ rt_deinit(struct scheduler *ops) > { > struct rt_private *prv = rt_priv(ops); > > -kill_timer(prv->repl_timer); > +ASSERT(prv->repl_timer->status == TIMER_STATUS_invalid || > + prv->repl_timer->status == TIMER_STATUS_killed); I found in xen/timer.h, the comment after the definition of the TIMER_STATUS_invalid is #define TIMER_STATUS_invalid 0 /* Should never see this. */ This comment is a little contrary to how the status is used here. Actually, what does it exactly means by "Should never see this"? This _invalid status is used in timer.h and it is the status when a timer is initialized by init_timer(). So I'm thinking maybe this comment can be better improved to avoid confusion? Anyway, this is just a comment and should not be a blocker, IMO. I just want to raise it up since I saw it... :-) ===About the testing I did=== ---Below is how I tested it--- I booted up two vcpus, created one cpupool for each type of schedulers, and migrated them around. The scripts to run the test cases can be found at https://github.com/PennPanda/scripts/tree/master/xen/xen-test ---Below is the testing scenarios--- echo "start test case 1..." xl cpupool-list xl cpupool-destroy cpupool-credit xl cpupool-destroy cpupool-credit2 xl cpupool-destroy cpupool-rtds xl cpupool-create ${cpupool_credit_file} xl cpupool-create ${cpupool_credit2_file} xl cpupool-create ${cpupool_rtds_file} # Add cpus to each cpupool echo "Add CPUs to each cpupool" for ((i=0;i<5; i+=1));do xl cpupool-cpu-remove Pool-0 ${i} done echo "xl cpupool-cpu-add cpupool-credit 0" xl cpupool-cpu-add cpupool-credit 0 echo "xl cpupool-cpu-add cpupool-credit2 1,2" xl cpupool-cpu-add cpupool-credit2 1 xl cpupool-cpu-add cpupool-credit2 2 echo "xl cpupool-cpu-add cpupool-rtds 3,4" xl cpupool-cpu-add cpupool-rtds 3 xl cpupool-cpu-add cpupool-rtds 4 xl cpupool-list -c xl cpupool-list # Migrate vm1 among cpupools echo "Migrate ${vm1_name} among cpupools" xl cpupool-migrate ${vm1_name} cpupool-rtds xl cpupool-migrate ${vm1_name} cpupool-credit2 xl cpupool-migrate ${vm1_name} cpupool-rtds xl cpupool-migrate ${vm1_name} cpupool-credit xl cpupool-migrate ${vm1_name} cpupool-rtds Thank you very much and best regards, Meng -- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for 4.7 0/4] Assorted scheduling fixes
Hi Wei, On Wed, May 4, 2016 at 12:04 PM, Wei Liu wrote: > > On Tue, May 03, 2016 at 11:46:27PM +0200, Dario Faggioli wrote: > > Hi, > > > > This small series contains some bugfixes for various schedulers. They're all > > bugfixes, so I think all should be considered for 4.7. Here's some more > > detailed analysis. > > > > Patch 1 and 3 are for Credit2. Patch 1 is a lot more important, as we have > > an > > ASSERT triggering without it. Patch 2 is behavioral fixing, which I believe > > it > > is important, but at least does not make anything explode. > > > > Patch 2 fixes another ASSERT, in case a pCPU fails to come up. This is what > > Julien reported here: > > > > https://www.mail-archive.com/xen-devel@lists.xen.org/msg65918.html > > > > Julien, the patch is very very similar to the one attached to one of my > > reply > > in that thread, but I had to change some small bits... Can you please > > re-test > > it? > > > > Patch 4 makes the code of RTDS look consistent with what we state in patch > > 2, > > so it's also important. Furthermore, it does fix a bug (although, again, not > > one that would splat Xen) as, without it, we may have a timer used by the > > RTDS > > scheduler bound to the pCPU of another cpupool with another scheduler. That > > would introduce some unwanted and very difficult to recognize interference > > between different schedulers in different pool, and should hence be avoided. > > > > So this was awesomeness; about risks: > > - patch 1 is very small, super-self contained (zero impact outside of > > Credit2 > >code) and it fixes an actual and 100% reproducible bug; > > - patch 2 is also totally self-contained and it can't possibly cause > > problems > >to anything else than to what it is trying to fix (Credit2's load > > balancer). > >It doesn't cure any ASSERT or Oops, so it's less interesting, but given > > the > >low risk --also considering that Credit2 will still be considered > >experimental in 4.7-- I think it can go in; > > - patch 3 is bigger, and a bit more complex. Note, however, that most of > > its > >content is code comments and ASSERT-s; it is self contained to scheduling > >(in the sense that it impacts all schedulers, but "just" them), and fixes > >a situation that, AFAIUI, is important for ARM; > > You meant patch 2 actually. > > For the first three patches: > > Release-acked-by: Wei Liu > > > - patch 4 may again look not that critical. But, the fact that someone > > wanting > >to experiment with RTDS in a cpupool would face the kind of interference > >between independent cpupools that the patch cures is, I think, something > >worthwhile trying to avoid. Yes. It's better to avoid this type of interference. > > Besides, it is again quite self contained, as > >it's indeed only relevant for RTDS (which is also going to be called > >experimental for 4.7). Yes. It should not affect other schedulers or other parts of the system. Actually, it does not affect the logic in RTDS either. > I will wait for Meng to review this one. I just reviewed and tested this patch on my computer. Thank you very much! Best regards, Meng --- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for 4.7 4/4] xen: adopt .deinit_pdata and improve timer handling
On Sat, May 7, 2016 at 5:19 PM, Meng Xu wrote: > On Tue, May 3, 2016 at 5:46 PM, Dario Faggioli > wrote: >> >> The scheduling hooks API is now used properly, and no >> initialization or de-initialization happen in >> alloc/free_pdata any longer. >> >> In fact, just like it is for Credit2, there is no real >> need for implementing alloc_pdata and free_pdata. >> >> This also made it possible to improve the replenishment >> timer handling logic, such that now the timer is always >> kept on one of the pCPU of the scheduler it's servicing. >> Before this commit, in fact, even if the pCPU where the >> timer happened to be initialized at creation time was >> moved to another cpupool, the timer stayed there, >> potentially inferfearing with the new scheduler of the >> pCPU itself. >> >> Signed-off-by: Dario Faggioli >> -- > > Reviewed-and-Tested-by: Meng Xu > > ---Below is the testing scenarios--- > echo "start test case 1..." > xl cpupool-list > xl cpupool-destroy cpupool-credit > xl cpupool-destroy cpupool-credit2 > xl cpupool-destroy cpupool-rtds > xl cpupool-create ${cpupool_credit_file} > xl cpupool-create ${cpupool_credit2_file} > xl cpupool-create ${cpupool_rtds_file} > # Add cpus to each cpupool > echo "Add CPUs to each cpupool" > for ((i=0;i<5; i+=1));do > xl cpupool-cpu-remove Pool-0 ${i} > done > echo "xl cpupool-cpu-add cpupool-credit 0" > xl cpupool-cpu-add cpupool-credit 0 > echo "xl cpupool-cpu-add cpupool-credit2 1,2" > xl cpupool-cpu-add cpupool-credit2 1 > xl cpupool-cpu-add cpupool-credit2 2 > echo "xl cpupool-cpu-add cpupool-rtds 3,4" > xl cpupool-cpu-add cpupool-rtds 3 > xl cpupool-cpu-add cpupool-rtds 4 > xl cpupool-list -c > xl cpupool-list > # Migrate vm1 among cpupools > echo "Migrate ${vm1_name} among cpupools" > xl cpupool-migrate ${vm1_name} cpupool-rtds > xl cpupool-migrate ${vm1_name} cpupool-credit2 > xl cpupool-migrate ${vm1_name} cpupool-rtds > xl cpupool-migrate ${vm1_name} cpupool-credit > xl cpupool-migrate ${vm1_name} cpupool-rtds > I forgot one thing in the previous email. When I tried to migrate Domain-0 from the Pool-0 (with rtds or credit scheduler) to another newly created pool, say cpupool-credit, it always fails. This situation happens even when I boot into credit scheduler and try to migrate Domain-0 to another cpupool. I'm wondering if Domain-0 can be migrated among cpupools? From the Xen wiki: http://wiki.xenproject.org/wiki/Cpupools_Howto, it seems Domain-0 can be migrated Thanks, Meng --- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for 4.7 4/4] xen: adopt .deinit_pdata and improve timer handling
> > > > > I do have a minor comment about the patch, although it is not > > important at all and it is not really about this patch... > > > > > > > > @@ -614,7 +612,8 @@ rt_deinit(struct scheduler *ops) > > > { > > > struct rt_private *prv = rt_priv(ops); > > > > > > -kill_timer(prv->repl_timer); > > > +ASSERT(prv->repl_timer->status == TIMER_STATUS_invalid || > > > + prv->repl_timer->status == TIMER_STATUS_killed); > > I found in xen/timer.h, the comment after the definition of the > > TIMER_STATUS_invalid is > > > > #define TIMER_STATUS_invalid 0 /* Should never see > > this. */ > > > > This comment is a little contrary to how the status is used here. > > Actually, what does it exactly means by "Should never see this"? > > > > This _invalid status is used in timer.h and it is the status when a > > timer is initialized by init_timer(). > > > As far as my understanding goes, this means that a timer, during its > operations, should never be found in this state. > > In fact, this mark a situation where the timer has been allocated but > never initialized, and there are ASSERT()s around to enforce that. > > However, if what one wants is _exactly_ to check whether the timer has > been allocated ut not initialized, I don't see why I can't use this. You can use this. Actually, I agree with how you used this here. Actually, this is also how the existing init_timer() uses it. > > > > So I'm thinking maybe this comment can be better improved to avoid > > confusion? > > > I don't think things are confusing, neither right now, nor after this > patch, but I'm open to others' opinion. :-) Hmm, I won't get confused with the comment from now on, but I'm unsure if someone else will or not. The tricky thing is when I know it, I won't feel weird. However, when I first read it, I feel a little confusing if not reading the other parts of the code related to this macro. Anyway, I'm ok with either way: change the comment or not. Best Regards, Meng --- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for 4.7 4/4] xen: adopt .deinit_pdata and improve timer handling
On Mon, May 9, 2016 at 10:52 AM, Dario Faggioli wrote: > On Mon, 2016-05-09 at 10:08 -0400, Meng Xu wrote: >> > I don't think things are confusing, neither right now, nor after >> > this >> > patch, but I'm open to others' opinion. :-) >> >> Hmm, I won't get confused with the comment from now on, but I'm >> unsure >> if someone else will or not. The tricky thing is when I know it, I >> won't feel weird. However, when I first read it, I feel a little >> confusing if not reading the other parts of the code related to this >> macro. >> > I don't feel the same, but I understand the concern. > > I think we have two options here: > 1. we just do nothing; > 2. you send a patch that, according to your best judgement, improve > things (as we all do all the time! :-P). > > :-D > >> Anyway, I'm ok with either way: change the comment or not. >> > Me too, and in fact, I'm not changing it, but I won't stop you tryingto > do so. :-) > OK. I can do it... But is just one comment line change too small to be a patch? ;-) Thanks, Meng --- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Xen 4.7 Test Day Instructions for RC2+ : Call to action for people who added new features / functionality to Xen 4.7
On Mon, May 9, 2016 at 11:28 AM, Lars Kurth wrote: > Hi all, > > I added the following sections based on git logs to man pages. Authors are on > the CC list and should review and modify (or suggest edits by replying to > this thread). I added/updated/added TODO's to: > > I do have some questions, to ... > - Konrad/Ross: is there any documentation for xSplice which I have missed? > - Julien: Any ARM specific stuff you want people to test? > - Doug: are there any docs / tests for KCONFIG you want to push > - George: are there any manual tests for credit 2 hard affinity, for hotplug > disk backends (drbd, iscsi, &c) and soft reset for HVM guests that should be > added? > > For the following sections there are some TODO's - please verify and modify > and once OK, remove the TODO from the wiki pages. > > RTDS (Meng Xu, Tianyang, Chong Li) > - Meng, you mention improvements to the RTDS scheduler in another thread > - Are any specific test instructions needed in > http://wiki.xenproject.org/wiki/Xen_4.7_RC_test_instructions > - > http://wiki.xenproject.org/wiki/Xen_4.7_RC_test_instructions#RTDS_scheduler_improvements I verified the text in the wiki, added one comment "which will not invoke the scheduler unnecessarily" for the event-driven model. I removed the TODO in the RTDS section in the wiki. Please let me know if I need to do something else. Thank you very much! Best, Meng --- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.7] sched/rt: Fix memory leak in rt_init()
On Tue, May 10, 2016 at 9:38 AM, Andrew Cooper wrote: > c/s 2656bc7b0 "xen: adopt .deinit_pdata and improve timer handling" > introduced a error path into rt_init() which leaked prv if the > allocation of prv->repl_timer failed. > > Introduce an error cleanup path. > > Spotted by Coverity. I'm curious about this line. Does it mean that this is spotted by the coverty code review or by some automatical testing/checking? > > Signed-off-by: Andrew Cooper > --- I'm sorry that I should have spot it out when I reviewed the code. :-( Reviewed-by: Meng Xu --- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Hypercall invoking
On Tue, May 10, 2016 at 6:12 AM, tutu sky wrote: > > > > From: Dario Faggioli > Sent: Tuesday, May 10, 2016 7:32 AM > To: tutu sky; Xen-devel@lists.xen.org > Subject: Re: [Xen-devel] Hypercall invoking > > On Tue, 2016-05-10 at 06:49 +, tutu sky wrote: > > Hi, > > I added a new hypercall to xen successfully, but when i try to invoke > > it in dom0 using privcmd, i am unable to invoke (using XC), I must cd > > to /xen.x.y.z/tools/xcutils and then try to invoke hypercall by XC > > interface which i created for it. > > DO functions of hypercall is written in /xen/common/kernel.c. > > > > can you give me a clue? > > > That depends on what you are trying to achieve, and on what you have > implemented and how you have implemented it. > > Actually, this is not the first time we tell you this: without you > being specific, we can't possibly help. > > In this case, "being specific" would have meant specifying: > - what is your final end goal > I think Dario meant why you want to add a hypercall? What is your "final" goal to add the hypercall? Probably it is just unnecessary to add a hypercall to achieve your final goal. I'm just curious if you could give a self introduction. ;-) Meng --- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Running Xen on Nvidia Jetson-TK1
Hi Dushyant, On Tue, Mar 8, 2016 at 3:23 AM, Dushyant K Behl wrote: > > Hi All, > > I'm working on a research project with IBM, and I want to run Xen on Nvidia > Tegra Jetson-tk1 board. > I looked at a post on this mailing list > (http://lists.xenproject.org/archives/html/xen-devel/2015-03/msg01122.html), > and I am using this git tree - > > git://xenbits.xen.org/people/ianc/xen.git > and branch - tegra-tk1-jetson-v1 > > But when I try to boot Xen on the board I am not able to see any output (even > with earlyprintk enabled). > After jumping to Xen the board just resets without showing any output. > > I am using upstream u-boot with non secure mode enabled. I just got the Jetson TK1 board and I'm trying to run Xen on it. May I know which u-boot repo and which branch you used to enable the non-secure mode? If you could also share your u-boot config file, that will be awesome! The u-boot from NVIDEA didn't turn on the HYP mode. I tried the git://git.denx.de/u-boot.git, tag v2016.03, but the board won't boot after I flashed the uboot. No message at all... :-( If I use NVIDEA's uboot, I can boot into the linux kernel without problem. Thank you very much for your help and time! Best Regards, Meng -- --- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] xen: sched: rtds: refactor code
Hi Tianyang On Wed, May 11, 2016 at 11:20 AM, Tianyang Chen wrote: > No functional change: > -Various coding style fix > -Added comments for UPDATE_LIMIT_SHIFT. > > Use non-atomic bit-ops: > -Vcpu flags are checked and cleared atomically. Performance can be > improved with corresponding non-atomic versions since schedule.c > already has spin_locks in place. > > Suggested-by: Dario Faggioli It's better to add the link to the thread that has the suggestion. > @@ -930,7 +936,7 @@ burn_budget(const struct scheduler *ops, struct rt_vcpu > *svc, s_time_t now) > if ( svc->cur_budget <= 0 ) > { > svc->cur_budget = 0; > -set_bit(__RTDS_depleted, &svc->flags); > +__set_bit(__RTDS_depleted, &svc->flags); > } > > /* TRACE */ > @@ -955,7 +961,7 @@ burn_budget(const struct scheduler *ops, struct rt_vcpu > *svc, s_time_t now) > * lock is grabbed before calling this function The comment says "lock is grabbed before calling this function". IIRC, we use __ to represent that we grab the lock before call this function. Then this change violates the convention. > */ > static struct rt_vcpu * > -__runq_pick(const struct scheduler *ops, const cpumask_t *mask) > +runq_pick(const struct scheduler *ops, const cpumask_t *mask) > { > struct list_head *runq = rt_runq(ops); > struct list_head *iter; > @@ -964,9 +970,9 @@ __runq_pick(const struct scheduler *ops, const cpumask_t > *mask) > cpumask_t cpu_common; > cpumask_t *online; > > -list_for_each(iter, runq) > +list_for_each ( iter, runq ) > { > -iter_svc = __q_elem(iter); > +iter_svc = q_elem(iter); > > /* mask cpu_hard_affinity & cpupool & mask */ > online = cpupool_domain_cpumask(iter_svc->vcpu->domain); > @@ -1028,7 +1034,7 @@ rt_schedule(const struct scheduler *ops, s_time_t now, > bool_t tasklet_work_sched > } > else > { > - snext = __runq_pick(ops, cpumask_of(cpu)); > +snext = runq_pick(ops, cpumask_of(cpu)); > if ( snext == NULL ) > snext = rt_vcpu(idle_vcpu[cpu]); > Meng --- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [TESTDAY] Test report - xl sched-rtds
On Fri, May 13, 2016 at 5:31 AM, Wei Liu wrote: > On Thu, May 12, 2016 at 02:00:06PM -0500, Chong Li wrote: >> * Hardware: >> CPU: Intel Core2 Quad Q9400 >> Total Memory: 2791088 kB >> >> * Software: >> Ubuntu 14.04 >> Linux kernel: 3.13.0-68 >> >> * Guest operating systems: >> Ubuntu 14.04 (PV) >> >> * Functionality tested: >> xl sched-rtds (for set/get per-VCPU parameters) >> >> * Comments: >> All examples about "xl sched-rtds" in xl mannual page >> <http://xenbits.xen.org/docs/unstable/man/xl.1.html#DOMAIN-SUBCOMMANDS> >> have been tested, >> and all run successfully. >> >> If users type in wrong parameters (e.g., budget > period), the >> error/warnning messages >> are returned correctly as well. >> > > Good, so RTDS works as expected. Thanks for your report. Hi Wei, I'd like to share some of my experience with the improved RTDS scheduler. It is not a formal report. But hopefully it is some useful information. I have been using the improved RTDS in the staging for a while. I haven't seen any weird issue. Because I also modify the other parts of Xen a bit, so the test is not for xen 4.7-rc2 though. That's why we have Chong test the xen 4.7-rc2. Thank you very much, Chong, for your nice test report! :-) The workload types I run are: 1) Compile linux or xen kernels in parallel. The number of compiling jobs is usually double the number of cores allocated for dom0. 2) Run cpu-intensive task or memory-intensive task,which access a large array. Best, Meng -- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Running Xen on Nvidia Jetson-TK1
Hi Julien and Dushyant, >>> >>>> (XEN) DOM0: [0.00] irq: no irq domain found for >>>> /interrupt-controller ! >>>> (XEN) DOM0: [0.00] irq: no irq domain found for >>>> /interrupt-controller ! >>>> (XEN) DOM0: [0.00] irq: no irq domain found for >>>> /interrupt-controller ! >>>> (XEN) DOM0: [0.00] arch_timer: No interrupt available, giving up >>> >>> >>> >>> It looks like to me that Xen is not recreating the device-tree correctly. >>> I >>> would look into the kernel to find what is expected. >> >> >> This looks like a possible bug (or some missing feature) in Xen's >> device tree creation which could >> take some time to handle, so if I could be of any more help to you >> with this issue please let me know. > > > There was a conversation on #xen-arm few days ago about this problem. Is there a way that we can see the conversation on #xen-arm? I hope to better understand the problem. > Xen doesn't correctly recreate the GIC node which result in a loop between > the interrupt controller. Can you try the below patch? > > http://dev.ktemkin.com/misc/xenarm-gic-parents.patch It seems this link is invalid now... Has this patch been upstreamed? Hi Dushyant, Could you help repost this patch in this email if it's not that large? (Since we used the same repo, which is IanC's, it may be even better if you could kindly share the patch based on the tegra-tk1-jetson-v1 branch of Ian's repo.?) Hi Julien, Do you have some suggestions on how we can debug and fix the issue related to the device tree? I saw that there may still be some issues with the NVIDIA devices as Dushyant described after he applied the patch. Right now, I have the exact same board as Dushyant has. I think I may encounter the exact same issue as he did. So I'm wondering if there is some documentation/tutorial/notes that we can learn about how to debug the issues. Thank you both very much for your help and time! Best Regards, Meng --- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Running Xen on Nvidia Jetson-TK1
Hi Dushyant, On Sat, May 14, 2016 at 1:36 PM, Dushyant Behl wrote: > Hey Meng, > > On Sat, May 14, 2016 at 7:39 AM, Meng Xu wrote: >>> >>> http://dev.ktemkin.com/misc/xenarm-gic-parents.patch >> >> It seems this link is invalid now... >> Has this patch been upstreamed? >> >> Hi Dushyant, >> Could you help repost this patch in this email if it's not that large? >> (Since we used the same repo, which is IanC's, it may be even better >> if you could kindly share the patch based on the tegra-tk1-jetson-v1 >> branch of Ian's repo.?) > > The patch is attached with the mail. > Thank you so much for your help! I applied the patch and the kernel can have further progress in booting. I'm replying to your last email about the issue I'm facing to, which seems not same with what you saw. Best Regards, Meng --- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Running Xen on Nvidia Jetson-TK1
Hi Dushyant, >>> On Thu, Mar 17, 2016 at 8:22 PM, Julien Grall >>> wrote: >>>> >>>> On 14/03/16 14:19, Dushyant Behl wrote: >>>>>> >>>>>> Yes, I have enabled these configuration parameters when compiling linux >>>>>> - >>>> >>>> >>>> The list of options looks good to me. I guess Linux is crashing before >>>> setting >>>> up the console. Can you apply the below to Linux and post the log here? >>> >>> >>> I applied your patch to Linux but still there is no output from the >>> kernel. >>> >>> But I have found location of the problem, I have a debugger attached >>> to the Jetson board >>> and using that I was able to find out that Linux is failing while >>> initializing the Tegra timer. >>> >>> The call stack at the time of failing is - >>> >>> - prefetchw (inline) >>> arch_spin_lock (inline) >>> do_raw_spin_lock_flags (inline) >>> __raw_spin_lock_irqssave (inline) >>> raw_spin_lock_irq_save (lock = 0xC0B746F0) >>> - of_get_parent (node = 0xA1D3) >>> - of_get_address (dev = 0xDBBABC30, index = 0, size = 0xC0A83F30) >>> - of_address_to_resource(dev = 0xDBBABC30, index = 0, r = 0xC0A83F50) >>> - of_iomap (np = 0xDBBABC30, index = 0) >>> - tegra20_init_timer (np = 0xDBBABC30) >>> - clocksource_of_init() >>> - start_kernel() >>> >>> After this Linux jumps to floating point exception handler and then to >>> undefined instruction and fails. >> >> >> I don't know why Linux is receiving a floating point exception. However, >> DOM0 must not use the tegra timer as it doesn't support virtualization. >> >> You need to ensure that DOM0 will use the arch timer instead. Xen provides >> some facilities to blacklist a device tree node (see blacklist dev in >> arm/platforms/tegra.c). > > I have blacklisted the tegra20_timer I guess you blocked the "tegra20-timer" (which uses "-" instead of "_") right as shown in the following patch? Am I right? diff --git a/xen/arch/arm/platforms/tegra.c b/xen/arch/arm/platforms/tegra.c index 5ec9dda..8477ad1 100644 --- a/xen/arch/arm/platforms/tegra.c +++ b/xen/arch/arm/platforms/tegra.c @@ -431,6 +431,7 @@ static const struct dt_device_match tegra_blacklist_dev[] __initconst = * UART to dom0, so don't map any of them. */ DT_MATCH_COMPATIBLE("nvidia,tegra20-uart"), +DT_MATCH_COMPATIBLE("nvidia,tegra20-timer"), { /* sentinel */ }, }; Thanks and Best Regards, Meng --- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Running Xen on Nvidia Jetson-TK1
a: Failed to get supply 'avdd': -517 > [8.833857] tegra-ahci 70027000.sata: Failed to get regulators > [8.840805] input: gpio-keys as /devices/soc0/gpio-keys/input/input0 > [8.846820] hctosys: unable to open rtc device (rtc0) > [8.847331] sdhci-tegra 700b0400.sdhci: Got CD GPIO > [8.847369] sdhci-tegra 700b0400.sdhci: Got WP GPIO > [8.847471] mmc1: Unknown controller version (3). You may > experience problems. > [8.851403] sdhci-tegra 700b0400.sdhci: No vmmc regulator found > [8.852328] tegra-snd-rt5640 sound: ASoC: CODEC DAI rt5640-aif1 not > registered > [8.852340] tegra-snd-rt5640 sound: snd_soc_register_card failed (-517) > [8.854009] tegra-pcie 1003000.pcie-controller: 2x1, 1x1 configuration > [8.854062] tegra-pcie 1003000.pcie-controller: Failed to get > supply 'avddio-pex': -517 > [8.855019] reg-fixed-voltage regulators:regulator@11: Failed to > resolve vin-supply for +1.05V_RUN_AVDD_HDMI_PLL > [8.855030] tegra-hdmi 5428.hdmi: failed to get PLL regulator > [8.856050] tegra-ahci 70027000.sata: Failed to get supply 'avdd': -517 > [8.856059] tegra-ahci 70027000.sata: Failed to get regulators > [8.951051] +12V_SATA: disabling > [8.952391] +5V_SATA: disabling > [8.955596] +5V_HDMI_CON: disabling > [8.959166] +1.05V_RUN_AVDD_HDMI_PLL: disabling > [8.963742] +USB0_VBUS_SW: disabling > [8.967400] +3.3V_AVDD_HDMI_ > The last several dom0 log messages are: [5.398512] sdhci: Copyright(c) Pierre Ossman 6sdhci-pltfm: SDHCI platform and OF driver helper [5.398574] sdhci-pltfm: SDHCI platform and OF driver helper sdhci-tegra 700b0400.sdhci: Got CD GPIO [5.399032] sdhci-tegra 700b0400.sdhci: Got CD GPIO sdhci-tegra 700b0400.sdhci: Got WP GPIO [5.399109] sdhci-tegra 700b0400.sdhci: Got WP GPIO 3mmc0: Unknown controller version (3). You may experience problems. [5.399231] mmc0: Unknown controller version (3). You may experience problems. sdhci-tegra 700b0400.sdhci: No vmmc regulator found [5.399443] sdhci-tegra 700b0400.sdhci: No vmmc regulator found 3mmc0: Unknown controller version (3). You may experience problems. [5.399731] mmc0: Unknown controller version (3). You may experience problems. sdhci-tegra 700b0600.sdhci: No vmmc regulator found [5.399868] sdhci-tegra 700b0600.sdhci: No vmmc regulator found sdhci-tegra 700b0600.sdhci: No vqmmc regulator found [5.399931] sdhci-tegra 700b0600.sdhci: No vqmmc regulator found 4mmc0: Invalid maximum block size, assuming 512 bytes [5.33] mmc0: Invalid maximum block size, assuming 512 bytes 6mmc0: SDHCI controller on 700b0600.sdhci [700b0600.sdhci] using ADMA 64-bit [5.446794] mmc0: SDHCI controller on 700b0600.sdhci [700b0600.sdhci] using ADMA 64-bit 6usbcore: registered new interface driver usbhid [5.448020] usbcore: registered new interface driver usbhid 6usbhid: USB HID core driver [5.448075] usbhid: USB HID core driver 6cfg80211: Calling CRDA to update world regulatory domain [6.536872] cfg80211: Calling CRDA to update world regulatory domain tegra-hda 7003.hda: azx_get_response timeout, switching to polling mode: last cmd=0x300f0001 [8.526885] tegra-hda 7003.hda: azx_get_response timeout, switching to polling mode: last cmd=0x300f0001 6input: tegra-hda HDMI/DP,pcm=3 as /devices/soc0/7003.hda/sound/card0/input0 [8.968688] input: tegra-hda HDMI/DP,pcm=3 as /devices/soc0/7003.hda/sound/card0/input0 6cfg80211: Calling CRDA to update world regulatory domain [9.696855] cfg80211: Calling CRDA to update world regulatory domain tegra-i2c 7000c000.i2c: From Dushyant's log, I saw the "tegra-i2c 7000c000.i2c:" will finally time out. However, in my case, I didn't see the time out happens. Thanks and Best Regards, Meng --- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Problem Reading from XenStore in DomU
On Sun, May 15, 2016 at 3:54 PM, Dagaen Golomb wrote: >>> Hi All, >>> >>> I'm having an interesting issue. I am working on a project that >>> requires me to share memory between dom0 and domUs. I have this >>> successfully working using the grant table and the XenStore to >>> communicate grefs. >>> >>> My issue is this. I have one domU running Ubuntu 12.04 with a default >>> 3.8.x kernel that has no issue reading or writing from the XenStore. >>> My work also requires some kernel modifications, and we have made >>> these changes in the 4.1.0 kernel. In particular, we've only added a >>> simple hypercall. This modified kernel is what dom0 is running, on top >>> of Xen 4.7 rc1. >>> >>> The guest I mentioned before with the default kernel can still read >>> and write the XenStore just fine when on Xen 4.7 rc1 and with dom0 >>> running our kernel. >>> >>> The issue I'm having is with another newly created guest (i.e., it is >>> not a copy of the working one, this is because I needed more space and >>> couldn't expand the original disk image). It is also running Ubuntu >>> 12.04 but has been upgraded to our modified kernel. On this guest I >>> can write to the XenStore, and see that the writes were indeed >>> successful using xenstore-ls in dom0. However, when this same guest >>> attempts to read from the XenStore, it doesn't return an error code >>> but instead just blocks indefinitely. I've waiting many minutes to >>> make sure its not just blocking for a while, it appears like it will >>> block forever. The block is happening when I start the transaction. >>> I've also tried not using a transaction, in which case it blocks on >>> the read itself. >>> >>> I have an inkling this may be something as simple as a configuration >>> issue, but I can't seem to find anything. Also, the fact that writes >>> work fine but reads do not is perplexing me. >>> >>> Any help would be appreciated! >> >> Nothing should block like this. Without seeing your patch, I can't >> comment as to whether you have accidentally broken things. > > I don't see any way the patch could be causing this. It simply adds > another function and case clause to an already-existing hypercall, and > when you call the hypercall with that option it returns the current > budget of a passed-in vcpu. It doesn't even come close to touching > grant mechanics, and doesn't modify any state - it simply returns a > value that previously was hidden in the kernel. > >> Other avenues of investigation are to look at what the xenstored process >> is doing in dom0 (is it idle? or is it spinning?), and to look in the >> xenstored log file to see if anything suspicious occurs. > > I tried booting into older, stock kernels. They all work with the > read. However, I do not see why the kernel modification would be the > issue as described above. I also have the dom0 running this kernel and > it reads and writes the XenStore just dandy. Are there any kernel > config issues that could do this? What if you use the .config of the kernel in the working domU to compile the kernel in the not-working domU? I assume you used the same kernel source code for both domUs. Best Regards, Meng --- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Problem Reading from XenStore in DomU
On Sun, May 15, 2016 at 9:41 PM, Dagaen Golomb wrote: >> On 5/15/16 8:28 PM, Dagaen Golomb wrote: >>>> On 5/15/16 11:40 AM, Dagaen Golomb wrote: >>>>> Hi All, >>>>> >>>>> I'm having an interesting issue. I am working on a project that >>>>> requires me to share memory between dom0 and domUs. I have this >>>>> successfully working using the grant table and the XenStore to >>>>> communicate grefs. >>>>> >>>>> My issue is this. I have one domU running Ubuntu 12.04 with a default >>>>> 3.8.x kernel that has no issue reading or writing from the XenStore. >>>>> My work also requires some kernel modifications, and we have made >>>>> these changes in the 4.1.0 kernel. In particular, we've only added a >>>>> simple hypercall. This modified kernel is what dom0 is running, on top >>>>> of Xen 4.7 rc1. >>>> >>>> Without reading the rest of the thread but seeing the kernel versions. >>>> Can you check how you're communicating to xenstore? Is it via >>>> /dev/xen/xenbus or /proc/xen/xenbus? Anything after 3.14 will give you >>>> deadlocks if you try to use /proc/xen/xenbus. Xen 4.6 and newer should >>>> prefer /dev/xen/xenbus. Same thing can happen with privcmd but making >>>> that default didn't land until Xen 4.7. Since you're on the right >>>> versions I expect you're using /dev/xen/xenbus but you never know. >>> >>> How do I know which is being used? /dev/xen/xenbus is there and so is >>> process/xen/xenbus. Could this be a problem with header version >>> mismatches or something similar? I'm using the xen/xenstore.h header >>> file for all of my xenstore interactions. I'm running Xen 4.7 so it >>> should be in /dev/, and the old kernel is before 3.14 but the new one >>> is after, but I would presume the standard headers are updated to >>> account for this. Is there an easy way to check for this? Also, would >>> the same issue cause writes to fails? Because writes from the same >>> domain work fine, and appear to other domains using xenstore-ls. >>> >>> Regards, >>> Dagaen Golomb >>> >> >> Use strace on the process and see what gets opened. > > Ah, of course. It seems both the working and non-working domains are > using /proc/... Then according to Doug, "Anything after 3.14 will give you > deadlocks if you try to use /proc/xen/xenbus.". Maybe the non-working domU > uses kernel version after 3.14? Meng --- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Problem Reading from XenStore in DomU
Hi Doug, Do you happen to know if Xen has an existing mechanism to make /dev/xen/xenbus as a default device for xenstored? On Sun, May 15, 2016 at 11:30 PM, Dagaen Golomb wrote: >> >>>>> Hi All, >> >>>>> >> >>>>> I'm having an interesting issue. I am working on a project that >> >>>>> requires me to share memory between dom0 and domUs. I have this >> >>>>> successfully working using the grant table and the XenStore to >> >>>>> communicate grefs. >> >>>>> >> >>>>> My issue is this. I have one domU running Ubuntu 12.04 with a >> >>>>> default >> >>>>> 3.8.x kernel that has no issue reading or writing from the XenStore. >> >>>>> My work also requires some kernel modifications, and we have made >> >>>>> these changes in the 4.1.0 kernel. In particular, we've only added a >> >>>>> simple hypercall. This modified kernel is what dom0 is running, on >> >>>>> top >> >>>>> of Xen 4.7 rc1. >> >>>> >> >>>> Without reading the rest of the thread but seeing the kernel >> >>>> versions. >> >>>> Can you check how you're communicating to xenstore? Is it via >> >>>> /dev/xen/xenbus or /proc/xen/xenbus? Anything after 3.14 will give >> >>>> you >> >>>> deadlocks if you try to use /proc/xen/xenbus. Xen 4.6 and newer >> >>>> should >> >>>> prefer /dev/xen/xenbus. Same thing can happen with privcmd but making >> >>>> that default didn't land until Xen 4.7. Since you're on the right >> >>>> versions I expect you're using /dev/xen/xenbus but you never know. >> >>> >> >>> How do I know which is being used? /dev/xen/xenbus is there and so is >> >>> process/xen/xenbus. Could this be a problem with header version >> >>> mismatches or something similar? I'm using the xen/xenstore.h header >> >>> file for all of my xenstore interactions. I'm running Xen 4.7 so it >> >>> should be in /dev/, and the old kernel is before 3.14 but the new one >> >>> is after, but I would presume the standard headers are updated to >> >>> account for this. Is there an easy way to check for this? Also, would >> >>> the same issue cause writes to fails? Because writes from the same >> >>> domain work fine, and appear to other domains using xenstore-ls. >> >>> >> >>> Regards, >> >>> Dagaen Golomb >> >>> >> >> >> >> Use strace on the process and see what gets opened. >> > >> > Ah, of course. It seems both the working and non-working domains are >> > using /proc/... >> >> Then according to Doug, "Anything after 3.14 will give you >> > deadlocks if you try to use /proc/xen/xenbus.". Maybe the non-working >> > domU uses kernel version after 3.14. > > It does, being the custom kernel on version 4.1.0. But Dom0 uses this same > exact kernel and reads/writes just fine! The only solution if this is indeed > the problem appears to be changing the kernel source we build on or some > hacky method such as symlinking /proc/.. to /dev/.., there has to be an > elegant real solution to this... Hi Dagaen, Maybe we can try to create a symlink /proc/xen/xenbus to /dev/xen/xenbus and see if works. I'm not that sure about if you can just define a env. variable XENSTORED_PATH to /dev/xen/xenbus to make /dev/xen/xenbus as a default choice.. But it's no harm to try. BTW, this is a useful link to refer to: http://lists.xenproject.org/archives/html/xen-devel/2015-08/msg01679.html Best Regards, Meng --- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Question about running Xen on NVIDIA Jetson-TK1
On Mon, May 16, 2016 at 7:33 AM, Julien Grall wrote: > > On 15/05/16 20:35, Meng Xu wrote: >> >> Hi Julien and Ian, > > > Hello Meng, Hi Julien, > >> >> I'm trying to run Xen on NVIDIA Jetson TK1 board. (Right now, Xen does >> not support the Jetson board officially. But I'm thinking it may be >> very interesting and useful to see it happens, since it has GPU inside >> which is quite popular in automotive.) >> >> Now I encountered some problem to boot dom0 in Xen environment. I want >> to debug the issues and maybe fix the issues, but I'm not so sure how >> I should debug the issue more efficiently. I really appreciate it if >> you advise me a little bit about the method of how to fix the issue. >> :-) >> >> ---Below is the details >> >> I noticed the Dushyant from IBM also tried to run Xen on the Jetson >> board. (http://www.gossamer-threads.com/lists/xen/devel/422519). I >> used the same Linux kernel (Jan Kiszka's development tree - >> http://git.kiszka.org/linux.git/, branch queues/assorted) and Ian's >> Xen repo. with the hack for Jetson board. I can see the dom0 kernel >> can boot to some extend and then "stall/spin" before the dom0 kernel >> fully boot up. >> >> In order to figure out the possible issue, I boot the exact same Linux >> kernel in native environment on one CPU and collected the boot log >> information in [1]. I also boot the same Linux kernel as dom0 in Xen >> environment and collected the boot log information in [2]. >> >> In Xen environment, dom0 hangs after the following message >> [ 10.541010] NET: Registered protocol family 10 >> 6mip6: Mobile IPv6 >> [ 10.542510] mi >> >> In native environment, the kernel has the following log after initializing >> NET. >> [2.934693] NET: Registered protocol family 10 >> [2.940611] mip6: Mobile IPv6 >> [2.943645] sit: IPv6 over IPv4 tunneling driver >> [2.951303] NET: Registered protocol family 17 >> [2.955800] NET: Registered protocol family 15 >> [2.960257] can: controller area network core (rev 20120528 abi 9) >> [2.966617] NET: Registered protocol family 29 >> [2.971098] can: raw protocol (rev 20120528) >> [2.975384] can: broadcast manager protocol (rev 20120528 t) >> [2.981088] can: netlink gateway (rev 20130117) max_hops=1 >> [2.986734] Bluetooth: RFCOMM socket layer initialized >> [2.991979] Bluetooth: RFCOMM ver 1.11 >> [2.995757] Bluetooth: BNEP (Ethernet Emulation) ver 1.3 >> [3.001109] Bluetooth: BNEP socket layer initialized >> [3.006089] Bluetooth: HIDP (Human Interface Emulation) ver 1.2 >> [3.012052] Bluetooth: HIDP socket layer initialized >> [3.017894] Registering SWP/SWPB emulation handler >> [3.029675] tegra-pcie 1003000.pcie-controller: 2x1, 1x1 configuration >> [3.036586] +3.3V_SYS: supplied by +VDD_MUX >> [3.040857] +3.3V_LP0: supplied by +3.3V_SYS >> [3.045509] +1.35V_LP0(sd2): supplied by +5V_SYS >> [3.050201] +1.05V_RUN_AVDD: supplied by +1.35V_LP0(sd2) >> [3.057131] tegra-pcie 1003000.pcie-controller: probing port 0, using 2 >> lanes >> [3.066479] tegra-pcie 1003000.pcie-controller: Slot present pin >> change, signature: 0008 >> >> I'm suspecting that my dom0 kernel hangs when it tries to initialize >> "can: controller area network core ". However, from Dushyant's post at >> http://www.gossamer-threads.com/lists/xen/devel/422519, it seems >> Dushyant's dom0 kernel hangs when it tries to initialize pci_bus. (The >> linux config I used may be different form Dushyant's. That could be >> the reason.) >> >> Right now, the system just hangs and has no output indicating what the >> problem could be. Although there are a lot of error message before the >> system hangs, I'm not that sure if I should start with solving all of >> those error messages. Maybe some errors can be ignored? >> >> My questions are: >> 1) Do you have suggestion on how to see more information about the >> reason why the dom0 hangs? > > > Have you tried to dump the registers using Xen console (CTLR-x 3 times then > 0) and see where it get stucks? I tried to type CTLR -x 3 times and then 0, nothing happens... :-( Just to confirm, once the system got stuck, I directly type Ctrl-x for three times on the host's screen. Am I correct? Maybe the serial console is not correctly set up? The serial console configuration I used is as follows, could you have a quick look to see
Re: [Xen-devel] Question about running Xen on NVIDIA Jetson-TK1
Hi Julien, On Mon, May 16, 2016 at 1:33 PM, Julien Grall wrote: > (CC Kyle who is also working on Tegra?) > > Hi Meng, > > Many people are working on Nvidia platform with different issues :/. I have > CCed another person which IIRC is also working on it. Sure. It's good to know others are also interested in this platform. It will be more useful to fix it... :-) > > On 16/05/16 17:33, Meng Xu wrote: >> >> On Mon, May 16, 2016 at 7:33 AM, Julien Grall >> wrote: >>> >>> >>> On 15/05/16 20:35, Meng Xu wrote: >>>> >>>> >>>> I'm trying to run Xen on NVIDIA Jetson TK1 board. (Right now, Xen does >>>> not support the Jetson board officially. But I'm thinking it may be >>>> very interesting and useful to see it happens, since it has GPU inside >>>> which is quite popular in automotive.) >>>> >>>> Now I encountered some problem to boot dom0 in Xen environment. I want >>>> to debug the issues and maybe fix the issues, but I'm not so sure how >>>> I should debug the issue more efficiently. I really appreciate it if >>>> you advise me a little bit about the method of how to fix the issue. >>>> :-) >>>> >>>> ---Below is the details >>>> >>>> I noticed the Dushyant from IBM also tried to run Xen on the Jetson >>>> board. (http://www.gossamer-threads.com/lists/xen/devel/422519). I >>>> used the same Linux kernel (Jan Kiszka's development tree - >>>> http://git.kiszka.org/linux.git/, branch queues/assorted) and Ian's >>>> Xen repo. with the hack for Jetson board. I can see the dom0 kernel >>>> can boot to some extend and then "stall/spin" before the dom0 kernel >>>> fully boot up. >>>> >>>> In order to figure out the possible issue, I boot the exact same Linux >>>> kernel in native environment on one CPU and collected the boot log >>>> information in [1]. I also boot the same Linux kernel as dom0 in Xen >>>> environment and collected the boot log information in [2]. >>>> >>>> In Xen environment, dom0 hangs after the following message >>>> [ 10.541010] NET: Registered protocol family 10 >>>> 6mip6: Mobile IPv6 >>>> [ 10.542510] mi >>>> >>>> In native environment, the kernel has the following log after >>>> initializing NET. >>>> [2.934693] NET: Registered protocol family 10 >>>> [2.940611] mip6: Mobile IPv6 >>>> [2.943645] sit: IPv6 over IPv4 tunneling driver >>>> [2.951303] NET: Registered protocol family 17 >>>> [2.955800] NET: Registered protocol family 15 >>>> [2.960257] can: controller area network core (rev 20120528 abi 9) >>>> [2.966617] NET: Registered protocol family 29 >>>> [2.971098] can: raw protocol (rev 20120528) >>>> [2.975384] can: broadcast manager protocol (rev 20120528 t) >>>> [2.981088] can: netlink gateway (rev 20130117) max_hops=1 >>>> [2.986734] Bluetooth: RFCOMM socket layer initialized >>>> [2.991979] Bluetooth: RFCOMM ver 1.11 >>>> [2.995757] Bluetooth: BNEP (Ethernet Emulation) ver 1.3 >>>> [3.001109] Bluetooth: BNEP socket layer initialized >>>> [3.006089] Bluetooth: HIDP (Human Interface Emulation) ver 1.2 >>>> [3.012052] Bluetooth: HIDP socket layer initialized >>>> [3.017894] Registering SWP/SWPB emulation handler >>>> [3.029675] tegra-pcie 1003000.pcie-controller: 2x1, 1x1 >>>> configuration >>>> [3.036586] +3.3V_SYS: supplied by +VDD_MUX >>>> [3.040857] +3.3V_LP0: supplied by +3.3V_SYS >>>> [3.045509] +1.35V_LP0(sd2): supplied by +5V_SYS >>>> [3.050201] +1.05V_RUN_AVDD: supplied by +1.35V_LP0(sd2) >>>> [3.057131] tegra-pcie 1003000.pcie-controller: probing port 0, using >>>> 2 lanes >>>> [3.066479] tegra-pcie 1003000.pcie-controller: Slot present pin >>>> change, signature: 0008 >>>> >>>> I'm suspecting that my dom0 kernel hangs when it tries to initialize >>>> "can: controller area network core ". However, from Dushyant's post at >>>> http://www.gossamer-threads.com/lists/xen/devel/422519, it seems >>>> Dushyant's dom0 kernel hangs when it tries to initialize pci_bus. (The >>>> linux config I used may be different form Dushyant's. That cou
Re: [Xen-devel] Question about running Xen on NVIDIA Jetson-TK1
On Mon, May 16, 2016 at 7:27 PM, Kyle Temkin wrote: > Hi, Meng: > Hi Kyle, > Julien is correct-- a coworker and I are working on support for Tegra > SoCs, and we've made pretty good progress; there's work yet to be > done, but we have dom0 and guests booting on the Jetson TK1, Jetson > TX1, and the Google Pixel C. We hope to get a patch set out soon-- > unfortunately, our employer has to take some time to verify that > everything's okay to be open-sourced, so I can't send out our > work-in-progress just yet. We'll have an RFC patchset out soon, I > hope! Looking forward to your RFC patchset... Could you please cc. me when you send out your RFC patchset. I really love to have a look at (maybe review) it. > > There are two main hardware differences that cause Tegra SoCs to have > trouble with Xen: > > - The primary interrupt controller for those systems isn't a single > GIC, as Xen expects. Instead, there's an NVIDIA Legacy Interrupt > Controller (LIC, or ICTLR) that gates all peripheral interrupts before > passing them to a standard GICv2. This interrupt controller has to be > programmed to ensure Xen can receive interrupts from the hardware > (e.g. serial), programmed to ensure that interrupts for pass-through > devices are correctly unmasked, and virtualized so dom0 can program > the "sections" related to interrupts not being routed to Xen or to a > domain for hardware passthrough. > > - The serial controller on the Tegra SoCs doesn't behave in the same > was as most NS16550-compatibles; it actually adheres to the NS16550 > spec a little more rigidly than most compatible controllers. A > coworker (Chris Patterson, cc'd) figured out what was going on; from > what I understand, most 16550s generate the "transmit ready" interrupt > once, when the device first can accept new FIFO entries. Both the > original 16550 and the Tegra implementation generate the "transmit > ready" interrupt /continuously/ when there's space available in the > FIFO, slewing the CPU with a stream of constant interrupts. I see. Thank you very much for explaining this so clearly! :-) > > What you're seeing is likely a symptom of the first difference. In > your logs, you see messages that indicate Xen is having trouble > correctly routing IRQ that are parented by the legacy interrupt > controller: > >> irq 0 not connected to primary controller.Connected to >> /interrupt-controller@60004000 Right. I see the root issue now. Thank you so much for pointing it out! > > The issue here is that Xen is currently explicitly opting not to route > legacy-interrupt-controller interrupts, as they don't belong to the > primary GIC. As a result, these interrupts never make it to dom0. The > logic that needs to be tweaked is here: > > http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/arch/arm/domain_build.c;h=00dc07af637b67153d33408f34331700dff84f93;hb=HEAD#l1137 > > We re-write this logic in our forthcoming patch-set to be more > general. As an interim workaround, you might opt to rewrite that logic > so LIC interrupts (which have an interrupt-parent compatible with > "tegra124-ictlr", in your case) can be routed by Xen, as well. Off the > top of my head, a workaround might look like: > > /* > * Don't map IRQ that have no physical meaning > * ie: IRQ whose controller is not the GIC > */ > - if ( rirq.controller != dt_interrupt_controller ) > +if ( (rirq.controller != dt_interrupt_controller) && > (!dt_device_is_compatible(rirq.controller, "tegra124-ictlr") ) It should have "nvidia" before "tegra124-ictlr". ;-) After change it to !dt_device_is_compatible(rirq.controller, "nvidia , tegra124-ictlr") dom0 boots up~~~ :-D > > Of course, that's off-the-cuff code I haven't tried, but hopefully it > should help to get you started. Sure! It does work and get me started! I really appreciate your help and explanation! Looking forward to your RFC patch set. :-) Thank you again for your help and time in this issue! It helps a lot! Best Regards, Meng --- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 0/2] xen: sched: rtds refactor code
On Sun, May 15, 2016 at 7:54 PM, Tianyang Chen wrote: > The first part of this patch series aims at fixing coding syle issue > for control structures. Because locks are grabbed in schedule.c before > hooks are called, underscores in front of function names are removed. > > The second part replaces atomic bit-ops with non-atomic ones since locks > are grabbed in schedule.c. > > Discussions: > http://lists.xenproject.org/archives/html/xen-devel/2016-03/msg01528.html > http://www.gossamer-threads.com/lists/xen/devel/431251?do=post_view_threaded#431251 > > Tianyang Chen (2): > xen: sched: rtds refactor code > xen: sched: rtds: use non-atomic bit-ops > > xen/common/sched_rt.c | 122 > ++--- > 1 file changed, 64 insertions(+), 58 deletions(-) > Tianyang, Thanks for the patch! One comment for the future: please add the version number into the title so that we can easily tell it is a new patch. :-) Best, Meng --- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/2] xen: sched: rtds refactor code
On Sun, May 15, 2016 at 7:54 PM, Tianyang Chen wrote: > No functional change: > -Various coding style fix > -Added comments for UPDATE_LIMIT_SHIFT. > > Signed-off-by: Tianyang Chen Reviewed-by: Meng Xu ------- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 2/2] xen: sched: rtds: use non-atomic bit-ops
On Sun, May 15, 2016 at 7:54 PM, Tianyang Chen wrote: > Vcpu flags are checked and cleared atomically. Performance can be > improved with corresponding non-atomic versions since schedule.c > already has spin_locks in place. > > Signed-off-by: Tianyang Chen Reviewed-by: Meng Xu Thanks, Meng ------- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [Embedded-pv-devel] [PATCH RFC 00/18] System adjustment to customer needs.
Hi Andrii, On Wed, May 18, 2016 at 12:32 PM, Andrii Anisov wrote: > This series RFC patches from the currently ongoing production project. > This patch series presents changes needed to fit the system into > customer requirements as well as workaround limitations of the > Jacinto6 SoC. IMHO, it will be better, if possible, to describe the exact customer requirements this patch series tries to satisfy. I'm curious at what the requirements are and if the requirements are general enough for many other customers. :-) Similarly, what are the limitations for the Jacinto6 SoC that need to be workaround? If the board is not supported by Xen, can we say Xen will support the board with the warkaround? Thanks and Best Regards, Meng --- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] xen: sched: avoid races on time values read from NOW()
On Thu, May 19, 2016 at 4:11 AM, Dario Faggioli wrote: > or (even in cases where there is no race, e.g., outside > of Credit2) avoid using a time sample which may be rather > old, and hence stale. > > In fact, we should only sample NOW() from _inside_ > the critical region within which the value we read is > used. If we don't, in case we have to spin for a while > before entering the region, when actually using it: > > 1) we will use something that, at the veryy least, is > not really "now", because of the spinning, > > 2) if someone else sampled NOW() during a critical > region protected by the lock we are spinning on, > and if we compare the two samples when we get > inside our region, our one will be 'earlier', > even if we actually arrived later, which is a > race. > > In Credit2, we see an instance of 2), in runq_tickle(), > when it is called by csched2_context_saved() as it samples > NOW() before acquiring the runq lock. This makes things > look like the time went backwards, and it confuses the > algorithm (there's even a d2printk() about it, which would > trigger all the time, if enabled). > > In RTDS, something similar happens in repl_timer_handler(), > and there's another instance in schedule() (in generic code), > so fix these cases too. > > While there, improve csched2_vcpu_wake() and and rt_vcpu_wake() > a little as well (removing a pointless initialization, and > moving the sampling a bit closer to its use). These two hunks > entail no further functional changes. > > Signed-off-by: Dario Faggioli > --- > Cc: George Dunlap > Cc: Meng Xu > Cc: Wei Liu > --- Reviewed-by: Meng Xu The bug will cause incorrect budget accounting for one VCPU when the race occurs. Best Regards, Meng --- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for 4.7] xen: sched: avoid races on time values read from NOW()
On Thu, May 19, 2016 at 4:11 AM, Dario Faggioli wrote: > Hey Wei, > > Again, I'm using an otherwise unnecessary cover letter for my analysis about > <>. :-) > > I'd say yes, because the patch fixes an actual bug, in the form of a rather > subtle race condition, which was all but trivial to spot. I must say, though, > that I've only found the bug guilty of being particularly nasty if we use > Credit2. Actually, I'm quite sure it has an effect on RTDS too (although I > did > not trace that), but since both Credit2 and RTDS are still marked as > experimental in 4.7, one may think it's not worthwhile putting in something > like this to fix experimental only code. > > Just FYI, this bug is what was causing the issue I briefly chatted about on > IRC > with George, yesterday, i.e., it is what led Credit2 to emit (rather > aggresively, actually) the debug printks showed here: > > http://pastebin.com/gzYrNST5 In addition to the race condition in the bare metal, actually I saw this when I debug/run Xen in VirtualBox. The situation is: If we have nested virtualization or if we have heterogeneous cores which have different speed/time, the RTDS scheduler (maybe credit2 as well?) will have a problem in budget accounting. The "CPU" of Xen is scheduled by the underlining hypervisor. One "CPU" of Xen could be slower than another, showing the time is left behind. We explicitly say that RTDS will have incorrect budget accounting for nested virtualization situation, when the RTDS was upstreamed in Xen 4.5. Best Regards, Meng --- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [Embedded-pv-devel] [PATCH RFC 0/6] Set of PV drivers used by production project
Hi Lurii, On Thu, May 19, 2016 at 10:37 AM, Iurii Mykhalskyi wrote: > This patches introduce set of pv drivers interfaces. Thank you very much for these pv drivers interfaces! It will be useful for automotive applications, IMO. However, I do have some questions: I'm wondering how general the pv driver interfaces are? Which types of ARM boards (I assume it's for ARM) can they be used? What are the ARM boards you have tested on? What are the production use case we are talking about here? Are you or globallogic going to contribute the PV drivers as well? I'm looking forward to the PV drivers as well. :-) Thanks and Best Regards, Meng --- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] Question about the best practice to install two versions of Xen toolstack on the same machine
Hi all, I'm trying to install two versions of Xen, say Xen 4.6 and Xen 4.7-unstable, onto the same machine. I want them to exist at the same time, instead of letting one override the other. I'm thinking about this because sometimes I want to try out someone else's code which uses an older or newer version. But I also want to keep my current version of Xen toolstack so that I won't need to reinstall everything again later. If I just use the following command, the new installation of the toolstack will override the old version's toolstack. obviously: $./configure $make dist $sudo make install (Right now, I just have to recompile my code after I tried out someone else's code that has a different version. I can keep two version of Xen kernel and configure it in the grub2 entries. But I have to reinstall the toolstack.) My quick question is: Does anyone try to install two version of Xen toolstack on the same machine? Is there any documentation about the best practice to install two versions of Xen onto the same machine? --- I had a look at the ./configure's help. There are several options, each of which can specify a specific path to install. However, I'm not that sure if I should configure every option to make it work. For example, it has --prefix and --exec-prefix to change the PREFIX from /usr/local to user defined path. However, there is also --bindir and --sbindir; I assume I should change it, should I? In addition, should I specify the --libexecdir for the program executables? I found one very old link at [1], but I doubt if it's still working since Xen changes the toolstack a lot since Xen 4.1 http://old-list-archives.xenproject.org/xen-users/2009-09/msg00263.html Thanks and Best Regards, Meng --- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [Embedded-pv-devel] [PATCH RFC 00/18] System adjustment to customer needs.
On Thu, May 19, 2016 at 5:53 PM, Andrii Anisov wrote: > Meng, > Hi Andrii, Thank you very much for your explanation about the use case in previous email! >>> If the board is not supported by Xen, can we say Xen will support the >>> board with the warkaround? > > I would not say boards are supported by XEN (except earlyprintk). > Rather architectures are supported in general, and SoC's are supported > in architecture implementation defined deviations (i.e. SMMU absence). Yes. I searched around for the "Jacinto 6" Automotive processor.[1] It uses Cortex A15 processor... However, I tried the Arndale Octo board two years ago (http://www.arndaleboard.org/wiki/index.php/Main_Page). From my previous experience, the board may not be supported by Xen even though the processor it uses has virtualization extension.. :-( That's why I asked if the board itself can run Xen. If the board can run Xen, I would like to buy one and try it out. :-) [1] http://www.ti.com/lit/ds/symlink/dra746.pdf Thanks and Best Regards, Meng --- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] RT-Xen on ARM
Hi Andrii, On Tue, Aug 1, 2017 at 4:02 AM, Andrii Anisov wrote: > Hello Meng Xu, > > I've get back to this stuff. Sorry for the late response. I'm not sure if you have already solved this. > > > On 03.07.17 17:58, Andrii Anisov wrote: >> >> That's why we are going to keep configuration (of guests and workloads) >> close to [1] for evaluation, but on our target SoC. >> I'm wondering if there are known issues or specifics for ARM. >> >> [1] https://www.cis.upenn.edu/~linhphan/papers/emsoft14-rt-xen.pdf > > Currently I have a setup with dom0 and domU's with Litmus-RT. Great! > Following the > document I need workload tasks. > Maybe you have mentioned workload tasks sources you can share, so that would > shorten my steps. Sure. The workload we used in the paper is mainly the cpu-intensive task. We first calibrate a busy-loop of multiplications that runs for 1ms. Then for a task that executes for exe(ms), we simply let the task execute the 1ms busy loop for exe times. It is also good to run the same task for several times to make sure the task's execution time is table from different runs. The Section 4.1 and 4.2 in [1] explained the whole experiment steps. If you have any question or confusion on a specific step, please feel free to let me know. We may schedule a meeting to clarify all the questions or confusions you may have. [1] https://www.cis.upenn.edu/~linhphan/papers/emsoft14-rt-xen.pdf Best regards, Meng > > -- > > *Andrii Anisov* > > -- --- Meng Xu PhD Candidate in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] RT-Xen on ARM
On Mon, Aug 21, 2017 at 4:07 AM, Andrii Anisov wrote: > > Hello Meng Xu, > > > On 18.08.17 23:43, Meng Xu wrote: >> >> The Section 4.1 and 4.2 in [1] explained the whole experiment steps. >> If you have any question or confusion on a specific step, please feel >> free to let me know. > > From the document it is not really clear if you ran one guest RT domain or > several simultaneously for your experiments. > We run 4 VMs simultaneously. Meng -- --- Meng Xu PhD Candidate in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] RT-Xen on ARM
On Mon, Aug 21, 2017 at 4:16 AM, Andrii Anisov wrote: > > On 21.08.17 11:07, Andrii Anisov wrote: >> >> Hello Meng Xu, >> >> >> On 18.08.17 23:43, Meng Xu wrote: >>> >>> The Section 4.1 and 4.2 in [1] explained the whole experiment steps. >>> If you have any question or confusion on a specific step, please feel >>> free to let me know. >> >> From the document it is not really clear if you ran one guest RT domain or >> several simultaneously for your experiments. > > Also it is not described XEN RT scheduler setup like vcpus period/budget > configuration for each guest domain. > It is not obvious if the configured set of vcpus in the experiment setup > utilized all the pcpus bandwidth. > Given the set of tasks in each VM, we compute the VCPUs' periods and budgets, using the CARTS tool [1]. Note that each task has a period and a worst-case execution time (wcet). The configured set of vcpus in the experiment setup may not use all pcpus bandwidth. For example, if we have one task (period = 10ms, wcet = 2ms) on a VCPU, the VCPU of the task will not be configured with 100% bandwidth. If the VCPU is the only VCPU on a pcpu, that pcpu bandwidth won't be fully used because there is not enough workload to fully use all pcpu bandwidth. [1] https://rtg.cis.upenn.edu/carts/ Best, Meng --- Meng Xu PhD Candidate in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] RT-Xen on ARM
On Mon, Aug 21, 2017 at 4:38 AM, Andrii Anisov wrote: > > On 18.08.17 23:43, Meng Xu wrote: >> >> Sure. The workload we used in the paper is mainly the cpu-intensive task. >> We first calibrate a busy-loop of multiplications that runs for 1ms. >> Then for a task that executes for exe(ms), we simply let the task >> execute the 1ms busy loop for exe times. > > I'm a bit confused, why didn't you ran the system with rtspin from > LITMUS-RT, any issues with it? The task we are using should do same amount of calculation for the same amount of time. For example, suppose it takes 1ms to run the following piece of code: for( i = 0; i < 1 million; i++) sum += i; This piece of code can be viewed as the "payload" of a realistic workload. Suppose the task is scheduled to run at t0, preempted at t1, resumes at t2, and finishes at t3. We have (t1 - t0) + (t3 - t2) = 1ms and we are sure the task did the addition for 1million times. However, if we use the rtspin, the rtspin will check if (t2-t0) > 1ms. If so, it will claim it finishes its workload although it hasn't finished its workload, i.e., doing addition for 1million times. Since we want to compare if tasks can finish their "workload" by their deadline under different scheduling algorithms, we should fix the "amount of workload" a task does under different scheduling policies. rtspin() does not achieve our purpose. That's why we don't use it. Note that rtspin() is initially designed to test the scheduling overload of LITMUS. It does not perform the same amount of workload for the same assigned wcet. > BTW, I've found set experimental patches (scripts and functional changes) on > your github: https://github.com/PennPanda/liblitmus . > Are they related to the mentioned document [1]? Not really. The liblitmus repo under my repo. is for another project. It is not for [1]'s purpose. The idea of creating the real-time task is similar, though. The real-time task is based on the bin/base_task.c in liblitmus. It needs to fill out the job() function as follows: static int job(int wcet) { for (i = 0; i < wcet; i++) loop_for_one_1ms() } loop_for_one_1ms() { /* iterations value differs across machines */ for (j = 0; j < iterations; j++ ) result = result + j * j; } > >> [1] https://www.cis.upenn.edu/~linhphan/papers/emsoft14-rt-xen.pdf > > > -- Hope it helps clear the confusion. Thanks, Meng -- --- Meng Xu PhD Candidate in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Xen 4.10 Development Update
On Mon, Aug 21, 2017 at 6:07 AM, Julien Grall wrote: > This email only tracks big items for xen.git tree. Please reply for items you > woulk like to see in 4.10 so that people have an idea what is going on and > prioritise accordingly. > > You're welcome to provide description and use cases of the feature you're > working on. > > = Timeline = > > We now adopt a fixed cut-off date scheme. We will release twice a > year. The upcoming 4.10 timeline are as followed: > > * Last posting date: September 15th, 2017 > * Hard code freeze: September 29th, 2017 > * RC1: TBD > * Release: December 2, 2017 > > Note that we don't have freeze exception scheme anymore. All patches > that wish to go into 4.10 must be posted no later than the last posting > date. All patches posted after that date will be automatically queued > into next release. > > RCs will be arranged immediately after freeze. > > We recently introduced a jira instance to track all the tasks (not only big) > for the project. See: https://xenproject.atlassian.net/projects/XEN/issues. > > Most of the tasks tracked by this e-mail also have a corresponding jira task > referred by XEN-N. > > I have started to include the version number of series associated to each > feature. Can each owner send an update on the version number if the series > was posted upstream? > > = Projects = > > == Hypervisor == > > * Per-cpu tasklet > - XEN-28 > - Konrad Rzeszutek Wilk > > * Add support of rcu_idle_{enter,exit} > - XEN-27 > - Dario Faggioli I'm moving the RTDS scheduler to work-conserving scheduler. The first version of the patch series has been posted at https://www.mail-archive.com/xen-devel@lists.xen.org/msg117062.html, after we discussed the RFC patch. Thanks, Meng -- --- Meng Xu PhD Candidate in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC PATCH 0/5] Extend resources to support more vcpus in single VM
Hi Tianyu, On Thu, Aug 24, 2017 at 10:52 PM, Lan Tianyu wrote: > > This patchset is to extend some resources(i.e, event channel, > hap and so) to support more vcpus for single VM. > > > Chao Gao (1): > xl/libacpi: extend lapic_id() to uint32_t > > Lan Tianyu (4): > xen/hap: Increase hap size for more vcpus support > XL: Increase event channels to support more vcpus > Tool/ACPI: DSDT extension to support more vcpus > hvmload: Add x2apic entry support in the MADT build > > tools/firmware/hvmloader/util.c | 2 +- > tools/libacpi/acpi2_0.h | 10 +++ > tools/libacpi/build.c | 61 > + > tools/libacpi/libacpi.h | 2 +- > tools/libacpi/mk_dsdt.c | 11 > tools/libxl/libxl_create.c | 2 +- > tools/libxl/libxl_x86_acpi.c| 2 +- > xen/arch/x86/mm/hap/hap.c | 2 +- > 8 files changed, 63 insertions(+), 29 deletions(-) How many VCPUs for a single VM do you want to support with this patch set? Thanks, Meng -- --- Meng Xu PhD Candidate in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v2 4/5] xentrace: enable per-VCPU extratime flag for RTDS
Change repl_budget event output for xentrace formats and xenalyze Signed-off-by: Meng Xu --- Changes from v1 Add this changes from v1 --- tools/xentrace/formats| 2 +- tools/xentrace/xenalyze.c | 8 +--- 2 files changed, 6 insertions(+), 4 deletions(-) diff --git a/tools/xentrace/formats b/tools/xentrace/formats index f39182a..470ac5c 100644 --- a/tools/xentrace/formats +++ b/tools/xentrace/formats @@ -75,7 +75,7 @@ 0x00022801 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) rtds:tickle[ cpu = %(1)d ] 0x00022802 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) rtds:runq_pick [ dom:vcpu = 0x%(1)08x, cur_deadline = 0x%(3)08x%(2)08x, cur_budget = 0x%(5)08x%(4)08x ] 0x00022803 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) rtds:burn_budget [ dom:vcpu = 0x%(1)08x, cur_budget = 0x%(3)08x%(2)08x, delta = %(4)d ] -0x00022804 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) rtds:repl_budget [ dom:vcpu = 0x%(1)08x, cur_deadline = 0x%(3)08x%(2)08x, cur_budget = 0x%(5)08x%(4)08x ] +0x00022804 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) rtds:repl_budget [ dom:vcpu = 0x%(1)08x, priority_level = 0x%(2)08d cur_deadline = 0x%(4)08x%(3)08x, cur_budget = 0x%(6)08x%(5)08x ] 0x00022805 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) rtds:sched_tasklet 0x00022806 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) rtds:schedule [ cpu[16]:tasklet[8]:idle[4]:tickled[4] = %(1)08x ] diff --git a/tools/xentrace/xenalyze.c b/tools/xentrace/xenalyze.c index 39fc35f..6fb952c 100644 --- a/tools/xentrace/xenalyze.c +++ b/tools/xentrace/xenalyze.c @@ -7944,12 +7944,14 @@ void sched_process(struct pcpu_info *p) if(opt.dump_all) { struct { unsigned int vcpuid:16, domid:16; +unsigned int priority_level; uint64_t cur_dl, cur_bg; } __attribute__((packed)) *r = (typeof(r))ri->d; -printf(" %s rtds:repl_budget d%uv%u, deadline = %"PRIu64", " - "budget = %"PRIu64"\n", ri->dump_header, - r->domid, r->vcpuid, r->cur_dl, r->cur_bg); +printf(" %s rtds:repl_budget d%uv%u, priority_level = %u," + "deadline = %"PRIu64", budget = %"PRIu64"\n", + ri->dump_header, r->domid, r->vcpuid, + r->priority_level, r->cur_dl, r->cur_bg); } break; case TRC_SCHED_CLASS_EVT(RTDS, 5): /* SCHED_TASKLET*/ -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v2 1/5] xen:rtds: towards work conserving RTDS
Make RTDS scheduler work conserving without breaking the real-time guarantees. VCPU model: Each real-time VCPU is extended to have an extratime flag and a priority_level field. When a VCPU's budget is depleted in the current period, if it has extratime flag set, its priority_level will increase by 1 and its budget will be refilled; othewrise, the VCPU will be moved to the depletedq. Scheduling policy is modified global EDF: A VCPU v1 has higher priority than another VCPU v2 if (i) v1 has smaller priority_leve; or (ii) v1 has the same priority_level but has a smaller deadline Queue management: Run queue holds VCPUs with extratime flag set and VCPUs with remaining budget. Run queue is sorted in increasing order of VCPUs priorities. Depleted queue holds VCPUs which have extratime flag cleared and depleted budget. Replenished queue is not modified. Signed-off-by: Meng Xu --- Changes from v1 Change XEN_DOMCTL_SCHED_RTDS_extratime to XEN_DOMCTL_SCHEDRT_extra as suggested by Dario Changes from RFC v1 Rewording comments and commit message Remove is_work_conserving field from rt_vcpu structure Use one bit in VCPU's flag to indicate if a VCPU will have extra time Correct comments style --- xen/common/sched_rt.c | 90 ++--- xen/include/public/domctl.h | 4 ++ 2 files changed, 80 insertions(+), 14 deletions(-) diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c index 0ac5816..fab6f49 100644 --- a/xen/common/sched_rt.c +++ b/xen/common/sched_rt.c @@ -49,13 +49,15 @@ * A PCPU is feasible if the VCPU can run on this PCPU and (the PCPU is idle or * has a lower-priority VCPU running on it.) * - * Each VCPU has a dedicated period and budget. + * Each VCPU has a dedicated period, budget and a extratime flag * The deadline of a VCPU is at the end of each period; * A VCPU has its budget replenished at the beginning of each period; * While scheduled, a VCPU burns its budget. * The VCPU needs to finish its budget before its deadline in each period; * The VCPU discards its unused budget at the end of each period. - * If a VCPU runs out of budget in a period, it has to wait until next period. + * When a VCPU runs out of budget in a period, if its extratime flag is set, + * the VCPU increases its priority_level by 1 and refills its budget; otherwise, + * it has to wait until next period. * * Each VCPU is implemented as a deferable server. * When a VCPU has a task running on it, its budget is continuously burned; @@ -63,7 +65,8 @@ * * Queue scheme: * A global runqueue and a global depletedqueue for each CPU pool. - * The runqueue holds all runnable VCPUs with budget, sorted by deadline; + * The runqueue holds all runnable VCPUs with budget, + * sorted by priority_level and deadline; * The depletedqueue holds all VCPUs without budget, unsorted; * * Note: cpumask and cpupool is supported. @@ -151,6 +154,14 @@ #define RTDS_depleted (1<<__RTDS_depleted) /* + * RTDS_extratime: Can the vcpu run in the time that is + * not part of any real-time reservation, and would therefore + * be otherwise left idle? + */ +#define __RTDS_extratime4 +#define RTDS_extratime (1<<__RTDS_extratime) + +/* * rt tracing events ("only" 512 available!). Check * include/public/trace.h for more details. */ @@ -201,6 +212,8 @@ struct rt_vcpu { struct rt_dom *sdom; struct vcpu *vcpu; +unsigned priority_level; + unsigned flags; /* mark __RTDS_scheduled, etc.. */ }; @@ -245,6 +258,11 @@ static inline struct list_head *rt_replq(const struct scheduler *ops) return &rt_priv(ops)->replq; } +static inline bool has_extratime(const struct rt_vcpu *svc) +{ +return (svc->flags & RTDS_extratime) ? 1 : 0; +} + /* * Helper functions for manipulating the runqueue, the depleted queue, * and the replenishment events queue. @@ -274,6 +292,21 @@ vcpu_on_replq(const struct rt_vcpu *svc) } /* + * If v1 priority >= v2 priority, return value > 0 + * Otherwise, return value < 0 + */ +static s_time_t +compare_vcpu_priority(const struct rt_vcpu *v1, const struct rt_vcpu *v2) +{ +int prio = v2->priority_level - v1->priority_level; + +if ( prio == 0 ) +return v2->cur_deadline - v1->cur_deadline; + +return prio; +} + +/* * Debug related code, dump vcpu/cpu information */ static void @@ -303,6 +336,7 @@ rt_dump_vcpu(const struct scheduler *ops, const struct rt_vcpu *svc) cpulist_scnprintf(keyhandler_scratch, sizeof(keyhandler_scratch), mask); printk("[%5d.%-2u] cpu %u, (%"PRI_stime", %"PRI_stime")," " cur_b=%"PRI_stime" cur_d=%"PRI_stime" last_start=%"PRI_stime"\n" + " \t\t priority_level=%d has_extratime=%d\n" " \t\t onQ=%d runnable=%d flags=%x effective hard_affinity=%s\n", svc->vcpu
[Xen-devel] [PATCH v2 2/5] libxl: enable per-VCPU extratime flag for RTDS
Modify libxl_vcpu_sched_params_get/set and sched_rtds_vcpu_get/set functions to support per-VCPU extratime flag Signed-off-by: Meng Xu --- Changes from v1 1) Add LIBXL_HAVE_SCHED_RTDS_VCPU_EXTRA to indicate if extratime flag is supported 2) Change flag name in domctl.h from XEN_DOMCTL_SCHED_RTDS_extratime to XEN_DOMCTL_SCHEDRT_extra Changes from RFC v1 Change work_conserving flag to extratime flag --- tools/libxl/libxl_sched.c | 12 1 file changed, 12 insertions(+) --- tools/libxl/libxl.h | 6 ++ tools/libxl/libxl_sched.c | 18 ++ 2 files changed, 24 insertions(+) diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h index 1704525..ead300f 100644 --- a/tools/libxl/libxl.h +++ b/tools/libxl/libxl.h @@ -257,6 +257,12 @@ #define LIBXL_HAVE_SCHED_RTDS_VCPU_PARAMS 1 /* + * LIBXL_HAVE_SCHED_RTDS_VCPU_EXTRA indicates RTDS scheduler + * now supports per-vcpu extratime settings. + */ +#define LIBXL_HAVE_SCHED_RTDS_VCPU_EXTRA 1 + +/* * libxl_domain_build_info has the arm.gic_version field. */ #define LIBXL_HAVE_BUILDINFO_ARM_GIC_VERSION 1 diff --git a/tools/libxl/libxl_sched.c b/tools/libxl/libxl_sched.c index faa604e..b76a29a 100644 --- a/tools/libxl/libxl_sched.c +++ b/tools/libxl/libxl_sched.c @@ -558,6 +558,10 @@ static int sched_rtds_vcpu_get_all(libxl__gc *gc, uint32_t domid, for (i = 0; i < num_vcpus; i++) { scinfo->vcpus[i].period = vcpus[i].u.rtds.period; scinfo->vcpus[i].budget = vcpus[i].u.rtds.budget; +if (vcpus[i].u.rtds.flags & XEN_DOMCTL_SCHEDRT_extra) + scinfo->vcpus[i].extratime = 1; +else + scinfo->vcpus[i].extratime = 0; scinfo->vcpus[i].vcpuid = vcpus[i].vcpuid; } rc = 0; @@ -607,6 +611,10 @@ static int sched_rtds_vcpu_set(libxl__gc *gc, uint32_t domid, vcpus[i].vcpuid = scinfo->vcpus[i].vcpuid; vcpus[i].u.rtds.period = scinfo->vcpus[i].period; vcpus[i].u.rtds.budget = scinfo->vcpus[i].budget; +if (scinfo->vcpus[i].extratime) +vcpus[i].u.rtds.flags |= XEN_DOMCTL_SCHEDRT_extra; +else +vcpus[i].u.rtds.flags &= ~XEN_DOMCTL_SCHEDRT_extra; } r = xc_sched_rtds_vcpu_set(CTX->xch, domid, @@ -655,6 +663,10 @@ static int sched_rtds_vcpu_set_all(libxl__gc *gc, uint32_t domid, vcpus[i].vcpuid = i; vcpus[i].u.rtds.period = scinfo->vcpus[0].period; vcpus[i].u.rtds.budget = scinfo->vcpus[0].budget; +if (scinfo->vcpus[0].extratime) +vcpus[i].u.rtds.flags |= XEN_DOMCTL_SCHEDRT_extra; +else +vcpus[i].u.rtds.flags &= ~XEN_DOMCTL_SCHEDRT_extra; } r = xc_sched_rtds_vcpu_set(CTX->xch, domid, @@ -705,6 +717,12 @@ static int sched_rtds_domain_set(libxl__gc *gc, uint32_t domid, sdom.period = scinfo->period; if (scinfo->budget != LIBXL_DOMAIN_SCHED_PARAM_BUDGET_DEFAULT) sdom.budget = scinfo->budget; +if (scinfo->extratime != LIBXL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT) { +if (scinfo->extratime) +sdom.flags |= XEN_DOMCTL_SCHEDRT_extra; +else +sdom.flags &= ~XEN_DOMCTL_SCHEDRT_extra; +} if (sched_rtds_validate_params(gc, sdom.period, sdom.budget)) return ERROR_INVAL; -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v2 0/5] Towards work-conserving RTDS
This series of patches make RTDS scheduler work-conserving without breaking real-time guarantees. VCPUs with extratime flag set can get extra time from the unreserved system resource. System administrators can decide which VCPUs have extratime flag set. Example: Set the extratime bit of all VCPUs of domain 1: # xl sched-rtds -d 1 -v all -p 1 -b 2000 -e 1 Each VCPU of domain 1 will be guaranteed to have 2000ms every 1ms (if the system is schedulable). If there is a CPU having no work to do, domain 1's VCPUs will be scheduled onto the CPU, even though the VCPUs have got 2000ms in 1ms. Clear the extra bit of all VCPUs of domain 1: # xl sched-rtds -d 1 -v all -p 1 -b 2000 -e 0 Set/Clear the extratime bit of one specific VCPU of domain 1: # xl sched-rtds -d 1 -v 1 -p 1 -b 2000 -e 1 # xl sched-rtds -d 1 -v 1 -p 1 -b 2000 -e 0 The original design of the work-conserving RTDS was discussed at https://www.mail-archive.com/xen-devel@lists.xen.org/msg77150.html The first version was discussed at https://www.mail-archive.com/xen-devel@lists.xen.org/msg117361.html The series of patch can be found at github: https://github.com/PennPanda/RT-Xen under the branch: xenbits/rtds/work-conserving-v2 Changes from v1 Change XEN_DOMCTL_SCHED_RTDS_extratime to XEN_DOMCTL_SCHEDRT_extra Revise xentrace, xenalyze, and docs Add LIBXL_HAVE_SCHED_RTDS_VCPU_EXTRA symbol in libxl.h Changes from RFC v1 Merge changes in sched_rt.c into one patch; Minor change in variable name and comments. Signed-off-by: Meng Xu [PATCH v2 1/5] xen:rtds: towards work conserving RTDS [PATCH v2 2/5] libxl: enable per-VCPU extratime flag for RTDS [PATCH v2 3/5] xl: enable per-VCPU extratime flag for RTDS [PATCH v2 4/5] xentrace: enable per-VCPU extratime flag for RTDS [PATCH v2 5/5] docs: enable per-VCPU extratime flag for RTDS ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v2 3/5] xl: enable per-VCPU extratime flag for RTDS
Change main_sched_rtds and related output functions to support per-VCPU extratime flag. Signed-off-by: Meng Xu --- Changes from v1 No change because we agree on using -e 0/1 option to set if a vcpu will get extra time or not Changes from RFC v1 Changes work_conserving flag to extratime flag --- tools/xl/xl_cmdtable.c | 3 ++- tools/xl/xl_sched.c| 56 ++ 2 files changed, 40 insertions(+), 19 deletions(-) --- tools/xl/xl_cmdtable.c | 3 ++- tools/xl/xl_sched.c| 56 ++ 2 files changed, 40 insertions(+), 19 deletions(-) diff --git a/tools/xl/xl_cmdtable.c b/tools/xl/xl_cmdtable.c index ba0159d..1b03d44 100644 --- a/tools/xl/xl_cmdtable.c +++ b/tools/xl/xl_cmdtable.c @@ -272,12 +272,13 @@ struct cmd_spec cmd_table[] = { { "sched-rtds", &main_sched_rtds, 0, 1, "Get/set rtds scheduler parameters", - "[-d [-v[=VCPUID/all]] [-p[=PERIOD]] [-b[=BUDGET]]]", + "[-d [-v[=VCPUID/all]] [-p[=PERIOD]] [-b[=BUDGET]] [-e[=EXTRATIME]]]", "-d DOMAIN, --domain=DOMAIN Domain to modify\n" "-v VCPUID/all, --vcpuid=VCPUID/allVCPU to modify or output;\n" " Using '-v all' to modify/output all vcpus\n" "-p PERIOD, --period=PERIOD Period (us)\n" "-b BUDGET, --budget=BUDGET Budget (us)\n" + "-e EXTRATIME, --extratime=EXTRATIME EXTRATIME (1=yes, 0=no)\n" }, { "domid", &main_domid, 0, 0, diff --git a/tools/xl/xl_sched.c b/tools/xl/xl_sched.c index 85722fe..5138012 100644 --- a/tools/xl/xl_sched.c +++ b/tools/xl/xl_sched.c @@ -251,7 +251,7 @@ static int sched_rtds_domain_output( libxl_domain_sched_params scinfo; if (domid < 0) { -printf("%-33s %4s %9s %9s\n", "Name", "ID", "Period", "Budget"); +printf("%-33s %4s %9s %9s %10s\n", "Name", "ID", "Period", "Budget", "Extra time"); return 0; } @@ -262,11 +262,12 @@ static int sched_rtds_domain_output( } domname = libxl_domid_to_name(ctx, domid); -printf("%-33s %4d %9d %9d\n", +printf("%-33s %4d %9d %9d %10s\n", domname, domid, scinfo.period, -scinfo.budget); +scinfo.budget, +scinfo.extratime ? "yes" : "no"); free(domname); libxl_domain_sched_params_dispose(&scinfo); return 0; @@ -279,8 +280,8 @@ static int sched_rtds_vcpu_output(int domid, libxl_vcpu_sched_params *scinfo) int i; if (domid < 0) { -printf("%-33s %4s %4s %9s %9s\n", "Name", "ID", - "VCPU", "Period", "Budget"); +printf("%-33s %4s %4s %9s %9s %10s\n", "Name", "ID", + "VCPU", "Period", "Budget", "Extra time"); return 0; } @@ -290,12 +291,13 @@ static int sched_rtds_vcpu_output(int domid, libxl_vcpu_sched_params *scinfo) domname = libxl_domid_to_name(ctx, domid); for ( i = 0; i < scinfo->num_vcpus; i++ ) { -printf("%-33s %4d %4d %9"PRIu32" %9"PRIu32"\n", +printf("%-33s %4d %4d %9"PRIu32" %9"PRIu32" %10s\n", domname, domid, scinfo->vcpus[i].vcpuid, scinfo->vcpus[i].period, - scinfo->vcpus[i].budget); + scinfo->vcpus[i].budget, + scinfo->vcpus[i].extratime ? "yes" : "no"); } free(domname); return 0; @@ -309,8 +311,8 @@ static int sched_rtds_vcpu_output_all(int domid, int i; if (domid < 0) { -printf("%-33s %4s %4s %9s %9s\n", "Name", "ID", - "VCPU", "Period", "Budget"); +printf("%-33s %4s %4s %9s %9s %10s\n", "Name", "ID", + "VCPU", "Period", "Budget", "Extra time"); return 0; } @@ -321,12 +323,13 @@ static int sched_rtds_vcpu_output_all(int domid, domname = libxl_domid_to_name(ctx, domid); for ( i = 0; i < scinfo->num_vcpus; i++ ) { -printf("%-33s %4d %4d %9"PRIu32" %9"PRIu32"\n", +printf("%-33s %4d %4d %9"PRIu32" %9"PRIu32" %10s\n", domname, domid, scinfo->vcpus[i].vcpuid, scinfo->vcpus[i].period, - scinfo->vcpus[i].budget); + scinfo->vcpus[i].budg
[Xen-devel] [PATCH v2 5/5] docs: enable per-VCPU extratime flag for RTDS
Revise xl tool use case by adding -e option Remove work-conserving from TODO list Signed-off-by: Meng Xu --- Changes from v1 Revise rtds docs --- docs/features/sched_rtds.pandoc | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/docs/features/sched_rtds.pandoc b/docs/features/sched_rtds.pandoc index 354097b..d51b499 100644 --- a/docs/features/sched_rtds.pandoc +++ b/docs/features/sched_rtds.pandoc @@ -40,7 +40,7 @@ as follows: It is possible, for a multiple vCPUs VM, to change the parameters of each vCPU individually: -* `xl sched-rtds -d vm-rt -v 0 -p 2 -b 1 -v 1 -p 45000 -b 12000` +* `xl sched-rtds -d vm-rt -v 0 -p 2 -b 1 -e 1 -v 1 -p 45000 -b 12000 -e 0` # Technical details @@ -53,7 +53,8 @@ the presence of the LIBXL\_HAVE\_SCHED\_RTDS symbol. The ability of specifying different scheduling parameters for each vcpu has been introduced later, and is available if the following symbols are defined: * `LIBXL\_HAVE\_VCPU\_SCHED\_PARAMS`, -* `LIBXL\_HAVE\_SCHED\_RTDS\_VCPU\_PARAMS`. +* `LIBXL\_HAVE\_SCHED\_RTDS\_VCPU\_PARAMS`, +* `LIBXL\_HAVE\_SCHED\_RTDS\_VCPU\_EXTRA`. # Limitations @@ -95,7 +96,6 @@ at a macroscopic level), the following should be done: # Areas for improvement -* Work-conserving mode to be added; * performance assessment, especially focusing on what level of real-time behavior the scheduler enables. @@ -118,4 +118,5 @@ at a macroscopic level), the following should be done: Date Revision Version Notes -- --- 2016-10-14 1Xen 4.8 Document written +2017-08-31 2Xen 4.10 Revise for work conserving feature -- --- -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 2/5] libxl: enable per-VCPU extratime flag for RTDS
Dario, I didn't include your Reviewed-by tag because I made one small change. On Fri, Sep 1, 2017 at 11:58 AM, Meng Xu wrote: > > Modify libxl_vcpu_sched_params_get/set and sched_rtds_vcpu_get/set > functions to support per-VCPU extratime flag > > Signed-off-by: Meng Xu > > --- > Changes from v1 > 1) Add LIBXL_HAVE_SCHED_RTDS_VCPU_EXTRA to indicate if extratime flag is > supported > 2) Change flag name in domctl.h from XEN_DOMCTL_SCHED_RTDS_extratime to > XEN_DOMCTL_SCHEDRT_extra > > Changes from RFC v1 > Change work_conserving flag to extratime flag > --- > tools/libxl/libxl_sched.c | 12 > 1 file changed, 12 insertions(+) > --- > tools/libxl/libxl.h | 6 ++ > tools/libxl/libxl_sched.c | 18 ++ > 2 files changed, 24 insertions(+) > > diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h > index 1704525..ead300f 100644 > --- a/tools/libxl/libxl.h > +++ b/tools/libxl/libxl.h > @@ -257,6 +257,12 @@ > #define LIBXL_HAVE_SCHED_RTDS_VCPU_PARAMS 1 > > /* > + * LIBXL_HAVE_SCHED_RTDS_VCPU_EXTRA indicates RTDS scheduler > + * now supports per-vcpu extratime settings. > + */ > +#define LIBXL_HAVE_SCHED_RTDS_VCPU_EXTRA 1 > + > +/* > * libxl_domain_build_info has the arm.gic_version field. > */ > #define LIBXL_HAVE_BUILDINFO_ARM_GIC_VERSION 1 > diff --git a/tools/libxl/libxl_sched.c b/tools/libxl/libxl_sched.c > index faa604e..b76a29a 100644 > --- a/tools/libxl/libxl_sched.c > +++ b/tools/libxl/libxl_sched.c > @@ -558,6 +558,10 @@ static int sched_rtds_vcpu_get_all(libxl__gc *gc, > uint32_t domid, > for (i = 0; i < num_vcpus; i++) { > scinfo->vcpus[i].period = vcpus[i].u.rtds.period; > scinfo->vcpus[i].budget = vcpus[i].u.rtds.budget; > +if (vcpus[i].u.rtds.flags & XEN_DOMCTL_SCHEDRT_extra) > + scinfo->vcpus[i].extratime = 1; > +else > + scinfo->vcpus[i].extratime = 0; > scinfo->vcpus[i].vcpuid = vcpus[i].vcpuid; > } > rc = 0; > @@ -607,6 +611,10 @@ static int sched_rtds_vcpu_set(libxl__gc *gc, uint32_t > domid, > vcpus[i].vcpuid = scinfo->vcpus[i].vcpuid; > vcpus[i].u.rtds.period = scinfo->vcpus[i].period; > vcpus[i].u.rtds.budget = scinfo->vcpus[i].budget; > +if (scinfo->vcpus[i].extratime) > +vcpus[i].u.rtds.flags |= XEN_DOMCTL_SCHEDRT_extra; > +else > +vcpus[i].u.rtds.flags &= ~XEN_DOMCTL_SCHEDRT_extra; > } > > r = xc_sched_rtds_vcpu_set(CTX->xch, domid, > @@ -655,6 +663,10 @@ static int sched_rtds_vcpu_set_all(libxl__gc *gc, > uint32_t domid, > vcpus[i].vcpuid = i; > vcpus[i].u.rtds.period = scinfo->vcpus[0].period; > vcpus[i].u.rtds.budget = scinfo->vcpus[0].budget; > +if (scinfo->vcpus[0].extratime) > +vcpus[i].u.rtds.flags |= XEN_DOMCTL_SCHEDRT_extra; > +else > +vcpus[i].u.rtds.flags &= ~XEN_DOMCTL_SCHEDRT_extra; > } > > r = xc_sched_rtds_vcpu_set(CTX->xch, domid, > @@ -705,6 +717,12 @@ static int sched_rtds_domain_set(libxl__gc *gc, uint32_t > domid, > sdom.period = scinfo->period; > if (scinfo->budget != LIBXL_DOMAIN_SCHED_PARAM_BUDGET_DEFAULT) > sdom.budget = scinfo->budget; > +if (scinfo->extratime != LIBXL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT) { > +if (scinfo->extratime) > +sdom.flags |= XEN_DOMCTL_SCHEDRT_extra; > +else > +sdom.flags &= ~XEN_DOMCTL_SCHEDRT_extra; > +} > if (sched_rtds_validate_params(gc, sdom.period, sdom.budget)) > return ERROR_INVAL; As you mentioned in the comment to the xl patch v1, I used LIBXL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT for extratime flag as what we did for period and budget. But the way we handle flags is exactly the same with the way we handle period and budget. I'm ok with what it is in this patch, although I feel that we can kill the if (scinfo->extratime != LIBXL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT) because LIBXL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT is -1. What do you think? Thanks, Meng -- --- Meng Xu PhD Candidate in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] MAINTAINERS: update entries to new email address.
On Thu, Oct 5, 2017 at 10:28 AM, Dario Faggioli wrote: > Replace, in the 'M:' fields of the components I co-maintain > ('CPU POOLS', 'SCHEDULING' and 'RTDS SCHEDULER'), the Citrix > email, to which I don't have access any longer, with my > personal email. > > Signed-off-by: Dario Faggioli > --- > Cc: Andrew Cooper > Cc: George Dunlap > Cc: Ian Jackson > Cc: Jan Beulich > Cc: Konrad Rzeszutek Wilk > Cc: Stefano Stabellini > Cc: Tim Deegan > Cc: Wei Liu > Cc: Juergen Gross > Cc: Meng Xu > Acked-by: Meng Xu Meng -- Meng Xu Ph.D. Candidate in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] Changing my email address
Hi Dario, On Thu, Oct 5, 2017 at 10:28 AM, Dario Faggioli wrote: > > Hello, > > Soon I won't have access to dario.faggi...@citrix.com email address. It's sad to hear this. :( > > Therefore, replace it, in my entries in MAINTAINERS, with an email address > that > I actually can, and will actually read. > > One thing about RTDS. Meng, which one of the following two sentences, better > describes your situation? > > a) Supported: Someone is actually paid to look after this. > b) Maintained: Someone actually looks after it. > > If it's a (you're currently paied to look after RTDS) then we're fine. I'm paid to look after RTDS at least before I graduate. :) Best regards, Meng -- Meng Xu Ph.D. Candidate in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] RT-Xen on ARM
Hi Andrii, I'm sorry for replying to this thread late. I was busy with a paper deadline until last Saturday morning. I saw Dario's thorough answer which explains the high-level idea of the real-time analysis that is the theoretical foundation of the analysis tool, e.g., CARTs. Hopefully, he answered your question. If not, please feel free to ask. I just added some very quick comment about your questions/comments below. On Thu, Sep 28, 2017 at 5:18 AM, Andrii Anisov wrote: > Hello, > > > On 27.09.17 22:57, Meng Xu wrote: >> >> Note that: >> When you use gEDF scheduler in VM or VMM (i.e., Xen), you should use >> MPR2 model > > I guess you mean DMPR in CARTS terms. > >> to compute the resource interface (i.e., VCPU parameters). >> When you use pEDF scheduler, you should use PRM model to compute. >>> >>> - Could you please provide an example input xml for CARTS described >>> a >>> system with 2 RT domains with 2 VCPUs each, running on a 2PCPUs, with >>> gEDF >>> scheduling at VMM level (for XEN based setup). >> >> Hmm, if you use the gEDF scheduling algorithm, this may not be >> possible. Let me explain why. >> In the MPR2 model, it computes the interface with the minimum number >> of cores. To get 2 VCPUs for a VM, the total utilization (i.e., budget >> / period) of these two VCPUs must be larger than 1.0. Since you ask >> for 2 domains, the total utilization of these 4 VCPUs will be larger >> than 2.0, which are definitely not schedulable on two cores. > > Well, if we are speaking about test-cases similar to described in [1], where > the whole real time tasks set utilization is taken from 1.1...(PCPU*1)-0.1, > there is no problem with having VCPU number greater than PCPUs. For sure if > we take number of domains more that 1. The number of VCPUs can be larger than the number of PCPUs. > >> If you are considering VCPUs with very low utilization, you may use >> PRM model to compute each VCPU's parameters; after that, you can treat >> these VCPUs as tasks, create another xml file, and ask CARTS to >> compute the resource interface for these VCPUs. > > Sounds terrible for getting it scripted :( If you use python to parse the xml file, it should not be very difficuly. Python has api to parse the xml. :) >> >> (Unfortunately, the current CARTS implementation does not support >> mixing MPR model in one XML file, although it is supported in theory. >> This can be worked around by using the above approach.) >> >>> For pEDF at both VMM and >>> domain level, my understanding is that the os_scheduler represents XEN, >>> and >>> VCPUs are represented by components with tasks running on them. >> >> Yes, if you analyze the entire system that uses one type of scheduler >> with only one type of model (i.e., PRM or MPR2). >> >> If you mixed the scheduling algorithm or the interface model, you can >> compute each VM or VCPU's parameters first. Then you treat VCPUs as >> tasks and create another XML which will be used to compute the number >> of cores to schedule all these VCPUs. >> >>> - I did not get a concept of min_period/max_period for a >>> component/os_scheduler in CARTS description files. If I have them >>> different, >>> CARTS gives calculation for all periods in between, but did not provide >>> the >>> best period to get system schedulable. >> >> You should set them to the same value. > > Ok, how to chose the value for some taskset in a component? Tasks' periods and execution time depends on the tasks' requirement. As Dario mentioned, if a sensor needs to process every 100ms, the sensor task's period is 100ms. Its execution time is the worst-case execution time of the sensor task. As to the component (or VM)'s period, it's better to be smaller than its tasks' periods. Usually, I may want to set to a value divisible by its tasks' periods. You may try different values for components' periods, because the VCPU's bandwidth (budget/period) will be different for different components' periods. You can choose the component's period that produces a smaller VCPU's bandwidth, which may help make VCPUs easiler to be scheduled on PCPUs. Best, Meng -- Meng Xu Ph.D. Candidate in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 2/5] libxl: enable per-VCPU extratime flag for RTDS
On Tue, Sep 19, 2017 at 5:23 AM, Dario Faggioli wrote: > > On Fri, 2017-09-15 at 12:01 -0400, Meng Xu wrote: > > On Wed, Sep 13, 2017 at 8:16 PM, Dario Faggioli > > wrote: > > > > > > > I'm ok with what it is in this patch, although I feel that we can > > > > kill the > > > > if (scinfo->extratime != > > > > LIBXL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT) > > > > because LIBXL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT is -1. > > > > > > > > > > No, sorry, I don't understand what you mean here... > > > > I was thinking about the following code: > > > > if (scinfo->extratime != > > LIBXL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT) { > > if (scinfo->extratime) > > sdom.flags |= XEN_DOMCTL_SCHEDRT_extra; > > else > > sdom.flags &= ~XEN_DOMCTL_SCHEDRT_extra; > > } > > > > This code can be changed to > > if (scinfo->extratime) > > sdom.flags |= XEN_DOMCTL_SCHEDRT_extra; > > else > > sdom.flags &= ~XEN_DOMCTL_SCHEDRT_extra; > > > > If the extratime uses default value (-1), we still set the extratime > > flag. > > > > That's why I feel we may kill the > > if (scinfo->extratime != LIBXL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT) > > > Mmm... Ok, I see it now. Well, this is of course all up to the tools' > maintainers. > > What I think it would be valauble to ask ourself here is, can, at this > point, scinfo->extratime be equal to > XL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT? > > And if it is, what does it mean, and what do we want to do? > > I mean, if extratime is -1, it means that we've been called, without it > being touched by xl (although, remember that, as a library, libxl can > be linked to and called by other programs too, e.g., libvirt). > > If you think that this is a serious programming bug, you can use > XL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT to check that, and raise an > assert. > > If you think it's an API misuse, you can use it to check for that, and > return an error. > > If you think it's just fine, you can do whatever you want to do as > default (which, AFAIUI, it's set the flag). In this case, it's probably > fine to ignore XL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT in actual code. > Although, I'd still put a reference to it in a comment, to explain > what's going on, and why we're doing things differently from budget and > period (since _their_ *_DEFAULT are checked). I think it should be fine for API to call the function without setting extratime parameter. We set the extratime by default. I will go with the following code for the next version. > if (scinfo->extratime) > sdom.flags |= XEN_DOMCTL_SCHEDRT_extra; > else > sdom.flags &= ~XEN_DOMCTL_SCHEDRT_extra; > Thank you very much! Best, Meng -- Meng Xu Ph.D. Candidate in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 3/5] xl: enable per-VCPU extratime flag for RTDS
On Wed, Sep 13, 2017 at 8:51 PM, Dario Faggioli wrote: > > On Fri, 2017-09-01 at 11:58 -0400, Meng Xu wrote: > > diff --git a/tools/xl/xl_cmdtable.c b/tools/xl/xl_cmdtable.c > > index ba0159d..1b03d44 100644 > > --- a/tools/xl/xl_cmdtable.c > > +++ b/tools/xl/xl_cmdtable.c > > @@ -272,12 +272,13 @@ struct cmd_spec cmd_table[] = { > > { "sched-rtds", > >&main_sched_rtds, 0, 1, > >"Get/set rtds scheduler parameters", > > - "[-d [-v[=VCPUID/all]] [-p[=PERIOD]] [-b[=BUDGET]]]", > > + "[-d [-v[=VCPUID/all]] [-p[=PERIOD]] [-b[=BUDGET]] [- > > e[=EXTRATIME]]]", > >"-d DOMAIN, --domain=DOMAIN Domain to modify\n" > >"-v VCPUID/all, --vcpuid=VCPUID/allVCPU to modify or > > output;\n" > >" Using '-v all' to modify/output all vcpus\n" > >"-p PERIOD, --period=PERIOD Period (us)\n" > >"-b BUDGET, --budget=BUDGET Budget (us)\n" > > + "-e EXTRATIME, --extratime=EXTRATIME EXTRATIME (1=yes, 0=no)\n" > Extratime > ? We need to provide the option to configure the extratime flag for each vcpu, right? > > > }, > > { "domid", > >&main_domid, 0, 0, > > diff --git a/tools/xl/xl_sched.c b/tools/xl/xl_sched.c > > index 85722fe..5138012 100644 > > --- a/tools/xl/xl_sched.c > > +++ b/tools/xl/xl_sched.c > > @@ -251,7 +251,7 @@ static int sched_rtds_domain_output( > > libxl_domain_sched_params scinfo; > > > > if (domid < 0) { > > -printf("%-33s %4s %9s %9s\n", "Name", "ID", "Period", > > "Budget"); > > +printf("%-33s %4s %9s %9s %10s\n", "Name", "ID", "Period", > > "Budget", "Extra time"); > > return 0; > > } > > > Can you paste the output of: > Sure > xl sched-rtds Cpupool Pool-0: sched=RTDS NameIDPeriodBudget Extra time Domain-0 0 1 4000yes > xl sched-rtds -d 0 NameIDPeriodBudget Extra time Domain-0 0 1 4000yes > xl sched-rtds -d 0 -v 1 NameID VCPUPeriodBudget Extra time Domain-0 01 1 4000yes > xl sched-rtds -d 0 -v all NameID VCPUPeriodBudget Extra time Domain-0 00 1 4000yes Domain-0 01 1 4000yes Domain-0 02 1 4000yes Domain-0 03 1 4000yes Domain-0 04 1 4000yes Domain-0 05 1 4000yes Domain-0 06 1 4000yes Domain-0 07 1 4000yes Domain-0 08 1 4000yes Domain-0 09 1 4000yes Domain-0 0 10 1 4000yes Domain-0 0 11 1 4000yes > > with the series applied? > > > @@ -785,8 +801,9 @@ int main_sched_rtds(int argc, char **argv) > > goto out; > > } > > if (((v_index > b_index) && opt_b) || ((v_index > p_index) && > > opt_p) > > -|| p_index != b_index) { > > -fprintf(stderr, "Incorrect number of period and budget\n"); > > + || ((v_index > e_index) && opt_e) || p_index != b_index > > + || p_index != e_index || b_index != e_index ) { > > > I don't think you need the `b_indes ! e_index` part. If p==b and p==e, > it's automatically true that b==e. Right. > > > @@ -820,7 +837,7 @@ int main_sched_rtds(int argc, char **argv) > > r = EXIT_FAILURE; > > goto out; > > } > > -} else if (!opt_p && !opt_b) { > > +} else if (!opt_p && !opt_b && !opt_e) { > > /* get per-vcpu rtds scheduling parameters */ > > libxl_vcpu_sched_params scinfo; > > libx
[Xen-devel] [PATCH v3 2/5] libxl: enable per-VCPU extratime flag for RTDS
Modify libxl_vcpu_sched_params_get/set and sched_rtds_vcpu_get/set functions to support per-VCPU extratime flag Signed-off-by: Meng Xu --- Changes from v2 1) Move extratime out of the section that is marked as depreciated in libxl_domain_sched_params. 2) Set vcpu extratime in sched_rtds_vcpu_get function function; This fix a bug in previous version when run command "xl sched-rtds -d 0 -v 1" which outputs vcpu extratime value incorrectly. Changes from v1 1) Add LIBXL_HAVE_SCHED_RTDS_VCPU_EXTRA to indicate if extratime flag is supported 2) Change flag name in domctl.h from XEN_DOMCTL_SCHED_RTDS_extratime to XEN_DOMCTL_SCHEDRT_extra Changes from RFC v1 Change work_conserving flag to extratime flag --- tools/libxl/libxl.h | 6 ++ tools/libxl/libxl_sched.c | 17 + tools/libxl/libxl_types.idl | 8 3 files changed, 27 insertions(+), 4 deletions(-) diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h index f82b91e..5e9aed7 100644 --- a/tools/libxl/libxl.h +++ b/tools/libxl/libxl.h @@ -257,6 +257,12 @@ #define LIBXL_HAVE_SCHED_RTDS_VCPU_PARAMS 1 /* + * LIBXL_HAVE_SCHED_RTDS_VCPU_EXTRA indicates RTDS scheduler + * now supports per-vcpu extratime settings. + */ +#define LIBXL_HAVE_SCHED_RTDS_VCPU_EXTRA 1 + +/* * libxl_domain_build_info has the arm.gic_version field. */ #define LIBXL_HAVE_BUILDINFO_ARM_GIC_VERSION 1 diff --git a/tools/libxl/libxl_sched.c b/tools/libxl/libxl_sched.c index 7d144d0..512788f 100644 --- a/tools/libxl/libxl_sched.c +++ b/tools/libxl/libxl_sched.c @@ -532,6 +532,8 @@ static int sched_rtds_vcpu_get(libxl__gc *gc, uint32_t domid, for (i = 0; i < num_vcpus; i++) { scinfo->vcpus[i].period = vcpus[i].u.rtds.period; scinfo->vcpus[i].budget = vcpus[i].u.rtds.budget; +scinfo->vcpus[i].extratime = +!!(vcpus[i].u.rtds.flags & XEN_DOMCTL_SCHEDRT_extra); scinfo->vcpus[i].vcpuid = vcpus[i].vcpuid; } rc = 0; @@ -579,6 +581,8 @@ static int sched_rtds_vcpu_get_all(libxl__gc *gc, uint32_t domid, for (i = 0; i < num_vcpus; i++) { scinfo->vcpus[i].period = vcpus[i].u.rtds.period; scinfo->vcpus[i].budget = vcpus[i].u.rtds.budget; +scinfo->vcpus[i].extratime = +!!(vcpus[i].u.rtds.flags & XEN_DOMCTL_SCHEDRT_extra); scinfo->vcpus[i].vcpuid = vcpus[i].vcpuid; } rc = 0; @@ -628,6 +632,10 @@ static int sched_rtds_vcpu_set(libxl__gc *gc, uint32_t domid, vcpus[i].vcpuid = scinfo->vcpus[i].vcpuid; vcpus[i].u.rtds.period = scinfo->vcpus[i].period; vcpus[i].u.rtds.budget = scinfo->vcpus[i].budget; +if (scinfo->vcpus[i].extratime) +vcpus[i].u.rtds.flags |= XEN_DOMCTL_SCHEDRT_extra; +else +vcpus[i].u.rtds.flags &= ~XEN_DOMCTL_SCHEDRT_extra; } r = xc_sched_rtds_vcpu_set(CTX->xch, domid, @@ -676,6 +684,10 @@ static int sched_rtds_vcpu_set_all(libxl__gc *gc, uint32_t domid, vcpus[i].vcpuid = i; vcpus[i].u.rtds.period = scinfo->vcpus[0].period; vcpus[i].u.rtds.budget = scinfo->vcpus[0].budget; +if (scinfo->vcpus[0].extratime) +vcpus[i].u.rtds.flags |= XEN_DOMCTL_SCHEDRT_extra; +else +vcpus[i].u.rtds.flags &= ~XEN_DOMCTL_SCHEDRT_extra; } r = xc_sched_rtds_vcpu_set(CTX->xch, domid, @@ -726,6 +738,11 @@ static int sched_rtds_domain_set(libxl__gc *gc, uint32_t domid, sdom.period = scinfo->period; if (scinfo->budget != LIBXL_DOMAIN_SCHED_PARAM_BUDGET_DEFAULT) sdom.budget = scinfo->budget; +/* Set extratime by default */ +if (scinfo->extratime) +sdom.flags |= XEN_DOMCTL_SCHEDRT_extra; +else +sdom.flags &= ~XEN_DOMCTL_SCHEDRT_extra; if (sched_rtds_validate_params(gc, sdom.period, sdom.budget)) return ERROR_INVAL; diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index 2d0bb8a..dd7d364 100644 --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl @@ -421,14 +421,14 @@ libxl_domain_sched_params = Struct("domain_sched_params",[ ("cap", integer, {'init_val': 'LIBXL_DOMAIN_SCHED_PARAM_CAP_DEFAULT'}), ("period", integer, {'init_val': 'LIBXL_DOMAIN_SCHED_PARAM_PERIOD_DEFAULT'}), ("budget", integer, {'init_val': 'LIBXL_DOMAIN_SCHED_PARAM_BUDGET_DEFAULT'}), +("extratime",integer, {'init_val': 'LIBXL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT'}), -# The following three parameters ('slice', 'latency' and 'extratime') are deprecated, +# The following three parameters ('slice' and 'latency') are deprecated, # and will have no effect if used, since the S
[Xen-devel] [PATCH v3 4/5] xentrace: enable per-VCPU extratime flag for RTDS
Change repl_budget event output for xentrace formats and xenalyze Signed-off-by: Meng Xu --- No changes from v2 Changes from v1 Add this changes from v1 --- tools/xentrace/formats| 2 +- tools/xentrace/xenalyze.c | 8 +--- 2 files changed, 6 insertions(+), 4 deletions(-) diff --git a/tools/xentrace/formats b/tools/xentrace/formats index d6e7e3f..7d3a209 100644 --- a/tools/xentrace/formats +++ b/tools/xentrace/formats @@ -75,7 +75,7 @@ 0x00022801 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) rtds:tickle[ cpu = %(1)d ] 0x00022802 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) rtds:runq_pick [ dom:vcpu = 0x%(1)08x, cur_deadline = 0x%(3)08x%(2)08x, cur_budget = 0x%(5)08x%(4)08x ] 0x00022803 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) rtds:burn_budget [ dom:vcpu = 0x%(1)08x, cur_budget = 0x%(3)08x%(2)08x, delta = %(4)d ] -0x00022804 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) rtds:repl_budget [ dom:vcpu = 0x%(1)08x, cur_deadline = 0x%(3)08x%(2)08x, cur_budget = 0x%(5)08x%(4)08x ] +0x00022804 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) rtds:repl_budget [ dom:vcpu = 0x%(1)08x, priority_level = 0x%(2)08d cur_deadline = 0x%(4)08x%(3)08x, cur_budget = 0x%(6)08x%(5)08x ] 0x00022805 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) rtds:sched_tasklet 0x00022806 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) rtds:schedule [ cpu[16]:tasklet[8]:idle[4]:tickled[4] = %(1)08x ] diff --git a/tools/xentrace/xenalyze.c b/tools/xentrace/xenalyze.c index 79bdba7..2783204 100644 --- a/tools/xentrace/xenalyze.c +++ b/tools/xentrace/xenalyze.c @@ -7946,12 +7946,14 @@ void sched_process(struct pcpu_info *p) if(opt.dump_all) { struct { unsigned int vcpuid:16, domid:16; +unsigned int priority_level; uint64_t cur_dl, cur_bg; } __attribute__((packed)) *r = (typeof(r))ri->d; -printf(" %s rtds:repl_budget d%uv%u, deadline = %"PRIu64", " - "budget = %"PRIu64"\n", ri->dump_header, - r->domid, r->vcpuid, r->cur_dl, r->cur_bg); +printf(" %s rtds:repl_budget d%uv%u, priority_level = %u," + "deadline = %"PRIu64", budget = %"PRIu64"\n", + ri->dump_header, r->domid, r->vcpuid, + r->priority_level, r->cur_dl, r->cur_bg); } break; case TRC_SCHED_CLASS_EVT(RTDS, 5): /* SCHED_TASKLET*/ -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 5/5] docs: enable per-VCPU extratime flag for RTDS
Revise xl tool use case by adding -e option Remove work-conserving from TODO list Signed-off-by: Meng Xu --- No change from v2 Changes from v1 Revise rtds docs --- docs/features/sched_rtds.pandoc | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/docs/features/sched_rtds.pandoc b/docs/features/sched_rtds.pandoc index 354097b..d51b499 100644 --- a/docs/features/sched_rtds.pandoc +++ b/docs/features/sched_rtds.pandoc @@ -40,7 +40,7 @@ as follows: It is possible, for a multiple vCPUs VM, to change the parameters of each vCPU individually: -* `xl sched-rtds -d vm-rt -v 0 -p 2 -b 1 -v 1 -p 45000 -b 12000` +* `xl sched-rtds -d vm-rt -v 0 -p 2 -b 1 -e 1 -v 1 -p 45000 -b 12000 -e 0` # Technical details @@ -53,7 +53,8 @@ the presence of the LIBXL\_HAVE\_SCHED\_RTDS symbol. The ability of specifying different scheduling parameters for each vcpu has been introduced later, and is available if the following symbols are defined: * `LIBXL\_HAVE\_VCPU\_SCHED\_PARAMS`, -* `LIBXL\_HAVE\_SCHED\_RTDS\_VCPU\_PARAMS`. +* `LIBXL\_HAVE\_SCHED\_RTDS\_VCPU\_PARAMS`, +* `LIBXL\_HAVE\_SCHED\_RTDS\_VCPU\_EXTRA`. # Limitations @@ -95,7 +96,6 @@ at a macroscopic level), the following should be done: # Areas for improvement -* Work-conserving mode to be added; * performance assessment, especially focusing on what level of real-time behavior the scheduler enables. @@ -118,4 +118,5 @@ at a macroscopic level), the following should be done: Date Revision Version Notes -- --- 2016-10-14 1Xen 4.8 Document written +2017-08-31 2Xen 4.10 Revise for work conserving feature -- --- -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 0/5] Towards work-conserving RTDS
This series of patches make RTDS scheduler work-conserving without breaking real-time guarantees. VCPUs with extratime flag set can get extra time from the unreserved system resource. System administrators can decide which VCPUs have extratime flag set. Example: Set the extratime bit of all VCPUs of domain 1: # xl sched-rtds -d 1 -v all -p 1 -b 2000 -e 1 Each VCPU of domain 1 will be guaranteed to have 2000ms every 1ms (if the system is schedulable). If there is a CPU having no work to do, domain 1's VCPUs will be scheduled onto the CPU, even though the VCPUs have got 2000ms in 1ms. Clear the extra bit of all VCPUs of domain 1: # xl sched-rtds -d 1 -v all -p 1 -b 2000 -e 0 Set/Clear the extratime bit of one specific VCPU of domain 1: # xl sched-rtds -d 1 -v 1 -p 1 -b 2000 -e 1 # xl sched-rtds -d 1 -v 1 -p 1 -b 2000 -e 0 The original design of the work-conserving RTDS was discussed at https://www.mail-archive.com/xen-devel@lists.xen.org/msg77150.html The first version was discussed at https://www.mail-archive.com/xen-devel@lists.xen.org/msg117361.html The second version was discussed at https://www.mail-archive.com/xen-devel@lists.xen.org/msg120618.html The series of patch can be found at github: https://github.com/PennPanda/RT-Xen under the branch: xenbits/rtds/work-conserving-v3.1 Changes from v2 Sanity check the input of -e option which can only be 0 or 1 Set -e to 1 by default if 3rd party library does not set -e option Set vcpu extratime in sched_rtds_vcpu_get function function, which fixes a bug in previous version. Change EXTRATIME to Extratime in the xl output Changes from v1 Change XEN_DOMCTL_SCHED_RTDS_extratime to XEN_DOMCTL_SCHEDRT_extra Revise xentrace, xenalyze, and docs Add LIBXL_HAVE_SCHED_RTDS_VCPU_EXTRA symbol in libxl.h Changes from RFC v1 Merge changes in sched_rt.c into one patch; Minor change in variable name and comments. Signed-off-by: Meng Xu [PATCH v3 1/5] xen:rtds: towards work conserving RTDS [PATCH v3 2/5] libxl: enable per-VCPU extratime flag for RTDS [PATCH v3 3/5] xl: enable per-VCPU extratime flag for RTDS [PATCH v3 4/5] xentrace: enable per-VCPU extratime flag for RTDS [PATCH v3 5/5] docs: enable per-VCPU extratime flag for RTDS ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 3/5] xl: enable per-VCPU extratime flag for RTDS
Change main_sched_rtds and related output functions to support per-VCPU extratime flag. Signed-off-by: Meng Xu --- Changes from v2 Validate the -e option input that can only be 0 or 1 Update docs/man/xl.pod.1.in Change EXTRATIME to Extratime Changes from v1 No change because we agree on using -e 0/1 option to set if a vcpu will get extra time or not Changes from RFC v1 Changes work_conserving flag to extratime flag --- docs/man/xl.pod.1.in | 59 +-- tools/xl/xl_cmdtable.c | 3 ++- tools/xl/xl_sched.c| 62 +++--- 3 files changed, 78 insertions(+), 46 deletions(-) diff --git a/docs/man/xl.pod.1.in b/docs/man/xl.pod.1.in index cd8bb1c..486a24f 100644 --- a/docs/man/xl.pod.1.in +++ b/docs/man/xl.pod.1.in @@ -1117,11 +1117,11 @@ as B<--ratelimit_us> in B Set or get rtds (Real Time Deferrable Server) scheduler parameters. This rt scheduler applies Preemptive Global Earliest Deadline First real-time scheduling algorithm to schedule VCPUs in the system. -Each VCPU has a dedicated period and budget. -VCPUs in the same domain have the same period and budget. +Each VCPU has a dedicated period, budget and extratime. While scheduled, a VCPU burns its budget. A VCPU has its budget replenished at the beginning of each period; Unused budget is discarded at the end of each period. +A VCPU with extratime set gets extra time from the unreserved system resource. B @@ -1145,6 +1145,11 @@ Period of time, in microseconds, over which to replenish the budget. Amount of time, in microseconds, that the VCPU will be allowed to run every period. +=item B<-e Extratime>, B<--extratime=Extratime> + +Binary flag to decide if the VCPU will be allowed to get extra time from +the unreserved system resource. + =item B<-c CPUPOOL>, B<--cpupool=CPUPOOL> Restrict output to domains in the specified cpupool. @@ -1160,57 +1165,57 @@ all the domains: xl sched-rtds -v all Cpupool Pool-0: sched=RTDS -NameID VCPUPeriodBudget -Domain-0 00 1 4000 -vm1 10 300 150 -vm1 11 400 200 -vm1 12 1 4000 -vm1 13 1000 500 -vm2 20 1 4000 -vm2 21 1 4000 +NameID VCPUPeriodBudget Extratime +Domain-0 00 1 4000yes +vm1 20 300 150yes +vm1 21 400 200yes +vm1 22 1 4000yes +vm1 23 1000 500yes +vm2 40 1 4000yes +vm2 41 1 4000yes Without any arguments, it will output the default scheduling parameters for each domain: xl sched-rtds Cpupool Pool-0: sched=RTDS -NameIDPeriodBudget -Domain-0 0 1 4000 -vm1 1 1 4000 -vm2 2 1 4000 +NameIDPeriodBudget Extratime +Domain-0 0 1 4000yes +vm1 2 1 4000yes +vm2 4 1 4000yes -2) Use, for instancei, B<-d vm1, -v all> to see the budget and +2) Use, for instance, B<-d vm1, -v all> to see the budget and period of all VCPUs of a specific domain (B): xl sched-rtds -d vm1 -v all -NameID VCPUPeriodBudget -vm1 10 300 150 -vm1 11 400 200 -vm1 12 1 4000 -vm1 13 1000 500 +NameID VCPUPeriodBudget Extratime +vm1 20 300 150yes +vm1 21 400 200yes +vm1 22 1 4000yes +vm1 23 1000 500yes To see the parameters of a subset of the VCPUs of a domain, use: xl sched-rtds -d vm1 -v 0 -v 3 -NameID VCPUPeriodBudget -vm1 10 300 150 -vm1 13 1000 500 +NameID VCPUPeriodBudget Ext
[Xen-devel] [PATCH v3 1/5] xen:rtds: towards work conserving RTDS
Make RTDS scheduler work conserving without breaking the real-time guarantees. VCPU model: Each real-time VCPU is extended to have an extratime flag and a priority_level field. When a VCPU's budget is depleted in the current period, if it has extratime flag set, its priority_level will increase by 1 and its budget will be refilled; othewrise, the VCPU will be moved to the depletedq. Scheduling policy is modified global EDF: A VCPU v1 has higher priority than another VCPU v2 if (i) v1 has smaller priority_leve; or (ii) v1 has the same priority_level but has a smaller deadline Queue management: Run queue holds VCPUs with extratime flag set and VCPUs with remaining budget. Run queue is sorted in increasing order of VCPUs priorities. Depleted queue holds VCPUs which have extratime flag cleared and depleted budget. Replenished queue is not modified. Distribution of spare bandwidth Spare bandwidth is distributed among all VCPUs with extratime flag set, proportional to these VCPUs utilizations Signed-off-by: Meng Xu --- Changes from v2 Explain how to distribute spare bandwidth in commit log Minor change in has_extratime function without functionality change. Changes from v1 Change XEN_DOMCTL_SCHED_RTDS_extratime to XEN_DOMCTL_SCHEDRT_extra as suggested by Dario Changes from RFC v1 Rewording comments and commit message Remove is_work_conserving field from rt_vcpu structure Use one bit in VCPU's flag to indicate if a VCPU will have extra time Correct comments style --- xen/common/sched_rt.c | 90 ++--- xen/include/public/domctl.h | 4 ++ 2 files changed, 80 insertions(+), 14 deletions(-) diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c index 5c51cd9..b770287 100644 --- a/xen/common/sched_rt.c +++ b/xen/common/sched_rt.c @@ -49,13 +49,15 @@ * A PCPU is feasible if the VCPU can run on this PCPU and (the PCPU is idle or * has a lower-priority VCPU running on it.) * - * Each VCPU has a dedicated period and budget. + * Each VCPU has a dedicated period, budget and a extratime flag * The deadline of a VCPU is at the end of each period; * A VCPU has its budget replenished at the beginning of each period; * While scheduled, a VCPU burns its budget. * The VCPU needs to finish its budget before its deadline in each period; * The VCPU discards its unused budget at the end of each period. - * If a VCPU runs out of budget in a period, it has to wait until next period. + * When a VCPU runs out of budget in a period, if its extratime flag is set, + * the VCPU increases its priority_level by 1 and refills its budget; otherwise, + * it has to wait until next period. * * Each VCPU is implemented as a deferable server. * When a VCPU has a task running on it, its budget is continuously burned; @@ -63,7 +65,8 @@ * * Queue scheme: * A global runqueue and a global depletedqueue for each CPU pool. - * The runqueue holds all runnable VCPUs with budget, sorted by deadline; + * The runqueue holds all runnable VCPUs with budget, + * sorted by priority_level and deadline; * The depletedqueue holds all VCPUs without budget, unsorted; * * Note: cpumask and cpupool is supported. @@ -151,6 +154,14 @@ #define RTDS_depleted (1<<__RTDS_depleted) /* + * RTDS_extratime: Can the vcpu run in the time that is + * not part of any real-time reservation, and would therefore + * be otherwise left idle? + */ +#define __RTDS_extratime4 +#define RTDS_extratime (1<<__RTDS_extratime) + +/* * rt tracing events ("only" 512 available!). Check * include/public/trace.h for more details. */ @@ -201,6 +212,8 @@ struct rt_vcpu { struct rt_dom *sdom; struct vcpu *vcpu; +unsigned priority_level; + unsigned flags; /* mark __RTDS_scheduled, etc.. */ }; @@ -245,6 +258,11 @@ static inline struct list_head *rt_replq(const struct scheduler *ops) return &rt_priv(ops)->replq; } +static inline bool has_extratime(const struct rt_vcpu *svc) +{ +return svc->flags & RTDS_extratime; +} + /* * Helper functions for manipulating the runqueue, the depleted queue, * and the replenishment events queue. @@ -274,6 +292,21 @@ vcpu_on_replq(const struct rt_vcpu *svc) } /* + * If v1 priority >= v2 priority, return value > 0 + * Otherwise, return value < 0 + */ +static s_time_t +compare_vcpu_priority(const struct rt_vcpu *v1, const struct rt_vcpu *v2) +{ +int prio = v2->priority_level - v1->priority_level; + +if ( prio == 0 ) +return v2->cur_deadline - v1->cur_deadline; + +return prio; +} + +/* * Debug related code, dump vcpu/cpu information */ static void @@ -303,6 +336,7 @@ rt_dump_vcpu(const struct scheduler *ops, const struct rt_vcpu *svc) cpulist_scnprintf(keyhandler_scratch, sizeof(keyhandler_scratch), mask); printk("[%5d.%-2u] cpu %u, (%"PRI_stime", %"PRI_stime")," "
Re: [Xen-devel] [PATCH v3 4/5] xentrace: enable per-VCPU extratime flag for RTDS
On Wed, Oct 11, 2017 at 6:57 AM, Dario Faggioli wrote: > On Tue, 2017-10-10 at 19:17 -0400, Meng Xu wrote: >> --- a/tools/xentrace/formats >> +++ b/tools/xentrace/formats >> @@ -75,7 +75,7 @@ >> 0x00022801 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) rtds:tickle[ >> cpu = %(1)d ] >> 0x00022802 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) rtds:runq_pick [ >> dom:vcpu = 0x%(1)08x, cur_deadline = 0x%(3)08x%(2)08x, cur_budget = >> 0x%(5)08x%(4)08x ] >> 0x00022803 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) rtds:burn_budget [ >> dom:vcpu = 0x%(1)08x, cur_budget = 0x%(3)08x%(2)08x, delta = %(4)d ] >> -0x00022804 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) rtds:repl_budget [ >> dom:vcpu = 0x%(1)08x, cur_deadline = 0x%(3)08x%(2)08x, cur_budget = >> 0x%(5)08x%(4)08x ] >> +0x00022804 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) rtds:repl_budget [ >> dom:vcpu = 0x%(1)08x, priority_level = 0x%(2)08d cur_deadline = >> 0x%(4)08x%(3)08x, cur_budget = 0x%(6)08x%(5)08x ] >> 0x00022805 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) rtds:sched_tasklet >> 0x00022806 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) rtds:schedule [ >> cpu[16]:tasklet[8]:idle[4]:tickled[4] = %(1)08x ] >> > But, both in case of this file and below in xenalyze.c, you update 1 > record (the one of REPL_BUDGET). However, in patch 1, you added the > priority_level field to two records: REPL_BUDGET and BURN_BUDGET. > > Or am I missing something? OMG, my fault. I forgot to check this. I will add this and double check it by running some tests. Best, Meng -- Meng Xu Ph.D. Candidate in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v4 4/5] xentrace: enable per-VCPU extratime flag for RTDS
Change repl_budget event output for xentrace formats and xenalyze Signed-off-by: Meng Xu --- Changes from v3 Handle burn_budget event No changes from v2 Changes from v1 Add this changes from v1 --- tools/xentrace/formats| 4 ++-- tools/xentrace/xenalyze.c | 16 +++- 2 files changed, 13 insertions(+), 7 deletions(-) diff --git a/tools/xentrace/formats b/tools/xentrace/formats index d6e7e3f..8b286c3 100644 --- a/tools/xentrace/formats +++ b/tools/xentrace/formats @@ -74,8 +74,8 @@ 0x00022801 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) rtds:tickle[ cpu = %(1)d ] 0x00022802 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) rtds:runq_pick [ dom:vcpu = 0x%(1)08x, cur_deadline = 0x%(3)08x%(2)08x, cur_budget = 0x%(5)08x%(4)08x ] -0x00022803 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) rtds:burn_budget [ dom:vcpu = 0x%(1)08x, cur_budget = 0x%(3)08x%(2)08x, delta = %(4)d ] -0x00022804 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) rtds:repl_budget [ dom:vcpu = 0x%(1)08x, cur_deadline = 0x%(3)08x%(2)08x, cur_budget = 0x%(5)08x%(4)08x ] +0x00022803 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) rtds:burn_budget [ dom:vcpu = 0x%(1)08x, cur_budget = 0x%(3)08x%(2)08x, delta = %(4)d, priority_level = %(5)d, has_extratime = %(6)x ] +0x00022804 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) rtds:repl_budget [ dom:vcpu = 0x%(1)08x, priority_level = 0x%(2)08d cur_deadline = 0x%(4)08x%(3)08x, cur_budget = 0x%(6)08x%(5)08x ] 0x00022805 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) rtds:sched_tasklet 0x00022806 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) rtds:schedule [ cpu[16]:tasklet[8]:idle[4]:tickled[4] = %(1)08x ] diff --git a/tools/xentrace/xenalyze.c b/tools/xentrace/xenalyze.c index 79bdba7..19e050f 100644 --- a/tools/xentrace/xenalyze.c +++ b/tools/xentrace/xenalyze.c @@ -7935,23 +7935,29 @@ void sched_process(struct pcpu_info *p) unsigned int vcpuid:16, domid:16; uint64_t cur_bg; int delta; +unsigned priority_level; +unsigned has_extratime; } __attribute__((packed)) *r = (typeof(r))ri->d; printf(" %s rtds:burn_budget d%uv%u, budget = %"PRIu64", " - "delta = %d\n", ri->dump_header, r->domid, - r->vcpuid, r->cur_bg, r->delta); + "delta = %d, priority_level = %d, has_extratime = %d\n", + ri->dump_header, r->domid, + r->vcpuid, r->cur_bg, r->delta, + r->priority_level, !!r->has_extratime); } break; case TRC_SCHED_CLASS_EVT(RTDS, 4): /* BUDGET_REPLENISH */ if(opt.dump_all) { struct { unsigned int vcpuid:16, domid:16; +unsigned int priority_level; uint64_t cur_dl, cur_bg; } __attribute__((packed)) *r = (typeof(r))ri->d; -printf(" %s rtds:repl_budget d%uv%u, deadline = %"PRIu64", " - "budget = %"PRIu64"\n", ri->dump_header, - r->domid, r->vcpuid, r->cur_dl, r->cur_bg); +printf(" %s rtds:repl_budget d%uv%u, priority_level = %u," + "deadline = %"PRIu64", budget = %"PRIu64"\n", + ri->dump_header, r->domid, r->vcpuid, + r->priority_level, r->cur_dl, r->cur_bg); } break; case TRC_SCHED_CLASS_EVT(RTDS, 5): /* SCHED_TASKLET*/ -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v4 1/5] xen:rtds: towards work conserving RTDS
Make RTDS scheduler work conserving without breaking the real-time guarantees. VCPU model: Each real-time VCPU is extended to have an extratime flag and a priority_level field. When a VCPU's budget is depleted in the current period, if it has extratime flag set, its priority_level will increase by 1 and its budget will be refilled; othewrise, the VCPU will be moved to the depletedq. Scheduling policy is modified global EDF: A VCPU v1 has higher priority than another VCPU v2 if (i) v1 has smaller priority_leve; or (ii) v1 has the same priority_level but has a smaller deadline Queue management: Run queue holds VCPUs with extratime flag set and VCPUs with remaining budget. Run queue is sorted in increasing order of VCPUs priorities. Depleted queue holds VCPUs which have extratime flag cleared and depleted budget. Replenished queue is not modified. Distribution of spare bandwidth Spare bandwidth is distributed among all VCPUs with extratime flag set, proportional to these VCPUs utilizations Signed-off-by: Meng Xu Reviewed-by: Dario Faggioli --- Changes from v2 Explain how to distribute spare bandwidth in commit log Minor change in has_extratime function without functionality change. Changes from v1 Change XEN_DOMCTL_SCHED_RTDS_extratime to XEN_DOMCTL_SCHEDRT_extra as suggested by Dario Changes from RFC v1 Rewording comments and commit message Remove is_work_conserving field from rt_vcpu structure Use one bit in VCPU's flag to indicate if a VCPU will have extra time Correct comments style --- xen/common/sched_rt.c | 90 ++--- xen/include/public/domctl.h | 4 ++ 2 files changed, 80 insertions(+), 14 deletions(-) diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c index 5c51cd9..b770287 100644 --- a/xen/common/sched_rt.c +++ b/xen/common/sched_rt.c @@ -49,13 +49,15 @@ * A PCPU is feasible if the VCPU can run on this PCPU and (the PCPU is idle or * has a lower-priority VCPU running on it.) * - * Each VCPU has a dedicated period and budget. + * Each VCPU has a dedicated period, budget and a extratime flag * The deadline of a VCPU is at the end of each period; * A VCPU has its budget replenished at the beginning of each period; * While scheduled, a VCPU burns its budget. * The VCPU needs to finish its budget before its deadline in each period; * The VCPU discards its unused budget at the end of each period. - * If a VCPU runs out of budget in a period, it has to wait until next period. + * When a VCPU runs out of budget in a period, if its extratime flag is set, + * the VCPU increases its priority_level by 1 and refills its budget; otherwise, + * it has to wait until next period. * * Each VCPU is implemented as a deferable server. * When a VCPU has a task running on it, its budget is continuously burned; @@ -63,7 +65,8 @@ * * Queue scheme: * A global runqueue and a global depletedqueue for each CPU pool. - * The runqueue holds all runnable VCPUs with budget, sorted by deadline; + * The runqueue holds all runnable VCPUs with budget, + * sorted by priority_level and deadline; * The depletedqueue holds all VCPUs without budget, unsorted; * * Note: cpumask and cpupool is supported. @@ -151,6 +154,14 @@ #define RTDS_depleted (1<<__RTDS_depleted) /* + * RTDS_extratime: Can the vcpu run in the time that is + * not part of any real-time reservation, and would therefore + * be otherwise left idle? + */ +#define __RTDS_extratime4 +#define RTDS_extratime (1<<__RTDS_extratime) + +/* * rt tracing events ("only" 512 available!). Check * include/public/trace.h for more details. */ @@ -201,6 +212,8 @@ struct rt_vcpu { struct rt_dom *sdom; struct vcpu *vcpu; +unsigned priority_level; + unsigned flags; /* mark __RTDS_scheduled, etc.. */ }; @@ -245,6 +258,11 @@ static inline struct list_head *rt_replq(const struct scheduler *ops) return &rt_priv(ops)->replq; } +static inline bool has_extratime(const struct rt_vcpu *svc) +{ +return svc->flags & RTDS_extratime; +} + /* * Helper functions for manipulating the runqueue, the depleted queue, * and the replenishment events queue. @@ -274,6 +292,21 @@ vcpu_on_replq(const struct rt_vcpu *svc) } /* + * If v1 priority >= v2 priority, return value > 0 + * Otherwise, return value < 0 + */ +static s_time_t +compare_vcpu_priority(const struct rt_vcpu *v1, const struct rt_vcpu *v2) +{ +int prio = v2->priority_level - v1->priority_level; + +if ( prio == 0 ) +return v2->cur_deadline - v1->cur_deadline; + +return prio; +} + +/* * Debug related code, dump vcpu/cpu information */ static void @@ -303,6 +336,7 @@ rt_dump_vcpu(const struct scheduler *ops, const struct rt_vcpu *svc) cpulist_scnprintf(keyhandler_scratch, sizeof(keyhandler_scratch), mask); printk("[%5d.%-2u] cpu %u, (%"PRI_stime", %"PRI_
[Xen-devel] [PATCH v4 5/5] docs: enable per-VCPU extratime flag for RTDS
Revise xl tool use case by adding -e option Remove work-conserving from TODO list Signed-off-by: Meng Xu Reviewed-by: Dario Faggioli Acked-by: Wei Liu --- No change from v2 Changes from v1 Revise rtds docs --- docs/features/sched_rtds.pandoc | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/docs/features/sched_rtds.pandoc b/docs/features/sched_rtds.pandoc index 354097b..d51b499 100644 --- a/docs/features/sched_rtds.pandoc +++ b/docs/features/sched_rtds.pandoc @@ -40,7 +40,7 @@ as follows: It is possible, for a multiple vCPUs VM, to change the parameters of each vCPU individually: -* `xl sched-rtds -d vm-rt -v 0 -p 2 -b 1 -v 1 -p 45000 -b 12000` +* `xl sched-rtds -d vm-rt -v 0 -p 2 -b 1 -e 1 -v 1 -p 45000 -b 12000 -e 0` # Technical details @@ -53,7 +53,8 @@ the presence of the LIBXL\_HAVE\_SCHED\_RTDS symbol. The ability of specifying different scheduling parameters for each vcpu has been introduced later, and is available if the following symbols are defined: * `LIBXL\_HAVE\_VCPU\_SCHED\_PARAMS`, -* `LIBXL\_HAVE\_SCHED\_RTDS\_VCPU\_PARAMS`. +* `LIBXL\_HAVE\_SCHED\_RTDS\_VCPU\_PARAMS`, +* `LIBXL\_HAVE\_SCHED\_RTDS\_VCPU\_EXTRA`. # Limitations @@ -95,7 +96,6 @@ at a macroscopic level), the following should be done: # Areas for improvement -* Work-conserving mode to be added; * performance assessment, especially focusing on what level of real-time behavior the scheduler enables. @@ -118,4 +118,5 @@ at a macroscopic level), the following should be done: Date Revision Version Notes -- --- 2016-10-14 1Xen 4.8 Document written +2017-08-31 2Xen 4.10 Revise for work conserving feature -- --- -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v4 2/5] libxl: enable per-VCPU extratime flag for RTDS
Modify libxl_vcpu_sched_params_get/set and sched_rtds_vcpu_get/set functions to support per-VCPU extratime flag Signed-off-by: Meng Xu Reviewed-by: Dario Faggioli Acked-by: Wei Liu --- Changes from v2 1) Move extratime out of the section that is marked as depreciated in libxl_domain_sched_params. 2) Set vcpu extratime in sched_rtds_vcpu_get function function; This fix a bug in previous version when run command "xl sched-rtds -d 0 -v 1" which outputs vcpu extratime value incorrectly. Changes from v1 1) Add LIBXL_HAVE_SCHED_RTDS_VCPU_EXTRA to indicate if extratime flag is supported 2) Change flag name in domctl.h from XEN_DOMCTL_SCHED_RTDS_extratime to XEN_DOMCTL_SCHEDRT_extra Changes from RFC v1 Change work_conserving flag to extratime flag --- tools/libxl/libxl.h | 6 ++ tools/libxl/libxl_sched.c | 17 + tools/libxl/libxl_types.idl | 8 3 files changed, 27 insertions(+), 4 deletions(-) diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h index f82b91e..5e9aed7 100644 --- a/tools/libxl/libxl.h +++ b/tools/libxl/libxl.h @@ -257,6 +257,12 @@ #define LIBXL_HAVE_SCHED_RTDS_VCPU_PARAMS 1 /* + * LIBXL_HAVE_SCHED_RTDS_VCPU_EXTRA indicates RTDS scheduler + * now supports per-vcpu extratime settings. + */ +#define LIBXL_HAVE_SCHED_RTDS_VCPU_EXTRA 1 + +/* * libxl_domain_build_info has the arm.gic_version field. */ #define LIBXL_HAVE_BUILDINFO_ARM_GIC_VERSION 1 diff --git a/tools/libxl/libxl_sched.c b/tools/libxl/libxl_sched.c index 7d144d0..512788f 100644 --- a/tools/libxl/libxl_sched.c +++ b/tools/libxl/libxl_sched.c @@ -532,6 +532,8 @@ static int sched_rtds_vcpu_get(libxl__gc *gc, uint32_t domid, for (i = 0; i < num_vcpus; i++) { scinfo->vcpus[i].period = vcpus[i].u.rtds.period; scinfo->vcpus[i].budget = vcpus[i].u.rtds.budget; +scinfo->vcpus[i].extratime = +!!(vcpus[i].u.rtds.flags & XEN_DOMCTL_SCHEDRT_extra); scinfo->vcpus[i].vcpuid = vcpus[i].vcpuid; } rc = 0; @@ -579,6 +581,8 @@ static int sched_rtds_vcpu_get_all(libxl__gc *gc, uint32_t domid, for (i = 0; i < num_vcpus; i++) { scinfo->vcpus[i].period = vcpus[i].u.rtds.period; scinfo->vcpus[i].budget = vcpus[i].u.rtds.budget; +scinfo->vcpus[i].extratime = +!!(vcpus[i].u.rtds.flags & XEN_DOMCTL_SCHEDRT_extra); scinfo->vcpus[i].vcpuid = vcpus[i].vcpuid; } rc = 0; @@ -628,6 +632,10 @@ static int sched_rtds_vcpu_set(libxl__gc *gc, uint32_t domid, vcpus[i].vcpuid = scinfo->vcpus[i].vcpuid; vcpus[i].u.rtds.period = scinfo->vcpus[i].period; vcpus[i].u.rtds.budget = scinfo->vcpus[i].budget; +if (scinfo->vcpus[i].extratime) +vcpus[i].u.rtds.flags |= XEN_DOMCTL_SCHEDRT_extra; +else +vcpus[i].u.rtds.flags &= ~XEN_DOMCTL_SCHEDRT_extra; } r = xc_sched_rtds_vcpu_set(CTX->xch, domid, @@ -676,6 +684,10 @@ static int sched_rtds_vcpu_set_all(libxl__gc *gc, uint32_t domid, vcpus[i].vcpuid = i; vcpus[i].u.rtds.period = scinfo->vcpus[0].period; vcpus[i].u.rtds.budget = scinfo->vcpus[0].budget; +if (scinfo->vcpus[0].extratime) +vcpus[i].u.rtds.flags |= XEN_DOMCTL_SCHEDRT_extra; +else +vcpus[i].u.rtds.flags &= ~XEN_DOMCTL_SCHEDRT_extra; } r = xc_sched_rtds_vcpu_set(CTX->xch, domid, @@ -726,6 +738,11 @@ static int sched_rtds_domain_set(libxl__gc *gc, uint32_t domid, sdom.period = scinfo->period; if (scinfo->budget != LIBXL_DOMAIN_SCHED_PARAM_BUDGET_DEFAULT) sdom.budget = scinfo->budget; +/* Set extratime by default */ +if (scinfo->extratime) +sdom.flags |= XEN_DOMCTL_SCHEDRT_extra; +else +sdom.flags &= ~XEN_DOMCTL_SCHEDRT_extra; if (sched_rtds_validate_params(gc, sdom.period, sdom.budget)) return ERROR_INVAL; diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index 2d0bb8a..dd7d364 100644 --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl @@ -421,14 +421,14 @@ libxl_domain_sched_params = Struct("domain_sched_params",[ ("cap", integer, {'init_val': 'LIBXL_DOMAIN_SCHED_PARAM_CAP_DEFAULT'}), ("period", integer, {'init_val': 'LIBXL_DOMAIN_SCHED_PARAM_PERIOD_DEFAULT'}), ("budget", integer, {'init_val': 'LIBXL_DOMAIN_SCHED_PARAM_BUDGET_DEFAULT'}), +("extratime",integer, {'init_val': 'LIBXL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT'}), -# The following three parameters ('slice', 'latency' and 'extratime') are deprecated, +# The following three parameters ('slice' and 'latency') are deprecated,
[Xen-devel] [PATCH v4 0/5] Towards work-conserving RTDS
This series of patches make RTDS scheduler work-conserving without breaking real-time guarantees. VCPUs with extratime flag set can get extra time from the unreserved system resource. System administrators can decide which VCPUs have extratime flag set. Example: Set the extratime bit of all VCPUs of domain 1: # xl sched-rtds -d 1 -v all -p 1 -b 2000 -e 1 Each VCPU of domain 1 will be guaranteed to have 2000ms every 1ms (if the system is schedulable). If there is a CPU having no work to do, domain 1's VCPUs will be scheduled onto the CPU, even though the VCPUs have got 2000ms in 1ms. Clear the extra bit of all VCPUs of domain 1: # xl sched-rtds -d 1 -v all -p 1 -b 2000 -e 0 Set/Clear the extratime bit of one specific VCPU of domain 1: # xl sched-rtds -d 1 -v 1 -p 1 -b 2000 -e 1 # xl sched-rtds -d 1 -v 1 -p 1 -b 2000 -e 0 The original design of the work-conserving RTDS was discussed at https://www.mail-archive.com/xen-devel@lists.xen.org/msg77150.html The first version was discussed at https://www.mail-archive.com/xen-devel@lists.xen.org/msg117361.html The second version was discussed at https://www.mail-archive.com/xen-devel@lists.xen.org/msg120618.html The third version has been mostly reviewed by Dario Faggioli and acked by Wei Liu, except [PATCH v4 4/5] xentrace: enable per-VCPU extratime flag for RTDS The series of patch can be found at github: https://github.com/PennPanda/RT-Xen under the branch: xenbits/rtds/work-conserving-v4 Changes from v3 Handle burn_budget event in xentrace and xenanalyze. Tested the change with three VMs Changes from v2 Sanity check the input of -e option which can only be 0 or 1 Set -e to 1 by default if 3rd party library does not set -e option Set vcpu extratime in sched_rtds_vcpu_get function function, which fixes a bug in previous version. Change EXTRATIME to Extratime in the xl output Changes from v1 Change XEN_DOMCTL_SCHED_RTDS_extratime to XEN_DOMCTL_SCHEDRT_extra Revise xentrace, xenalyze, and docs Add LIBXL_HAVE_SCHED_RTDS_VCPU_EXTRA symbol in libxl.h Changes from RFC v1 Merge changes in sched_rt.c into one patch; Minor change in variable name and comments. Signed-off-by: Meng Xu [PATCH v4 1/5] xen:rtds: towards work conserving RTDS [PATCH v4 2/5] libxl: enable per-VCPU extratime flag for RTDS [PATCH v4 3/5] xl: enable per-VCPU extratime flag for RTDS [PATCH v4 4/5] xentrace: enable per-VCPU extratime flag for RTDS [PATCH v4 5/5] docs: enable per-VCPU extratime flag for RTDS ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v4 3/5] xl: enable per-VCPU extratime flag for RTDS
Change main_sched_rtds and related output functions to support per-VCPU extratime flag. Signed-off-by: Meng Xu Reviewed-by: Dario Faggioli Acked-by: Wei Liu --- Changes from v2 Validate the -e option input that can only be 0 or 1 Update docs/man/xl.pod.1.in Change EXTRATIME to Extratime Changes from v1 No change because we agree on using -e 0/1 option to set if a vcpu will get extra time or not Changes from RFC v1 Changes work_conserving flag to extratime flag --- docs/man/xl.pod.1.in | 59 +-- tools/xl/xl_cmdtable.c | 3 ++- tools/xl/xl_sched.c| 62 +++--- 3 files changed, 78 insertions(+), 46 deletions(-) diff --git a/docs/man/xl.pod.1.in b/docs/man/xl.pod.1.in index cd8bb1c..486a24f 100644 --- a/docs/man/xl.pod.1.in +++ b/docs/man/xl.pod.1.in @@ -1117,11 +1117,11 @@ as B<--ratelimit_us> in B Set or get rtds (Real Time Deferrable Server) scheduler parameters. This rt scheduler applies Preemptive Global Earliest Deadline First real-time scheduling algorithm to schedule VCPUs in the system. -Each VCPU has a dedicated period and budget. -VCPUs in the same domain have the same period and budget. +Each VCPU has a dedicated period, budget and extratime. While scheduled, a VCPU burns its budget. A VCPU has its budget replenished at the beginning of each period; Unused budget is discarded at the end of each period. +A VCPU with extratime set gets extra time from the unreserved system resource. B @@ -1145,6 +1145,11 @@ Period of time, in microseconds, over which to replenish the budget. Amount of time, in microseconds, that the VCPU will be allowed to run every period. +=item B<-e Extratime>, B<--extratime=Extratime> + +Binary flag to decide if the VCPU will be allowed to get extra time from +the unreserved system resource. + =item B<-c CPUPOOL>, B<--cpupool=CPUPOOL> Restrict output to domains in the specified cpupool. @@ -1160,57 +1165,57 @@ all the domains: xl sched-rtds -v all Cpupool Pool-0: sched=RTDS -NameID VCPUPeriodBudget -Domain-0 00 1 4000 -vm1 10 300 150 -vm1 11 400 200 -vm1 12 1 4000 -vm1 13 1000 500 -vm2 20 1 4000 -vm2 21 1 4000 +NameID VCPUPeriodBudget Extratime +Domain-0 00 1 4000yes +vm1 20 300 150yes +vm1 21 400 200yes +vm1 22 1 4000yes +vm1 23 1000 500yes +vm2 40 1 4000yes +vm2 41 1 4000yes Without any arguments, it will output the default scheduling parameters for each domain: xl sched-rtds Cpupool Pool-0: sched=RTDS -NameIDPeriodBudget -Domain-0 0 1 4000 -vm1 1 1 4000 -vm2 2 1 4000 +NameIDPeriodBudget Extratime +Domain-0 0 1 4000yes +vm1 2 1 4000yes +vm2 4 1 4000yes -2) Use, for instancei, B<-d vm1, -v all> to see the budget and +2) Use, for instance, B<-d vm1, -v all> to see the budget and period of all VCPUs of a specific domain (B): xl sched-rtds -d vm1 -v all -NameID VCPUPeriodBudget -vm1 10 300 150 -vm1 11 400 200 -vm1 12 1 4000 -vm1 13 1000 500 +NameID VCPUPeriodBudget Extratime +vm1 20 300 150yes +vm1 21 400 200yes +vm1 22 1 4000yes +vm1 23 1000 500yes To see the parameters of a subset of the VCPUs of a domain, use: xl sched-rtds -d vm1 -v 0 -v 3 -NameID VCPUPeriodBudget -vm1 10 300 150 -vm1 13 1000 500 +Name
Re: [Xen-devel] [PATCH v4 0/5] Towards work-conserving RTDS
On Thu, Oct 12, 2017 at 5:02 AM, Wei Liu wrote: > > FYI all patches except the xentrace one were committed yesterday. Thank you very much, Wei! Best, Meng -- Meng Xu Ph.D. Candidate in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 0/5] Towards work-conserving RTDS
On Tue, Oct 17, 2017 at 3:29 AM, Dario Faggioli wrote: > On Tue, 2017-10-17 at 09:26 +0200, Dario Faggioli wrote: > > On Thu, 2017-10-12 at 10:34 -0400, Meng Xu wrote: > > > On Thu, Oct 12, 2017 at 5:02 AM, Wei Liu > > > wrote: > > > > > > > > FYI all patches except the xentrace one were committed yesterday. > > > > > > Thank you very much, Wei! > > > > > > > Hey Meng, > > > > Any update on that missing patch, though? > > > No, wait... Posted on Wednesday, mmmhh... Ah, so "this" is you posting > the missing patch! > Yes. :) I didn't repost the patch. I made the changes and tested it once I got the feedback. > > Ok, my bad, sorry. I was fooled by the fact that you resent the whole > series, and that I did not get a copy of it (extra-list, I mean) as > you're still using my old email address. > > Lemme have a look... > Ah, I neglected the email address. I was also wondering maybe you were busy with something else. So I didn't send a reminder. Thanks! Best Regards, Meng > > Regards, > Dario > -- > <> (Raistlin Majere) > - > Dario Faggioli, Ph.D, http://about.me/dario.faggioli > -- Meng Xu Ph.D. Candidate in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 4/5] xentrace: enable per-VCPU extratime flag for RTDS
On Tue, Oct 17, 2017 at 4:10 AM, Dario Faggioli wrote: > On Wed, 2017-10-11 at 14:02 -0400, Meng Xu wrote: >> Change repl_budget event output for xentrace formats and xenalyze >> >> Signed-off-by: Meng Xu >> > I'd say: > > Reviewed-by: Dario Faggioli > > However... > >> diff --git a/tools/xentrace/xenalyze.c b/tools/xentrace/xenalyze.c >> index 79bdba7..19e050f 100644 >> --- a/tools/xentrace/xenalyze.c >> +++ b/tools/xentrace/xenalyze.c >> @@ -7935,23 +7935,29 @@ void sched_process(struct pcpu_info *p) >> unsigned int vcpuid:16, domid:16; >> uint64_t cur_bg; >> int delta; >> +unsigned priority_level; >> +unsigned has_extratime; >> > ...this last field is 'bool' in Xen. > > I appreciate that xenalyze does not build if you just make this bool as > well. But it does build for me, if you do that, and also include > stdbool.h, which I think is a fine thing to do. Right. I'm not sure about this. If including the stdbool.h is preferred, I can resend this one with that change. > > Anyway, I'll leave this to George and tools' maintainers. Sure! Thanks, Meng -- Meng Xu Ph.D. Candidate in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] VPMU interrupt unreliability
On Thu, Oct 19, 2017 at 11:40 AM, Andrew Cooper wrote: > > On 19/10/17 16:09, Kyle Huey wrote: > > On Wed, Oct 11, 2017 at 7:09 AM, Boris Ostrovsky > > wrote: > >> On 10/10/2017 12:54 PM, Kyle Huey wrote: > >>> On Mon, Jul 24, 2017 at 9:54 AM, Kyle Huey wrote: > >>>> On Mon, Jul 24, 2017 at 8:07 AM, Boris Ostrovsky > >>>> wrote: > >>>>>>> One thing I noticed is that the workaround doesn't appear to be > >>>>>>> complete: it is only checking PMC0 status and not other counters > >>>>>>> (fixed > >>>>>>> or architectural). Of course, without knowing what the actual problem > >>>>>>> was it's hard to say whether this was intentional. > >>>>>> handle_pmc_quirk appears to loop through all the counters ... > >>>>> Right, I didn't notice that it is shifting MSR_CORE_PERF_GLOBAL_STATUS > >>>>> value one by one and so it is looking at all bits. > >>>>> > >>>>>>>> 2. Intercepting MSR loads for counters that have the workaround > >>>>>>>> applied and giving the guest the correct counter value. > >>>>>>> We'd have to keep track of whether the counter has been reset (by the > >>>>>>> quirk) since the last MSR write. > >>>>>> Yes. > >>>>>> > >>>>>>>> 3. Or perhaps even changing the workaround to disable the PMI on that > >>>>>>>> counter until the guest acks via GLOBAL_OVF_CTRL, assuming that works > >>>>>>>> on the relevant hardware. > >>>>>>> MSR_CORE_PERF_GLOBAL_OVF_CTRL is written immediately after the quirk > >>>>>>> runs (in core2_vpmu_do_interrupt()) so we already do this, don't we? > >>>>>> I'm suggesting waiting until the *guest* writes to the (virtualized) > >>>>>> GLOBAL_OVF_CTRL. > >>>>> Wouldn't it be better to wait until the counter is reloaded? > >>>> Maybe! I haven't thought through it a lot. It's still not clear to > >>>> me whether MSR_CORE_PERF_GLOBAL_OVF_CTRL actually controls the > >>>> interrupt in any way or whether it just resets the bits in > >>>> MSR_CORE_PERF_GLOBAL_STATUS and acking the interrupt on the APIC is > >>>> all that's required to reenable it. > >>>> > >>>> - Kyle > >>> I wonder if it would be reasonable to just remove the workaround > >>> entirely at some point. The set of people using 1) several year old > >>> hardware, 2) an up to date Xen, and 3) the off-by-default performance > >>> counters is probably rather small. > >> We'd probably want to only enable this for affected processors, not > >> remove it outright. But the problem is that we still don't know for sure > >> whether this issue affects NHM only, do we? > >> > >> (https://lists.xenproject.org/archives/html/xen-devel/2017-07/msg02242.html > >> is the original message) > > Yes, the basic problem is that we don't know where to draw the line. > > vPMU is disabled by default for security reasons, Is there any document about the possible attack via the vPMU? The document I found (such as [1] and XSA-163) just briefly say that the vPMU should be disabled due to security concern. [1] https://xenbits.xen.org/docs/unstable/misc/xen-command-line.html > > and also broken, in a > way which demonstrates that vPMU isn't getting much real-world use. I also noticed that AWS seems support part of the vPMU functionalities, which were used by Netflix to optimize their applications' performance, according to http://www.brendangregg.com/blog/2017-05-04/the-pmcs-of-ec2.html . I guess the security issue should be solved by AWS? However, without knowing how the attack could be conducted, I'm not sure how AWS avoids the attack concern for vPMU. > > As far as I'm concerned, all options (including rm -rf and start from > scratch) are acceptable, especially if this ends up giving us a better > overall subsystem. > > Do we know how other hypervisors work around this issue? Maybe the solution of AWS is a choice? I'm not sure. I'm just thinking aloud. :) Thanks, Meng -- Meng Xu Ph.D. Candidate in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] VPMU interrupt unreliability
On Fri, Oct 20, 2017 at 3:07 AM, Jan Beulich wrote: > > >>> On 19.10.17 at 20:20, wrote: > > Is there any document about the possible attack via the vPMU? The > > document I found (such as [1] and XSA-163) just briefly say that the > > vPMU should be disabled due to security concern. > > Besides the other responses you've already got, I also recall there > being at least some CPU models that would live lock upon the > debug store being placed into virtual space not mapped by present > pages. Thank you very much for your explanation! :) Best Regards, Meng --- Meng Xu Ph.D. Candidate in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 4/5] xentrace: enable per-VCPU extratime flag for RTDS
On Tue, Oct 17, 2017 at 4:10 AM, Dario Faggioli wrote: > On Wed, 2017-10-11 at 14:02 -0400, Meng Xu wrote: >> Change repl_budget event output for xentrace formats and xenalyze >> >> Signed-off-by: Meng Xu >> > I'd say: > > Reviewed-by: Dario Faggioli Hi guys, Just a reminder, we may need this patch for the work-conserving RTDS scheduler in Xen 4.10. I say Julien sent out the rc2 today which does not include this patch. Thanks and best regards, Meng --- Meng Xu Ph.D. Candidate in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 4/5] xentrace: enable per-VCPU extratime flag for RTDS
Hi George, On Wed, Oct 25, 2017 at 10:31 AM, Wei Liu wrote: > > On Mon, Oct 23, 2017 at 02:50:31PM -0400, Meng Xu wrote: > > On Tue, Oct 17, 2017 at 4:10 AM, Dario Faggioli wrote: > > > On Wed, 2017-10-11 at 14:02 -0400, Meng Xu wrote: > > >> Change repl_budget event output for xentrace formats and xenalyze > > >> > > >> Signed-off-by: Meng Xu > > >> > > > I'd say: > > > > > > Reviewed-by: Dario Faggioli > > > > Hi guys, > > > > Just a reminder, we may need this patch for the work-conserving RTDS > > scheduler in Xen 4.10. > > > > I say Julien sent out the rc2 today which does not include this patch. > > > > Thanks and best regards, > > > > I'm waiting for George's ack. Just a friendly reminder: Do you have any comment on this patch? Thanks, Meng ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 4/5] xentrace: enable per-VCPU extratime flag for RTDS
Hi all, On Tue, Oct 17, 2017 at 4:10 AM, Dario Faggioli wrote: > > On Wed, 2017-10-11 at 14:02 -0400, Meng Xu wrote: > > Change repl_budget event output for xentrace formats and xenalyze > > > > Signed-off-by: Meng Xu > > > I'd say: > > Reviewed-by: Dario Faggioli > Just a friendly reminder: This patch has not been pushed into either the staging or master branch of xen.git. This is an essential patch for the new version of RTDS scheduler which Dario and I are maintaining. This patch won't affect other features. It has been a while without hearing complaints from the tools maintainers. Is it ok to push it? > > However... > > > diff --git a/tools/xentrace/xenalyze.c b/tools/xentrace/xenalyze.c > > index 79bdba7..19e050f 100644 > > --- a/tools/xentrace/xenalyze.c > > +++ b/tools/xentrace/xenalyze.c > > @@ -7935,23 +7935,29 @@ void sched_process(struct pcpu_info *p) > > unsigned int vcpuid:16, domid:16; > > uint64_t cur_bg; > > int delta; > > +unsigned priority_level; > > +unsigned has_extratime; > > > ...this last field is 'bool' in Xen. > > I appreciate that xenalyze does not build if you just make this bool as > well. But it does build for me, if you do that, and also include > stdbool.h, which I think is a fine thing to do. > > Anyway, I'll leave this to George and tools' maintainers. If it turns out bool is prefered, I can change it and send out a new one. But please just let me know so that we can have a complete toolstack for the new version of RTDS scheduler. Thanks, Meng ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 5/6] xen: RTDS: rearrange members of control structures
On Fri, Jun 23, 2017 at 6:55 AM, Dario Faggioli wrote: > > Nothing changed in `pahole` output, in terms of holes > and padding, but some fields have been moved, to put > related members in same cache line. > > Signed-off-by: Dario Faggioli > --- > Cc: Meng Xu > Cc: George Dunlap > --- > xen/common/sched_rt.c | 13 - > 1 file changed, 8 insertions(+), 5 deletions(-) > > diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c > index 1b30014..39f6bee 100644 > --- a/xen/common/sched_rt.c > +++ b/xen/common/sched_rt.c > @@ -171,11 +171,14 @@ static void repl_timer_handler(void *data); > struct rt_private { > spinlock_t lock;/* the global coarse-grained lock */ > struct list_head sdom; /* list of availalbe domains, used for dump > */ > + > struct list_head runq; /* ordered list of runnable vcpus */ > struct list_head depletedq; /* unordered list of depleted vcpus */ > + > +struct timer *repl_timer; /* replenishment timer */ > struct list_head replq; /* ordered list of vcpus that need > replenishment */ > + > cpumask_t tickled; /* cpus been tickled */ > -struct timer *repl_timer; /* replenishment timer */ > }; > > /* > @@ -185,10 +188,6 @@ struct rt_vcpu { > struct list_head q_elem; /* on the runq/depletedq list */ > struct list_head replq_elem; /* on the replenishment events list */ > > -/* Up-pointers */ > -struct rt_dom *sdom; > -struct vcpu *vcpu; > - > /* VCPU parameters, in nanoseconds */ > s_time_t period; > s_time_t budget; > @@ -198,6 +197,10 @@ struct rt_vcpu { > s_time_t last_start; /* last start time */ > s_time_t cur_deadline; /* current deadline for EDF */ > > +/* Up-pointers */ > +struct rt_dom *sdom; > +struct vcpu *vcpu; > + > unsigned flags; /* mark __RTDS_scheduled, etc.. */ > }; > Reviewed-by: Meng Xu BTW, Dario, I'm wondering if you used any tool to give hints about how to arrange the fields in a structure or you just did it manually? Thanks, Meng --- Meng Xu PhD Candidate in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH RFC v1] xen:rtds: towards work conserving RTDS
Make RTDS scheduler work conserving to utilize the idle resource, without breaking the real-time guarantees. VCPU model: Each real-time VCPU is extended to have a work conserving flag and a priority_level field. When a VCPU's budget is depleted in the current period, if it has work conserving flag set, its priority_level will increase by 1 and its budget will be refilled; othewrise, the VCPU will be moved to the depletedq. Scheduling policy: modified global EDF: A VCPU v1 has higher priority than another VCPU v2 if (i) v1 has smaller priority_leve; or (ii) v1 has the same priority_level but has a smaller deadline Signed-off-by: Meng Xu --- xen/common/sched_rt.c | 71 ++- 1 file changed, 59 insertions(+), 12 deletions(-) diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c index 39f6bee..740a712 100644 --- a/xen/common/sched_rt.c +++ b/xen/common/sched_rt.c @@ -49,13 +49,16 @@ * A PCPU is feasible if the VCPU can run on this PCPU and (the PCPU is idle or * has a lower-priority VCPU running on it.) * - * Each VCPU has a dedicated period and budget. + * Each VCPU has a dedicated period, budget and is_work_conserving flag * The deadline of a VCPU is at the end of each period; * A VCPU has its budget replenished at the beginning of each period; * While scheduled, a VCPU burns its budget. * The VCPU needs to finish its budget before its deadline in each period; * The VCPU discards its unused budget at the end of each period. - * If a VCPU runs out of budget in a period, it has to wait until next period. + * A work conserving VCPU has is_work_conserving flag set to true; + * When a VCPU runs out of budget in a period, if it is work conserving, + * it increases its priority_level by 1 and refill its budget; otherwise, + * it has to wait until next period. * * Each VCPU is implemented as a deferable server. * When a VCPU has a task running on it, its budget is continuously burned; @@ -63,7 +66,8 @@ * * Queue scheme: * A global runqueue and a global depletedqueue for each CPU pool. - * The runqueue holds all runnable VCPUs with budget, sorted by deadline; + * The runqueue holds all runnable VCPUs with budget, + * sorted by priority_level and deadline; * The depletedqueue holds all VCPUs without budget, unsorted; * * Note: cpumask and cpupool is supported. @@ -191,6 +195,7 @@ struct rt_vcpu { /* VCPU parameters, in nanoseconds */ s_time_t period; s_time_t budget; +bool_t is_work_conserving; /* is vcpu work conserving */ /* VCPU current infomation in nanosecond */ s_time_t cur_budget; /* current budget */ @@ -201,6 +206,8 @@ struct rt_vcpu { struct rt_dom *sdom; struct vcpu *vcpu; +unsigned priority_level; + unsigned flags; /* mark __RTDS_scheduled, etc.. */ }; @@ -245,6 +252,11 @@ static inline struct list_head *rt_replq(const struct scheduler *ops) return &rt_priv(ops)->replq; } +static inline bool_t is_work_conserving(const struct rt_vcpu *svc) +{ +return svc->is_work_conserving; +} + /* * Helper functions for manipulating the runqueue, the depleted queue, * and the replenishment events queue. @@ -273,6 +285,20 @@ vcpu_on_replq(const struct rt_vcpu *svc) return !list_empty(&svc->replq_elem); } +/* If v1 priority >= v2 priority, return value > 0 + * Otherwise, return value < 0 + */ +static int +compare_vcpu_priority(const struct rt_vcpu *v1, const struct rt_vcpu *v2) +{ +if ( v1->priority_level < v2->priority_level || + ( v1->priority_level == v2->priority_level && + v1->cur_deadline <= v2->cur_deadline ) ) +return 1; +else +return -1; +} + /* * Debug related code, dump vcpu/cpu information */ @@ -303,6 +329,7 @@ rt_dump_vcpu(const struct scheduler *ops, const struct rt_vcpu *svc) cpulist_scnprintf(keyhandler_scratch, sizeof(keyhandler_scratch), mask); printk("[%5d.%-2u] cpu %u, (%"PRI_stime", %"PRI_stime")," " cur_b=%"PRI_stime" cur_d=%"PRI_stime" last_start=%"PRI_stime"\n" + " \t\t priority_level=%d work_conserving=%d\n" " \t\t onQ=%d runnable=%d flags=%x effective hard_affinity=%s\n", svc->vcpu->domain->domain_id, svc->vcpu->vcpu_id, @@ -312,6 +339,8 @@ rt_dump_vcpu(const struct scheduler *ops, const struct rt_vcpu *svc) svc->cur_budget, svc->cur_deadline, svc->last_start, +svc->priority_level, +is_work_conserving(svc), vcpu_on_q(svc), vcpu_runnable(svc->vcpu), svc->flags, @@ -423,15 +452,18 @@ rt_update_deadline(s_time_t now, struct rt_vcpu *svc) */ svc->last_start = now;
[Xen-devel] [PATCH RFC v1 2/3] libxl: enable per-VCPU work conserving flag for RTDS
Modify libxl_vcpu_sched_params_get/set and sched_rtds_vcpu_get/set functions to support per-VCPU work conserving flag Signed-off-by: Meng Xu --- tools/libxl/libxl.h | 1 + tools/libxl/libxl_sched.c | 3 +++ tools/libxl/libxl_types.idl | 2 ++ 3 files changed, 6 insertions(+) diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h index 7cf0f31..dd9c926 100644 --- a/tools/libxl/libxl.h +++ b/tools/libxl/libxl.h @@ -2058,6 +2058,7 @@ int libxl_sched_credit2_params_set(libxl_ctx *ctx, uint32_t poolid, #define LIBXL_DOMAIN_SCHED_PARAM_LATENCY_DEFAULT -1 #define LIBXL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT -1 #define LIBXL_DOMAIN_SCHED_PARAM_BUDGET_DEFAULT-1 +#define LIBXL_DOMAIN_SCHED_PARAM_IS_WORK_CONSERVING_DEFAULT-1 /* Per-VCPU parameters */ #define LIBXL_SCHED_PARAM_VCPU_INDEX_DEFAULT -1 diff --git a/tools/libxl/libxl_sched.c b/tools/libxl/libxl_sched.c index faa604e..fe92747 100644 --- a/tools/libxl/libxl_sched.c +++ b/tools/libxl/libxl_sched.c @@ -558,6 +558,7 @@ static int sched_rtds_vcpu_get_all(libxl__gc *gc, uint32_t domid, for (i = 0; i < num_vcpus; i++) { scinfo->vcpus[i].period = vcpus[i].u.rtds.period; scinfo->vcpus[i].budget = vcpus[i].u.rtds.budget; +scinfo->vcpus[i].is_work_conserving = vcpus[i].u.rtds.is_work_conserving; scinfo->vcpus[i].vcpuid = vcpus[i].vcpuid; } rc = 0; @@ -607,6 +608,7 @@ static int sched_rtds_vcpu_set(libxl__gc *gc, uint32_t domid, vcpus[i].vcpuid = scinfo->vcpus[i].vcpuid; vcpus[i].u.rtds.period = scinfo->vcpus[i].period; vcpus[i].u.rtds.budget = scinfo->vcpus[i].budget; +vcpus[i].u.rtds.is_work_conserving = scinfo->vcpus[i].is_work_conserving; } r = xc_sched_rtds_vcpu_set(CTX->xch, domid, @@ -655,6 +657,7 @@ static int sched_rtds_vcpu_set_all(libxl__gc *gc, uint32_t domid, vcpus[i].vcpuid = i; vcpus[i].u.rtds.period = scinfo->vcpus[0].period; vcpus[i].u.rtds.budget = scinfo->vcpus[0].budget; +vcpus[i].u.rtds.is_work_conserving = scinfo->vcpus[0].is_work_conserving; } r = xc_sched_rtds_vcpu_set(CTX->xch, domid, diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index 8a9849c..f6c3ead 100644 --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl @@ -401,6 +401,7 @@ libxl_sched_params = Struct("sched_params",[ ("period", integer, {'init_val': 'LIBXL_DOMAIN_SCHED_PARAM_PERIOD_DEFAULT'}), ("extratime",integer, {'init_val': 'LIBXL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT'}), ("budget", integer, {'init_val': 'LIBXL_DOMAIN_SCHED_PARAM_BUDGET_DEFAULT'}), +("is_work_conserving", integer, {'init_val': 'LIBXL_DOMAIN_SCHED_PARAM_IS_WORK_CONSERVING_DEFAULT'}), ]) libxl_vcpu_sched_params = Struct("vcpu_sched_params",[ @@ -414,6 +415,7 @@ libxl_domain_sched_params = Struct("domain_sched_params",[ ("cap", integer, {'init_val': 'LIBXL_DOMAIN_SCHED_PARAM_CAP_DEFAULT'}), ("period", integer, {'init_val': 'LIBXL_DOMAIN_SCHED_PARAM_PERIOD_DEFAULT'}), ("budget", integer, {'init_val': 'LIBXL_DOMAIN_SCHED_PARAM_BUDGET_DEFAULT'}), +("is_work_conserving", integer, {'init_val': 'LIBXL_DOMAIN_SCHED_PARAM_IS_WORK_CONSERVING_DEFAULT'}), # The following three parameters ('slice', 'latency' and 'extratime') are deprecated, # and will have no effect if used, since the SEDF scheduler has been removed. -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH RFC v1 1/3] xen:rtds: enable XL to set and get vcpu work conserving flag
Extend the hypercalls(XEN_DOMCTL_SCHEDOP_getvcpuinfo/putvcpuinfo) to get/set a domain's per-VCPU work conserving parameters. Signed-off-by: Meng Xu --- xen/common/sched_rt.c | 2 ++ xen/include/public/domctl.h | 1 + 2 files changed, 3 insertions(+) diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c index 740a712..76ed4cb 100644 --- a/xen/common/sched_rt.c +++ b/xen/common/sched_rt.c @@ -1442,6 +1442,7 @@ rt_dom_cntl( svc = rt_vcpu(d->vcpu[local_sched.vcpuid]); local_sched.u.rtds.budget = svc->budget / MICROSECS(1); local_sched.u.rtds.period = svc->period / MICROSECS(1); +local_sched.u.rtds.is_work_conserving = svc->is_work_conserving; spin_unlock_irqrestore(&prv->lock, flags); if ( copy_to_guest_offset(op->u.v.vcpus, index, @@ -1466,6 +1467,7 @@ rt_dom_cntl( svc = rt_vcpu(d->vcpu[local_sched.vcpuid]); svc->period = period; svc->budget = budget; +svc->is_work_conserving = local_sched.u.rtds.is_work_conserving; spin_unlock_irqrestore(&prv->lock, flags); } /* Process a most 64 vCPUs without checking for preemptions. */ diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h index ff39762..e67cd9e 100644 --- a/xen/include/public/domctl.h +++ b/xen/include/public/domctl.h @@ -360,6 +360,7 @@ typedef struct xen_domctl_sched_credit2 { typedef struct xen_domctl_sched_rtds { uint32_t period; uint32_t budget; +bool is_work_conserving; } xen_domctl_sched_rtds_t; typedef struct xen_domctl_schedparam_vcpu { -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH RFC v1 3/3] xl: enable per-VCPU work conserving flag for RTDS
Change main_sched_rtds and related output functions to support per-VCPU work conserving flag. Signed-off-by: Meng Xu --- tools/xl/xl_cmdtable.c | 3 ++- tools/xl/xl_sched.c| 56 ++ 2 files changed, 40 insertions(+), 19 deletions(-) diff --git a/tools/xl/xl_cmdtable.c b/tools/xl/xl_cmdtable.c index 30eb93c..95997e1 100644 --- a/tools/xl/xl_cmdtable.c +++ b/tools/xl/xl_cmdtable.c @@ -272,12 +272,13 @@ struct cmd_spec cmd_table[] = { { "sched-rtds", &main_sched_rtds, 0, 1, "Get/set rtds scheduler parameters", - "[-d [-v[=VCPUID/all]] [-p[=PERIOD]] [-b[=BUDGET]]]", + "[-d [-v[=VCPUID/all]] [-p[=PERIOD]] [-b[=BUDGET]]] [-w[=WORKCONSERVING]]", "-d DOMAIN, --domain=DOMAIN Domain to modify\n" "-v VCPUID/all, --vcpuid=VCPUID/allVCPU to modify or output;\n" " Using '-v all' to modify/output all vcpus\n" "-p PERIOD, --period=PERIOD Period (us)\n" "-b BUDGET, --budget=BUDGET Budget (us)\n" + "-w WORKCONSERVING, --workconserving=WORKCONSERVINGWORKCONSERVING (1=yes,0=no)\n" }, { "domid", &main_domid, 0, 0, diff --git a/tools/xl/xl_sched.c b/tools/xl/xl_sched.c index 85722fe..35a64e1 100644 --- a/tools/xl/xl_sched.c +++ b/tools/xl/xl_sched.c @@ -251,7 +251,7 @@ static int sched_rtds_domain_output( libxl_domain_sched_params scinfo; if (domid < 0) { -printf("%-33s %4s %9s %9s\n", "Name", "ID", "Period", "Budget"); +printf("%-33s %4s %9s %9s %15s\n", "Name", "ID", "Period", "Budget", "Work conserving"); return 0; } @@ -262,11 +262,12 @@ static int sched_rtds_domain_output( } domname = libxl_domid_to_name(ctx, domid); -printf("%-33s %4d %9d %9d\n", +printf("%-33s %4d %9d %9d %15d\n", domname, domid, scinfo.period, -scinfo.budget); +scinfo.budget, +scinfo.is_work_conserving); free(domname); libxl_domain_sched_params_dispose(&scinfo); return 0; @@ -279,8 +280,8 @@ static int sched_rtds_vcpu_output(int domid, libxl_vcpu_sched_params *scinfo) int i; if (domid < 0) { -printf("%-33s %4s %4s %9s %9s\n", "Name", "ID", - "VCPU", "Period", "Budget"); +printf("%-33s %4s %4s %9s %9s %15s\n", "Name", "ID", + "VCPU", "Period", "Budget", "Work conserving"); return 0; } @@ -290,12 +291,13 @@ static int sched_rtds_vcpu_output(int domid, libxl_vcpu_sched_params *scinfo) domname = libxl_domid_to_name(ctx, domid); for ( i = 0; i < scinfo->num_vcpus; i++ ) { -printf("%-33s %4d %4d %9"PRIu32" %9"PRIu32"\n", +printf("%-33s %4d %4d %9"PRIu32" %9"PRIu32" %15d\n", domname, domid, scinfo->vcpus[i].vcpuid, scinfo->vcpus[i].period, - scinfo->vcpus[i].budget); + scinfo->vcpus[i].budget, + scinfo->vcpus[i].is_work_conserving ); } free(domname); return 0; @@ -309,8 +311,8 @@ static int sched_rtds_vcpu_output_all(int domid, int i; if (domid < 0) { -printf("%-33s %4s %4s %9s %9s\n", "Name", "ID", - "VCPU", "Period", "Budget"); +printf("%-33s %4s %4s %9s %9s %15s\n", "Name", "ID", + "VCPU", "Period", "Budget", "Work conserving"); return 0; } @@ -321,12 +323,13 @@ static int sched_rtds_vcpu_output_all(int domid, domname = libxl_domid_to_name(ctx, domid); for ( i = 0; i < scinfo->num_vcpus; i++ ) { -printf("%-33s %4d %4d %9"PRIu32" %9"PRIu32"\n", +printf("%-33s %4d %4d %9"PRIu32" %9"PRIu32" %15d\n", domname, domid, scinfo->vcpus[i].vcpuid, scinfo->vcpus[i].period, - scinfo->vcpus[i].budget); + scinfo->vcpus[i].budget, + scinfo->vcpus[i].is_work_conserving); } free(domname); return 0; @@ -702,14 +705,18 @@ int main_sched_rtds(int argc, char **argv) int *vcpus = (int *)xmalloc(sizeof(int)); /* IDs of VCPUs that change */ int *periods = (int *)xmalloc(sizeof(int)); /* period is in microsecond */ int *budgets = (int *)xmalloc(si
[Xen-devel] [PATCH RFC v1 0/3] Enable XL to set and get per-VCPU work conserving flag for RTDS scheduler
This series of patches enable the toolstack to set and get per-VCPU work-conserving flag. With the toolstack, system administrators can decide which VCPUs will be made work-conserving. The design of the work-conserving RTDS was discussed in https://www.mail-archive.com/xen-devel@lists.xen.org/msg77150.html We plan to perform two steps in making RTDS scheduler work-conserving: (1) First make all VCPUs work-conserving by default, which was sent as a separate patch. This work aims for Xen 4.10 release. (2) After that, we enable the XL to set and get per-VCPU work-conserving flag, which is this series of patches. Signed-off-by: Meng Xu ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RFC v1 0/3] Enable XL to set and get per-VCPU work conserving flag for RTDS scheduler
On Tue, Aug 1, 2017 at 2:33 PM, Meng Xu wrote: > > This series of patches enable the toolstack to > set and get per-VCPU work-conserving flag. > With the toolstack, system administrators can decide > which VCPUs will be made work-conserving. > > The design of the work-conserving RTDS was discussed in > https://www.mail-archive.com/xen-devel@lists.xen.org/msg77150.html > > We plan to perform two steps in making RTDS scheduler work-conserving: > (1) First make all VCPUs work-conserving by default, > which was sent as a separate patch. This work aims for Xen 4.10 release. > (2) After that, we enable the XL to set and get per-VCPU work-conserving flag, > which is this series of patches. The series of patches that have both steps done can be found at the following repo: https://github.com/PennPanda/RT-Xen under the branch xenbits/rtds/work-conserving-RFCv1. Thanks, Meng Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v4] xen: rtds: only tickle non-already tickled CPUs
When more than one idle VCPUs that have the same PCPU as their previous running core invoke runq_tickle(), they will tickle the same PCPU. The tickled PCPU will only pick at most one VCPU, i.e., the highest-priority one, to execute. The other VCPUs will not be scheduled for a period, even when there is an idle core, making these VCPUs unnecessarily starve for one period. Therefore, always make sure that we only tickle PCPUs that have not been tickled already. Signed-off-by: Haoran Li Signed-off-by: Meng Xu --- The initial discussion of this patch can be found at https://lists.xenproject.org/archives/html/xen-devel/2017-02/msg02857.html Changes in v4: 1) Take Dario's suggestions: Search the new->cpu first for the cpu to tickle. This get rid of the if statement in previous versions. 2) Reword the comments and commit messages. 3) Rebased on staging branch. Issues in v2 and v3: Did not rebase on the latest staging branch. Did not solve the comments/issues in v1. Please ignore the v2 and v3. --- xen/common/sched_rt.c | 29 ++--- 1 file changed, 14 insertions(+), 15 deletions(-) diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c index 39f6bee..5fec95f 100644 --- a/xen/common/sched_rt.c +++ b/xen/common/sched_rt.c @@ -1147,9 +1147,9 @@ rt_vcpu_sleep(const struct scheduler *ops, struct vcpu *vc) * Called by wake() and context_saved() * We have a running candidate here, the kick logic is: * Among all the cpus that are within the cpu affinity - * 1) if the new->cpu is idle, kick it. This could benefit cache hit - * 2) if there are any idle vcpu, kick it. - * 3) now all pcpus are busy; + * 1) if there are any idle vcpu, kick it. + For cache benefit,we first search new->cpu. + * 2) now all pcpus are busy; *among all the running vcpus, pick lowest priority one *if snext has higher priority, kick it. * @@ -1177,17 +1177,13 @@ runq_tickle(const struct scheduler *ops, struct rt_vcpu *new) cpumask_and(¬_tickled, online, new->vcpu->cpu_hard_affinity); cpumask_andnot(¬_tickled, ¬_tickled, &prv->tickled); -/* 1) if new's previous cpu is idle, kick it for cache benefit */ -if ( is_idle_vcpu(curr_on_cpu(new->vcpu->processor)) ) -{ -SCHED_STAT_CRANK(tickled_idle_cpu); -cpu_to_tickle = new->vcpu->processor; -goto out; -} - -/* 2) if there are any idle pcpu, kick it */ -/* The same loop also find the one with lowest priority */ -for_each_cpu(cpu, ¬_tickled) +/* + * 1) If there are any idle vcpu, kick it. + *For cache benefit,we first search new->cpu. + *The same loop also find the one with lowest priority. + */ +cpu = cpumask_test_or_cycle(new->vcpu->processor, ¬_tickled); +while ( cpu!= nr_cpu_ids ) { iter_vc = curr_on_cpu(cpu); if ( is_idle_vcpu(iter_vc) ) @@ -1200,9 +1196,12 @@ runq_tickle(const struct scheduler *ops, struct rt_vcpu *new) if ( latest_deadline_vcpu == NULL || iter_svc->cur_deadline > latest_deadline_vcpu->cur_deadline ) latest_deadline_vcpu = iter_svc; + +cpumask_clear_cpu(cpu, ¬_tickled); +cpu = cpumask_cycle(cpu, ¬_tickled); } -/* 3) candicate has higher priority, kick out lowest priority vcpu */ +/* 2) candicate has higher priority, kick out lowest priority vcpu */ if ( latest_deadline_vcpu != NULL && new->cur_deadline < latest_deadline_vcpu->cur_deadline ) { -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v5] xen: rtds: only tickle non-already tickled CPUs
When more than one idle VCPUs that have the same PCPU as their previous running core invoke runq_tickle(), they will tickle the same PCPU. The tickled PCPU will only pick at most one VCPU, i.e., the highest-priority one, to execute. The other VCPUs will not be scheduled for a period, even when there is an idle core, making these VCPUs unnecessarily starve for one period. Therefore, always make sure that we only tickle PCPUs that have not been tickled already. Signed-off-by: Haoran Li Signed-off-by: Meng Xu Reviewed-by: Dario Faggioli --- The initial discussion of this patch can be found at https://lists.xenproject.org/archives/html/xen-devel/2017-02/msg02857.html Changes in v5: Revise comments as Dario suggested Changes in v4: 1) Take Dario's suggestions: Search the new->cpu first for the cpu to tickle. This get rid of the if statement in previous versions. 2) Reword the comments and commit messages. 3) Rebased on staging branch. Issues in v2 and v3: Did not rebase on the latest staging branch. Did not solve the comments/issues in v1. Please ignore the v2 and v3. --- xen/common/sched_rt.c | 29 ++--- 1 file changed, 14 insertions(+), 15 deletions(-) diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c index 39f6bee..0ac5816 100644 --- a/xen/common/sched_rt.c +++ b/xen/common/sched_rt.c @@ -1147,9 +1147,9 @@ rt_vcpu_sleep(const struct scheduler *ops, struct vcpu *vc) * Called by wake() and context_saved() * We have a running candidate here, the kick logic is: * Among all the cpus that are within the cpu affinity - * 1) if the new->cpu is idle, kick it. This could benefit cache hit - * 2) if there are any idle vcpu, kick it. - * 3) now all pcpus are busy; + * 1) if there are any idle CPUs, kick one. + For cache benefit, we check new->cpu as first + * 2) now all pcpus are busy; *among all the running vcpus, pick lowest priority one *if snext has higher priority, kick it. * @@ -1177,17 +1177,13 @@ runq_tickle(const struct scheduler *ops, struct rt_vcpu *new) cpumask_and(¬_tickled, online, new->vcpu->cpu_hard_affinity); cpumask_andnot(¬_tickled, ¬_tickled, &prv->tickled); -/* 1) if new's previous cpu is idle, kick it for cache benefit */ -if ( is_idle_vcpu(curr_on_cpu(new->vcpu->processor)) ) -{ -SCHED_STAT_CRANK(tickled_idle_cpu); -cpu_to_tickle = new->vcpu->processor; -goto out; -} - -/* 2) if there are any idle pcpu, kick it */ -/* The same loop also find the one with lowest priority */ -for_each_cpu(cpu, ¬_tickled) +/* + * 1) If there are any idle CPUs, kick one. + *For cache benefit,we first search new->cpu. + *The same loop also find the one with lowest priority. + */ +cpu = cpumask_test_or_cycle(new->vcpu->processor, ¬_tickled); +while ( cpu!= nr_cpu_ids ) { iter_vc = curr_on_cpu(cpu); if ( is_idle_vcpu(iter_vc) ) @@ -1200,9 +1196,12 @@ runq_tickle(const struct scheduler *ops, struct rt_vcpu *new) if ( latest_deadline_vcpu == NULL || iter_svc->cur_deadline > latest_deadline_vcpu->cur_deadline ) latest_deadline_vcpu = iter_svc; + +cpumask_clear_cpu(cpu, ¬_tickled); +cpu = cpumask_cycle(cpu, ¬_tickled); } -/* 3) candicate has higher priority, kick out lowest priority vcpu */ +/* 2) candicate has higher priority, kick out lowest priority vcpu */ if ( latest_deadline_vcpu != NULL && new->cur_deadline < latest_deadline_vcpu->cur_deadline ) { -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RFC v1] xen:rtds: towards work conserving RTDS
On Wed, Aug 2, 2017 at 1:46 PM, Dario Faggioli wrote: > Hey, Meng! > > It's really cool to see progress on this... There was quite a bit of > interest in scheduling in general at the Summit in Budapest, and one > important thing for making sure RTDS will be really useful, is for it > to have a work conserving mode! :-) Glad to hear that. :-) > > On Tue, 2017-08-01 at 14:13 -0400, Meng Xu wrote: >> Make RTDS scheduler work conserving to utilize the idle resource, >> without breaking the real-time guarantees. > > Just kill the "to utilize the idle resource". We can expect that people > that are interested in this commit, also know what 'work conserving' > means. :-) Got it. Will do. > >> VCPU model: >> Each real-time VCPU is extended to have a work conserving flag >> and a priority_level field. >> When a VCPU's budget is depleted in the current period, >> if it has work conserving flag set, >> its priority_level will increase by 1 and its budget will be >> refilled; >> othewrise, the VCPU will be moved to the depletedq. >> > Mmm... Ok. But is the budget burned, while the vCPU executes at > priority_level 1? If yes, doesn't this mean we risk having less budget > when we get back to priority_lvevel 0? > > Oh, wait, maybe it's the case that, when we get back to priority_level > 0, we also get another replenishment, is that the case? If yes, I > actually think it's fine... It's the latter case: the vcpu will get another replenishment when it gets back to priority_level 0. > >> diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c >> index 39f6bee..740a712 100644 >> --- a/xen/common/sched_rt.c >> +++ b/xen/common/sched_rt.c >> @@ -191,6 +195,7 @@ struct rt_vcpu { >> /* VCPU parameters, in nanoseconds */ >> s_time_t period; >> s_time_t budget; >> +bool_t is_work_conserving; /* is vcpu work conserving */ >> >> /* VCPU current infomation in nanosecond */ >> s_time_t cur_budget; /* current budget */ >> @@ -201,6 +206,8 @@ struct rt_vcpu { >> struct rt_dom *sdom; >> struct vcpu *vcpu; >> >> +unsigned priority_level; >> + >> unsigned flags; /* mark __RTDS_scheduled, etc.. */ >> > So, since we've got a 'flags' field already, can the flag be one of its > bit, instead of adding a new bool in the struct: > > /* > * RTDS_work_conserving: Can the vcpu run in the time that is > * not part of any real-time reservation, and would therefore > * be otherwise left idle? > */ > __RTDS_work_conserving 4 > #define RTDS_work_conserving (1<<__RTDS_work_conserving) Thank you very much for the suggestion! I will modify based on your suggestion. Actually, I was not very comfortable with the is_work_conserving field either. It makes the structure verbose and mess up the struct's the cache_line alignment. > >> @@ -245,6 +252,11 @@ static inline struct list_head *rt_replq(const >> struct scheduler *ops) >> return &rt_priv(ops)->replq; >> } >> >> +static inline bool_t is_work_conserving(const struct rt_vcpu *svc) >> +{ >> > Use bool. OK. > >> @@ -273,6 +285,20 @@ vcpu_on_replq(const struct rt_vcpu *svc) >> return !list_empty(&svc->replq_elem); >> } >> >> +/* If v1 priority >= v2 priority, return value > 0 >> + * Otherwise, return value < 0 >> + */ >> > Comment style. Got it. Will make it as: /* * If v1 priority >= v2 priority, return value > 0 * Otherwise, return value < 0 */ > > Apart from that, do you want this to return >0 if v1 should have > priority over v2, and <0 if vice-versa, right? If yes... Yes. > >> +static int >> +compare_vcpu_priority(const struct rt_vcpu *v1, const struct rt_vcpu >> *v2) >> +{ >> +if ( v1->priority_level < v2->priority_level || >> + ( v1->priority_level == v2->priority_level && >> + v1->cur_deadline <= v2->cur_deadline ) ) >> +return 1; >> +else >> +return -1; >> > int prio = v2->priority_level - v1->priority_level; > > if ( prio == 0 ) > return v2->cur_deadline - v1->cur_deadline; > > return prio; > > Return type has to become s_time_t, and there's a chance that it'll > return 0, if they are at the same level, and have the same absolute > deadline. But I think you can deal with this in the caller. OK. Will do. > >> @@ -966,8 +1001,16 @@ burn_budget(const struct scheduler *
Re: [Xen-devel] [PATCH RFC v1 0/3] Enable XL to set and get per-VCPU work conserving flag for RTDS scheduler
On Wed, Aug 2, 2017 at 1:49 PM, Dario Faggioli wrote: > On Tue, 2017-08-01 at 14:33 -0400, Meng Xu wrote: >> This series of patches enable the toolstack to >> set and get per-VCPU work-conserving flag. >> With the toolstack, system administrators can decide >> which VCPUs will be made work-conserving. >> > Thanks for this series as well, Meng. I'll look at it in the next > couple of days. >> >> We plan to perform two steps in making RTDS scheduler work- >> conserving: >> (1) First make all VCPUs work-conserving by default, >> which was sent as a separate patch. This work aims for Xen 4.10 >> release. >> (2) After that, we enable the XL to set and get per-VCPU work- >> conserving flag, >> which is this series of patches. >> > I think it's better if you merge the "xen:rtds: towards work conserving > RTDS" as patch 1 of this series. > > In fact, sending them as separate series, you make people think that > they're independent, while they're not (as this series is pretty > useless, without that patch :-P). Sure. I can do that. :) Thanks, Meng Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RFC v1 1/3] xen:rtds: enable XL to set and get vcpu work conserving flag
On Thu, Aug 3, 2017 at 11:47 AM, Dario Faggioli wrote: > On Tue, 2017-08-01 at 14:33 -0400, Meng Xu wrote: >> --- a/xen/include/public/domctl.h >> +++ b/xen/include/public/domctl.h >> @@ -360,6 +360,7 @@ typedef struct xen_domctl_sched_credit2 { >> typedef struct xen_domctl_sched_rtds { >> uint32_t period; >> uint32_t budget; >> +bool is_work_conserving; >> > I wonder whether it wouldn't be better (e.g., more future proof) to > have a 'uint32_T flags' field here too. > > That way, if/when, in future, we want to introduce some other way of > tweaking the scheduler's behavior for this vCPU, we already have space > for specifying it... > uint32_t flag sounds reasonable to me. I can do it in the next version. Meng --- Meng Xu PhD Candidate in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RFC v1 2/3] libxl: enable per-VCPU work conserving flag for RTDS
On Thu, Aug 3, 2017 at 11:53 AM, Dario Faggioli wrote: > On Tue, 2017-08-01 at 14:33 -0400, Meng Xu wrote: >> diff --git a/tools/libxl/libxl_types.idl >> b/tools/libxl/libxl_types.idl >> index 8a9849c..f6c3ead 100644 >> --- a/tools/libxl/libxl_types.idl >> +++ b/tools/libxl/libxl_types.idl >> @@ -401,6 +401,7 @@ libxl_sched_params = Struct("sched_params",[ >> ("period", integer, {'init_val': >> 'LIBXL_DOMAIN_SCHED_PARAM_PERIOD_DEFAULT'}), >> ("extratime",integer, {'init_val': >> 'LIBXL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT'}), >> ("budget", integer, {'init_val': >> 'LIBXL_DOMAIN_SCHED_PARAM_BUDGET_DEFAULT'}), >> +("is_work_conserving", integer, {'init_val': >> 'LIBXL_DOMAIN_SCHED_PARAM_IS_WORK_CONSERVING_DEFAULT'}), >> ]) >> > How about, here at libxl level, we use the "extratime" field that we > have as a leftover from SEDF (and which had, in that scheduler, a > similar meaning)? > > If we don't want to use that one, and we want a new field, I suggest > thinking to a shorter name. How about 'LIBXL_DOMAIN_SCHED_PARAM_FLAG'? We use a bit in the flag field in the sched_rt.c to indicate if a VCPU is work-conserving. The flag field is also extensible for adding other VCPU properties in the future, if necessary. Thanks, Meng --- Meng Xu PhD Candidate in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RFC v1 3/3] xl: enable per-VCPU work conserving flag for RTDS
On Thu, Aug 3, 2017 at 12:03 PM, Dario Faggioli wrote: > On Tue, 2017-08-01 at 14:33 -0400, Meng Xu wrote: >> --- a/tools/xl/xl_cmdtable.c >> +++ b/tools/xl/xl_cmdtable.c >> @@ -272,12 +272,13 @@ struct cmd_spec cmd_table[] = { >> { "sched-rtds", >>&main_sched_rtds, 0, 1, >>"Get/set rtds scheduler parameters", >> - "[-d [-v[=VCPUID/all]] [-p[=PERIOD]] [-b[=BUDGET]]]", >> + "[-d [-v[=VCPUID/all]] [-p[=PERIOD]] [-b[=BUDGET]]] >> [-w[=WORKCONSERVING]]", >>"-d DOMAIN, --domain=DOMAIN Domain to modify\n" >>"-v VCPUID/all, --vcpuid=VCPUID/allVCPU to modify or >> output;\n" >>" Using '-v all' to modify/output all vcpus\n" >>"-p PERIOD, --period=PERIOD Period (us)\n" >>"-b BUDGET, --budget=BUDGET Budget (us)\n" >> + "-w WORKCONSERVING, -- >> workconserving=WORKCONSERVINGWORKCONSERVING (1=yes,0=no)\n" >> > Does this really need to accept a 1 or 0 parameter? Can't it be that, > if -w is provided, the vCPU is marked as work-conserving, if it's not, > it's considered reservation only. > >> --- a/tools/xl/xl_sched.c >> +++ b/tools/xl/xl_sched.c >> >> @@ -279,8 +280,8 @@ static int sched_rtds_vcpu_output(int domid, >> libxl_vcpu_sched_params *scinfo) >> int i; >> >> if (domid < 0) { >> -printf("%-33s %4s %4s %9s %9s\n", "Name", "ID", >> - "VCPU", "Period", "Budget"); >> +printf("%-33s %4s %4s %9s %9s %15s\n", "Name", "ID", >> + "VCPU", "Period", "Budget", "Work conserving"); >> return 0; >> } >> >> @@ -290,12 +291,13 @@ static int sched_rtds_vcpu_output(int domid, >> libxl_vcpu_sched_params *scinfo) >> >> domname = libxl_domid_to_name(ctx, domid); >> for ( i = 0; i < scinfo->num_vcpus; i++ ) { >> -printf("%-33s %4d %4d %9"PRIu32" %9"PRIu32"\n", >> +printf("%-33s %4d %4d %9"PRIu32" %9"PRIu32" %15d\n", >> > As far as printing it goes, OTOH, I would indeed print a string, i.e., > "yes", if the field is found to be 1 (true), or "no", if the field is > found to be 0 (false). > >> @@ -702,14 +705,18 @@ int main_sched_rtds(int argc, char **argv) >> int *vcpus = (int *)xmalloc(sizeof(int)); /* IDs of VCPUs that >> change */ >> int *periods = (int *)xmalloc(sizeof(int)); /* period is in >> microsecond */ >> int *budgets = (int *)xmalloc(sizeof(int)); /* budget is in >> microsecond */ >> +int *workconservings = (int *)xmalloc(sizeof(int)); /* budget is >> in microsecond */ >> > Yeah, budget is in microseconds. But this is not budget! :-P Ah, my bad.. > > In fact (jokes apart), it can be just a bool, can't it? Yes, bool is enough. Is "workconserving" too long here? I thought about alternative names, such as "wc", "workc", and "extratime". None of them is good enough. The ideal one should be much shorter and easy to link to "work conserving". :( If we use "extratime", it may cause confusion with the "extratime" in the depreciated SEDF. (That is my concern of reusing the EXTRATIME in the libxl_type.idl.) Maybe "workc" is better than "workconserving"? Thanks, Meng --- Meng Xu PhD Candidate in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RFC v1 3/3] xl: enable per-VCPU work conserving flag for RTDS
On Fri, Aug 4, 2017 at 5:01 AM, Dario Faggioli wrote: > On Thu, 2017-08-03 at 18:02 -0400, Meng Xu wrote: >> On Thu, Aug 3, 2017 at 12:03 PM, Dario Faggioli >> wrote: >> > >> > > @@ -702,14 +705,18 @@ int main_sched_rtds(int argc, char **argv) >> > > int *vcpus = (int *)xmalloc(sizeof(int)); /* IDs of VCPUs >> > > that >> > > change */ >> > > int *periods = (int *)xmalloc(sizeof(int)); /* period is in >> > > microsecond */ >> > > int *budgets = (int *)xmalloc(sizeof(int)); /* budget is in >> > > microsecond */ >> > > +int *workconservings = (int *)xmalloc(sizeof(int)); /* >> > > budget is >> > > in microsecond */ >> > > >> > >> > Yeah, budget is in microseconds. But this is not budget! :-P >> >> Ah, my bad.. >> >> > >> > In fact (jokes apart), it can be just a bool, can't it? >> >> Yes, bool is enough. >> Is "workconserving" too long here? >> > So, I don't want to turn this into a discussion about what colour we > should paint the infamous bikeshed... but, yeah, I don't especially > like this name! :-P > > An I mean, not only here, but everywhere you've used it (changelogs, > other patches, etc.). > > There are two reasons for that: > - it's indeed very long; > - being work conserving is (or at least, I've always heard it used >and used it myself) a characteristic of a scheduling algorithm (or >of its implementation), *not* of a task/vcpu/schedulable entity. Fair enough. I agree work conserving is not a good name. > >It is the scheduler that is work conserving, iff it never let CPUs >sit idle, when there is work to do. In our case here, the scheduler >is work conserving if all the vCPUs has this flag set. It's not, >if even just one has it clear. > >And by putting workconserving-ness at the vCPU level, it looks to >me that we're doing something terminologically wrong, and >potentially confusing. > > I didn't bring this up before, because I'm a bit afraid that it's just > be being picky... but since you mentioned this yourself. > >> I thought about alternative names, such as "wc", "workc", and >> "extratime". None of them is good enough. >> > Yep, I agree that contractions like 'wc' or 'workc' are pretty bad. > 'extratime', I'd actually like it better, TBH. > >> The ideal one should be much >> shorter and easy to link to "work conserving". :( >> If we use "extratime", it may cause confusion with the "extratime" in >> the depreciated SEDF. (That is my concern of reusing the EXTRATIME in >> the libxl_type.idl.) >> > Well, but SEDF being gone (and since quite a few time), and the fact > that RTDS and SEDF have not really never been there together, does > leave very few room for confusion, I think. > > While in academia (e.g., in the GRUB == Gready Reclaming of Unused > Bandwidth papers), what you're trying to achieved, I've heard it called > 'reclaiming' (as I'm sure you have as well :-)), and my friends that > are still working on Linux, are actually using it in there: > > https://lkml.org/lkml/2017/5/18/1128 > https://lkml.org/lkml/2017/5/18/1137 <-- SCHED_FLAG_RECLAIM > > I'm not so sure about it... As I'm not sure the meaning would appear > obvious, to people not into RT scheduling research. > > And even from this point of view, 'extratime' seems a lot better to me. > And if it were me doing this, I'd probably use it, both in the > internals and in the interface. > I'm thinking between reclaim and extratime. I will use extratime since extratime is already in the libxl. extratime means the VCPU will have extra time. It's the scheduler to determine how much extratime it will get. Thanks, Meng --- Meng Xu PhD Candidate in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RFC v1 2/3] libxl: enable per-VCPU work conserving flag for RTDS
On Fri, Aug 4, 2017 at 10:34 AM, Wei Liu wrote: > On Fri, Aug 04, 2017 at 02:53:51PM +0200, Dario Faggioli wrote: >> On Fri, 2017-08-04 at 13:10 +0100, Wei Liu wrote: >> > On Fri, Aug 04, 2017 at 10:13:18AM +0200, Dario Faggioli wrote: >> > > On Thu, 2017-08-03 at 17:39 -0400, Meng Xu wrote: >> > > > >> > > *HOWEVER*, in this case, we do have that 'extratime' field already, >> > > as >> > > a leftover from SEDF, which is there taking space and cluttering >> > > the >> > > interface, so why don't make good use of it. Especially considering >> > > it >> > > was used for _exactly_ the same thing, and with _exactly_ the same >> > > meaning, and even for a very similar (i.e., SEDF was also real- >> > > time) >> > > kind of scheduler. >> > >> > Correct me if I'm wrong: >> > >> > 1. extratime is ever only used in SEDF >> > 2. SEDF is removed >> > >> > That means we do have extratime to use in all other schedulers. >> > >> I'm not sure what you mean with this last line. >> >> IAC, this is how our the related data structures looks like, right now: >> >> libxl_sched_params = Struct("sched_params",[ >> ("vcpuid", integer, {'init_val': >> 'LIBXL_SCHED_PARAM_VCPU_INDEX_DEFAULT'}), >> ("weight", integer, {'init_val': >> 'LIBXL_DOMAIN_SCHED_PARAM_WEIGHT_DEFAULT'}), >> ("cap", integer, {'init_val': >> 'LIBXL_DOMAIN_SCHED_PARAM_CAP_DEFAULT'}), >> ("period", integer, {'init_val': >> 'LIBXL_DOMAIN_SCHED_PARAM_PERIOD_DEFAULT'}), >> ("extratime",integer, {'init_val': >> 'LIBXL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT'}), >> ("budget", integer, {'init_val': >> 'LIBXL_DOMAIN_SCHED_PARAM_BUDGET_DEFAULT'}), >> ]) >> >> The extratime field is there. Any scheduler can use it, if it wants >> (and in the way it wants). Currently, no one of them does that. > > Right, that's what I wanted to know. > >> >> libxl_domain_sched_params = Struct("domain_sched_params",[ >> ("sched",libxl_scheduler), >> ("weight", integer, {'init_val': >> 'LIBXL_DOMAIN_SCHED_PARAM_WEIGHT_DEFAULT'}), >> ("cap", integer, {'init_val': >> 'LIBXL_DOMAIN_SCHED_PARAM_CAP_DEFAULT'}), >> ("period", integer, {'init_val': >> 'LIBXL_DOMAIN_SCHED_PARAM_PERIOD_DEFAULT'}), >> ("budget", integer, {'init_val': >> 'LIBXL_DOMAIN_SCHED_PARAM_BUDGET_DEFAULT'}), >> >> # The following three parameters ('slice', 'latency' and 'extratime') >> are deprecated, >> # and will have no effect if used, since the SEDF scheduler has been >> removed. >> # Note that 'period' was an SDF parameter too, but it is still effective >> as it is >> # now used (together with 'budget') by the RTDS scheduler. >> ("slice",integer, {'init_val': >> 'LIBXL_DOMAIN_SCHED_PARAM_SLICE_DEFAULT'}), >> ("latency", integer, {'init_val': >> 'LIBXL_DOMAIN_SCHED_PARAM_LATENCY_DEFAULT'}), >> ("extratime",integer, {'init_val': >> 'LIBXL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT'}), >> ]) >> >> Same here. 'slice', 'latency' and 'extratime' are there because we >> deprecate, but don't remove stuff. They're not used in any way. [*] >> >> If, at some point, I'd decide to develop a feature for, say Credit2, >> that controll the latency (whatever that would mean, it's just an >> example! :-D) of domains, I think I'll use this 'latency' field, for >> its interface, instead of adding some other stuff. >> >> > However, please consider the possibility of reintroducing SEDF in the >> > future. Suppose that would happen, does extratime still has the same >> > semantics? >> > >> Well, I guess yes. But how does this matter? Each scheduler can, if it >> wants, use all these parameters in the way it actuallly prefers. So, >> the fact that RTDS will be using 'extratime' for letting vCPUs execute >> past their own real-time reservation, does not prevent the reintroduced >> SEDF --nor any other already existing or new scheduler-- to also use >> it, for similar (or maybe even not so similar) purposes. >> >> Or am I missing something? > > If extratime means different things to different schedulers, it's going > to be confusing. As a layperson I can't tell what extratime is or how it > is supposed to be used. I would like to have the field to have only one > meaning. Right now, extratime is not used by any scheduler. It was used in SEDF only. Since RTDS is the first scheduler to use the extratime after SEDF is depreciated, if we will use it, it only has one meaning: if extratime is non-zero, it indicates the VCPU will get extra time. I guess I lean to use extratime in the RTDS now. Best, Meng --- Meng Xu PhD Candidate in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RFC v1] xen:rtds: towards work conserving RTDS
> >> @@ -966,8 +1001,16 @@ burn_budget(const struct scheduler *ops, struct >> rt_vcpu *svc, s_time_t now) >> >> if ( svc->cur_budget <= 0 ) >> { >> -svc->cur_budget = 0; >> -__set_bit(__RTDS_depleted, &svc->flags); >> +if ( is_work_conserving(svc) ) >> +{ >> +svc->priority_level++; >> >ASSERT(svc->priority_level <= 1); I'm sorry I didn't see this suggestion in previous email. I don't think this assert makes sense. A vcpu that has extratime can have priority_level > 1. For example, a VCPU (period = 100ms, budget = 10ms) runs alone on a core. The VCPU may get its budget replenished for 9 times in a period. the vcpu's priority_level may be 9. The priority_level here also indicates how many times the VCPU gets the extra budget in the current period. > >> +svc->cur_budget = svc->budget; >> +} >> +else >> + { >> +svc->cur_budget = 0; >> +__set_bit(__RTDS_depleted, &svc->flags); >> +} >> } Thanks, Meng --- Meng Xu PhD Candidate in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v1 1/3] xen:rtds: towards work conserving RTDS
Make RTDS scheduler work conserving without breaking the real-time guarantees. VCPU model: Each real-time VCPU is extended to have an extratime flag and a priority_level field. When a VCPU's budget is depleted in the current period, if it has extratime flag set, its priority_level will increase by 1 and its budget will be refilled; othewrise, the VCPU will be moved to the depletedq. Scheduling policy is modified global EDF: A VCPU v1 has higher priority than another VCPU v2 if (i) v1 has smaller priority_leve; or (ii) v1 has the same priority_level but has a smaller deadline Queue management: Run queue holds VCPUs with extratime flag set and VCPUs with remaining budget. Run queue is sorted in increasing order of VCPUs priorities. Depleted queue holds VCPUs which have extratime flag cleared and depleted budget. Replenished queue is not modified. Signed-off-by: Meng Xu --- Changes from RFC v1 Rewording comments and commit message Remove is_work_conserving field from rt_vcpu structure Use one bit in VCPU's flag to indicate if a VCPU will have extra time Correct comments style --- xen/common/sched_rt.c | 90 ++--- xen/include/public/domctl.h | 3 ++ 2 files changed, 79 insertions(+), 14 deletions(-) diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c index 39f6bee..4e048b9 100644 --- a/xen/common/sched_rt.c +++ b/xen/common/sched_rt.c @@ -49,13 +49,15 @@ * A PCPU is feasible if the VCPU can run on this PCPU and (the PCPU is idle or * has a lower-priority VCPU running on it.) * - * Each VCPU has a dedicated period and budget. + * Each VCPU has a dedicated period, budget and a extratime flag * The deadline of a VCPU is at the end of each period; * A VCPU has its budget replenished at the beginning of each period; * While scheduled, a VCPU burns its budget. * The VCPU needs to finish its budget before its deadline in each period; * The VCPU discards its unused budget at the end of each period. - * If a VCPU runs out of budget in a period, it has to wait until next period. + * When a VCPU runs out of budget in a period, if its extratime flag is set, + * the VCPU increases its priority_level by 1 and refills its budget; otherwise, + * it has to wait until next period. * * Each VCPU is implemented as a deferable server. * When a VCPU has a task running on it, its budget is continuously burned; @@ -63,7 +65,8 @@ * * Queue scheme: * A global runqueue and a global depletedqueue for each CPU pool. - * The runqueue holds all runnable VCPUs with budget, sorted by deadline; + * The runqueue holds all runnable VCPUs with budget, + * sorted by priority_level and deadline; * The depletedqueue holds all VCPUs without budget, unsorted; * * Note: cpumask and cpupool is supported. @@ -151,6 +154,14 @@ #define RTDS_depleted (1<<__RTDS_depleted) /* + * RTDS_extratime: Can the vcpu run in the time that is + * not part of any real-time reservation, and would therefore + * be otherwise left idle? + */ +#define __RTDS_extratime4 +#define RTDS_extratime (1<<__RTDS_extratime) + +/* * rt tracing events ("only" 512 available!). Check * include/public/trace.h for more details. */ @@ -201,6 +212,8 @@ struct rt_vcpu { struct rt_dom *sdom; struct vcpu *vcpu; +unsigned priority_level; + unsigned flags; /* mark __RTDS_scheduled, etc.. */ }; @@ -245,6 +258,11 @@ static inline struct list_head *rt_replq(const struct scheduler *ops) return &rt_priv(ops)->replq; } +static inline bool has_extratime(const struct rt_vcpu *svc) +{ +return (svc->flags & RTDS_extratime) ? 1 : 0; +} + /* * Helper functions for manipulating the runqueue, the depleted queue, * and the replenishment events queue. @@ -274,6 +292,21 @@ vcpu_on_replq(const struct rt_vcpu *svc) } /* + * If v1 priority >= v2 priority, return value > 0 + * Otherwise, return value < 0 + */ +static s_time_t +compare_vcpu_priority(const struct rt_vcpu *v1, const struct rt_vcpu *v2) +{ +int prio = v2->priority_level - v1->priority_level; + +if ( prio == 0 ) +return v2->cur_deadline - v1->cur_deadline; + +return prio; +} + +/* * Debug related code, dump vcpu/cpu information */ static void @@ -303,6 +336,7 @@ rt_dump_vcpu(const struct scheduler *ops, const struct rt_vcpu *svc) cpulist_scnprintf(keyhandler_scratch, sizeof(keyhandler_scratch), mask); printk("[%5d.%-2u] cpu %u, (%"PRI_stime", %"PRI_stime")," " cur_b=%"PRI_stime" cur_d=%"PRI_stime" last_start=%"PRI_stime"\n" + " \t\t priority_level=%d has_extratime=%d\n" " \t\t onQ=%d runnable=%d flags=%x effective hard_affinity=%s\n", svc->vcpu->domain->domain_id, svc->vcpu->vcpu_id, @@ -312,6 +346,8 @@ rt_dump_vcpu(const str
[Xen-devel] [PATCH v1 2/3] libxl: enable per-VCPU extratime flag for RTDS
Modify libxl_vcpu_sched_params_get/set and sched_rtds_vcpu_get/set functions to support per-VCPU extratime flag Signed-off-by: Meng Xu --- Changes from RFC v1 Change work_conserving flag to extratime flag --- tools/libxl/libxl_sched.c | 12 1 file changed, 12 insertions(+) diff --git a/tools/libxl/libxl_sched.c b/tools/libxl/libxl_sched.c index faa604e..4ebed96 100644 --- a/tools/libxl/libxl_sched.c +++ b/tools/libxl/libxl_sched.c @@ -558,6 +558,10 @@ static int sched_rtds_vcpu_get_all(libxl__gc *gc, uint32_t domid, for (i = 0; i < num_vcpus; i++) { scinfo->vcpus[i].period = vcpus[i].u.rtds.period; scinfo->vcpus[i].budget = vcpus[i].u.rtds.budget; +if ( vcpus[i].u.rtds.flags & XEN_DOMCTL_SCHED_RTDS_extratime ) + scinfo->vcpus[i].extratime = 1; +else + scinfo->vcpus[i].extratime = 0; scinfo->vcpus[i].vcpuid = vcpus[i].vcpuid; } rc = 0; @@ -607,6 +611,10 @@ static int sched_rtds_vcpu_set(libxl__gc *gc, uint32_t domid, vcpus[i].vcpuid = scinfo->vcpus[i].vcpuid; vcpus[i].u.rtds.period = scinfo->vcpus[i].period; vcpus[i].u.rtds.budget = scinfo->vcpus[i].budget; +if ( scinfo->vcpus[i].extratime ) +vcpus[i].u.rtds.flags |= XEN_DOMCTL_SCHED_RTDS_extratime; +else +vcpus[i].u.rtds.flags &= ~XEN_DOMCTL_SCHED_RTDS_extratime; } r = xc_sched_rtds_vcpu_set(CTX->xch, domid, @@ -655,6 +663,10 @@ static int sched_rtds_vcpu_set_all(libxl__gc *gc, uint32_t domid, vcpus[i].vcpuid = i; vcpus[i].u.rtds.period = scinfo->vcpus[0].period; vcpus[i].u.rtds.budget = scinfo->vcpus[0].budget; +if ( scinfo->vcpus[0].extratime ) +vcpus[i].u.rtds.flags |= XEN_DOMCTL_SCHED_RTDS_extratime; +else +vcpus[i].u.rtds.flags &= ~XEN_DOMCTL_SCHED_RTDS_extratime; } r = xc_sched_rtds_vcpu_set(CTX->xch, domid, -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v1 3/3] xl: enable per-VCPU extratime flag for RTDS
Change main_sched_rtds and related output functions to support per-VCPU extratime flag. Signed-off-by: Meng Xu --- Changes from RFC v1 Changes work_conserving flag to extratime flag --- tools/xl/xl_cmdtable.c | 3 ++- tools/xl/xl_sched.c| 56 ++ 2 files changed, 40 insertions(+), 19 deletions(-) diff --git a/tools/xl/xl_cmdtable.c b/tools/xl/xl_cmdtable.c index 2c71a9f..88933a4 100644 --- a/tools/xl/xl_cmdtable.c +++ b/tools/xl/xl_cmdtable.c @@ -272,12 +272,13 @@ struct cmd_spec cmd_table[] = { { "sched-rtds", &main_sched_rtds, 0, 1, "Get/set rtds scheduler parameters", - "[-d [-v[=VCPUID/all]] [-p[=PERIOD]] [-b[=BUDGET]]]", + "[-d [-v[=VCPUID/all]] [-p[=PERIOD]] [-b[=BUDGET]] [-e[=EXTRATIME]]]", "-d DOMAIN, --domain=DOMAIN Domain to modify\n" "-v VCPUID/all, --vcpuid=VCPUID/allVCPU to modify or output;\n" " Using '-v all' to modify/output all vcpus\n" "-p PERIOD, --period=PERIOD Period (us)\n" "-b BUDGET, --budget=BUDGET Budget (us)\n" + "-e EXTRATIME, --extratime=EXTRATIME EXTRATIME (1=yes, 0=no)\n" }, { "domid", &main_domid, 0, 0, diff --git a/tools/xl/xl_sched.c b/tools/xl/xl_sched.c index 85722fe..5138012 100644 --- a/tools/xl/xl_sched.c +++ b/tools/xl/xl_sched.c @@ -251,7 +251,7 @@ static int sched_rtds_domain_output( libxl_domain_sched_params scinfo; if (domid < 0) { -printf("%-33s %4s %9s %9s\n", "Name", "ID", "Period", "Budget"); +printf("%-33s %4s %9s %9s %10s\n", "Name", "ID", "Period", "Budget", "Extra time"); return 0; } @@ -262,11 +262,12 @@ static int sched_rtds_domain_output( } domname = libxl_domid_to_name(ctx, domid); -printf("%-33s %4d %9d %9d\n", +printf("%-33s %4d %9d %9d %10s\n", domname, domid, scinfo.period, -scinfo.budget); +scinfo.budget, +scinfo.extratime ? "yes" : "no"); free(domname); libxl_domain_sched_params_dispose(&scinfo); return 0; @@ -279,8 +280,8 @@ static int sched_rtds_vcpu_output(int domid, libxl_vcpu_sched_params *scinfo) int i; if (domid < 0) { -printf("%-33s %4s %4s %9s %9s\n", "Name", "ID", - "VCPU", "Period", "Budget"); +printf("%-33s %4s %4s %9s %9s %10s\n", "Name", "ID", + "VCPU", "Period", "Budget", "Extra time"); return 0; } @@ -290,12 +291,13 @@ static int sched_rtds_vcpu_output(int domid, libxl_vcpu_sched_params *scinfo) domname = libxl_domid_to_name(ctx, domid); for ( i = 0; i < scinfo->num_vcpus; i++ ) { -printf("%-33s %4d %4d %9"PRIu32" %9"PRIu32"\n", +printf("%-33s %4d %4d %9"PRIu32" %9"PRIu32" %10s\n", domname, domid, scinfo->vcpus[i].vcpuid, scinfo->vcpus[i].period, - scinfo->vcpus[i].budget); + scinfo->vcpus[i].budget, + scinfo->vcpus[i].extratime ? "yes" : "no"); } free(domname); return 0; @@ -309,8 +311,8 @@ static int sched_rtds_vcpu_output_all(int domid, int i; if (domid < 0) { -printf("%-33s %4s %4s %9s %9s\n", "Name", "ID", - "VCPU", "Period", "Budget"); +printf("%-33s %4s %4s %9s %9s %10s\n", "Name", "ID", + "VCPU", "Period", "Budget", "Extra time"); return 0; } @@ -321,12 +323,13 @@ static int sched_rtds_vcpu_output_all(int domid, domname = libxl_domid_to_name(ctx, domid); for ( i = 0; i < scinfo->num_vcpus; i++ ) { -printf("%-33s %4d %4d %9"PRIu32" %9"PRIu32"\n", +printf("%-33s %4d %4d %9"PRIu32" %9"PRIu32" %10s\n", domname, domid, scinfo->vcpus[i].vcpuid, scinfo->vcpus[i].period, - scinfo->vcpus[i].budget); + scinfo->vcpus[i].budget, + scinfo->vcpus[i].extratime ? "yes" : "no"); } free(domname); return 0; @@ -702,14 +705,18 @@ int main_sched_rtds(int argc, char **argv) int *vcpus = (int *)xmalloc(sizeof(int)); /* IDs of VCPUs that change */ int *perio