from:"Meng Xu"

Re: [Xen-devel] [PATCH] xen: enable per-VCPU parameter for RTDS

2016-04-04 Thread Meng Xu

On Mon, Apr 4, 2016 at 9:07 PM, Chong Li  wrote:
> Commit f7b87b0745b4 ("enable per-VCPU parameter for RTDS") introduced
> a bug: it made it possible, in Credit and Credit2, when doing domain
> or vcpu parameters' manipulation, to leave the hypervisor with a
> spinlock held and interrupts disabled.
>
> Fix it.
>
> Signed-off-by: Chong Li 
>
> Acked-by: Dario Faggioli 

I'm wondering if the title "xen: enable per-VCPU parameter for RTDS"
is suitable for this patch, although I don't have a better title.

The title in my mind is: xen: fix incorrect lock for credit and credit2
I won't fight for this title, though. :-)

Probably no need to resend...

Thanks,

Meng

-- 
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] xenpm and scheduler

2016-04-11 Thread Meng Xu

On Mon, Apr 11, 2016 at 10:16 AM, tutu sky  wrote:

>
> hi,
> Does xenpm 'cpufreq' or 'cpuidle' feature, has any effect on scheduling
> decisions?
>

Please do not cross post.

No effect on RTDS scheduler.

May I ask a question: why do you need to consider this?

Thanks,

Meng
-- 
---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Xen 4.7 Headline Features (for PR)

2016-04-22 Thread Meng Xu

On Fri, Apr 22, 2016 at 9:26 AM, Lars Kurth  wrote:
>
> Folks,
>
> given that we have we are getting close to RC's, I would like to start to 
> spec out the headline Features for the press release. The big items I am 
> aware of are COLO. I am a little confused about xSplice.
>
> Maybe we can use this thread to start collating a short-list.

How about the improved RTDS scheduler:
(1) Change the RTDS scheduler from quantum-driven model to event-driven model;
(2) Support get/set per-VCPU parameters in RTDS toolstack.

Thanks,

Meng

-- 
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] Should we mark RTDS as supported feature from experimental feature?

2016-04-25 Thread Meng Xu

Hi Dario and all,

When RTDS scheduler is initialized, it will print out that the
scheduler is an experimental feature with the following lines:

printk("Initializing RTDS scheduler\n"

   "WARNING: This is experimental software in development.\n"

   "Use at your own risk.\n");

On RTDS' wiki [1], it says the RTDS scheduler is experimental feature.

All of the above information haven't been updated since Xen 4.5.

However, inside MAINTAINERS file, the status of RTDS scheduler is
marked as Supported (refer to commit point 28041371 by Dario Faggioli
on 2015-06-25).

In my opinion, the RTDS scheduler's functionality is finished and
tested. So should I send a patch to change the message printed out
when the scheduler is initialized?

If I understand correctly, the status in MAINTAINERS file should have
the highest priority and information from other sources should keep
updated with what the MAINTAINERS file says?

Please correct me if I'm wrong.

[1] http://wiki.xenproject.org/wiki/RTDS-Based-Scheduler

Thanks,

Meng

-- 
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Should we mark RTDS as supported feature from experimental feature?

2016-04-26 Thread Meng Xu

>> When RTDS scheduler is initialized, it will print out that the
>> scheduler is an experimental feature with the following lines:
>>
>> printk("Initializing RTDS scheduler\n"
>>
>>"WARNING: This is experimental software in development.\n"
>>
>>"Use at your own risk.\n");
>>
>> On RTDS' wiki [1], it says the RTDS scheduler is experimental
>> feature.
>>
> Yes.
>
>> However, inside MAINTAINERS file, the status of RTDS scheduler is
>> marked as Supported (refer to commit point 28041371 by Dario Faggioli
>> on 2015-06-25).
>>
> There's indeed a discrepancy between the way one can read that bit of
> MAINTAINERS, and what is generally considered Supported (e.g., subject
> to security support, etc).
>
> This is true in general, not only for RTDS (more about this below).

Ah-ha, I see.

>
>> In my opinion, the RTDS scheduler's functionality is finished and
>> tested. So should I send a patch to change the message printed out
>> when the scheduler is initialized?
>>
> So, yes, the scheduler is now feature complete (with the per-vcpu
> parameters) and adheres to a much more sensible and scalable design
> (event driven). Yet, these features have been merged very recently,
> therefore, when you say "tested", I'm not so sure I agree. In fact, we
> do test it on OSSTest, but only in a couple of tests. The combination
> of these two things make me think that we should allow for at least
> another development cycle, before considering switching.

I see. So should we mark it as Completed for Xen 4.7? or should we
wait until Xen 4.8 to mark it as Completed if nothing bad happens to
the scheduler?

>
> And speaking of OSSTest, there have benn occasional failures, on ARM,
> which I haven't yet found the time to properly analyze. It may be just
> something related to the fact that the specific board was very slow,
> but I'm not sure yet.

Hmm, I see. I plan to have a look at Xen on ARM this summer. When I
boot Xen on ARM, I probably could have a look at it as well.

>
> And even in that case, I wonder how we should handle such a
> situation... I was thinking of adding a work-conserving mode, what do
> you think?

Hmm, I can get why work-conserving mode is necessary and useful. I'm
thinking about the tradeoff  between the scheduler's complexity and
the benefit brought by introducing complexity.

The work-conserving mode is useful. However, there are other real time
features in terms of the scheduler that may be also useful. For
example, I heard from some company that they want to run RT VM with
non-RT VM, which is supported in RT-Xen 2.1 version, but not supported
in RTDS.

There are other RT-related issues we may need to solve to make it more
suitable for real-time or embedded field, such as protocols to handle
the shared resource.

Since the scheduler aims for the embedded and real-time applications,
those RT-related features seems to me more important than the
work-conserving feature.

What do you think?

> You may have something similar in RT-Xen already but, even
> if you don't, there are a number of ways for achieving that without
> disrupting the real-time guarantees.

Actually, in RT-Xen, we don't have the work-conserving version yet.
The work-conversing feature may not affect the real-time guarantees,
but it may not bring any improved real-time guarantees in theory. When
an embedded system designer wants to use the RTDS scheduler "with
work-conserving feature" (suppose we implement it), he cannot pack
more workload to the system by leveraging the work-conserving feature.
In practice, the system may run faster than he expects, but he won't
know how faster it will be unless we provide theoretical guarantee.

>
> What do you think?

IMHO, handling the other real-time features related to the scheduler
may be more important than the work-conserving feature, in order to
make the scheduler more adoptable in embedded world.

>
>> If I understand correctly, the status in MAINTAINERS file should have
>> the highest priority and information from other sources should keep
>> updated with what the MAINTAINERS file says?
>>
>> Please correct me if I'm wrong.
>>
> This has been discussed before. Have a look at this thread/messages.
>
> http://lists.xenproject.org/archives/html/xen-devel/2015-06/msg00972.html
> http://lists.xenproject.org/archives/html/xen-devel/2015-06/msg01775.html

I remembered this. Always keep an eye on ARINC653 as well. :-)

>
> And at this:
> http://lists.xenproject.org/archives/html/xen-devel/2015-06/msg01992.html

Yes. I read this before I asked. :-)

>
> The feature document template has been put

Re: [Xen-devel] Should we mark RTDS as supported feature from experimental feature?

2016-04-26 Thread Meng Xu

On Tue, Apr 26, 2016 at 4:56 AM, Andrew Cooper
 wrote:
>
>>> However, inside MAINTAINERS file, the status of RTDS scheduler is
>>> marked as Supported (refer to commit point 28041371 by Dario Faggioli
>>> on 2015-06-25).
>>>
>> There's indeed a discrepancy between the way one can read that bit of
>> MAINTAINERS, and what is generally considered Supported (e.g., subject
>> to security support, etc).
>>
>> This is true in general, not only for RTDS (more about this below).
>
> The purpose of starting the feature docs (in docs/features/) was to
> identify the technical status of a feature, along side some
> documentation pertinent to its use.
>
> I am tempted to suggest a requirement of "no security support without a
> feature doc" for new features, in an effort to resolve the current
> uncertainty as to what is supported and what is not.

I see. As I said in Dario's reply, I will add a feature doc in the
summer about the RTDS scheduler.

>
> As for the MAINTAINERS file, supported has a different meaning.  From
> the file itself,

Right. I read this doc before asking. :-)

>
> Descriptions of section entries:
>
> M: Mail patches to: FullName 
> L: Mailing list that is relevant to this area
> W: Web-page with status/info
> T: SCM tree type and location.  Type is one of: git, hg, quilt, stgit.
> S: Status, one of the following:
>Supported:   Someone is actually paid to look after this.
>Maintained:  Someone actually looks after it.
>Odd Fixes:   It has a maintainer but they don't have time to do
> much other than throw the odd patch in. See below..
>Orphan:  No current maintainer [but maybe you could take the
> role as you write your new code].
>Obsolete:Old code. Something tagged obsolete generally means
> it has been replaced by a better system and you
> should be using that.
>
> Nothing in the MAINTAINERS file constitutes a security statement.

I didn't realize this before.

Thank you very much for clarification!

Meng

-- 
---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Should we mark RTDS as supported feature from experimental feature?

2016-04-26 Thread Meng Xu

>> The feature document template has been put together:
>> http://lists.xenproject.org/archives/html/xen-devel/2015-08/msg01929.html
>>
>> And there are feature documents in tree already.
>>
>> Actually, writing one for RTDS would be a rather interesting and useful
>> thing to do, IMO! :-)
>
> I think it would be helpful to try to spell out what we think are the
> criteria for marking RTDS non-experimental.  Reading your e-mail, Dario,
> I might infer the following criteria:
>
> 1. New event-driven code spends most of a full release cycle in the tree
> being tested
> 2. Better tests in osstest (which ones?)
> 3. A feature doc

I agree with the above three items.

> 4. A work-conserving mode

I think we need to consider the item 4 carefully. Work-conserving mode
is not a must for real-time schedulers and it is not the main
purpose/goal of the RTDS scheduler.

>
> #3 definitely sounds like a good idea.  #1 is probably reasonable.
>
> I don't think #4 should be a blocker; we have plenty of work-conserving
> schedulers. :-)

Exactly.. Actually, work-conserving feature is not a top property for
real-time applications. The resource sharing issues, interacted with
the scheduler, are more important than the work-conserving "issue" for
complex non-independent real-time applications.

>
> Regarding #2, did you have specific tests in mind?

I've been thinking about how to confirm the correctness of (RTDS)
schedulers. It is actually quite challenging to prove the scheduler is
correct.

I'm thinking what the goal of the tests is? It will determine how the
scheduler should be tested, IMHO.
There are three possible goals in increasing difficulty:
(1) Make sure the scheduler won't crash the system, or
(2) make sure the performance of the scheduler is correct, or
(3) prove the scheduler is correct?

Which one are we talking about here? (maybe item 1?)

Thanks,

Meng

---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Should we mark RTDS as supported feature from experimental feature?

2016-04-26 Thread Meng Xu

On Tue, Apr 26, 2016 at 6:49 PM, Dario Faggioli
 wrote:
> On Tue, 2016-04-26 at 14:38 -0400, Meng Xu wrote:
>> > So, yes, the scheduler is now feature complete (with the per-vcpu
>> > parameters) and adheres to a much more sensible and scalable design
>> > (event driven). Yet, these features have been merged very recently,
>> > therefore, when you say "tested", I'm not so sure I agree. In fact,
>> > we
>> > do test it on OSSTest, but only in a couple of tests. The
>> > combination
>> > of these two things make me think that we should allow for at least
>> > another development cycle, before considering switching.
>> I see. So should we mark it as Completed for Xen 4.7? or should we
>> wait until Xen 4.8 to mark it as Completed if nothing bad happens to
>> the scheduler?
>>
> We should define the criteria. :-)
>
> In any case, not earlier than 4.8, IMO.
>
>> > And even in that case, I wonder how we should handle such a
>> > situation... I was thinking of adding a work-conserving mode, what
>> > do
>> > you think?
>> Hmm, I can get why work-conserving mode is necessary and useful. I'm
>> thinking about the tradeoff  between the scheduler's complexity and
>> the benefit brought by introducing complexity.
>>
>> The work-conserving mode is useful. However, there are other real
>> time
>> features in terms of the scheduler that may be also useful. For
>> example, I heard from some company that they want to run RT VM with
>> non-RT VM, which is supported in RT-Xen 2.1 version, but not
>> supported
>> in RTDS.
>>
> I remember that, but I'm not sure what "running a non-RT VM" inside
> RTDS would mean. According to what algorithm these non real-time VMs
> would be scheduled?

A non-RT VM means the VM whose priority is lower than any RT VM. The
non-RT VMs won't get scheduled until all RT VMs have  been scheduled.
We can use the same gEDF scheduling policy to schedule non-RT VMs.

>
> Since you mentioned complexity, adding a work conserving mode should be
> easy enough, and if you allow a VM to be in work conserving mode, and
> have a very small (or even zero) budget, here you are a non real-time
> VM.

OK. I think it depends on what algorithm we want to use for the work
conserving mode? Do you have some algorithm in mind?

>
>> There are other RT-related issues we may need to solve to make it
>> more
>> suitable for real-time or embedded field, such as protocols to handle
>> the shared resource.
>>
>> Since the scheduler aims for the embedded and real-time applications,
>> those RT-related features seems to me more important than the
>> work-conserving feature.
>>
>> What do you think?
>>
> There always will be new/other features... But that's not the point.

OK.

>
> What we need, here, is agree on what is the _minimum_ set of them that
> allows us to call the scheduler complete and usable. I think we're
> pretty close, with this work conserving mode I'm talking about the only
> candidate I can think of.

Since the point you raised here is that the work-conserving is
(probably) a must.

>
>> >
>> > You may have something similar in RT-Xen already but, even
>> > if you don't, there are a number of ways for achieving that without
>> > disrupting the real-time guarantees.
>> Actually, in RT-Xen, we don't have the work-conserving version yet.
>>
> Yeah, sorry, I probably was confusing it with the "RT / non-RT" flag.

I see. :-)

Best regards,

Meng

-- 
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Should we mark RTDS as supported feature from experimental feature?

2016-04-26 Thread Meng Xu

>
>> > 4. A work-conserving mode
>> I think we need to consider the item 4 carefully. Work-conserving
>> mode
>> is not a must for real-time schedulers and it is not the main
>> purpose/goal of the RTDS scheduler.
>>
> It's indeed not a must for real-time schedulers. In fact, it's only
> important if one wants the system to be overall usable, when using a
> real-time scheduler. :-P
>
> Also, I may be wrong but it should not be too hard to implement...
> I.e., a win-win. :-)

I'm thinking if we want to implement work-conserving policy in RTDS,
how should we allocate the unused resource to domains. Should this
allocation be promotional to the budget/period each domain is
configured with?
I guess the complexity totally depends on which work-conserving
algorithm we want to encode into RTDS.

For example, we can have priority bands that when a VCPU depletes its
budget, it will goes to the lower priority band. The VCPU on a lower
priority band will not be scheduled until all VCPUs in a higher
priority band are scheduled.
This policy seems easy to incorporate into the RTDS. (But I have to
think harder to make sure there is not catch :-) )

Best,

Meng

---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Should we mark RTDS as supported feature from experimental feature?

2016-04-27 Thread Meng Xu

On Wed, Apr 27, 2016 at 8:27 AM, Dario Faggioli
 wrote:
> On Tue, 2016-04-26 at 21:16 -0400, Meng Xu wrote:
>> > It's indeed not a must for real-time schedulers. In fact, it's only
>> > important if one wants the system to be overall usable, when using
>> > a
>> > real-time scheduler. :-P
>> >
>> > Also, I may be wrong but it should not be too hard to implement...
>> > I.e., a win-win. :-)
>> I'm thinking if we want to implement work-conserving policy in RTDS,
>> how should we allocate the unused resource to domains. Should this
>> allocation be promotional to the budget/period each domain is
>> configured with?
>> I guess the complexity totally depends on which work-conserving
>> algorithm we want to encode into RTDS.
>>
> Indeed it does.
>
> Everything works for me, basically. As you say, it would not be a
> critical aspect of this scheduler, and the implementation details of
> the work conserving mode is not going to be the reason why people
> choose it anyway... It's just to avoid that people runs away from it
> (and from Xen) screaming! :-)

I see. Right! This is a good point.

>
> So, for instance, how do you manage non real-time VMs in RT Xen?

RT-Xen is not working-serving right now. The way we handle the non RT
VM in RT-Xen 2.1 (not the latest version) is that we use another bit
in rt_vcpu to indicate if a VCPU is RT or not.

The non-RT VCPUs always have lower priority than the RT VCPUs.

> You
> say you still use EDF, how do you do that?

When RT VCPUs all depleted budget,  the non-RT VCPUs will be scheduled
by gEDF scheduling policy.

> When does one non real-time
> VM preempt another non real-time VM? (Ideally, I'd go and check the RT-
> Xen code that does this myself, but right now, I can't, sorry.)

The non-RT VCPU cannot be scheduled if any RT VCPU still has budget.
Once non-RT VCPUs are scheduled, they are preempted/scheduled based on
gEDF, since a non-RT VCPU also has budget and period parameters.

>
> We could go for this that you have already, and as soon as a VM
> exhausts its budget, we demote it to non real-time, until it receives
> the replenishment. Or something like that.

Right. To make it work-conserving, we will have to keep decreasing the
priority whenever it runs out of budget at that priority, until there
is no idle resource in the system any more.

>
> In this case, we basically get two features at the cost of one (support
> for non real-time VMs and work conserving mode for real-time VMs). Not
> to mention that you basically have the code already, and "only" need to
> upstream it! :-DD
>

Right. That is true... Let me think about it and send out a design later.

Meng


-- 
---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for 4.7 4/4] xen: adopt .deinit_pdata and improve timer handling

2016-05-04 Thread Meng Xu

On Wed, May 4, 2016 at 11:51 AM, George Dunlap  wrote:
> On 03/05/16 22:46, Dario Faggioli wrote:
>> The scheduling hooks API is now used properly, and no
>> initialization or de-initialization happen in
>> alloc/free_pdata any longer.
>>
>> In fact, just like it is for Credit2, there is no real
>> need for implementing alloc_pdata and free_pdata.
>>
>> This also made it possible to improve the replenishment
>> timer handling logic, such that now the timer is always
>> kept on one of the pCPU of the scheduler it's servicing.
>> Before this commit, in fact, even if the pCPU where the
>> timer happened to be initialized at creation time was
>> moved to another cpupool, the timer stayed there,
>> potentially inferfearing with the new scheduler of the
>
> * interfering
>
>> pCPU itself.
>>
>> Signed-off-by: Dario Faggioli 
>
> I don't know much about the logic, so I'll wait for Meng Xu to review it.
>

I will look at it this week...(I will try to do it asap...)

Meng

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for 4.7 4/4] xen: adopt .deinit_pdata and improve timer handling

2016-05-07 Thread Meng Xu

On Tue, May 3, 2016 at 5:46 PM, Dario Faggioli
 wrote:
>
> The scheduling hooks API is now used properly, and no
> initialization or de-initialization happen in
> alloc/free_pdata any longer.
>
> In fact, just like it is for Credit2, there is no real
> need for implementing alloc_pdata and free_pdata.
>
> This also made it possible to improve the replenishment
> timer handling logic, such that now the timer is always
> kept on one of the pCPU of the scheduler it's servicing.
> Before this commit, in fact, even if the pCPU where the
> timer happened to be initialized at creation time was
> moved to another cpupool, the timer stayed there,
> potentially inferfearing with the new scheduler of the
> pCPU itself.
>
> Signed-off-by: Dario Faggioli 
> --

Reviewed-and-Tested-by: Meng Xu 

I do have a minor comment about the patch, although it is not
important at all and it is not really about this patch...

> @@ -614,7 +612,8 @@ rt_deinit(struct scheduler *ops)
>  {
>  struct rt_private *prv = rt_priv(ops);
>
> -kill_timer(prv->repl_timer);
> +ASSERT(prv->repl_timer->status == TIMER_STATUS_invalid ||
> +   prv->repl_timer->status == TIMER_STATUS_killed);

I found in xen/timer.h, the comment after the definition of the
TIMER_STATUS_invalid is

#define TIMER_STATUS_invalid  0 /* Should never see
this.   */

This comment is a little contrary to how the status is used here.
Actually, what does it exactly means by "Should never see this"?

This _invalid status is used in timer.h and it is the status when a
timer is initialized by init_timer().

So I'm thinking maybe this comment can be better improved to avoid confusion?

Anyway, this is just a comment and should not be a blocker, IMO. I
just want to raise it up since I saw it... :-)


===About the testing I did===
---Below is how I tested it---
I booted up two vcpus, created one cpupool for each type of
schedulers, and migrated them around.
The scripts to run the test cases can be found at
https://github.com/PennPanda/scripts/tree/master/xen/xen-test

---Below is the testing scenarios---
echo "start test case 1..."
xl cpupool-list
xl cpupool-destroy cpupool-credit
xl cpupool-destroy cpupool-credit2
xl cpupool-destroy cpupool-rtds
xl cpupool-create ${cpupool_credit_file}
xl cpupool-create ${cpupool_credit2_file}
xl cpupool-create ${cpupool_rtds_file}
# Add cpus to each cpupool
echo "Add CPUs to each cpupool"
for ((i=0;i<5; i+=1));do
xl cpupool-cpu-remove Pool-0 ${i}
done
echo "xl cpupool-cpu-add cpupool-credit 0"
xl cpupool-cpu-add cpupool-credit 0
echo "xl cpupool-cpu-add cpupool-credit2 1,2"
xl cpupool-cpu-add cpupool-credit2 1
xl cpupool-cpu-add cpupool-credit2 2
echo "xl cpupool-cpu-add cpupool-rtds 3,4"
xl cpupool-cpu-add cpupool-rtds 3
xl cpupool-cpu-add cpupool-rtds 4
xl cpupool-list -c
xl cpupool-list
# Migrate vm1 among cpupools
echo "Migrate ${vm1_name} among cpupools"
xl cpupool-migrate ${vm1_name} cpupool-rtds
xl cpupool-migrate ${vm1_name} cpupool-credit2
xl cpupool-migrate ${vm1_name} cpupool-rtds
xl cpupool-migrate ${vm1_name} cpupool-credit
xl cpupool-migrate ${vm1_name} cpupool-rtds

Thank you very much and best regards,

Meng

-- 
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for 4.7 0/4] Assorted scheduling fixes

2016-05-07 Thread Meng Xu

Hi Wei,

On Wed, May 4, 2016 at 12:04 PM, Wei Liu  wrote:
>
> On Tue, May 03, 2016 at 11:46:27PM +0200, Dario Faggioli wrote:
> > Hi,
> >
> > This small series contains some bugfixes for various schedulers. They're all
> > bugfixes, so I think all should be considered for 4.7. Here's some more
> > detailed analysis.
> >
> > Patch 1 and 3 are for Credit2. Patch 1 is a lot more important, as we have 
> > an
> > ASSERT triggering without it. Patch 2 is behavioral fixing, which I believe 
> > it
> > is important, but at least does not make anything explode.
> >
> > Patch 2 fixes another ASSERT, in case a pCPU fails to come up. This is what
> > Julien reported here:
> >
> >  https://www.mail-archive.com/xen-devel@lists.xen.org/msg65918.html
> >
> > Julien, the patch is very very similar to the one attached to one of my 
> > reply
> > in that thread, but I had to change some small bits... Can you please 
> > re-test
> > it?
> >
> > Patch 4 makes the code of RTDS look consistent with what we state in patch 
> > 2,
> > so it's also important. Furthermore, it does fix a bug (although, again, not
> > one that would splat Xen) as, without it, we may have a timer used by the 
> > RTDS
> > scheduler bound to the pCPU of another cpupool with another scheduler. That
> > would introduce some unwanted and very difficult to recognize interference
> > between different schedulers in different pool, and should hence be avoided.
> >
> > So this was awesomeness; about risks:
> >  - patch 1 is very small, super-self contained (zero impact outside of 
> > Credit2
> >code) and it fixes an actual and 100% reproducible bug;
> >  - patch 2 is also totally self-contained and it can't possibly cause 
> > problems
> >to anything else than to what it is trying to fix (Credit2's load 
> > balancer).
> >It doesn't cure any ASSERT or Oops, so it's less interesting, but given 
> > the
> >low risk --also considering that Credit2 will still be considered
> >experimental in 4.7-- I think it can go in;
> >  - patch 3 is bigger, and a bit more complex. Note, however, that most of 
> > its
> >content is code comments and ASSERT-s; it is self contained to scheduling
> >(in the sense that it impacts all schedulers, but "just" them), and fixes
> >a situation that, AFAIUI, is important for ARM;
>
> You meant patch 2 actually.
>
> For the first three patches:
>
> Release-acked-by: Wei Liu 
>
> >  - patch 4 may again look not that critical. But, the fact that someone 
> > wanting
> >to experiment with RTDS in a cpupool would face the kind of interference
> >between independent cpupools that the patch cures is, I think, something
> >worthwhile trying to avoid.


Yes. It's better to avoid this type of interference.


>
> Besides, it is again quite self contained, as
> >it's indeed only relevant for RTDS (which is also going to be called
> >experimental for 4.7).


Yes. It should not affect other schedulers or other parts of the system.
Actually, it does not affect the logic in RTDS either.


> I will wait for Meng to review this one.


I just reviewed and tested this patch on my computer.
Thank you very much!

Best regards,

Meng

---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for 4.7 4/4] xen: adopt .deinit_pdata and improve timer handling

2016-05-07 Thread Meng Xu

On Sat, May 7, 2016 at 5:19 PM, Meng Xu  wrote:
> On Tue, May 3, 2016 at 5:46 PM, Dario Faggioli
>  wrote:
>>
>> The scheduling hooks API is now used properly, and no
>> initialization or de-initialization happen in
>> alloc/free_pdata any longer.
>>
>> In fact, just like it is for Credit2, there is no real
>> need for implementing alloc_pdata and free_pdata.
>>
>> This also made it possible to improve the replenishment
>> timer handling logic, such that now the timer is always
>> kept on one of the pCPU of the scheduler it's servicing.
>> Before this commit, in fact, even if the pCPU where the
>> timer happened to be initialized at creation time was
>> moved to another cpupool, the timer stayed there,
>> potentially inferfearing with the new scheduler of the
>> pCPU itself.
>>
>> Signed-off-by: Dario Faggioli 
>> --
>
> Reviewed-and-Tested-by: Meng Xu 

>
> ---Below is the testing scenarios---
> echo "start test case 1..."
> xl cpupool-list
> xl cpupool-destroy cpupool-credit
> xl cpupool-destroy cpupool-credit2
> xl cpupool-destroy cpupool-rtds
> xl cpupool-create ${cpupool_credit_file}
> xl cpupool-create ${cpupool_credit2_file}
> xl cpupool-create ${cpupool_rtds_file}
> # Add cpus to each cpupool
> echo "Add CPUs to each cpupool"
> for ((i=0;i<5; i+=1));do
> xl cpupool-cpu-remove Pool-0 ${i}
> done
> echo "xl cpupool-cpu-add cpupool-credit 0"
> xl cpupool-cpu-add cpupool-credit 0
> echo "xl cpupool-cpu-add cpupool-credit2 1,2"
> xl cpupool-cpu-add cpupool-credit2 1
> xl cpupool-cpu-add cpupool-credit2 2
> echo "xl cpupool-cpu-add cpupool-rtds 3,4"
> xl cpupool-cpu-add cpupool-rtds 3
> xl cpupool-cpu-add cpupool-rtds 4
> xl cpupool-list -c
> xl cpupool-list
> # Migrate vm1 among cpupools
> echo "Migrate ${vm1_name} among cpupools"
> xl cpupool-migrate ${vm1_name} cpupool-rtds
> xl cpupool-migrate ${vm1_name} cpupool-credit2
> xl cpupool-migrate ${vm1_name} cpupool-rtds
> xl cpupool-migrate ${vm1_name} cpupool-credit
> xl cpupool-migrate ${vm1_name} cpupool-rtds
>

I forgot one thing in the previous email.
When I tried to migrate Domain-0 from the Pool-0 (with rtds or credit
scheduler) to another newly created pool, say cpupool-credit, it
always fails.

This situation happens even when I boot into credit scheduler and try
to migrate Domain-0 to another cpupool.

I'm wondering if Domain-0 can be migrated among cpupools?
From the Xen wiki: http://wiki.xenproject.org/wiki/Cpupools_Howto, it
seems Domain-0 can be migrated

Thanks,

Meng

---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for 4.7 4/4] xen: adopt .deinit_pdata and improve timer handling

2016-05-09 Thread Meng Xu

>
>
>
> > I do have a minor comment about the patch, although it is not
> > important at all and it is not really about this patch...
> >
> > >
> > > @@ -614,7 +612,8 @@ rt_deinit(struct scheduler *ops)
> > >  {
> > >  struct rt_private *prv = rt_priv(ops);
> > >
> > > -kill_timer(prv->repl_timer);
> > > +ASSERT(prv->repl_timer->status == TIMER_STATUS_invalid ||
> > > +   prv->repl_timer->status == TIMER_STATUS_killed);
> > I found in xen/timer.h, the comment after the definition of the
> > TIMER_STATUS_invalid is
> >
> > #define TIMER_STATUS_invalid  0 /* Should never see
> > this.   */
> >
> > This comment is a little contrary to how the status is used here.
> > Actually, what does it exactly means by "Should never see this"?
> >
> > This _invalid status is used in timer.h and it is the status when a
> > timer is initialized by init_timer().
> >
> As far as my understanding goes, this means that a timer, during its
> operations, should never be found in this state.
>
> In fact, this mark a situation where the timer has been allocated but
> never initialized, and there are ASSERT()s around to enforce that.
>
> However, if what one wants is _exactly_ to check whether the timer has
> been allocated ut not initialized, I don't see why I can't use this.



You can use this. Actually, I agree with how you used this here.
Actually, this is also how the existing init_timer() uses it.


>
>
> > So I'm thinking maybe this comment can be better improved to avoid
> > confusion?
> >
> I don't think things are confusing, neither right now, nor after this
> patch, but I'm open to others' opinion. :-)


Hmm, I won't get confused with the comment from now on, but I'm unsure
if someone else will or not. The tricky thing is when I know it, I
won't feel weird. However, when I first read it, I feel a little
confusing if not reading the other parts of the code related to this
macro.

Anyway, I'm ok with either way: change the comment or not.

Best Regards,

Meng


---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for 4.7 4/4] xen: adopt .deinit_pdata and improve timer handling

2016-05-09 Thread Meng Xu

On Mon, May 9, 2016 at 10:52 AM, Dario Faggioli
 wrote:
> On Mon, 2016-05-09 at 10:08 -0400, Meng Xu wrote:
>> > I don't think things are confusing, neither right now, nor after
>> > this
>> > patch, but I'm open to others' opinion. :-)
>>
>> Hmm, I won't get confused with the comment from now on, but I'm
>> unsure
>> if someone else will or not. The tricky thing is when I know it, I
>> won't feel weird. However, when I first read it, I feel a little
>> confusing if not reading the other parts of the code related to this
>> macro.
>>
> I don't feel the same, but I understand the concern.
>
> I think we have two options here:
>  1. we just do nothing;
>  2. you send a patch that, according to your best judgement, improve
> things (as we all do all the time! :-P).
>
> :-D
>
>> Anyway, I'm ok with either way: change the comment or not.
>>
> Me too, and in fact, I'm not changing it, but I won't stop you tryingto
> do so. :-)
>

OK. I can do it... But is just one comment line change too small to be
a patch? ;-)

Thanks,

Meng

---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Xen 4.7 Test Day Instructions for RC2+ : Call to action for people who added new features / functionality to Xen 4.7

2016-05-09 Thread Meng Xu

On Mon, May 9, 2016 at 11:28 AM, Lars Kurth  wrote:
> Hi all,
>
> I added the following sections based on git logs to man pages. Authors are on 
> the CC list and should review and modify (or suggest edits by replying to 
> this thread). I added/updated/added TODO's to:
>
> I do have some questions, to ...
> - Konrad/Ross: is there any documentation for xSplice which I have missed?
> - Julien: Any ARM specific stuff you want people to test?
> - Doug: are there any docs / tests for KCONFIG you want to push
> - George: are there any manual tests for credit 2 hard affinity, for hotplug 
> disk backends (drbd, iscsi, &c) and soft reset for HVM guests that should be 
> added?
>
> For the following sections there are some TODO's - please verify and modify 
> and once OK, remove the TODO from the wiki pages.
>
> RTDS (Meng Xu, Tianyang, Chong Li)
> - Meng, you mention improvements to the RTDS scheduler in another thread
> - Are any specific test instructions needed in 
> http://wiki.xenproject.org/wiki/Xen_4.7_RC_test_instructions
> - 
> http://wiki.xenproject.org/wiki/Xen_4.7_RC_test_instructions#RTDS_scheduler_improvements

I verified the text in the wiki, added one comment "which will not
invoke the scheduler unnecessarily" for the event-driven model.

I removed the TODO in the RTDS section in the wiki.
Please let me know if I need to do something else.

Thank you very much!

Best,

Meng

---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for-4.7] sched/rt: Fix memory leak in rt_init()

2016-05-10 Thread Meng Xu

On Tue, May 10, 2016 at 9:38 AM, Andrew Cooper
 wrote:
> c/s 2656bc7b0 "xen: adopt .deinit_pdata and improve timer handling"
> introduced a error path into rt_init() which leaked prv if the
> allocation of prv->repl_timer failed.
>
> Introduce an error cleanup path.
>
> Spotted by Coverity.

I'm curious about this line. Does it mean that this is spotted by the
coverty code review or by some automatical testing/checking?

>
> Signed-off-by: Andrew Cooper 
> ---

I'm sorry that I should have spot it out when I reviewed the code. :-(

Reviewed-by: Meng Xu 


---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Hypercall invoking

2016-05-10 Thread Meng Xu

On Tue, May 10, 2016 at 6:12 AM, tutu sky  wrote:

>
>
> 
> From: Dario Faggioli 
> Sent: Tuesday, May 10, 2016 7:32 AM
> To: tutu sky; Xen-devel@lists.xen.org
> Subject: Re: [Xen-devel] Hypercall invoking
>
> On Tue, 2016-05-10 at 06:49 +, tutu sky wrote:
> > Hi,
> > I added a new hypercall to xen successfully, but when i try to invoke
> > it in dom0 using privcmd, i am unable to invoke (using XC), I must cd
> > to /xen.x.y.z/tools/xcutils and then try to invoke hypercall by XC
> > interface which i created for it.
> > DO functions of hypercall is written in /xen/common/kernel.c.
> >
> > can you give me a clue?
> >
> That depends on what you are trying to achieve, and on what you have
> implemented and how you have implemented it.
>
> Actually, this is not the first time we tell you this: without you
> being specific, we can't possibly help.
>
> In this case, "being specific" would have meant specifying:
>  - what is your final end goal
>

I think Dario meant why you want to add a hypercall? What is your "final"
goal to add the hypercall? Probably it is just unnecessary to add a
hypercall to achieve your final goal.

I'm just curious if you could give a self introduction. ;-)

Meng


---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Running Xen on Nvidia Jetson-TK1

2016-05-11 Thread Meng Xu

Hi Dushyant,

On Tue, Mar 8, 2016 at 3:23 AM, Dushyant K Behl  wrote:
>
> Hi All,
>
> I'm working on a research project with IBM, and I want to run Xen on Nvidia 
> Tegra Jetson-tk1 board.
> I looked at a post on this mailing list 
> (http://lists.xenproject.org/archives/html/xen-devel/2015-03/msg01122.html),
> and I am using this git tree -
>
> git://xenbits.xen.org/people/ianc/xen.git
> and branch  - tegra-tk1-jetson-v1
>
> But when I try to boot Xen on the board I am not able to see any output (even 
> with earlyprintk enabled).
> After jumping to Xen the board just resets without showing any output.
>
> I am using upstream u-boot with non secure mode enabled.

I just got the Jetson TK1 board and I'm trying to run Xen on it.

May I know which u-boot repo and which branch you used to enable the
non-secure mode? If you could also share your u-boot config file, that
will be awesome!

The u-boot from NVIDEA didn't turn on the HYP mode. I tried the
git://git.denx.de/u-boot.git, tag v2016.03, but the board won't boot
after I flashed the uboot. No message at all... :-(
If I use NVIDEA's uboot, I can boot into the linux kernel without problem.

Thank you very much for your help and time!

Best Regards,

Meng

-- 
---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] xen: sched: rtds: refactor code

2016-05-12 Thread Meng Xu

Hi Tianyang

On Wed, May 11, 2016 at 11:20 AM, Tianyang Chen  wrote:
> No functional change:
>  -Various coding style fix
>  -Added comments for UPDATE_LIMIT_SHIFT.
>
> Use non-atomic bit-ops:
>  -Vcpu flags are checked and cleared atomically. Performance can be
>   improved with corresponding non-atomic versions since schedule.c
>   already has spin_locks in place.
>
> Suggested-by: Dario Faggioli 

It's better to add the link to the thread that has the suggestion.

> @@ -930,7 +936,7 @@ burn_budget(const struct scheduler *ops, struct rt_vcpu 
> *svc, s_time_t now)
>  if ( svc->cur_budget <= 0 )
>  {
>  svc->cur_budget = 0;
> -set_bit(__RTDS_depleted, &svc->flags);
> +__set_bit(__RTDS_depleted, &svc->flags);
>  }
>
>  /* TRACE */
> @@ -955,7 +961,7 @@ burn_budget(const struct scheduler *ops, struct rt_vcpu 
> *svc, s_time_t now)
>   * lock is grabbed before calling this function

The comment says "lock is grabbed before calling this function".
IIRC,  we use __ to represent that we grab the lock before call this function.
Then this change violates the convention.

>   */
>  static struct rt_vcpu *
> -__runq_pick(const struct scheduler *ops, const cpumask_t *mask)
> +runq_pick(const struct scheduler *ops, const cpumask_t *mask)
>  {
>  struct list_head *runq = rt_runq(ops);
>  struct list_head *iter;
> @@ -964,9 +970,9 @@ __runq_pick(const struct scheduler *ops, const cpumask_t 
> *mask)
>  cpumask_t cpu_common;
>  cpumask_t *online;
>
> -list_for_each(iter, runq)
> +list_for_each ( iter, runq )
>  {
> -iter_svc = __q_elem(iter);
> +iter_svc = q_elem(iter);
>
>  /* mask cpu_hard_affinity & cpupool & mask */
>  online = cpupool_domain_cpumask(iter_svc->vcpu->domain);
> @@ -1028,7 +1034,7 @@ rt_schedule(const struct scheduler *ops, s_time_t now, 
> bool_t tasklet_work_sched
>  }
>  else
>  {
> -    snext = __runq_pick(ops, cpumask_of(cpu));
> +snext = runq_pick(ops, cpumask_of(cpu));
>  if ( snext == NULL )
>  snext = rt_vcpu(idle_vcpu[cpu]);
>



Meng

---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [TESTDAY] Test report - xl sched-rtds

2016-05-13 Thread Meng Xu

On Fri, May 13, 2016 at 5:31 AM, Wei Liu  wrote:
> On Thu, May 12, 2016 at 02:00:06PM -0500, Chong Li wrote:
>> * Hardware:
>> CPU: Intel Core2 Quad Q9400
>> Total Memory: 2791088 kB
>>
>> * Software:
>> Ubuntu 14.04
>> Linux kernel: 3.13.0-68
>>
>> * Guest operating systems:
>> Ubuntu 14.04 (PV)
>>
>> * Functionality tested:
>> xl sched-rtds (for set/get per-VCPU parameters)
>>
>> * Comments:
>> All examples about "xl sched-rtds" in xl mannual page
>> <http://xenbits.xen.org/docs/unstable/man/xl.1.html#DOMAIN-SUBCOMMANDS>
>> have been tested,
>> and all run successfully.
>>
>> If users type in wrong parameters (e.g., budget > period), the
>> error/warnning messages
>> are returned correctly as well.
>>
>
> Good, so RTDS works as expected. Thanks for your report.

Hi Wei,

I'd like to share some of my experience with the improved RTDS scheduler.
It is not a formal report. But hopefully it is some useful information.

I have been using the improved RTDS in the staging for a while. I
haven't seen any weird issue. Because I also modify the other parts of
Xen a bit, so the test is not for xen 4.7-rc2 though. That's why we
have Chong test the xen 4.7-rc2. Thank you very much, Chong, for your
nice test report! :-)

The workload types I run are:
1) Compile linux or xen kernels in parallel. The number of compiling
jobs is usually double the number of cores allocated for dom0.
2) Run cpu-intensive task or memory-intensive task,which access a large array.

Best,

Meng

-- 
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Running Xen on Nvidia Jetson-TK1

2016-05-13 Thread Meng Xu

Hi Julien and Dushyant,

>>>
>>>> (XEN) DOM0: [0.00] irq: no irq domain found for
>>>> /interrupt-controller !
>>>> (XEN) DOM0: [0.00] irq: no irq domain found for
>>>> /interrupt-controller !
>>>> (XEN) DOM0: [0.00] irq: no irq domain found for
>>>> /interrupt-controller !
>>>> (XEN) DOM0: [0.00] arch_timer: No interrupt available, giving up
>>>
>>>
>>>
>>> It looks like to me that Xen is not recreating the device-tree correctly.
>>> I
>>> would look into the kernel to find what is expected.
>>
>>
>> This looks like a possible bug (or some missing feature) in Xen's
>> device tree creation which could
>> take some time to handle, so if I could be of any more help to you
>> with this issue please let me know.
>
>
> There was a conversation on #xen-arm few days ago about this problem.

Is there a way that we can see the conversation on #xen-arm?
I hope to better understand the problem.

> Xen doesn't correctly recreate the GIC node which result in a loop between
> the interrupt controller. Can you try the below patch?
>
> http://dev.ktemkin.com/misc/xenarm-gic-parents.patch

It seems this link is invalid now...
Has this patch been upstreamed?

Hi Dushyant,
Could you help repost this patch in this email if it's not that large?
(Since we used the same repo, which is IanC's, it may be even better
if you could kindly share the patch based on the tegra-tk1-jetson-v1
branch of Ian's repo.?)


Hi Julien,
Do you have some suggestions on how we can debug and fix the issue
related to the device tree?

I saw that there may still be some issues with the NVIDIA devices as
Dushyant described after he applied the patch.
Right now, I have the exact same board as Dushyant has. I think I may
encounter the exact same issue as he did. So I'm wondering if there is
some documentation/tutorial/notes that we can learn about how to debug
the issues.

Thank you both very much for your help and time!

Best Regards,

Meng

---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Running Xen on Nvidia Jetson-TK1

2016-05-14 Thread Meng Xu

Hi Dushyant,

On Sat, May 14, 2016 at 1:36 PM, Dushyant Behl
 wrote:
> Hey Meng,
>
> On Sat, May 14, 2016 at 7:39 AM, Meng Xu  wrote:
>>>
>>> http://dev.ktemkin.com/misc/xenarm-gic-parents.patch
>>
>> It seems this link is invalid now...
>> Has this patch been upstreamed?
>>
>> Hi Dushyant,
>> Could you help repost this patch in this email if it's not that large?
>> (Since we used the same repo, which is IanC's, it may be even better
>> if you could kindly share the patch based on the tegra-tk1-jetson-v1
>> branch of Ian's repo.?)
>
> The patch is attached with the mail.
>

Thank you so much for your help! I applied the patch and the kernel
can have further progress in booting.

I'm replying to your last email about the issue I'm facing to, which
seems not same with what you saw.

Best Regards,

Meng
---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Running Xen on Nvidia Jetson-TK1

2016-05-14 Thread Meng Xu

Hi Dushyant,

>>> On Thu, Mar 17, 2016 at 8:22 PM, Julien Grall 
>>> wrote:
>>>>
>>>> On 14/03/16 14:19, Dushyant Behl wrote:
>>>>>>
>>>>>> Yes, I have enabled these configuration parameters when compiling linux
>>>>>> -
>>>>
>>>>
>>>> The list of options looks good to me. I guess Linux is crashing before
>>>> setting
>>>> up the console. Can you apply the below to Linux and post the log here?
>>>
>>>
>>> I applied your patch to Linux but still there is no output from the
>>> kernel.
>>>
>>> But I have found location of the problem, I have a debugger attached
>>> to the Jetson board
>>> and using that I was able to find out that Linux is failing while
>>> initializing the Tegra timer.
>>>
>>> The call stack at the time of failing is  -
>>>
>>> -   prefetchw (inline)
>>>  arch_spin_lock (inline)
>>>  do_raw_spin_lock_flags (inline)
>>>  __raw_spin_lock_irqssave (inline)
>>>  raw_spin_lock_irq_save (lock = 0xC0B746F0)
>>> -   of_get_parent (node = 0xA1D3)
>>> -   of_get_address (dev = 0xDBBABC30, index = 0, size = 0xC0A83F30)
>>> -   of_address_to_resource(dev = 0xDBBABC30, index = 0, r = 0xC0A83F50)
>>> -   of_iomap (np = 0xDBBABC30, index = 0)
>>> -   tegra20_init_timer (np = 0xDBBABC30)
>>> -   clocksource_of_init()
>>> -   start_kernel()
>>>
>>> After this Linux jumps to floating point exception handler and then to
>>> undefined instruction and fails.
>>
>>
>> I don't know why Linux is receiving a floating point exception. However,
>> DOM0 must not use the tegra timer as it doesn't support virtualization.
>>
>> You need to ensure that DOM0 will use the arch timer instead. Xen provides
>> some facilities to blacklist a device tree node (see blacklist dev in
>> arm/platforms/tegra.c).
>
> I have blacklisted the tegra20_timer

I guess you blocked the "tegra20-timer" (which uses "-" instead of
"_") right as shown in the following patch? Am I right?

diff --git a/xen/arch/arm/platforms/tegra.c b/xen/arch/arm/platforms/tegra.c

index 5ec9dda..8477ad1 100644

--- a/xen/arch/arm/platforms/tegra.c

+++ b/xen/arch/arm/platforms/tegra.c

@@ -431,6 +431,7 @@ static const struct dt_device_match
tegra_blacklist_dev[] __initconst =

  * UART to dom0, so don't map any of them.

  */

 DT_MATCH_COMPATIBLE("nvidia,tegra20-uart"),

+DT_MATCH_COMPATIBLE("nvidia,tegra20-timer"),

 { /* sentinel */ },

 };

Thanks and Best Regards,

Meng
---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Running Xen on Nvidia Jetson-TK1

2016-05-14 Thread Meng Xu

a: Failed to get supply 'avdd': -517
> [8.833857] tegra-ahci 70027000.sata: Failed to get regulators
> [8.840805] input: gpio-keys as /devices/soc0/gpio-keys/input/input0
> [8.846820] hctosys: unable to open rtc device (rtc0)
> [8.847331] sdhci-tegra 700b0400.sdhci: Got CD GPIO
> [8.847369] sdhci-tegra 700b0400.sdhci: Got WP GPIO
> [8.847471] mmc1: Unknown controller version (3). You may
> experience problems.
> [8.851403] sdhci-tegra 700b0400.sdhci: No vmmc regulator found
> [8.852328] tegra-snd-rt5640 sound: ASoC: CODEC DAI rt5640-aif1 not
> registered
> [8.852340] tegra-snd-rt5640 sound: snd_soc_register_card failed (-517)
> [8.854009] tegra-pcie 1003000.pcie-controller: 2x1, 1x1 configuration
> [8.854062] tegra-pcie 1003000.pcie-controller: Failed to get
> supply 'avddio-pex': -517
> [8.855019] reg-fixed-voltage regulators:regulator@11: Failed to
> resolve vin-supply for +1.05V_RUN_AVDD_HDMI_PLL
> [8.855030] tegra-hdmi 5428.hdmi: failed to get PLL regulator
> [8.856050] tegra-ahci 70027000.sata: Failed to get supply 'avdd': -517
> [8.856059] tegra-ahci 70027000.sata: Failed to get regulators
> [8.951051] +12V_SATA: disabling
> [8.952391] +5V_SATA: disabling
> [8.955596] +5V_HDMI_CON: disabling
> [8.959166] +1.05V_RUN_AVDD_HDMI_PLL: disabling
> [8.963742] +USB0_VBUS_SW: disabling
> [8.967400] +3.3V_AVDD_HDMI_
>

The last several dom0 log messages are:

[5.398512] sdhci: Copyright(c) Pierre Ossman

6sdhci-pltfm: SDHCI platform and OF driver helper

[5.398574] sdhci-pltfm: SDHCI platform and OF driver helper

sdhci-tegra 700b0400.sdhci: Got CD GPIO

[5.399032] sdhci-tegra 700b0400.sdhci: Got CD GPIO

sdhci-tegra 700b0400.sdhci: Got WP GPIO

[5.399109] sdhci-tegra 700b0400.sdhci: Got WP GPIO

3mmc0: Unknown controller version (3). You may experience problems.

[5.399231] mmc0: Unknown controller version (3). You may
experience problems.

sdhci-tegra 700b0400.sdhci: No vmmc regulator found

[5.399443] sdhci-tegra 700b0400.sdhci: No vmmc regulator found

3mmc0: Unknown controller version (3). You may experience problems.

[5.399731] mmc0: Unknown controller version (3). You may
experience problems.

sdhci-tegra 700b0600.sdhci: No vmmc regulator found

[5.399868] sdhci-tegra 700b0600.sdhci: No vmmc regulator found

sdhci-tegra 700b0600.sdhci: No vqmmc regulator found

[5.399931] sdhci-tegra 700b0600.sdhci: No vqmmc regulator found

4mmc0: Invalid maximum block size, assuming 512 bytes

[5.33] mmc0: Invalid maximum block size, assuming 512 bytes

6mmc0: SDHCI controller on 700b0600.sdhci [700b0600.sdhci] using ADMA 64-bit

[5.446794] mmc0: SDHCI controller on 700b0600.sdhci
[700b0600.sdhci] using ADMA 64-bit

6usbcore: registered new interface driver usbhid

[5.448020] usbcore: registered new interface driver usbhid

6usbhid: USB HID core driver

[5.448075] usbhid: USB HID core driver

6cfg80211: Calling CRDA to update world regulatory domain

[6.536872] cfg80211: Calling CRDA to update world regulatory domain

tegra-hda 7003.hda: azx_get_response timeout, switching to polling
mode: last cmd=0x300f0001

[8.526885] tegra-hda 7003.hda: azx_get_response timeout,
switching to polling mode: last cmd=0x300f0001

6input: tegra-hda HDMI/DP,pcm=3 as /devices/soc0/7003.hda/sound/card0/input0

[8.968688] input: tegra-hda HDMI/DP,pcm=3 as
/devices/soc0/7003.hda/sound/card0/input0

6cfg80211: Calling CRDA to update world regulatory domain

[9.696855] cfg80211: Calling CRDA to update world regulatory domain

tegra-i2c 7000c000.i2c:


From Dushyant's log, I saw the "tegra-i2c 7000c000.i2c:" will finally
time out. However, in my case, I didn't see the time out happens.

Thanks and Best Regards,

Meng
---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Problem Reading from XenStore in DomU

2016-05-15 Thread Meng Xu

On Sun, May 15, 2016 at 3:54 PM, Dagaen Golomb  wrote:
>>> Hi All,
>>>
>>> I'm having an interesting issue. I am working on a project that
>>> requires me to share memory between dom0 and domUs. I have this
>>> successfully working using the grant table and the XenStore to
>>> communicate grefs.
>>>
>>> My issue is this. I have one domU running Ubuntu 12.04 with a default
>>> 3.8.x kernel that has no issue reading or writing from the XenStore.
>>> My work also requires some kernel modifications, and we have made
>>> these changes in the 4.1.0 kernel. In particular, we've only added a
>>> simple hypercall. This modified kernel is what dom0 is running, on top
>>> of Xen 4.7 rc1.
>>>
>>> The guest I mentioned before with the default kernel can still read
>>> and write the XenStore just fine when on Xen 4.7 rc1 and with dom0
>>> running our kernel.
>>>
>>> The issue I'm having is with another newly created guest (i.e., it is
>>> not a copy of the working one, this is because I needed more space and
>>> couldn't expand the original disk image). It is also running Ubuntu
>>> 12.04 but has been upgraded to our modified kernel. On this guest I
>>> can write to the XenStore, and see that the writes were indeed
>>> successful using xenstore-ls in dom0. However, when this same guest
>>> attempts to read from the XenStore, it doesn't return an error code
>>> but instead just blocks indefinitely. I've waiting many minutes to
>>> make sure its not just blocking for a while, it appears like it will
>>> block forever. The block is happening when I start the transaction.
>>> I've also tried not using a transaction, in which case it blocks on
>>> the read itself.
>>>
>>> I have an inkling this may be something as simple as a configuration
>>> issue, but I can't seem to find anything. Also, the fact that writes
>>> work fine but reads do not is perplexing me.
>>>
>>> Any help would be appreciated!
>>
>> Nothing should block like this.  Without seeing your patch, I can't
>> comment as to whether you have accidentally broken things.
>
> I don't see any way the patch could be causing this. It simply adds
> another function and case clause to an already-existing hypercall, and
> when you call the hypercall with that option it returns the current
> budget of a passed-in vcpu. It doesn't even come close to touching
> grant mechanics, and doesn't modify any state - it simply returns a
> value that previously was hidden in the kernel.
>
>> Other avenues of investigation are to look at what the xenstored process
>> is doing in dom0 (is it idle? or is it spinning?), and to look in the
>> xenstored log file to see if anything suspicious occurs.
>
> I tried booting into older, stock kernels. They all work with the
> read. However, I do not see why the kernel modification would be the
> issue as described above. I also have the dom0 running this kernel and
> it reads and writes the XenStore just dandy. Are there any kernel
> config issues that could do this?

What if you use the .config of the kernel in the working domU to
compile the kernel in the not-working domU?
I assume you used the same kernel source code for both domUs.


Best Regards,

Meng
---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Problem Reading from XenStore in DomU

2016-05-15 Thread Meng Xu

On Sun, May 15, 2016 at 9:41 PM, Dagaen Golomb  wrote:
>> On 5/15/16 8:28 PM, Dagaen Golomb wrote:
>>>> On 5/15/16 11:40 AM, Dagaen Golomb wrote:
>>>>> Hi All,
>>>>>
>>>>> I'm having an interesting issue. I am working on a project that
>>>>> requires me to share memory between dom0 and domUs. I have this
>>>>> successfully working using the grant table and the XenStore to
>>>>> communicate grefs.
>>>>>
>>>>> My issue is this. I have one domU running Ubuntu 12.04 with a default
>>>>> 3.8.x kernel that has no issue reading or writing from the XenStore.
>>>>> My work also requires some kernel modifications, and we have made
>>>>> these changes in the 4.1.0 kernel. In particular, we've only added a
>>>>> simple hypercall. This modified kernel is what dom0 is running, on top
>>>>> of Xen 4.7 rc1.
>>>>
>>>> Without reading the rest of the thread but seeing the kernel versions.
>>>> Can you check how you're communicating to xenstore? Is it via
>>>> /dev/xen/xenbus or /proc/xen/xenbus? Anything after 3.14 will give you
>>>> deadlocks if you try to use /proc/xen/xenbus. Xen 4.6 and newer should
>>>> prefer /dev/xen/xenbus. Same thing can happen with privcmd but making
>>>> that default didn't land until Xen 4.7. Since you're on the right
>>>> versions I expect you're using /dev/xen/xenbus but you never know.
>>>
>>> How do I know which is being used? /dev/xen/xenbus is there and so is
>>> process/xen/xenbus. Could this be a problem with header version
>>> mismatches or something similar? I'm using the xen/xenstore.h header
>>> file for all of my xenstore interactions. I'm running Xen 4.7 so it
>>> should be in /dev/, and the old kernel is before 3.14 but the new one
>>> is after, but I would presume the standard headers are updated to
>>> account for this. Is there an easy way to check for this? Also, would
>>> the same issue cause writes to fails? Because writes from the same
>>> domain work fine, and appear to other domains using xenstore-ls.
>>>
>>> Regards,
>>> Dagaen Golomb
>>>
>>
>> Use strace on the process and see what gets opened.
>
> Ah, of course. It seems both the working and non-working domains are
> using /proc/...

Then according to Doug, "Anything after 3.14 will give you
> deadlocks if you try to use /proc/xen/xenbus.". Maybe the non-working domU 
> uses kernel version after 3.14?

Meng
---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Problem Reading from XenStore in DomU

2016-05-15 Thread Meng Xu

Hi Doug,

Do you happen to know if Xen has an existing mechanism to make
/dev/xen/xenbus as a default device for xenstored?

On Sun, May 15, 2016 at 11:30 PM, Dagaen Golomb  wrote:
>> >>>>> Hi All,
>> >>>>>
>> >>>>> I'm having an interesting issue. I am working on a project that
>> >>>>> requires me to share memory between dom0 and domUs. I have this
>> >>>>> successfully working using the grant table and the XenStore to
>> >>>>> communicate grefs.
>> >>>>>
>> >>>>> My issue is this. I have one domU running Ubuntu 12.04 with a
>> >>>>> default
>> >>>>> 3.8.x kernel that has no issue reading or writing from the XenStore.
>> >>>>> My work also requires some kernel modifications, and we have made
>> >>>>> these changes in the 4.1.0 kernel. In particular, we've only added a
>> >>>>> simple hypercall. This modified kernel is what dom0 is running, on
>> >>>>> top
>> >>>>> of Xen 4.7 rc1.
>> >>>>
>> >>>> Without reading the rest of the thread but seeing the kernel
>> >>>> versions.
>> >>>> Can you check how you're communicating to xenstore? Is it via
>> >>>> /dev/xen/xenbus or /proc/xen/xenbus? Anything after 3.14 will give
>> >>>> you
>> >>>> deadlocks if you try to use /proc/xen/xenbus. Xen 4.6 and newer
>> >>>> should
>> >>>> prefer /dev/xen/xenbus. Same thing can happen with privcmd but making
>> >>>> that default didn't land until Xen 4.7. Since you're on the right
>> >>>> versions I expect you're using /dev/xen/xenbus but you never know.
>> >>>
>> >>> How do I know which is being used? /dev/xen/xenbus is there and so is
>> >>> process/xen/xenbus. Could this be a problem with header version
>> >>> mismatches or something similar? I'm using the xen/xenstore.h header
>> >>> file for all of my xenstore interactions. I'm running Xen 4.7 so it
>> >>> should be in /dev/, and the old kernel is before 3.14 but the new one
>> >>> is after, but I would presume the standard headers are updated to
>> >>> account for this. Is there an easy way to check for this? Also, would
>> >>> the same issue cause writes to fails? Because writes from the same
>> >>> domain work fine, and appear to other domains using xenstore-ls.
>> >>>
>> >>> Regards,
>> >>> Dagaen Golomb
>> >>>
>> >>
>> >> Use strace on the process and see what gets opened.
>> >
>> > Ah, of course. It seems both the working and non-working domains are
>> > using /proc/...
>>
>> Then according to Doug, "Anything after 3.14 will give you
>> > deadlocks if you try to use /proc/xen/xenbus.". Maybe the non-working
>> > domU uses kernel version after 3.14.
>
> It does, being the custom kernel on version 4.1.0. But Dom0 uses this same
> exact kernel and reads/writes just fine! The only solution if this is indeed
> the problem appears to be changing the kernel source we build on or some
> hacky method such as symlinking /proc/.. to /dev/.., there has to be an
> elegant real solution to this...

Hi Dagaen,

Maybe we can try to create a symlink /proc/xen/xenbus to
/dev/xen/xenbus and see if works.

I'm not that sure about if you can just define a env. variable
XENSTORED_PATH to /dev/xen/xenbus to make /dev/xen/xenbus as a default
choice.. But it's no harm to try.

BTW, this is a useful link to refer to:
http://lists.xenproject.org/archives/html/xen-devel/2015-08/msg01679.html


Best Regards,

Meng
---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Question about running Xen on NVIDIA Jetson-TK1

2016-05-16 Thread Meng Xu

On Mon, May 16, 2016 at 7:33 AM, Julien Grall  wrote:
>
> On 15/05/16 20:35, Meng Xu wrote:
>>
>> Hi Julien and Ian,
>
>
> Hello Meng,


Hi Julien,


>
>>
>> I'm trying to run Xen on NVIDIA Jetson TK1 board. (Right now, Xen does
>> not support the Jetson board officially. But I'm thinking it may be
>> very interesting and useful to see it happens, since it has GPU inside
>> which is quite popular in automotive.)
>>
>> Now I encountered some problem to boot dom0 in Xen environment. I want
>> to debug the issues and maybe fix the issues, but I'm not so sure how
>> I should debug the issue more efficiently. I really appreciate it if
>> you advise me a little bit about the method of how to fix the issue.
>> :-)
>>
>> ---Below is the details
>>
>> I noticed the Dushyant from IBM also tried to run Xen on the Jetson
>> board. (http://www.gossamer-threads.com/lists/xen/devel/422519). I
>> used the same Linux kernel (Jan Kiszka's development tree -
>> http://git.kiszka.org/linux.git/, branch queues/assorted) and Ian's
>> Xen repo. with the hack for Jetson board. I can see the dom0 kernel
>> can boot to some extend and then "stall/spin" before the dom0 kernel
>> fully boot up.
>>
>> In order to figure out the possible issue, I boot the exact same Linux
>> kernel in native environment on one CPU and collected the boot log
>> information in [1]. I also boot the same Linux kernel as dom0 in Xen
>> environment and collected the boot log information in [2].
>>
>> In Xen environment, dom0 hangs after the following message
>> [   10.541010] NET: Registered protocol family 10
>> 6mip6: Mobile IPv6
>> [   10.542510] mi
>>
>> In native environment, the kernel has the following log after initializing 
>> NET.
>> [2.934693] NET: Registered protocol family 10
>> [2.940611] mip6: Mobile IPv6
>> [2.943645] sit: IPv6 over IPv4 tunneling driver
>> [2.951303] NET: Registered protocol family 17
>> [2.955800] NET: Registered protocol family 15
>> [2.960257] can: controller area network core (rev 20120528 abi 9)
>> [2.966617] NET: Registered protocol family 29
>> [2.971098] can: raw protocol (rev 20120528)
>> [2.975384] can: broadcast manager protocol (rev 20120528 t)
>> [2.981088] can: netlink gateway (rev 20130117) max_hops=1
>> [2.986734] Bluetooth: RFCOMM socket layer initialized
>> [2.991979] Bluetooth: RFCOMM ver 1.11
>> [2.995757] Bluetooth: BNEP (Ethernet Emulation) ver 1.3
>> [3.001109] Bluetooth: BNEP socket layer initialized
>> [3.006089] Bluetooth: HIDP (Human Interface Emulation) ver 1.2
>> [3.012052] Bluetooth: HIDP socket layer initialized
>> [3.017894] Registering SWP/SWPB emulation handler
>> [3.029675] tegra-pcie 1003000.pcie-controller: 2x1, 1x1 configuration
>> [3.036586] +3.3V_SYS: supplied by +VDD_MUX
>> [3.040857] +3.3V_LP0: supplied by +3.3V_SYS
>> [3.045509] +1.35V_LP0(sd2): supplied by +5V_SYS
>> [3.050201] +1.05V_RUN_AVDD: supplied by +1.35V_LP0(sd2)
>> [3.057131] tegra-pcie 1003000.pcie-controller: probing port 0, using 2 
>> lanes
>> [3.066479] tegra-pcie 1003000.pcie-controller: Slot present pin
>> change, signature: 0008
>>
>> I'm suspecting that my dom0 kernel hangs when it tries to initialize
>> "can: controller area network core ". However, from Dushyant's post at
>> http://www.gossamer-threads.com/lists/xen/devel/422519,  it seems
>> Dushyant's dom0 kernel hangs when it tries to initialize pci_bus. (The
>> linux config I used may be different form Dushyant's. That could be
>> the reason.)
>>
>> Right now, the system just hangs and has no output indicating what the
>> problem could be. Although there are a lot of error message before the
>> system hangs, I'm not that sure if I should start with solving all of
>> those error messages. Maybe some errors can be ignored?
>>
>> My questions are:
>> 1) Do you have suggestion on how to see more information about the
>> reason why the dom0 hangs?
>
>
> Have you tried to dump the registers using Xen console (CTLR-x 3 times then 
> 0) and see where it get stucks?


I tried to type CTLR -x 3 times and then 0, nothing happens... :-(
Just to confirm, once the system got stuck, I directly type Ctrl-x for
three times on the host's screen. Am I correct?

Maybe the serial console is not correctly set up?
The serial console configuration I used is as follows, could you have
a quick look to see

Re: [Xen-devel] Question about running Xen on NVIDIA Jetson-TK1

2016-05-16 Thread Meng Xu

Hi Julien,

On Mon, May 16, 2016 at 1:33 PM, Julien Grall  wrote:
> (CC Kyle who is also working on Tegra?)
>
> Hi Meng,
>
> Many people are working on Nvidia platform with different issues :/. I have
> CCed another person which IIRC is also working on it.

Sure. It's good to know others are also interested in this platform.
It will be more useful to fix it... :-)

>
> On 16/05/16 17:33, Meng Xu wrote:
>>
>> On Mon, May 16, 2016 at 7:33 AM, Julien Grall 
>> wrote:
>>>
>>>
>>> On 15/05/16 20:35, Meng Xu wrote:
>>>>
>>>>
>>>> I'm trying to run Xen on NVIDIA Jetson TK1 board. (Right now, Xen does
>>>> not support the Jetson board officially. But I'm thinking it may be
>>>> very interesting and useful to see it happens, since it has GPU inside
>>>> which is quite popular in automotive.)
>>>>
>>>> Now I encountered some problem to boot dom0 in Xen environment. I want
>>>> to debug the issues and maybe fix the issues, but I'm not so sure how
>>>> I should debug the issue more efficiently. I really appreciate it if
>>>> you advise me a little bit about the method of how to fix the issue.
>>>> :-)
>>>>
>>>> ---Below is the details
>>>>
>>>> I noticed the Dushyant from IBM also tried to run Xen on the Jetson
>>>> board. (http://www.gossamer-threads.com/lists/xen/devel/422519). I
>>>> used the same Linux kernel (Jan Kiszka's development tree -
>>>> http://git.kiszka.org/linux.git/, branch queues/assorted) and Ian's
>>>> Xen repo. with the hack for Jetson board. I can see the dom0 kernel
>>>> can boot to some extend and then "stall/spin" before the dom0 kernel
>>>> fully boot up.
>>>>
>>>> In order to figure out the possible issue, I boot the exact same Linux
>>>> kernel in native environment on one CPU and collected the boot log
>>>> information in [1]. I also boot the same Linux kernel as dom0 in Xen
>>>> environment and collected the boot log information in [2].
>>>>
>>>> In Xen environment, dom0 hangs after the following message
>>>> [   10.541010] NET: Registered protocol family 10
>>>> 6mip6: Mobile IPv6
>>>> [   10.542510] mi
>>>>
>>>> In native environment, the kernel has the following log after
>>>> initializing NET.
>>>> [2.934693] NET: Registered protocol family 10
>>>> [2.940611] mip6: Mobile IPv6
>>>> [2.943645] sit: IPv6 over IPv4 tunneling driver
>>>> [2.951303] NET: Registered protocol family 17
>>>> [2.955800] NET: Registered protocol family 15
>>>> [2.960257] can: controller area network core (rev 20120528 abi 9)
>>>> [2.966617] NET: Registered protocol family 29
>>>> [2.971098] can: raw protocol (rev 20120528)
>>>> [2.975384] can: broadcast manager protocol (rev 20120528 t)
>>>> [2.981088] can: netlink gateway (rev 20130117) max_hops=1
>>>> [2.986734] Bluetooth: RFCOMM socket layer initialized
>>>> [2.991979] Bluetooth: RFCOMM ver 1.11
>>>> [2.995757] Bluetooth: BNEP (Ethernet Emulation) ver 1.3
>>>> [3.001109] Bluetooth: BNEP socket layer initialized
>>>> [3.006089] Bluetooth: HIDP (Human Interface Emulation) ver 1.2
>>>> [3.012052] Bluetooth: HIDP socket layer initialized
>>>> [3.017894] Registering SWP/SWPB emulation handler
>>>> [3.029675] tegra-pcie 1003000.pcie-controller: 2x1, 1x1
>>>> configuration
>>>> [3.036586] +3.3V_SYS: supplied by +VDD_MUX
>>>> [3.040857] +3.3V_LP0: supplied by +3.3V_SYS
>>>> [3.045509] +1.35V_LP0(sd2): supplied by +5V_SYS
>>>> [3.050201] +1.05V_RUN_AVDD: supplied by +1.35V_LP0(sd2)
>>>> [3.057131] tegra-pcie 1003000.pcie-controller: probing port 0, using
>>>> 2 lanes
>>>> [3.066479] tegra-pcie 1003000.pcie-controller: Slot present pin
>>>> change, signature: 0008
>>>>
>>>> I'm suspecting that my dom0 kernel hangs when it tries to initialize
>>>> "can: controller area network core ". However, from Dushyant's post at
>>>> http://www.gossamer-threads.com/lists/xen/devel/422519,  it seems
>>>> Dushyant's dom0 kernel hangs when it tries to initialize pci_bus. (The
>>>> linux config I used may be different form Dushyant's. That cou

Re: [Xen-devel] Question about running Xen on NVIDIA Jetson-TK1

2016-05-16 Thread Meng Xu

On Mon, May 16, 2016 at 7:27 PM, Kyle Temkin  wrote:
> Hi, Meng:
>

Hi Kyle,

> Julien is correct-- a coworker and I are working on support for Tegra
> SoCs, and we've made pretty good progress; there's work yet to be
> done, but we have dom0 and guests booting on the Jetson TK1, Jetson
> TX1, and the Google Pixel C. We hope to get a patch set out soon--
> unfortunately, our employer has to take some time to verify that
> everything's okay to be open-sourced, so I can't send out our
> work-in-progress just yet. We'll have an RFC patchset out soon, I
> hope!

Looking forward to your RFC patchset... Could you please cc. me when
you send out your RFC patchset. I really love to have a look at (maybe
review) it.

>
> There are two main hardware differences that cause Tegra SoCs to have
> trouble with Xen:
>
> - The primary interrupt controller for those systems isn't a single
> GIC, as Xen expects. Instead, there's an NVIDIA Legacy Interrupt
> Controller (LIC, or ICTLR) that gates all peripheral interrupts before
> passing them to a standard GICv2. This interrupt controller has to be
> programmed to ensure Xen can receive interrupts from the hardware
> (e.g. serial), programmed to ensure that interrupts for pass-through
> devices are correctly unmasked, and virtualized so dom0 can program
> the "sections" related to interrupts not being routed to Xen or to a
> domain for hardware passthrough.
>
> - The serial controller on the Tegra SoCs doesn't behave in the same
> was as most NS16550-compatibles; it actually adheres to the NS16550
> spec a little more rigidly than most compatible controllers. A
> coworker (Chris Patterson, cc'd) figured out what was going on; from
> what I understand, most 16550s generate the "transmit ready" interrupt
> once, when the device first can accept new FIFO entries. Both the
> original 16550 and the Tegra implementation generate the "transmit
> ready" interrupt /continuously/ when there's space available in the
> FIFO, slewing the CPU with a stream of constant interrupts.

I see. Thank you very much for explaining this so clearly! :-)

>
> What you're seeing is likely a symptom of the first difference. In
> your logs, you see messages that indicate Xen is having trouble
> correctly routing IRQ that are parented by the legacy interrupt
> controller:
>
>> irq 0 not connected to primary controller.Connected to 
>> /interrupt-controller@60004000

Right. I see the root issue now.  Thank you so much for pointing it out!

>
> The issue here is that Xen is currently explicitly opting not to route
> legacy-interrupt-controller interrupts, as they don't belong to the
> primary GIC. As a result, these interrupts never make it to dom0. The
> logic that needs to be tweaked is here:
>
> http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/arch/arm/domain_build.c;h=00dc07af637b67153d33408f34331700dff84f93;hb=HEAD#l1137
>
> We re-write this logic in our forthcoming patch-set to be more
> general. As an interim workaround, you might opt to rewrite that logic
> so LIC interrupts (which have an interrupt-parent compatible with
> "tegra124-ictlr", in your case) can be routed by Xen, as well. Off the
> top of my head, a workaround might look like:
>
> /*
> * Don't map IRQ that have no physical meaning
> * ie: IRQ whose controller is not the GIC
> */
> - if ( rirq.controller != dt_interrupt_controller )
> +if ( (rirq.controller != dt_interrupt_controller) &&
> (!dt_device_is_compatible(rirq.controller, "tegra124-ictlr") )

It should have "nvidia" before "tegra124-ictlr". ;-)
After change it to
!dt_device_is_compatible(rirq.controller, "nvidia , tegra124-ictlr")

dom0 boots up~~~ :-D

>
> Of course, that's off-the-cuff code I haven't tried, but hopefully it
> should help to get you started.

Sure! It does work and get me started! I really appreciate your help
and explanation!

Looking forward to your RFC patch set. :-)

Thank you again for your help and time in this issue! It helps a lot!

Best Regards,

Meng
---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 0/2] xen: sched: rtds refactor code

2016-05-17 Thread Meng Xu

On Sun, May 15, 2016 at 7:54 PM, Tianyang Chen  wrote:
> The first part of this patch series aims at fixing coding syle issue
> for control structures. Because locks are grabbed in schedule.c before
> hooks are called, underscores in front of function names are removed.
>
> The second part replaces atomic bit-ops with non-atomic ones since locks
> are grabbed in schedule.c.
>
> Discussions:
> http://lists.xenproject.org/archives/html/xen-devel/2016-03/msg01528.html
> http://www.gossamer-threads.com/lists/xen/devel/431251?do=post_view_threaded#431251
>
> Tianyang Chen (2):
>   xen: sched: rtds refactor code
>   xen: sched: rtds: use non-atomic bit-ops
>
>  xen/common/sched_rt.c |  122 
> ++---
>  1 file changed, 64 insertions(+), 58 deletions(-)
>

Tianyang,

Thanks for the patch!
One comment for the future: please add the version number into the
title so that we can easily tell it is a new patch. :-)

Best,

Meng

---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/2] xen: sched: rtds refactor code

2016-05-17 Thread Meng Xu

On Sun, May 15, 2016 at 7:54 PM, Tianyang Chen  wrote:
> No functional change:
>  -Various coding style fix
>  -Added comments for UPDATE_LIMIT_SHIFT.
>
> Signed-off-by: Tianyang Chen 

Reviewed-by: Meng Xu 

-------
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 2/2] xen: sched: rtds: use non-atomic bit-ops

2016-05-17 Thread Meng Xu

On Sun, May 15, 2016 at 7:54 PM, Tianyang Chen  wrote:
> Vcpu flags are checked and cleared atomically. Performance can be
> improved with corresponding non-atomic versions since schedule.c
> already has spin_locks in place.
>
> Signed-off-by: Tianyang Chen 

Reviewed-by: Meng Xu 

Thanks,

Meng
-------
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [Embedded-pv-devel] [PATCH RFC 00/18] System adjustment to customer needs.

2016-05-18 Thread Meng Xu

Hi Andrii,

On Wed, May 18, 2016 at 12:32 PM, Andrii Anisov
 wrote:
> This series RFC patches from the currently ongoing production project.
> This patch series presents changes needed to fit the system into
> customer requirements as well as workaround limitations of the
> Jacinto6 SoC.

IMHO, it will be better, if possible, to describe the exact customer
requirements this patch series tries to satisfy. I'm curious at what
the requirements are and if the requirements are general enough for
many other customers. :-)

Similarly, what are the limitations for the Jacinto6 SoC that need to
be workaround? If the board is not supported by Xen, can we say Xen
will support the board with the warkaround?

Thanks and Best Regards,

Meng
---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] xen: sched: avoid races on time values read from NOW()

2016-05-19 Thread Meng Xu

On Thu, May 19, 2016 at 4:11 AM, Dario Faggioli
 wrote:
> or (even in cases where there is no race, e.g., outside
> of Credit2) avoid using a time sample which may be rather
> old, and hence stale.
>
> In fact, we should only sample NOW() from _inside_
> the critical region within which the value we read is
> used. If we don't, in case we have to spin for a while
> before entering the region, when actually using it:
>
>  1) we will use something that, at the veryy least, is
> not really "now", because of the spinning,
>
>  2) if someone else sampled NOW() during a critical
> region protected by the lock we are spinning on,
> and if we compare the two samples when we get
> inside our region, our one will be 'earlier',
> even if we actually arrived later, which is a
> race.
>
> In Credit2, we see an instance of 2), in runq_tickle(),
> when it is called by csched2_context_saved() as it samples
> NOW() before acquiring the runq lock. This makes things
> look like the time went backwards, and it confuses the
> algorithm (there's even a d2printk() about it, which would
> trigger all the time, if enabled).
>
> In RTDS, something similar happens in repl_timer_handler(),
> and there's another instance in schedule() (in generic code),
> so fix these cases too.
>
> While there, improve csched2_vcpu_wake() and and rt_vcpu_wake()
> a little as well (removing a pointless initialization, and
> moving the sampling a bit closer to its use). These two hunks
> entail no further functional changes.
>
> Signed-off-by: Dario Faggioli 
> ---
> Cc: George Dunlap 
> Cc: Meng Xu 
> Cc: Wei Liu 
> ---

Reviewed-by: Meng Xu 

The bug will cause incorrect budget accounting for one VCPU when the
race occurs.

Best Regards,

Meng
---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for 4.7] xen: sched: avoid races on time values read from NOW()

2016-05-19 Thread Meng Xu

On Thu, May 19, 2016 at 4:11 AM, Dario Faggioli
 wrote:
> Hey Wei,
>
> Again, I'm using an otherwise unnecessary cover letter for my analysis about
> <>. :-)
>
> I'd say yes, because the patch fixes an actual bug, in the form of a rather
> subtle race condition, which was all but trivial to spot.  I must say, though,
> that I've only found the bug guilty of being particularly nasty if we use
> Credit2.  Actually, I'm quite sure it has an effect on RTDS too (although I 
> did
> not trace that), but since both Credit2 and RTDS are still marked as
> experimental in 4.7, one may think it's not worthwhile putting in something
> like this to fix experimental only code.
>
> Just FYI, this bug is what was causing the issue I briefly chatted about on 
> IRC
> with George, yesterday, i.e., it is what led Credit2 to emit (rather
> aggresively, actually) the debug printks showed here:
>
>  http://pastebin.com/gzYrNST5

In addition to the race condition in the bare metal,  actually I saw
this when I debug/run Xen in VirtualBox.
The situation is:
If we have nested virtualization or if we have heterogeneous cores
which have different speed/time, the RTDS scheduler (maybe credit2 as
well?) will have a problem in budget accounting. The "CPU" of Xen is
scheduled by the underlining hypervisor. One "CPU" of Xen could be
slower than another, showing the time is left behind.

We explicitly say that RTDS will have incorrect budget accounting for
nested virtualization situation, when the RTDS was upstreamed in Xen
4.5.

Best Regards,

Meng
---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [Embedded-pv-devel] [PATCH RFC 0/6] Set of PV drivers used by production project

2016-05-19 Thread Meng Xu

Hi Lurii,

On Thu, May 19, 2016 at 10:37 AM, Iurii Mykhalskyi
 wrote:
> This patches introduce set of pv drivers interfaces.

Thank you very much for these pv drivers interfaces! It will be useful
for automotive applications, IMO.

However, I do have some questions:
I'm wondering how general the pv driver interfaces are?
Which types of ARM boards (I assume it's for ARM) can they be used?
What are the ARM boards you have tested on?
What are the production use case we are talking about here?

Are you or globallogic going to contribute the PV drivers as well? I'm
looking forward to the PV drivers as well. :-)

Thanks and Best Regards,

Meng
---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] Question about the best practice to install two versions of Xen toolstack on the same machine

2016-05-19 Thread Meng Xu

Hi all,

I'm trying to install two versions of Xen, say Xen 4.6 and Xen
4.7-unstable, onto the same machine. I want them to exist at the same
time, instead of letting one override the other.

I'm thinking about this because sometimes I want to try out someone
else's code which uses an older or newer version. But I also want to
keep my current version of Xen toolstack so that I won't need to
reinstall everything again later.

If I just use the following command, the new installation of the
toolstack will override the old version's toolstack. obviously:
$./configure
$make dist
$sudo make install
(Right now, I just have to recompile my code after I tried out someone
else's code that has a different version. I can keep two version of
Xen kernel and configure it in the grub2 entries. But I have to
reinstall the toolstack.)

My quick question is:
Does anyone try to install two version of Xen toolstack on the same machine?
Is there any documentation about the best practice to install two
versions of Xen onto the same machine?

---
I had a look at the ./configure's help. There are several options,
each of which can specify a specific path to install.
However, I'm not that sure if I should configure every option to make it work.

For example, it has --prefix and --exec-prefix to change the PREFIX
from /usr/local to user defined path. However, there is also --bindir
and --sbindir; I assume I should change it, should I? In addition,
should I specify the --libexecdir for the program executables?

I found one very old link at [1], but I doubt if it's still working
since Xen changes the toolstack a lot since Xen 4.1
http://old-list-archives.xenproject.org/xen-users/2009-09/msg00263.html

Thanks and Best Regards,

Meng
---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [Embedded-pv-devel] [PATCH RFC 00/18] System adjustment to customer needs.

2016-05-20 Thread Meng Xu

On Thu, May 19, 2016 at 5:53 PM, Andrii Anisov
 wrote:
> Meng,
>

Hi Andrii,

Thank you very much for your explanation about the use case in previous email!

>>> If the board is not supported by Xen, can we say Xen will support the
>>> board with the warkaround?
>
> I would not say boards are supported by XEN (except earlyprintk).
> Rather architectures are supported in general, and SoC's are supported
> in architecture implementation defined deviations (i.e. SMMU absence).

Yes. I searched around for the "Jacinto 6" Automotive processor.[1]
It uses Cortex A15 processor...
However, I tried the Arndale Octo board two years ago
(http://www.arndaleboard.org/wiki/index.php/Main_Page). From my
previous experience, the board may not be supported by Xen even though
the processor it uses has virtualization extension.. :-(
That's why I asked if the board itself can run Xen. If the board can
run Xen, I would like to buy one and try it out. :-)

[1] http://www.ti.com/lit/ds/symlink/dra746.pdf

Thanks and Best Regards,

Meng
---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] RT-Xen on ARM

2017-08-18 Thread Meng Xu

Hi Andrii,

On Tue, Aug 1, 2017 at 4:02 AM, Andrii Anisov  wrote:
> Hello Meng Xu,
>
> I've get back to this stuff.

Sorry for the late response. I'm not sure if you have already solved this.

>
>
> On 03.07.17 17:58, Andrii Anisov wrote:
>>
>> That's why we are going to keep configuration (of guests and workloads)
>> close to [1] for evaluation, but on our target SoC.
>> I'm wondering if there are known issues or specifics for ARM.
>>
>> [1] https://www.cis.upenn.edu/~linhphan/papers/emsoft14-rt-xen.pdf
>
> Currently I have a setup with dom0 and domU's with Litmus-RT.

Great!

> Following the
> document I need workload tasks.
> Maybe you have mentioned workload tasks sources you can share, so that would
> shorten my steps.

Sure. The workload we used in the paper is mainly the cpu-intensive task.
We first calibrate a busy-loop of multiplications that runs for 1ms.
Then for a task that executes for exe(ms), we simply let the task
execute the 1ms busy loop for exe times.
It is also good to run the same task for several times to make sure
the task's execution time is table from different runs.

The Section 4.1 and 4.2 in [1] explained the whole experiment steps.
If you have any question or confusion on a specific step, please feel
free to let me know.
We may schedule a meeting to clarify all the questions or confusions
you may have.

[1] https://www.cis.upenn.edu/~linhphan/papers/emsoft14-rt-xen.pdf

Best regards,

Meng

>
> --
>
> *Andrii Anisov*
>
>

-- 
---
Meng Xu
PhD Candidate in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] RT-Xen on ARM

2017-08-21 Thread Meng Xu

On Mon, Aug 21, 2017 at 4:07 AM, Andrii Anisov  wrote:
>
> Hello Meng Xu,
>
>
> On 18.08.17 23:43, Meng Xu wrote:
>>
>> The Section 4.1 and 4.2 in [1] explained the whole experiment steps.
>> If you have any question or confusion on a specific step, please feel
>> free to let me know.
>
> From the document it is not really clear if you ran one guest RT domain or 
> several simultaneously for your experiments.
>

We run 4 VMs simultaneously.


Meng


-- 
---
Meng Xu
PhD Candidate in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] RT-Xen on ARM

2017-08-21 Thread Meng Xu

On Mon, Aug 21, 2017 at 4:16 AM, Andrii Anisov  wrote:
>
> On 21.08.17 11:07, Andrii Anisov wrote:
>>
>> Hello Meng Xu,
>>
>>
>> On 18.08.17 23:43, Meng Xu wrote:
>>>
>>> The Section 4.1 and 4.2 in [1] explained the whole experiment steps.
>>> If you have any question or confusion on a specific step, please feel
>>> free to let me know.
>>
>> From the document it is not really clear if you ran one guest RT domain or
>> several simultaneously for your experiments.
>
> Also it is not described XEN RT scheduler setup like vcpus period/budget
> configuration for each guest domain.
> It is not obvious if the configured set of vcpus in the experiment setup
> utilized all the pcpus bandwidth.
>

Given the set of tasks in each VM, we compute the VCPUs' periods and
budgets, using the CARTS tool [1]. Note that each task has a period
and a worst-case execution time (wcet).

The configured set of vcpus in the experiment setup may not use all
pcpus bandwidth. For example, if we have one task (period = 10ms, wcet
= 2ms) on a VCPU, the VCPU of the task will not be configured with
100% bandwidth. If the VCPU is the only VCPU on a pcpu, that pcpu
bandwidth won't be fully used because there is not enough workload to
fully use all pcpu bandwidth.

[1] https://rtg.cis.upenn.edu/carts/

Best,

Meng
---
Meng Xu
PhD Candidate in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] RT-Xen on ARM

2017-08-21 Thread Meng Xu

On Mon, Aug 21, 2017 at 4:38 AM, Andrii Anisov  wrote:
>
> On 18.08.17 23:43, Meng Xu wrote:
>>
>> Sure. The workload we used in the paper is mainly the cpu-intensive task.
>> We first calibrate a busy-loop of multiplications that runs for 1ms.
>> Then for a task that executes for exe(ms), we simply let the task
>> execute the 1ms busy loop for exe times.
>
> I'm a bit confused, why didn't you ran the system with rtspin from
> LITMUS-RT, any issues with it?

The task we are using should do same amount of calculation for the
same amount of time. For example, suppose it takes 1ms to run the
following piece of code:
for( i = 0; i < 1 million; i++)
 sum += i;
This piece of code can be viewed as the "payload" of a realistic workload.

Suppose the task is scheduled to run at t0, preempted at t1, resumes
at t2, and finishes at t3. We have (t1 - t0) + (t3 - t2) = 1ms and we
are sure the task did the addition for 1million times.

However, if we use the rtspin, the rtspin will check if (t2-t0) > 1ms.
If so, it will claim it finishes its workload although it hasn't
finished its workload, i.e., doing addition for 1million times.

Since we want to compare if tasks can finish their "workload" by their
deadline under different scheduling algorithms, we should fix the
"amount of workload" a task does under different scheduling policies.
rtspin() does not achieve our purpose. That's why we don't use it.

Note that rtspin() is initially designed to test the scheduling
overload of LITMUS. It does not perform the same amount of workload
for the same assigned wcet.

> BTW, I've found set experimental patches (scripts and functional changes) on
> your github: https://github.com/PennPanda/liblitmus .
> Are they related to the mentioned document [1]?

Not really. The liblitmus repo under my repo. is for another project.
It is not for [1]'s purpose.

The idea of creating the real-time task is similar, though.
The real-time task is based on the bin/base_task.c in liblitmus.
It needs to fill out the job() function as follows:

static int job(int wcet)
{
for (i = 0; i < wcet; i++)
  loop_for_one_1ms()
}

 loop_for_one_1ms() {
 /* iterations value differs across machines */
 for (j = 0; j < iterations; j++ )
   result  = result + j * j;
  }

>
>> [1] https://www.cis.upenn.edu/~linhphan/papers/emsoft14-rt-xen.pdf
>
>
> --

Hope it helps clear the confusion.

Thanks,

Meng

-- 
---
Meng Xu
PhD Candidate in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] Xen 4.10 Development Update

2017-08-22 Thread Meng Xu

On Mon, Aug 21, 2017 at 6:07 AM, Julien Grall  wrote:
> This email only tracks big items for xen.git tree. Please reply for items you
> woulk like to see in 4.10 so that people have an idea what is going on and
> prioritise accordingly.
>
> You're welcome to provide description and use cases of the feature you're
> working on.
>
> = Timeline =
>
> We now adopt a fixed cut-off date scheme. We will release twice a
> year. The upcoming 4.10 timeline are as followed:
>
> * Last posting date: September 15th, 2017
> * Hard code freeze: September 29th, 2017
> * RC1: TBD
> * Release: December 2, 2017
>
> Note that we don't have freeze exception scheme anymore. All patches
> that wish to go into 4.10 must be posted no later than the last posting
> date. All patches posted after that date will be automatically queued
> into next release.
>
> RCs will be arranged immediately after freeze.
>
> We recently introduced a jira instance to track all the tasks (not only big)
> for the project. See: https://xenproject.atlassian.net/projects/XEN/issues.
>
> Most of the tasks tracked by this e-mail also have a corresponding jira task
> referred by XEN-N.
>
> I have started to include the version number of series associated to each
> feature. Can each owner send an update on the version number if the series
> was posted upstream?
>
> = Projects =
>
> == Hypervisor ==
>
> *  Per-cpu tasklet
>   -  XEN-28
>   -  Konrad Rzeszutek Wilk
>
> *  Add support of rcu_idle_{enter,exit}
>   -  XEN-27
>   -  Dario Faggioli

I'm moving the RTDS scheduler to work-conserving scheduler.
The first version of the patch series has been posted at
https://www.mail-archive.com/xen-devel@lists.xen.org/msg117062.html,
after we discussed the RFC patch.

Thanks,

Meng

-- 
---
Meng Xu
PhD Candidate in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [RFC PATCH 0/5] Extend resources to support more vcpus in single VM

2017-08-25 Thread Meng Xu

Hi Tianyu,

On Thu, Aug 24, 2017 at 10:52 PM, Lan Tianyu  wrote:
>
> This patchset is to extend some resources(i.e, event channel,
> hap and so) to support more vcpus for single VM.
>
>
> Chao Gao (1):
>   xl/libacpi: extend lapic_id() to uint32_t
>
> Lan Tianyu (4):
>   xen/hap: Increase hap size for more vcpus support
>   XL: Increase event channels to support more vcpus
>   Tool/ACPI: DSDT extension to support more vcpus
>   hvmload: Add x2apic entry support in the MADT build
>
>  tools/firmware/hvmloader/util.c |  2 +-
>  tools/libacpi/acpi2_0.h | 10 +++
>  tools/libacpi/build.c   | 61 
> +
>  tools/libacpi/libacpi.h |  2 +-
>  tools/libacpi/mk_dsdt.c | 11 
>  tools/libxl/libxl_create.c  |  2 +-
>  tools/libxl/libxl_x86_acpi.c|  2 +-
>  xen/arch/x86/mm/hap/hap.c   |  2 +-
>  8 files changed, 63 insertions(+), 29 deletions(-)


How many VCPUs for a single VM do you want to support with this patch set?

Thanks,

Meng
-- 
---
Meng Xu
PhD Candidate in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

[Xen-devel] [PATCH v2 4/5] xentrace: enable per-VCPU extratime flag for RTDS

2017-09-01 Thread Meng Xu

Change repl_budget event output for xentrace formats and xenalyze

Signed-off-by: Meng Xu 

---
Changes from v1
Add this changes from v1
---
 tools/xentrace/formats| 2 +-
 tools/xentrace/xenalyze.c | 8 +---
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/tools/xentrace/formats b/tools/xentrace/formats
index f39182a..470ac5c 100644
--- a/tools/xentrace/formats
+++ b/tools/xentrace/formats
@@ -75,7 +75,7 @@
 0x00022801  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  rtds:tickle[ cpu = 
%(1)d ]
 0x00022802  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  rtds:runq_pick [ dom:vcpu 
= 0x%(1)08x, cur_deadline = 0x%(3)08x%(2)08x, cur_budget = 0x%(5)08x%(4)08x ]
 0x00022803  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  rtds:burn_budget   [ dom:vcpu 
= 0x%(1)08x, cur_budget = 0x%(3)08x%(2)08x, delta = %(4)d ]
-0x00022804  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  rtds:repl_budget   [ dom:vcpu 
= 0x%(1)08x, cur_deadline = 0x%(3)08x%(2)08x, cur_budget = 0x%(5)08x%(4)08x ]
+0x00022804  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  rtds:repl_budget   [ dom:vcpu 
= 0x%(1)08x, priority_level = 0x%(2)08d cur_deadline = 0x%(4)08x%(3)08x, 
cur_budget = 0x%(6)08x%(5)08x ]
 0x00022805  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  rtds:sched_tasklet
 0x00022806  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  rtds:schedule  [ 
cpu[16]:tasklet[8]:idle[4]:tickled[4] = %(1)08x ]
 
diff --git a/tools/xentrace/xenalyze.c b/tools/xentrace/xenalyze.c
index 39fc35f..6fb952c 100644
--- a/tools/xentrace/xenalyze.c
+++ b/tools/xentrace/xenalyze.c
@@ -7944,12 +7944,14 @@ void sched_process(struct pcpu_info *p)
 if(opt.dump_all) {
 struct {
 unsigned int vcpuid:16, domid:16;
+unsigned int priority_level;
 uint64_t cur_dl, cur_bg;
 } __attribute__((packed)) *r = (typeof(r))ri->d;
 
-printf(" %s rtds:repl_budget d%uv%u, deadline = %"PRIu64", "
-   "budget = %"PRIu64"\n", ri->dump_header,
-   r->domid, r->vcpuid, r->cur_dl, r->cur_bg);
+printf(" %s rtds:repl_budget d%uv%u, priority_level = %u,"
+   "deadline = %"PRIu64", budget = %"PRIu64"\n",
+   ri->dump_header, r->domid, r->vcpuid,
+   r->priority_level, r->cur_dl, r->cur_bg);
 }
 break;
 case TRC_SCHED_CLASS_EVT(RTDS, 5): /* SCHED_TASKLET*/
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

[Xen-devel] [PATCH v2 1/5] xen:rtds: towards work conserving RTDS

2017-09-01 Thread Meng Xu

Make RTDS scheduler work conserving without breaking the real-time guarantees.

VCPU model:
Each real-time VCPU is extended to have an extratime flag
and a priority_level field.
When a VCPU's budget is depleted in the current period,
if it has extratime flag set,
its priority_level will increase by 1 and its budget will be refilled;
othewrise, the VCPU will be moved to the depletedq.

Scheduling policy is modified global EDF:
A VCPU v1 has higher priority than another VCPU v2 if
(i) v1 has smaller priority_leve; or
(ii) v1 has the same priority_level but has a smaller deadline

Queue management:
Run queue holds VCPUs with extratime flag set and VCPUs with
remaining budget. Run queue is sorted in increasing order of VCPUs priorities.
Depleted queue holds VCPUs which have extratime flag cleared and depleted 
budget.
Replenished queue is not modified.

Signed-off-by: Meng Xu 

---
Changes from v1
Change XEN_DOMCTL_SCHED_RTDS_extratime to XEN_DOMCTL_SCHEDRT_extra as
suggested by Dario

Changes from RFC v1
Rewording comments and commit message
Remove is_work_conserving field from rt_vcpu structure
Use one bit in VCPU's flag to indicate if a VCPU will have extra time
Correct comments style
---
 xen/common/sched_rt.c   | 90 ++---
 xen/include/public/domctl.h |  4 ++
 2 files changed, 80 insertions(+), 14 deletions(-)

diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
index 0ac5816..fab6f49 100644
--- a/xen/common/sched_rt.c
+++ b/xen/common/sched_rt.c
@@ -49,13 +49,15 @@
  * A PCPU is feasible if the VCPU can run on this PCPU and (the PCPU is idle or
  * has a lower-priority VCPU running on it.)
  *
- * Each VCPU has a dedicated period and budget.
+ * Each VCPU has a dedicated period, budget and a extratime flag
  * The deadline of a VCPU is at the end of each period;
  * A VCPU has its budget replenished at the beginning of each period;
  * While scheduled, a VCPU burns its budget.
  * The VCPU needs to finish its budget before its deadline in each period;
  * The VCPU discards its unused budget at the end of each period.
- * If a VCPU runs out of budget in a period, it has to wait until next period.
+ * When a VCPU runs out of budget in a period, if its extratime flag is set,
+ * the VCPU increases its priority_level by 1 and refills its budget; 
otherwise,
+ * it has to wait until next period.
  *
  * Each VCPU is implemented as a deferable server.
  * When a VCPU has a task running on it, its budget is continuously burned;
@@ -63,7 +65,8 @@
  *
  * Queue scheme:
  * A global runqueue and a global depletedqueue for each CPU pool.
- * The runqueue holds all runnable VCPUs with budget, sorted by deadline;
+ * The runqueue holds all runnable VCPUs with budget,
+ * sorted by priority_level and deadline;
  * The depletedqueue holds all VCPUs without budget, unsorted;
  *
  * Note: cpumask and cpupool is supported.
@@ -151,6 +154,14 @@
 #define RTDS_depleted (1<<__RTDS_depleted)
 
 /*
+ * RTDS_extratime: Can the vcpu run in the time that is
+ * not part of any real-time reservation, and would therefore
+ * be otherwise left idle?
+ */
+#define __RTDS_extratime4
+#define RTDS_extratime (1<<__RTDS_extratime)
+
+/*
  * rt tracing events ("only" 512 available!). Check
  * include/public/trace.h for more details.
  */
@@ -201,6 +212,8 @@ struct rt_vcpu {
 struct rt_dom *sdom;
 struct vcpu *vcpu;
 
+unsigned priority_level;
+
 unsigned flags;  /* mark __RTDS_scheduled, etc.. */
 };
 
@@ -245,6 +258,11 @@ static inline struct list_head *rt_replq(const struct 
scheduler *ops)
 return &rt_priv(ops)->replq;
 }
 
+static inline bool has_extratime(const struct rt_vcpu *svc)
+{
+return (svc->flags & RTDS_extratime) ? 1 : 0;
+}
+
 /*
  * Helper functions for manipulating the runqueue, the depleted queue,
  * and the replenishment events queue.
@@ -274,6 +292,21 @@ vcpu_on_replq(const struct rt_vcpu *svc)
 }
 
 /*
+ * If v1 priority >= v2 priority, return value > 0
+ * Otherwise, return value < 0
+ */
+static s_time_t
+compare_vcpu_priority(const struct rt_vcpu *v1, const struct rt_vcpu *v2)
+{
+int prio = v2->priority_level - v1->priority_level;
+
+if ( prio == 0 )
+return v2->cur_deadline - v1->cur_deadline;
+
+return prio;
+}
+
+/*
  * Debug related code, dump vcpu/cpu information
  */
 static void
@@ -303,6 +336,7 @@ rt_dump_vcpu(const struct scheduler *ops, const struct 
rt_vcpu *svc)
 cpulist_scnprintf(keyhandler_scratch, sizeof(keyhandler_scratch), mask);
 printk("[%5d.%-2u] cpu %u, (%"PRI_stime", %"PRI_stime"),"
" cur_b=%"PRI_stime" cur_d=%"PRI_stime" last_start=%"PRI_stime"\n"
+   " \t\t priority_level=%d has_extratime=%d\n"
" \t\t onQ=%d runnable=%d flags=%x effective hard_affinity=%s\n",
 svc->vcpu

[Xen-devel] [PATCH v2 2/5] libxl: enable per-VCPU extratime flag for RTDS

2017-09-01 Thread Meng Xu

Modify libxl_vcpu_sched_params_get/set and sched_rtds_vcpu_get/set
functions to support per-VCPU extratime flag

Signed-off-by: Meng Xu 

---
Changes from v1
1) Add LIBXL_HAVE_SCHED_RTDS_VCPU_EXTRA to indicate if extratime flag is
supported
2) Change flag name in domctl.h from XEN_DOMCTL_SCHED_RTDS_extratime to
XEN_DOMCTL_SCHEDRT_extra

Changes from RFC v1
Change work_conserving flag to extratime flag
---
 tools/libxl/libxl_sched.c | 12 
 1 file changed, 12 insertions(+)
---
 tools/libxl/libxl.h   |  6 ++
 tools/libxl/libxl_sched.c | 18 ++
 2 files changed, 24 insertions(+)

diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 1704525..ead300f 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -257,6 +257,12 @@
 #define LIBXL_HAVE_SCHED_RTDS_VCPU_PARAMS 1
 
 /*
+ * LIBXL_HAVE_SCHED_RTDS_VCPU_EXTRA indicates RTDS scheduler
+ * now supports per-vcpu extratime settings.
+ */
+#define LIBXL_HAVE_SCHED_RTDS_VCPU_EXTRA 1
+
+/*
  * libxl_domain_build_info has the arm.gic_version field.
  */
 #define LIBXL_HAVE_BUILDINFO_ARM_GIC_VERSION 1
diff --git a/tools/libxl/libxl_sched.c b/tools/libxl/libxl_sched.c
index faa604e..b76a29a 100644
--- a/tools/libxl/libxl_sched.c
+++ b/tools/libxl/libxl_sched.c
@@ -558,6 +558,10 @@ static int sched_rtds_vcpu_get_all(libxl__gc *gc, uint32_t 
domid,
 for (i = 0; i < num_vcpus; i++) {
 scinfo->vcpus[i].period = vcpus[i].u.rtds.period;
 scinfo->vcpus[i].budget = vcpus[i].u.rtds.budget;
+if (vcpus[i].u.rtds.flags & XEN_DOMCTL_SCHEDRT_extra)
+   scinfo->vcpus[i].extratime = 1;
+else
+   scinfo->vcpus[i].extratime = 0;
 scinfo->vcpus[i].vcpuid = vcpus[i].vcpuid;
 }
 rc = 0;
@@ -607,6 +611,10 @@ static int sched_rtds_vcpu_set(libxl__gc *gc, uint32_t 
domid,
 vcpus[i].vcpuid = scinfo->vcpus[i].vcpuid;
 vcpus[i].u.rtds.period = scinfo->vcpus[i].period;
 vcpus[i].u.rtds.budget = scinfo->vcpus[i].budget;
+if (scinfo->vcpus[i].extratime)
+vcpus[i].u.rtds.flags |= XEN_DOMCTL_SCHEDRT_extra;
+else
+vcpus[i].u.rtds.flags &= ~XEN_DOMCTL_SCHEDRT_extra;
 }
 
 r = xc_sched_rtds_vcpu_set(CTX->xch, domid,
@@ -655,6 +663,10 @@ static int sched_rtds_vcpu_set_all(libxl__gc *gc, uint32_t 
domid,
 vcpus[i].vcpuid = i;
 vcpus[i].u.rtds.period = scinfo->vcpus[0].period;
 vcpus[i].u.rtds.budget = scinfo->vcpus[0].budget;
+if (scinfo->vcpus[0].extratime)
+vcpus[i].u.rtds.flags |= XEN_DOMCTL_SCHEDRT_extra;
+else
+vcpus[i].u.rtds.flags &= ~XEN_DOMCTL_SCHEDRT_extra;
 }
 
 r = xc_sched_rtds_vcpu_set(CTX->xch, domid,
@@ -705,6 +717,12 @@ static int sched_rtds_domain_set(libxl__gc *gc, uint32_t 
domid,
 sdom.period = scinfo->period;
 if (scinfo->budget != LIBXL_DOMAIN_SCHED_PARAM_BUDGET_DEFAULT)
 sdom.budget = scinfo->budget;
+if (scinfo->extratime != LIBXL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT) {
+if (scinfo->extratime)
+sdom.flags |= XEN_DOMCTL_SCHEDRT_extra;
+else
+sdom.flags &= ~XEN_DOMCTL_SCHEDRT_extra;
+}
 if (sched_rtds_validate_params(gc, sdom.period, sdom.budget))
 return ERROR_INVAL;
 
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

[Xen-devel] [PATCH v2 0/5] Towards work-conserving RTDS

2017-09-01 Thread Meng Xu

This series of patches make RTDS scheduler work-conserving
without breaking real-time guarantees.
VCPUs with extratime flag set can get extra time
from the unreserved system resource.
System administrators can decide which VCPUs have extratime flag set.

Example:
Set the extratime bit of all VCPUs of domain 1:
# xl sched-rtds -d 1 -v all -p 1 -b 2000 -e 1
Each VCPU of domain 1 will be guaranteed to have 2000ms every 1ms
(if the system is schedulable).
If there is a CPU having no work to do,
domain 1's VCPUs will be scheduled onto the CPU,
even though the VCPUs have got 2000ms in 1ms.

Clear the extra bit of all VCPUs of domain 1:
# xl sched-rtds -d 1 -v all -p 1 -b 2000 -e 0

Set/Clear the extratime bit of one specific VCPU of domain 1:
# xl sched-rtds -d 1 -v 1 -p 1 -b 2000 -e 1
# xl sched-rtds -d 1 -v 1 -p 1 -b 2000 -e 0


The original design of the work-conserving RTDS was discussed at
https://www.mail-archive.com/xen-devel@lists.xen.org/msg77150.html

The first version was discussed at
https://www.mail-archive.com/xen-devel@lists.xen.org/msg117361.html

The series of patch can be found at github:
https://github.com/PennPanda/RT-Xen
under the branch:
xenbits/rtds/work-conserving-v2

Changes from v1
Change XEN_DOMCTL_SCHED_RTDS_extratime to XEN_DOMCTL_SCHEDRT_extra
Revise xentrace, xenalyze, and docs
Add LIBXL_HAVE_SCHED_RTDS_VCPU_EXTRA symbol in libxl.h

Changes from RFC v1
Merge changes in sched_rt.c into one patch;
Minor change in variable name and comments.

Signed-off-by: Meng Xu 

[PATCH v2 1/5] xen:rtds: towards work conserving RTDS
[PATCH v2 2/5] libxl: enable per-VCPU extratime flag for RTDS
[PATCH v2 3/5] xl: enable per-VCPU extratime flag for RTDS
[PATCH v2 4/5] xentrace: enable per-VCPU extratime flag for RTDS
[PATCH v2 5/5] docs: enable per-VCPU extratime flag for RTDS


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

[Xen-devel] [PATCH v2 3/5] xl: enable per-VCPU extratime flag for RTDS

2017-09-01 Thread Meng Xu

Change main_sched_rtds and related output functions to support
per-VCPU extratime flag.

Signed-off-by: Meng Xu 

---
Changes from v1
No change because we agree on using -e 0/1 option to
set if a vcpu will get extra time or not

Changes from RFC v1
Changes work_conserving flag to extratime flag
---
 tools/xl/xl_cmdtable.c |  3 ++-
 tools/xl/xl_sched.c| 56 ++
 2 files changed, 40 insertions(+), 19 deletions(-)
---
 tools/xl/xl_cmdtable.c |  3 ++-
 tools/xl/xl_sched.c| 56 ++
 2 files changed, 40 insertions(+), 19 deletions(-)

diff --git a/tools/xl/xl_cmdtable.c b/tools/xl/xl_cmdtable.c
index ba0159d..1b03d44 100644
--- a/tools/xl/xl_cmdtable.c
+++ b/tools/xl/xl_cmdtable.c
@@ -272,12 +272,13 @@ struct cmd_spec cmd_table[] = {
 { "sched-rtds",
   &main_sched_rtds, 0, 1,
   "Get/set rtds scheduler parameters",
-  "[-d  [-v[=VCPUID/all]] [-p[=PERIOD]] [-b[=BUDGET]]]",
+  "[-d  [-v[=VCPUID/all]] [-p[=PERIOD]] [-b[=BUDGET]] 
[-e[=EXTRATIME]]]",
   "-d DOMAIN, --domain=DOMAIN Domain to modify\n"
   "-v VCPUID/all, --vcpuid=VCPUID/allVCPU to modify or output;\n"
   "   Using '-v all' to modify/output all vcpus\n"
   "-p PERIOD, --period=PERIOD Period (us)\n"
   "-b BUDGET, --budget=BUDGET Budget (us)\n"
+  "-e EXTRATIME, --extratime=EXTRATIME EXTRATIME (1=yes, 0=no)\n"
 },
 { "domid",
   &main_domid, 0, 0,
diff --git a/tools/xl/xl_sched.c b/tools/xl/xl_sched.c
index 85722fe..5138012 100644
--- a/tools/xl/xl_sched.c
+++ b/tools/xl/xl_sched.c
@@ -251,7 +251,7 @@ static int sched_rtds_domain_output(
 libxl_domain_sched_params scinfo;
 
 if (domid < 0) {
-printf("%-33s %4s %9s %9s\n", "Name", "ID", "Period", "Budget");
+printf("%-33s %4s %9s %9s %10s\n", "Name", "ID", "Period", "Budget", 
"Extra time");
 return 0;
 }
 
@@ -262,11 +262,12 @@ static int sched_rtds_domain_output(
 }
 
 domname = libxl_domid_to_name(ctx, domid);
-printf("%-33s %4d %9d %9d\n",
+printf("%-33s %4d %9d %9d %10s\n",
 domname,
 domid,
 scinfo.period,
-scinfo.budget);
+scinfo.budget,
+scinfo.extratime ? "yes" : "no");
 free(domname);
 libxl_domain_sched_params_dispose(&scinfo);
 return 0;
@@ -279,8 +280,8 @@ static int sched_rtds_vcpu_output(int domid, 
libxl_vcpu_sched_params *scinfo)
 int i;
 
 if (domid < 0) {
-printf("%-33s %4s %4s %9s %9s\n", "Name", "ID",
-   "VCPU", "Period", "Budget");
+printf("%-33s %4s %4s %9s %9s %10s\n", "Name", "ID",
+   "VCPU", "Period", "Budget", "Extra time");
 return 0;
 }
 
@@ -290,12 +291,13 @@ static int sched_rtds_vcpu_output(int domid, 
libxl_vcpu_sched_params *scinfo)
 
 domname = libxl_domid_to_name(ctx, domid);
 for ( i = 0; i < scinfo->num_vcpus; i++ ) {
-printf("%-33s %4d %4d %9"PRIu32" %9"PRIu32"\n",
+printf("%-33s %4d %4d %9"PRIu32" %9"PRIu32" %10s\n",
domname,
domid,
scinfo->vcpus[i].vcpuid,
scinfo->vcpus[i].period,
-   scinfo->vcpus[i].budget);
+   scinfo->vcpus[i].budget,
+   scinfo->vcpus[i].extratime ? "yes" : "no");
 }
 free(domname);
 return 0;
@@ -309,8 +311,8 @@ static int sched_rtds_vcpu_output_all(int domid,
 int i;
 
 if (domid < 0) {
-printf("%-33s %4s %4s %9s %9s\n", "Name", "ID",
-   "VCPU", "Period", "Budget");
+printf("%-33s %4s %4s %9s %9s %10s\n", "Name", "ID",
+   "VCPU", "Period", "Budget", "Extra time");
 return 0;
 }
 
@@ -321,12 +323,13 @@ static int sched_rtds_vcpu_output_all(int domid,
 
 domname = libxl_domid_to_name(ctx, domid);
 for ( i = 0; i < scinfo->num_vcpus; i++ ) {
-printf("%-33s %4d %4d %9"PRIu32" %9"PRIu32"\n",
+printf("%-33s %4d %4d %9"PRIu32" %9"PRIu32" %10s\n",
domname,
domid,
scinfo->vcpus[i].vcpuid,
scinfo->vcpus[i].period,
-   scinfo->vcpus[i].budget);
+   scinfo->vcpus[i].budg

[Xen-devel] [PATCH v2 5/5] docs: enable per-VCPU extratime flag for RTDS

2017-09-01 Thread Meng Xu

Revise xl tool use case by adding -e option
Remove work-conserving from TODO list

Signed-off-by: Meng Xu 

---
Changes from v1
Revise rtds docs
---
 docs/features/sched_rtds.pandoc | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/docs/features/sched_rtds.pandoc b/docs/features/sched_rtds.pandoc
index 354097b..d51b499 100644
--- a/docs/features/sched_rtds.pandoc
+++ b/docs/features/sched_rtds.pandoc
@@ -40,7 +40,7 @@ as follows:
 
 It is possible, for a multiple vCPUs VM, to change the parameters of
 each vCPU individually:
-* `xl sched-rtds -d vm-rt -v 0 -p 2 -b 1 -v 1 -p 45000 -b 12000`
+* `xl sched-rtds -d vm-rt -v 0 -p 2 -b 1 -e 1 -v 1 -p 45000 -b 
12000 -e 0`
 
 # Technical details
 
@@ -53,7 +53,8 @@ the presence of the LIBXL\_HAVE\_SCHED\_RTDS symbol. The 
ability of
 specifying different scheduling parameters for each vcpu has been
 introduced later, and is available if the following symbols are defined:
 * `LIBXL\_HAVE\_VCPU\_SCHED\_PARAMS`,
-* `LIBXL\_HAVE\_SCHED\_RTDS\_VCPU\_PARAMS`.
+* `LIBXL\_HAVE\_SCHED\_RTDS\_VCPU\_PARAMS`,
+* `LIBXL\_HAVE\_SCHED\_RTDS\_VCPU\_EXTRA`.
 
 # Limitations
 
@@ -95,7 +96,6 @@ at a macroscopic level), the following should be done:
 
 # Areas for improvement
 
-* Work-conserving mode to be added;
 * performance assessment, especially focusing on what level of real-time
   behavior the scheduler enables.
 
@@ -118,4 +118,5 @@ at a macroscopic level), the following should be done:
 Date   Revision Version  Notes
 --   ---
 2016-10-14 1Xen 4.8  Document written
+2017-08-31 2Xen 4.10 Revise for work conserving feature
 --   ---
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 2/5] libxl: enable per-VCPU extratime flag for RTDS

2017-09-01 Thread Meng Xu

Dario,

I didn't include your Reviewed-by tag because I made one small change.


On Fri, Sep 1, 2017 at 11:58 AM, Meng Xu  wrote:
>
> Modify libxl_vcpu_sched_params_get/set and sched_rtds_vcpu_get/set
> functions to support per-VCPU extratime flag
>
> Signed-off-by: Meng Xu 
>
> ---
> Changes from v1
> 1) Add LIBXL_HAVE_SCHED_RTDS_VCPU_EXTRA to indicate if extratime flag is
> supported
> 2) Change flag name in domctl.h from XEN_DOMCTL_SCHED_RTDS_extratime to
> XEN_DOMCTL_SCHEDRT_extra
>
> Changes from RFC v1
> Change work_conserving flag to extratime flag
> ---
>  tools/libxl/libxl_sched.c | 12 
>  1 file changed, 12 insertions(+)
> ---
>  tools/libxl/libxl.h   |  6 ++
>  tools/libxl/libxl_sched.c | 18 ++
>  2 files changed, 24 insertions(+)
>
> diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
> index 1704525..ead300f 100644
> --- a/tools/libxl/libxl.h
> +++ b/tools/libxl/libxl.h
> @@ -257,6 +257,12 @@
>  #define LIBXL_HAVE_SCHED_RTDS_VCPU_PARAMS 1
>
>  /*
> + * LIBXL_HAVE_SCHED_RTDS_VCPU_EXTRA indicates RTDS scheduler
> + * now supports per-vcpu extratime settings.
> + */
> +#define LIBXL_HAVE_SCHED_RTDS_VCPU_EXTRA 1
> +
> +/*
>   * libxl_domain_build_info has the arm.gic_version field.
>   */
>  #define LIBXL_HAVE_BUILDINFO_ARM_GIC_VERSION 1
> diff --git a/tools/libxl/libxl_sched.c b/tools/libxl/libxl_sched.c
> index faa604e..b76a29a 100644
> --- a/tools/libxl/libxl_sched.c
> +++ b/tools/libxl/libxl_sched.c
> @@ -558,6 +558,10 @@ static int sched_rtds_vcpu_get_all(libxl__gc *gc, 
> uint32_t domid,
>  for (i = 0; i < num_vcpus; i++) {
>  scinfo->vcpus[i].period = vcpus[i].u.rtds.period;
>  scinfo->vcpus[i].budget = vcpus[i].u.rtds.budget;
> +if (vcpus[i].u.rtds.flags & XEN_DOMCTL_SCHEDRT_extra)
> +   scinfo->vcpus[i].extratime = 1;
> +else
> +   scinfo->vcpus[i].extratime = 0;
>  scinfo->vcpus[i].vcpuid = vcpus[i].vcpuid;
>  }
>  rc = 0;
> @@ -607,6 +611,10 @@ static int sched_rtds_vcpu_set(libxl__gc *gc, uint32_t 
> domid,
>  vcpus[i].vcpuid = scinfo->vcpus[i].vcpuid;
>  vcpus[i].u.rtds.period = scinfo->vcpus[i].period;
>  vcpus[i].u.rtds.budget = scinfo->vcpus[i].budget;
> +if (scinfo->vcpus[i].extratime)
> +vcpus[i].u.rtds.flags |= XEN_DOMCTL_SCHEDRT_extra;
> +else
> +vcpus[i].u.rtds.flags &= ~XEN_DOMCTL_SCHEDRT_extra;
>  }
>
>  r = xc_sched_rtds_vcpu_set(CTX->xch, domid,
> @@ -655,6 +663,10 @@ static int sched_rtds_vcpu_set_all(libxl__gc *gc, 
> uint32_t domid,
>  vcpus[i].vcpuid = i;
>  vcpus[i].u.rtds.period = scinfo->vcpus[0].period;
>  vcpus[i].u.rtds.budget = scinfo->vcpus[0].budget;
> +if (scinfo->vcpus[0].extratime)
> +vcpus[i].u.rtds.flags |= XEN_DOMCTL_SCHEDRT_extra;
> +else
> +vcpus[i].u.rtds.flags &= ~XEN_DOMCTL_SCHEDRT_extra;
>  }
>
>  r = xc_sched_rtds_vcpu_set(CTX->xch, domid,
> @@ -705,6 +717,12 @@ static int sched_rtds_domain_set(libxl__gc *gc, uint32_t 
> domid,
>  sdom.period = scinfo->period;
>  if (scinfo->budget != LIBXL_DOMAIN_SCHED_PARAM_BUDGET_DEFAULT)
>  sdom.budget = scinfo->budget;
> +if (scinfo->extratime != LIBXL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT) {
> +if (scinfo->extratime)
> +sdom.flags |= XEN_DOMCTL_SCHEDRT_extra;
> +else
> +sdom.flags &= ~XEN_DOMCTL_SCHEDRT_extra;
> +}
>  if (sched_rtds_validate_params(gc, sdom.period, sdom.budget))
>  return ERROR_INVAL;


As you mentioned in the comment to the xl patch v1, I used
LIBXL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT for extratime flag as what
we did for period and budget. But the way we handle flags is exactly
the same with the way we handle period and budget.

I'm ok with what it is in this patch, although I feel that we can kill the
 if (scinfo->extratime != LIBXL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT)
because LIBXL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT is -1.


What do you think?

Thanks,

Meng


-- 
---
Meng Xu
PhD Candidate in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] MAINTAINERS: update entries to new email address.

2017-10-05 Thread Meng Xu

On Thu, Oct 5, 2017 at 10:28 AM, Dario Faggioli  wrote:

> Replace, in the 'M:' fields of the components I co-maintain
> ('CPU POOLS', 'SCHEDULING' and 'RTDS SCHEDULER'), the Citrix
> email, to which I don't have access any longer, with my
> personal email.
>
> Signed-off-by: Dario Faggioli 
> ---
> Cc: Andrew Cooper 
> Cc: George Dunlap 
> Cc: Ian Jackson 
> Cc: Jan Beulich 
> Cc: Konrad Rzeszutek Wilk 
> Cc: Stefano Stabellini 
> Cc: Tim Deegan 
> Cc: Wei Liu 
> Cc: Juergen Gross 
> Cc: Meng Xu 
>

Acked-by: Meng Xu 

Meng
-- 
Meng Xu
Ph.D. Candidate in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] Changing my email address

2017-10-05 Thread Meng Xu

Hi Dario,

On Thu, Oct 5, 2017 at 10:28 AM, Dario Faggioli  wrote:
>
> Hello,
>
> Soon I won't have access to dario.faggi...@citrix.com email address.

It's sad to hear this. :(

>
> Therefore, replace it, in my entries in MAINTAINERS, with an email address 
> that
> I actually can, and will actually read.
>
> One thing about RTDS. Meng, which one of the following two sentences, better
> describes your situation?
>
>  a) Supported:   Someone is actually paid to look after this.
>  b) Maintained:  Someone actually looks after it.
>
> If it's a (you're currently paied to look after RTDS) then we're fine.

I'm paid to look after RTDS at least before I graduate. :)

Best regards,

Meng

-- 
Meng Xu
Ph.D. Candidate in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] RT-Xen on ARM

2017-10-09 Thread Meng Xu

Hi Andrii,

I'm sorry for replying to this thread late. I was busy with a paper
deadline until last Saturday morning.

I saw Dario's thorough answer which explains the high-level idea of
the real-time analysis that is the theoretical foundation of the
analysis tool, e.g., CARTs.
Hopefully, he answered your question.
If not, please feel free to ask.

I just added some very quick comment about your questions/comments below.

On Thu, Sep 28, 2017 at 5:18 AM, Andrii Anisov  wrote:
> Hello,
>
>
> On 27.09.17 22:57, Meng Xu wrote:
>>
>> Note that:
>> When you use gEDF scheduler in VM or VMM (i.e., Xen), you should use
>> MPR2 model
>
> I guess you mean DMPR in CARTS terms.
>
>>   to compute the resource interface (i.e., VCPU parameters).
>> When you use pEDF scheduler, you should use PRM model to compute.
>>>
>>>  - Could you please provide an example input xml for CARTS described
>>> a
>>> system with 2 RT domains with 2 VCPUs each, running on a 2PCPUs, with
>>> gEDF
>>> scheduling at VMM level (for XEN based setup).
>>
>> Hmm, if you use the gEDF scheduling algorithm, this may not be
>> possible. Let me explain why.
>> In the MPR2 model, it computes the interface with the minimum number
>> of cores. To get 2 VCPUs for a VM, the total utilization (i.e., budget
>> / period) of these two VCPUs must be larger than 1.0. Since you ask
>> for 2 domains, the total utilization of these 4 VCPUs will be larger
>> than 2.0, which are definitely not schedulable on two cores.
>
> Well, if we are speaking about test-cases similar to described in [1], where
> the whole real time tasks set utilization is taken from 1.1...(PCPU*1)-0.1,
> there is no problem with having VCPU number greater than PCPUs. For sure if
> we take number of domains  more that 1.

The number of VCPUs can be larger than the number of PCPUs.

>
>> If you are considering VCPUs with very low utilization, you may use
>> PRM model to compute each VCPU's parameters; after that, you can treat
>> these VCPUs as tasks, create another xml file, and ask CARTS to
>> compute the resource interface for these VCPUs.
>
> Sounds terrible for getting it scripted :(

If you use python to parse the xml file, it should not be very
difficuly. Python has api to parse the xml. :)

>>
>> (Unfortunately, the current CARTS implementation does not support
>> mixing MPR model in one XML file, although it is supported in theory.
>> This can be worked around by using the above approach.)
>>
>>> For pEDF at both VMM and
>>> domain level, my understanding is that the os_scheduler represents XEN,
>>> and
>>> VCPUs are represented by components with tasks running on them.
>>
>> Yes, if you analyze the entire system that uses one type of scheduler
>> with only one type of model (i.e., PRM or MPR2).
>>
>> If you mixed the scheduling algorithm or the interface model, you can
>> compute each VM or VCPU's parameters first. Then you treat VCPUs as
>> tasks and create another XML which will be used to compute the number
>> of cores to schedule all these VCPUs.
>>
>>>  - I did not get a concept of min_period/max_period for a
>>> component/os_scheduler in CARTS description files. If I have them
>>> different,
>>> CARTS gives calculation for all periods in between, but did not provide
>>> the
>>> best period to get system schedulable.
>>
>> You should set them to the same value.
>
> Ok, how to chose the value for some taskset in a component?

Tasks' periods and execution time depends on the tasks' requirement.
As Dario mentioned, if a sensor needs to process every 100ms, the
sensor task's period is 100ms. Its execution time is the worst-case
execution time of the sensor task.

As to the component (or VM)'s period, it's better to be smaller than
its tasks' periods. Usually, I may want to set to a value divisible by
its tasks' periods.
You may try different values for components' periods, because the
VCPU's bandwidth (budget/period) will be different for different
components' periods.
You can choose the component's period that produces a smaller VCPU's
bandwidth, which may help make VCPUs easiler to be scheduled on PCPUs.

Best,

Meng


-- 
Meng Xu
Ph.D. Candidate in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 2/5] libxl: enable per-VCPU extratime flag for RTDS

2017-10-09 Thread Meng Xu

On Tue, Sep 19, 2017 at 5:23 AM, Dario Faggioli
 wrote:
>
> On Fri, 2017-09-15 at 12:01 -0400, Meng Xu wrote:
> > On Wed, Sep 13, 2017 at 8:16 PM, Dario Faggioli
> >  wrote:
> > >
> > > > I'm ok with what it is in this patch, although I feel that we can
> > > > kill the
> > > >  if (scinfo->extratime !=
> > > > LIBXL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT)
> > > > because LIBXL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT is -1.
> > > >
> > >
> > > No, sorry, I don't understand what you mean here...
> >
> > I was thinking about the following code:
> >
> > if (scinfo->extratime !=
> > LIBXL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT) {
> > if (scinfo->extratime)
> > sdom.flags |= XEN_DOMCTL_SCHEDRT_extra;
> > else
> > sdom.flags &= ~XEN_DOMCTL_SCHEDRT_extra;
> > }
> >
> > This code can be changed to
> > if (scinfo->extratime)
> > sdom.flags |= XEN_DOMCTL_SCHEDRT_extra;
> > else
> > sdom.flags &= ~XEN_DOMCTL_SCHEDRT_extra;
> >
> > If the extratime uses default value (-1), we still set the extratime
> > flag.
> >
> > That's why I feel we may kill the
> >  if (scinfo->extratime != LIBXL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT)
> >
> Mmm... Ok, I see it now. Well, this is of course all up to the tools'
> maintainers.
>
> What I think it would be valauble to ask ourself here is, can, at this
> point, scinfo->extratime be equal to
> XL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT?
>
> And if it is, what does it mean, and what do we want to do?
>
> I mean, if extratime is -1, it means that we've been called, without it
> being touched by xl (although, remember that, as a library, libxl can
> be linked to and called by other programs too, e.g., libvirt).
>
> If you think that this is a serious programming bug, you can use
> XL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT to check that, and raise an
> assert.
>
> If you think it's an API misuse, you can use it to check for that, and
> return an error.
>
> If you think it's just fine, you can do whatever you want to do as
> default (which, AFAIUI, it's set the flag). In this case, it's probably
> fine to ignore XL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT in actual code.
> Although, I'd still put a reference to it in a comment, to explain
> what's going on, and why we're doing things differently from budget and
> period (since _their_ *_DEFAULT are checked).


I think it should be fine for API to call the function without setting
extratime parameter. We set the extratime by default.

I will go with the following code for the next version.
> if (scinfo->extratime)
> sdom.flags |= XEN_DOMCTL_SCHEDRT_extra;
> else
> sdom.flags &= ~XEN_DOMCTL_SCHEDRT_extra;
>

Thank you very much!

Best,

Meng

-- 
Meng Xu
Ph.D. Candidate in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 3/5] xl: enable per-VCPU extratime flag for RTDS

2017-10-09 Thread Meng Xu

On Wed, Sep 13, 2017 at 8:51 PM, Dario Faggioli
 wrote:
>
> On Fri, 2017-09-01 at 11:58 -0400, Meng Xu wrote:
> > diff --git a/tools/xl/xl_cmdtable.c b/tools/xl/xl_cmdtable.c
> > index ba0159d..1b03d44 100644
> > --- a/tools/xl/xl_cmdtable.c
> > +++ b/tools/xl/xl_cmdtable.c
> > @@ -272,12 +272,13 @@ struct cmd_spec cmd_table[] = {
> >  { "sched-rtds",
> >&main_sched_rtds, 0, 1,
> >"Get/set rtds scheduler parameters",
> > -  "[-d  [-v[=VCPUID/all]] [-p[=PERIOD]] [-b[=BUDGET]]]",
> > +  "[-d  [-v[=VCPUID/all]] [-p[=PERIOD]] [-b[=BUDGET]] [-
> > e[=EXTRATIME]]]",
> >"-d DOMAIN, --domain=DOMAIN Domain to modify\n"
> >"-v VCPUID/all, --vcpuid=VCPUID/allVCPU to modify or
> > output;\n"
> >"   Using '-v all' to modify/output all vcpus\n"
> >"-p PERIOD, --period=PERIOD Period (us)\n"
> >"-b BUDGET, --budget=BUDGET Budget (us)\n"
> > +  "-e EXTRATIME, --extratime=EXTRATIME EXTRATIME (1=yes, 0=no)\n"
>   Extratime
> ?

We need to provide the option to configure the extratime flag for each
vcpu, right?

>
> >  },
> >  { "domid",
> >&main_domid, 0, 0,
> > diff --git a/tools/xl/xl_sched.c b/tools/xl/xl_sched.c
> > index 85722fe..5138012 100644
> > --- a/tools/xl/xl_sched.c
> > +++ b/tools/xl/xl_sched.c
> > @@ -251,7 +251,7 @@ static int sched_rtds_domain_output(
> >  libxl_domain_sched_params scinfo;
> >
> >  if (domid < 0) {
> > -printf("%-33s %4s %9s %9s\n", "Name", "ID", "Period",
> > "Budget");
> > +printf("%-33s %4s %9s %9s %10s\n", "Name", "ID", "Period",
> > "Budget", "Extra time");
> >  return 0;
> >  }
> >
> Can you paste the output of:
>

Sure

> xl sched-rtds

Cpupool Pool-0: sched=RTDS
NameIDPeriodBudget Extra time
Domain-0 0 1  4000yes

> xl sched-rtds -d 0

NameIDPeriodBudget Extra time
Domain-0 0 1  4000yes

> xl sched-rtds -d 0 -v 1

NameID VCPUPeriodBudget Extra time
Domain-0 01 1  4000yes


> xl sched-rtds -d 0 -v all

NameID VCPUPeriodBudget Extra time
Domain-0 00 1  4000yes
Domain-0 01 1  4000yes
Domain-0 02 1  4000yes
Domain-0 03 1  4000yes
Domain-0 04 1  4000yes
Domain-0 05 1  4000yes
Domain-0 06 1  4000yes
Domain-0 07 1  4000yes
Domain-0 08 1  4000yes
Domain-0 09 1  4000yes
Domain-0 0   10 1  4000yes
Domain-0 0   11 1  4000yes

>
> with the series applied?
>
> > @@ -785,8 +801,9 @@ int main_sched_rtds(int argc, char **argv)
> >  goto out;
> >  }
> >  if (((v_index > b_index) && opt_b) || ((v_index > p_index) &&
> > opt_p)
> > -|| p_index != b_index) {
> > -fprintf(stderr, "Incorrect number of period and budget\n");
> > + || ((v_index > e_index) && opt_e) || p_index != b_index
> > + || p_index != e_index || b_index != e_index ) {
> >
> I don't think you need the `b_indes ! e_index` part. If p==b and p==e,
> it's automatically true that b==e.

Right.

>
> > @@ -820,7 +837,7 @@ int main_sched_rtds(int argc, char **argv)
> >  r = EXIT_FAILURE;
> >  goto out;
> >  }
> > -} else if (!opt_p && !opt_b) {
> > +} else if (!opt_p && !opt_b && !opt_e) {
> >  /* get per-vcpu rtds scheduling parameters */
> >  libxl_vcpu_sched_params scinfo;
> >  libx

[Xen-devel] [PATCH v3 2/5] libxl: enable per-VCPU extratime flag for RTDS

2017-10-10 Thread Meng Xu

Modify libxl_vcpu_sched_params_get/set and sched_rtds_vcpu_get/set
functions to support per-VCPU extratime flag

Signed-off-by: Meng Xu 

---
Changes from v2
1) Move extratime out of the section
   that is marked as depreciated in libxl_domain_sched_params.
2) Set vcpu extratime in sched_rtds_vcpu_get function function;
   This fix a bug in previous version
   when run command "xl sched-rtds -d 0 -v 1" which
   outputs vcpu extratime value incorrectly.

Changes from v1
1) Add LIBXL_HAVE_SCHED_RTDS_VCPU_EXTRA to indicate if extratime flag is
supported
2) Change flag name in domctl.h from XEN_DOMCTL_SCHED_RTDS_extratime to
XEN_DOMCTL_SCHEDRT_extra

Changes from RFC v1
Change work_conserving flag to extratime flag
---
 tools/libxl/libxl.h |  6 ++
 tools/libxl/libxl_sched.c   | 17 +
 tools/libxl/libxl_types.idl |  8 
 3 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index f82b91e..5e9aed7 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -257,6 +257,12 @@
 #define LIBXL_HAVE_SCHED_RTDS_VCPU_PARAMS 1
 
 /*
+ * LIBXL_HAVE_SCHED_RTDS_VCPU_EXTRA indicates RTDS scheduler
+ * now supports per-vcpu extratime settings.
+ */
+#define LIBXL_HAVE_SCHED_RTDS_VCPU_EXTRA 1
+
+/*
  * libxl_domain_build_info has the arm.gic_version field.
  */
 #define LIBXL_HAVE_BUILDINFO_ARM_GIC_VERSION 1
diff --git a/tools/libxl/libxl_sched.c b/tools/libxl/libxl_sched.c
index 7d144d0..512788f 100644
--- a/tools/libxl/libxl_sched.c
+++ b/tools/libxl/libxl_sched.c
@@ -532,6 +532,8 @@ static int sched_rtds_vcpu_get(libxl__gc *gc, uint32_t 
domid,
 for (i = 0; i < num_vcpus; i++) {
 scinfo->vcpus[i].period = vcpus[i].u.rtds.period;
 scinfo->vcpus[i].budget = vcpus[i].u.rtds.budget;
+scinfo->vcpus[i].extratime =
+!!(vcpus[i].u.rtds.flags & XEN_DOMCTL_SCHEDRT_extra);
 scinfo->vcpus[i].vcpuid = vcpus[i].vcpuid;
 }
 rc = 0;
@@ -579,6 +581,8 @@ static int sched_rtds_vcpu_get_all(libxl__gc *gc, uint32_t 
domid,
 for (i = 0; i < num_vcpus; i++) {
 scinfo->vcpus[i].period = vcpus[i].u.rtds.period;
 scinfo->vcpus[i].budget = vcpus[i].u.rtds.budget;
+scinfo->vcpus[i].extratime =
+!!(vcpus[i].u.rtds.flags & XEN_DOMCTL_SCHEDRT_extra);
 scinfo->vcpus[i].vcpuid = vcpus[i].vcpuid;
 }
 rc = 0;
@@ -628,6 +632,10 @@ static int sched_rtds_vcpu_set(libxl__gc *gc, uint32_t 
domid,
 vcpus[i].vcpuid = scinfo->vcpus[i].vcpuid;
 vcpus[i].u.rtds.period = scinfo->vcpus[i].period;
 vcpus[i].u.rtds.budget = scinfo->vcpus[i].budget;
+if (scinfo->vcpus[i].extratime)
+vcpus[i].u.rtds.flags |= XEN_DOMCTL_SCHEDRT_extra;
+else
+vcpus[i].u.rtds.flags &= ~XEN_DOMCTL_SCHEDRT_extra;
 }
 
 r = xc_sched_rtds_vcpu_set(CTX->xch, domid,
@@ -676,6 +684,10 @@ static int sched_rtds_vcpu_set_all(libxl__gc *gc, uint32_t 
domid,
 vcpus[i].vcpuid = i;
 vcpus[i].u.rtds.period = scinfo->vcpus[0].period;
 vcpus[i].u.rtds.budget = scinfo->vcpus[0].budget;
+if (scinfo->vcpus[0].extratime)
+vcpus[i].u.rtds.flags |= XEN_DOMCTL_SCHEDRT_extra;
+else
+vcpus[i].u.rtds.flags &= ~XEN_DOMCTL_SCHEDRT_extra;
 }
 
 r = xc_sched_rtds_vcpu_set(CTX->xch, domid,
@@ -726,6 +738,11 @@ static int sched_rtds_domain_set(libxl__gc *gc, uint32_t 
domid,
 sdom.period = scinfo->period;
 if (scinfo->budget != LIBXL_DOMAIN_SCHED_PARAM_BUDGET_DEFAULT)
 sdom.budget = scinfo->budget;
+/* Set extratime by default */
+if (scinfo->extratime)
+sdom.flags |= XEN_DOMCTL_SCHEDRT_extra;
+else
+sdom.flags &= ~XEN_DOMCTL_SCHEDRT_extra;
 if (sched_rtds_validate_params(gc, sdom.period, sdom.budget))
 return ERROR_INVAL;
 
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 2d0bb8a..dd7d364 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -421,14 +421,14 @@ libxl_domain_sched_params = Struct("domain_sched_params",[
 ("cap",  integer, {'init_val': 
'LIBXL_DOMAIN_SCHED_PARAM_CAP_DEFAULT'}),
 ("period",   integer, {'init_val': 
'LIBXL_DOMAIN_SCHED_PARAM_PERIOD_DEFAULT'}),
 ("budget",   integer, {'init_val': 
'LIBXL_DOMAIN_SCHED_PARAM_BUDGET_DEFAULT'}),
+("extratime",integer, {'init_val': 
'LIBXL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT'}),
 
-# The following three parameters ('slice', 'latency' and 'extratime') are 
deprecated,
+# The following three parameters ('slice' and 'latency') are deprecated,
 # and will have no effect if used, since the S

[Xen-devel] [PATCH v3 4/5] xentrace: enable per-VCPU extratime flag for RTDS

2017-10-10 Thread Meng Xu

Change repl_budget event output for xentrace formats and xenalyze

Signed-off-by: Meng Xu 

---
No changes from v2

Changes from v1
Add this changes from v1
---
 tools/xentrace/formats| 2 +-
 tools/xentrace/xenalyze.c | 8 +---
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/tools/xentrace/formats b/tools/xentrace/formats
index d6e7e3f..7d3a209 100644
--- a/tools/xentrace/formats
+++ b/tools/xentrace/formats
@@ -75,7 +75,7 @@
 0x00022801  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  rtds:tickle[ cpu = 
%(1)d ]
 0x00022802  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  rtds:runq_pick [ dom:vcpu 
= 0x%(1)08x, cur_deadline = 0x%(3)08x%(2)08x, cur_budget = 0x%(5)08x%(4)08x ]
 0x00022803  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  rtds:burn_budget   [ dom:vcpu 
= 0x%(1)08x, cur_budget = 0x%(3)08x%(2)08x, delta = %(4)d ]
-0x00022804  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  rtds:repl_budget   [ dom:vcpu 
= 0x%(1)08x, cur_deadline = 0x%(3)08x%(2)08x, cur_budget = 0x%(5)08x%(4)08x ]
+0x00022804  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  rtds:repl_budget   [ dom:vcpu 
= 0x%(1)08x, priority_level = 0x%(2)08d cur_deadline = 0x%(4)08x%(3)08x, 
cur_budget = 0x%(6)08x%(5)08x ]
 0x00022805  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  rtds:sched_tasklet
 0x00022806  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  rtds:schedule  [ 
cpu[16]:tasklet[8]:idle[4]:tickled[4] = %(1)08x ]
 
diff --git a/tools/xentrace/xenalyze.c b/tools/xentrace/xenalyze.c
index 79bdba7..2783204 100644
--- a/tools/xentrace/xenalyze.c
+++ b/tools/xentrace/xenalyze.c
@@ -7946,12 +7946,14 @@ void sched_process(struct pcpu_info *p)
 if(opt.dump_all) {
 struct {
 unsigned int vcpuid:16, domid:16;
+unsigned int priority_level;
 uint64_t cur_dl, cur_bg;
 } __attribute__((packed)) *r = (typeof(r))ri->d;
 
-printf(" %s rtds:repl_budget d%uv%u, deadline = %"PRIu64", "
-   "budget = %"PRIu64"\n", ri->dump_header,
-   r->domid, r->vcpuid, r->cur_dl, r->cur_bg);
+printf(" %s rtds:repl_budget d%uv%u, priority_level = %u,"
+   "deadline = %"PRIu64", budget = %"PRIu64"\n",
+   ri->dump_header, r->domid, r->vcpuid,
+   r->priority_level, r->cur_dl, r->cur_bg);
 }
 break;
 case TRC_SCHED_CLASS_EVT(RTDS, 5): /* SCHED_TASKLET*/
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

[Xen-devel] [PATCH v3 5/5] docs: enable per-VCPU extratime flag for RTDS

2017-10-10 Thread Meng Xu

Revise xl tool use case by adding -e option
Remove work-conserving from TODO list

Signed-off-by: Meng Xu 

---
No change from v2

Changes from v1
Revise rtds docs
---
 docs/features/sched_rtds.pandoc | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/docs/features/sched_rtds.pandoc b/docs/features/sched_rtds.pandoc
index 354097b..d51b499 100644
--- a/docs/features/sched_rtds.pandoc
+++ b/docs/features/sched_rtds.pandoc
@@ -40,7 +40,7 @@ as follows:
 
 It is possible, for a multiple vCPUs VM, to change the parameters of
 each vCPU individually:
-* `xl sched-rtds -d vm-rt -v 0 -p 2 -b 1 -v 1 -p 45000 -b 12000`
+* `xl sched-rtds -d vm-rt -v 0 -p 2 -b 1 -e 1 -v 1 -p 45000 -b 
12000 -e 0`
 
 # Technical details
 
@@ -53,7 +53,8 @@ the presence of the LIBXL\_HAVE\_SCHED\_RTDS symbol. The 
ability of
 specifying different scheduling parameters for each vcpu has been
 introduced later, and is available if the following symbols are defined:
 * `LIBXL\_HAVE\_VCPU\_SCHED\_PARAMS`,
-* `LIBXL\_HAVE\_SCHED\_RTDS\_VCPU\_PARAMS`.
+* `LIBXL\_HAVE\_SCHED\_RTDS\_VCPU\_PARAMS`,
+* `LIBXL\_HAVE\_SCHED\_RTDS\_VCPU\_EXTRA`.
 
 # Limitations
 
@@ -95,7 +96,6 @@ at a macroscopic level), the following should be done:
 
 # Areas for improvement
 
-* Work-conserving mode to be added;
 * performance assessment, especially focusing on what level of real-time
   behavior the scheduler enables.
 
@@ -118,4 +118,5 @@ at a macroscopic level), the following should be done:
 Date   Revision Version  Notes
 --   ---
 2016-10-14 1Xen 4.8  Document written
+2017-08-31 2Xen 4.10 Revise for work conserving feature
 --   ---
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

[Xen-devel] [PATCH v3 0/5] Towards work-conserving RTDS

2017-10-10 Thread Meng Xu

This series of patches make RTDS scheduler work-conserving
without breaking real-time guarantees.
VCPUs with extratime flag set can get extra time
from the unreserved system resource.
System administrators can decide which VCPUs have extratime flag set.

Example:
Set the extratime bit of all VCPUs of domain 1:
# xl sched-rtds -d 1 -v all -p 1 -b 2000 -e 1
Each VCPU of domain 1 will be guaranteed to have 2000ms every 1ms
(if the system is schedulable).
If there is a CPU having no work to do,
domain 1's VCPUs will be scheduled onto the CPU,
even though the VCPUs have got 2000ms in 1ms.

Clear the extra bit of all VCPUs of domain 1:
# xl sched-rtds -d 1 -v all -p 1 -b 2000 -e 0

Set/Clear the extratime bit of one specific VCPU of domain 1:
# xl sched-rtds -d 1 -v 1 -p 1 -b 2000 -e 1
# xl sched-rtds -d 1 -v 1 -p 1 -b 2000 -e 0


The original design of the work-conserving RTDS was discussed at
https://www.mail-archive.com/xen-devel@lists.xen.org/msg77150.html

The first version was discussed at
https://www.mail-archive.com/xen-devel@lists.xen.org/msg117361.html

The second version was discussed at
https://www.mail-archive.com/xen-devel@lists.xen.org/msg120618.html

The series of patch can be found at github:
https://github.com/PennPanda/RT-Xen
under the branch:
xenbits/rtds/work-conserving-v3.1

Changes from v2
Sanity check the input of -e option which can only be 0 or 1
Set -e to 1 by default if 3rd party library does not set -e option
Set vcpu extratime in sched_rtds_vcpu_get function function, which
fixes a bug in previous version.
Change EXTRATIME to Extratime in the xl output

Changes from v1
Change XEN_DOMCTL_SCHED_RTDS_extratime to XEN_DOMCTL_SCHEDRT_extra
Revise xentrace, xenalyze, and docs
Add LIBXL_HAVE_SCHED_RTDS_VCPU_EXTRA symbol in libxl.h

Changes from RFC v1
Merge changes in sched_rt.c into one patch;
Minor change in variable name and comments.

Signed-off-by: Meng Xu 

[PATCH v3 1/5] xen:rtds: towards work conserving RTDS
[PATCH v3 2/5] libxl: enable per-VCPU extratime flag for RTDS
[PATCH v3 3/5] xl: enable per-VCPU extratime flag for RTDS
[PATCH v3 4/5] xentrace: enable per-VCPU extratime flag for RTDS
[PATCH v3 5/5] docs: enable per-VCPU extratime flag for RTDS

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

[Xen-devel] [PATCH v3 3/5] xl: enable per-VCPU extratime flag for RTDS

2017-10-10 Thread Meng Xu

Change main_sched_rtds and related output functions to support
per-VCPU extratime flag.

Signed-off-by: Meng Xu 

---
Changes from v2
Validate the -e option input that can only be 0 or 1
Update docs/man/xl.pod.1.in
Change EXTRATIME to Extratime

Changes from v1
No change because we agree on using -e 0/1 option to
set if a vcpu will get extra time or not

Changes from RFC v1
Changes work_conserving flag to extratime flag
---
 docs/man/xl.pod.1.in   | 59 +--
 tools/xl/xl_cmdtable.c |  3 ++-
 tools/xl/xl_sched.c| 62 +++---
 3 files changed, 78 insertions(+), 46 deletions(-)

diff --git a/docs/man/xl.pod.1.in b/docs/man/xl.pod.1.in
index cd8bb1c..486a24f 100644
--- a/docs/man/xl.pod.1.in
+++ b/docs/man/xl.pod.1.in
@@ -1117,11 +1117,11 @@ as B<--ratelimit_us> in B
 Set or get rtds (Real Time Deferrable Server) scheduler parameters.
 This rt scheduler applies Preemptive Global Earliest Deadline First
 real-time scheduling algorithm to schedule VCPUs in the system.
-Each VCPU has a dedicated period and budget.
-VCPUs in the same domain have the same period and budget.
+Each VCPU has a dedicated period, budget and extratime.
 While scheduled, a VCPU burns its budget.
 A VCPU has its budget replenished at the beginning of each period;
 Unused budget is discarded at the end of each period.
+A VCPU with extratime set gets extra time from the unreserved system resource.
 
 B
 
@@ -1145,6 +1145,11 @@ Period of time, in microseconds, over which to replenish 
the budget.
 Amount of time, in microseconds, that the VCPU will be allowed
 to run every period.
 
+=item B<-e Extratime>, B<--extratime=Extratime>
+
+Binary flag to decide if the VCPU will be allowed to get extra time from
+the unreserved system resource.
+
 =item B<-c CPUPOOL>, B<--cpupool=CPUPOOL>
 
 Restrict output to domains in the specified cpupool.
@@ -1160,57 +1165,57 @@ all the domains:
 
 xl sched-rtds -v all
 Cpupool Pool-0: sched=RTDS
-NameID VCPUPeriodBudget
-Domain-0 00 1  4000
-vm1  10   300   150
-vm1  11   400   200
-vm1  12 1  4000
-vm1  13  1000   500
-vm2  20 1  4000
-vm2  21 1  4000
+NameID VCPUPeriodBudget  Extratime
+Domain-0 00 1  4000yes
+vm1  20   300   150yes
+vm1  21   400   200yes
+vm1  22 1  4000yes
+vm1  23  1000   500yes
+vm2  40 1  4000yes
+vm2  41 1  4000yes
 
 Without any arguments, it will output the default scheduling
 parameters for each domain:
 
 xl sched-rtds
 Cpupool Pool-0: sched=RTDS
-NameIDPeriodBudget
-Domain-0 0 1  4000
-vm1  1 1  4000
-vm2  2 1  4000
+NameIDPeriodBudget  Extratime
+Domain-0 0 1  4000yes
+vm1  2 1  4000yes
+vm2  4 1  4000yes
 
 
-2) Use, for instancei, B<-d vm1, -v all> to see the budget and
+2) Use, for instance, B<-d vm1, -v all> to see the budget and
 period of all VCPUs of a specific domain (B):
 
 xl sched-rtds -d vm1 -v all
-NameID VCPUPeriodBudget
-vm1  10   300   150
-vm1  11   400   200
-vm1  12 1  4000
-vm1  13  1000   500
+NameID VCPUPeriodBudget  Extratime
+vm1  20   300   150yes
+vm1  21   400   200yes
+vm1  22 1  4000yes
+vm1  23  1000   500yes
 
 To see the parameters of a subset of the VCPUs of a domain, use:
 
 xl sched-rtds -d vm1 -v 0 -v 3
-NameID VCPUPeriodBudget
-vm1  10   300   150
-vm1  13  1000   500
+NameID VCPUPeriodBudget  Ext

[Xen-devel] [PATCH v3 1/5] xen:rtds: towards work conserving RTDS

2017-10-10 Thread Meng Xu

Make RTDS scheduler work conserving without breaking the real-time guarantees.

VCPU model:
Each real-time VCPU is extended to have an extratime flag
and a priority_level field.
When a VCPU's budget is depleted in the current period,
if it has extratime flag set,
its priority_level will increase by 1 and its budget will be refilled;
othewrise, the VCPU will be moved to the depletedq.

Scheduling policy is modified global EDF:
A VCPU v1 has higher priority than another VCPU v2 if
(i) v1 has smaller priority_leve; or
(ii) v1 has the same priority_level but has a smaller deadline

Queue management:
Run queue holds VCPUs with extratime flag set and VCPUs with
remaining budget. Run queue is sorted in increasing order of VCPUs priorities.
Depleted queue holds VCPUs which have extratime flag cleared and depleted 
budget.
Replenished queue is not modified.

Distribution of spare bandwidth
Spare bandwidth is distributed among all VCPUs with extratime flag set,
proportional to these VCPUs utilizations

Signed-off-by: Meng Xu 

---
Changes from v2
Explain how to distribute spare bandwidth in commit log
Minor change in has_extratime function without functionality change.

Changes from v1
Change XEN_DOMCTL_SCHED_RTDS_extratime to XEN_DOMCTL_SCHEDRT_extra as
suggested by Dario

Changes from RFC v1
Rewording comments and commit message
Remove is_work_conserving field from rt_vcpu structure
Use one bit in VCPU's flag to indicate if a VCPU will have extra time
Correct comments style
---
 xen/common/sched_rt.c   | 90 ++---
 xen/include/public/domctl.h |  4 ++
 2 files changed, 80 insertions(+), 14 deletions(-)

diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
index 5c51cd9..b770287 100644
--- a/xen/common/sched_rt.c
+++ b/xen/common/sched_rt.c
@@ -49,13 +49,15 @@
  * A PCPU is feasible if the VCPU can run on this PCPU and (the PCPU is idle or
  * has a lower-priority VCPU running on it.)
  *
- * Each VCPU has a dedicated period and budget.
+ * Each VCPU has a dedicated period, budget and a extratime flag
  * The deadline of a VCPU is at the end of each period;
  * A VCPU has its budget replenished at the beginning of each period;
  * While scheduled, a VCPU burns its budget.
  * The VCPU needs to finish its budget before its deadline in each period;
  * The VCPU discards its unused budget at the end of each period.
- * If a VCPU runs out of budget in a period, it has to wait until next period.
+ * When a VCPU runs out of budget in a period, if its extratime flag is set,
+ * the VCPU increases its priority_level by 1 and refills its budget; 
otherwise,
+ * it has to wait until next period.
  *
  * Each VCPU is implemented as a deferable server.
  * When a VCPU has a task running on it, its budget is continuously burned;
@@ -63,7 +65,8 @@
  *
  * Queue scheme:
  * A global runqueue and a global depletedqueue for each CPU pool.
- * The runqueue holds all runnable VCPUs with budget, sorted by deadline;
+ * The runqueue holds all runnable VCPUs with budget,
+ * sorted by priority_level and deadline;
  * The depletedqueue holds all VCPUs without budget, unsorted;
  *
  * Note: cpumask and cpupool is supported.
@@ -151,6 +154,14 @@
 #define RTDS_depleted (1<<__RTDS_depleted)
 
 /*
+ * RTDS_extratime: Can the vcpu run in the time that is
+ * not part of any real-time reservation, and would therefore
+ * be otherwise left idle?
+ */
+#define __RTDS_extratime4
+#define RTDS_extratime (1<<__RTDS_extratime)
+
+/*
  * rt tracing events ("only" 512 available!). Check
  * include/public/trace.h for more details.
  */
@@ -201,6 +212,8 @@ struct rt_vcpu {
 struct rt_dom *sdom;
 struct vcpu *vcpu;
 
+unsigned priority_level;
+
 unsigned flags;  /* mark __RTDS_scheduled, etc.. */
 };
 
@@ -245,6 +258,11 @@ static inline struct list_head *rt_replq(const struct 
scheduler *ops)
 return &rt_priv(ops)->replq;
 }
 
+static inline bool has_extratime(const struct rt_vcpu *svc)
+{
+return svc->flags & RTDS_extratime;
+}
+
 /*
  * Helper functions for manipulating the runqueue, the depleted queue,
  * and the replenishment events queue.
@@ -274,6 +292,21 @@ vcpu_on_replq(const struct rt_vcpu *svc)
 }
 
 /*
+ * If v1 priority >= v2 priority, return value > 0
+ * Otherwise, return value < 0
+ */
+static s_time_t
+compare_vcpu_priority(const struct rt_vcpu *v1, const struct rt_vcpu *v2)
+{
+int prio = v2->priority_level - v1->priority_level;
+
+if ( prio == 0 )
+return v2->cur_deadline - v1->cur_deadline;
+
+return prio;
+}
+
+/*
  * Debug related code, dump vcpu/cpu information
  */
 static void
@@ -303,6 +336,7 @@ rt_dump_vcpu(const struct scheduler *ops, const struct 
rt_vcpu *svc)
 cpulist_scnprintf(keyhandler_scratch, sizeof(keyhandler_scratch), mask);
 printk("[%5d.%-2u] cpu %u, (%"PRI_stime", %"PRI_stime"),"
"

Re: [Xen-devel] [PATCH v3 4/5] xentrace: enable per-VCPU extratime flag for RTDS

2017-10-11 Thread Meng Xu

On Wed, Oct 11, 2017 at 6:57 AM, Dario Faggioli  wrote:
> On Tue, 2017-10-10 at 19:17 -0400, Meng Xu wrote:
>> --- a/tools/xentrace/formats
>> +++ b/tools/xentrace/formats
>> @@ -75,7 +75,7 @@
>>  0x00022801  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  rtds:tickle[
>> cpu = %(1)d ]
>>  0x00022802  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  rtds:runq_pick [
>> dom:vcpu = 0x%(1)08x, cur_deadline = 0x%(3)08x%(2)08x, cur_budget =
>> 0x%(5)08x%(4)08x ]
>>  0x00022803  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  rtds:burn_budget   [
>> dom:vcpu = 0x%(1)08x, cur_budget = 0x%(3)08x%(2)08x, delta = %(4)d ]
>> -0x00022804  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  rtds:repl_budget   [
>> dom:vcpu = 0x%(1)08x, cur_deadline = 0x%(3)08x%(2)08x, cur_budget =
>> 0x%(5)08x%(4)08x ]
>> +0x00022804  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  rtds:repl_budget   [
>> dom:vcpu = 0x%(1)08x, priority_level = 0x%(2)08d cur_deadline =
>> 0x%(4)08x%(3)08x, cur_budget = 0x%(6)08x%(5)08x ]
>>  0x00022805  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  rtds:sched_tasklet
>>  0x00022806  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  rtds:schedule  [
>> cpu[16]:tasklet[8]:idle[4]:tickled[4] = %(1)08x ]
>>
> But, both in case of this file and below in xenalyze.c, you update 1
> record (the one of REPL_BUDGET). However, in patch 1, you added the
> priority_level field to two records: REPL_BUDGET and BURN_BUDGET.
>
> Or am I missing something?

OMG, my fault. I forgot to check this. I will add this and double
check it by running some tests.

Best,

Meng
-- 
Meng Xu
Ph.D. Candidate in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

[Xen-devel] [PATCH v4 4/5] xentrace: enable per-VCPU extratime flag for RTDS

2017-10-11 Thread Meng Xu

Change repl_budget event output for xentrace formats and xenalyze

Signed-off-by: Meng Xu 

---
Changes from v3
Handle burn_budget event

No changes from v2

Changes from v1
Add this changes from v1
---
 tools/xentrace/formats|  4 ++--
 tools/xentrace/xenalyze.c | 16 +++-
 2 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/tools/xentrace/formats b/tools/xentrace/formats
index d6e7e3f..8b286c3 100644
--- a/tools/xentrace/formats
+++ b/tools/xentrace/formats
@@ -74,8 +74,8 @@
 
 0x00022801  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  rtds:tickle[ cpu = 
%(1)d ]
 0x00022802  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  rtds:runq_pick [ dom:vcpu 
= 0x%(1)08x, cur_deadline = 0x%(3)08x%(2)08x, cur_budget = 0x%(5)08x%(4)08x ]
-0x00022803  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  rtds:burn_budget   [ dom:vcpu 
= 0x%(1)08x, cur_budget = 0x%(3)08x%(2)08x, delta = %(4)d ]
-0x00022804  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  rtds:repl_budget   [ dom:vcpu 
= 0x%(1)08x, cur_deadline = 0x%(3)08x%(2)08x, cur_budget = 0x%(5)08x%(4)08x ]
+0x00022803  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  rtds:burn_budget   [ dom:vcpu 
= 0x%(1)08x, cur_budget = 0x%(3)08x%(2)08x, delta = %(4)d, priority_level = 
%(5)d, has_extratime = %(6)x ]
+0x00022804  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  rtds:repl_budget   [ dom:vcpu 
= 0x%(1)08x, priority_level = 0x%(2)08d cur_deadline = 0x%(4)08x%(3)08x, 
cur_budget = 0x%(6)08x%(5)08x ]
 0x00022805  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  rtds:sched_tasklet
 0x00022806  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  rtds:schedule  [ 
cpu[16]:tasklet[8]:idle[4]:tickled[4] = %(1)08x ]
 
diff --git a/tools/xentrace/xenalyze.c b/tools/xentrace/xenalyze.c
index 79bdba7..19e050f 100644
--- a/tools/xentrace/xenalyze.c
+++ b/tools/xentrace/xenalyze.c
@@ -7935,23 +7935,29 @@ void sched_process(struct pcpu_info *p)
 unsigned int vcpuid:16, domid:16;
 uint64_t cur_bg;
 int delta;
+unsigned priority_level;
+unsigned has_extratime;
 } __attribute__((packed)) *r = (typeof(r))ri->d;
 
 printf(" %s rtds:burn_budget d%uv%u, budget = %"PRIu64", "
-   "delta = %d\n", ri->dump_header, r->domid,
-   r->vcpuid, r->cur_bg, r->delta);
+   "delta = %d, priority_level = %d, has_extratime = %d\n",
+   ri->dump_header, r->domid,
+   r->vcpuid, r->cur_bg, r->delta,
+   r->priority_level, !!r->has_extratime);
 }
 break;
 case TRC_SCHED_CLASS_EVT(RTDS, 4): /* BUDGET_REPLENISH */
 if(opt.dump_all) {
 struct {
 unsigned int vcpuid:16, domid:16;
+unsigned int priority_level;
 uint64_t cur_dl, cur_bg;
 } __attribute__((packed)) *r = (typeof(r))ri->d;
 
-printf(" %s rtds:repl_budget d%uv%u, deadline = %"PRIu64", "
-   "budget = %"PRIu64"\n", ri->dump_header,
-   r->domid, r->vcpuid, r->cur_dl, r->cur_bg);
+printf(" %s rtds:repl_budget d%uv%u, priority_level = %u,"
+   "deadline = %"PRIu64", budget = %"PRIu64"\n",
+   ri->dump_header, r->domid, r->vcpuid,
+   r->priority_level, r->cur_dl, r->cur_bg);
 }
 break;
 case TRC_SCHED_CLASS_EVT(RTDS, 5): /* SCHED_TASKLET*/
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

[Xen-devel] [PATCH v4 1/5] xen:rtds: towards work conserving RTDS

2017-10-11 Thread Meng Xu

Make RTDS scheduler work conserving without breaking the real-time guarantees.

VCPU model:
Each real-time VCPU is extended to have an extratime flag
and a priority_level field.
When a VCPU's budget is depleted in the current period,
if it has extratime flag set,
its priority_level will increase by 1 and its budget will be refilled;
othewrise, the VCPU will be moved to the depletedq.

Scheduling policy is modified global EDF:
A VCPU v1 has higher priority than another VCPU v2 if
(i) v1 has smaller priority_leve; or
(ii) v1 has the same priority_level but has a smaller deadline

Queue management:
Run queue holds VCPUs with extratime flag set and VCPUs with
remaining budget. Run queue is sorted in increasing order of VCPUs priorities.
Depleted queue holds VCPUs which have extratime flag cleared and depleted 
budget.
Replenished queue is not modified.

Distribution of spare bandwidth
Spare bandwidth is distributed among all VCPUs with extratime flag set,
proportional to these VCPUs utilizations

Signed-off-by: Meng Xu 
Reviewed-by: Dario Faggioli 

---
Changes from v2
Explain how to distribute spare bandwidth in commit log
Minor change in has_extratime function without functionality change.

Changes from v1
Change XEN_DOMCTL_SCHED_RTDS_extratime to XEN_DOMCTL_SCHEDRT_extra as
suggested by Dario

Changes from RFC v1
Rewording comments and commit message
Remove is_work_conserving field from rt_vcpu structure
Use one bit in VCPU's flag to indicate if a VCPU will have extra time
Correct comments style
---
 xen/common/sched_rt.c   | 90 ++---
 xen/include/public/domctl.h |  4 ++
 2 files changed, 80 insertions(+), 14 deletions(-)

diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
index 5c51cd9..b770287 100644
--- a/xen/common/sched_rt.c
+++ b/xen/common/sched_rt.c
@@ -49,13 +49,15 @@
  * A PCPU is feasible if the VCPU can run on this PCPU and (the PCPU is idle or
  * has a lower-priority VCPU running on it.)
  *
- * Each VCPU has a dedicated period and budget.
+ * Each VCPU has a dedicated period, budget and a extratime flag
  * The deadline of a VCPU is at the end of each period;
  * A VCPU has its budget replenished at the beginning of each period;
  * While scheduled, a VCPU burns its budget.
  * The VCPU needs to finish its budget before its deadline in each period;
  * The VCPU discards its unused budget at the end of each period.
- * If a VCPU runs out of budget in a period, it has to wait until next period.
+ * When a VCPU runs out of budget in a period, if its extratime flag is set,
+ * the VCPU increases its priority_level by 1 and refills its budget; 
otherwise,
+ * it has to wait until next period.
  *
  * Each VCPU is implemented as a deferable server.
  * When a VCPU has a task running on it, its budget is continuously burned;
@@ -63,7 +65,8 @@
  *
  * Queue scheme:
  * A global runqueue and a global depletedqueue for each CPU pool.
- * The runqueue holds all runnable VCPUs with budget, sorted by deadline;
+ * The runqueue holds all runnable VCPUs with budget,
+ * sorted by priority_level and deadline;
  * The depletedqueue holds all VCPUs without budget, unsorted;
  *
  * Note: cpumask and cpupool is supported.
@@ -151,6 +154,14 @@
 #define RTDS_depleted (1<<__RTDS_depleted)
 
 /*
+ * RTDS_extratime: Can the vcpu run in the time that is
+ * not part of any real-time reservation, and would therefore
+ * be otherwise left idle?
+ */
+#define __RTDS_extratime4
+#define RTDS_extratime (1<<__RTDS_extratime)
+
+/*
  * rt tracing events ("only" 512 available!). Check
  * include/public/trace.h for more details.
  */
@@ -201,6 +212,8 @@ struct rt_vcpu {
 struct rt_dom *sdom;
 struct vcpu *vcpu;
 
+unsigned priority_level;
+
 unsigned flags;  /* mark __RTDS_scheduled, etc.. */
 };
 
@@ -245,6 +258,11 @@ static inline struct list_head *rt_replq(const struct 
scheduler *ops)
 return &rt_priv(ops)->replq;
 }
 
+static inline bool has_extratime(const struct rt_vcpu *svc)
+{
+return svc->flags & RTDS_extratime;
+}
+
 /*
  * Helper functions for manipulating the runqueue, the depleted queue,
  * and the replenishment events queue.
@@ -274,6 +292,21 @@ vcpu_on_replq(const struct rt_vcpu *svc)
 }
 
 /*
+ * If v1 priority >= v2 priority, return value > 0
+ * Otherwise, return value < 0
+ */
+static s_time_t
+compare_vcpu_priority(const struct rt_vcpu *v1, const struct rt_vcpu *v2)
+{
+int prio = v2->priority_level - v1->priority_level;
+
+if ( prio == 0 )
+return v2->cur_deadline - v1->cur_deadline;
+
+return prio;
+}
+
+/*
  * Debug related code, dump vcpu/cpu information
  */
 static void
@@ -303,6 +336,7 @@ rt_dump_vcpu(const struct scheduler *ops, const struct 
rt_vcpu *svc)
 cpulist_scnprintf(keyhandler_scratch, sizeof(keyhandler_scratch), mask);
 printk("[%5d.%-2u] cpu %u, (%"PRI_stime", %"PRI_

[Xen-devel] [PATCH v4 5/5] docs: enable per-VCPU extratime flag for RTDS

2017-10-11 Thread Meng Xu

Revise xl tool use case by adding -e option
Remove work-conserving from TODO list

Signed-off-by: Meng Xu 
Reviewed-by: Dario Faggioli 
Acked-by: Wei Liu 

---
No change from v2

Changes from v1
Revise rtds docs
---
 docs/features/sched_rtds.pandoc | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/docs/features/sched_rtds.pandoc b/docs/features/sched_rtds.pandoc
index 354097b..d51b499 100644
--- a/docs/features/sched_rtds.pandoc
+++ b/docs/features/sched_rtds.pandoc
@@ -40,7 +40,7 @@ as follows:
 
 It is possible, for a multiple vCPUs VM, to change the parameters of
 each vCPU individually:
-* `xl sched-rtds -d vm-rt -v 0 -p 2 -b 1 -v 1 -p 45000 -b 12000`
+* `xl sched-rtds -d vm-rt -v 0 -p 2 -b 1 -e 1 -v 1 -p 45000 -b 
12000 -e 0`
 
 # Technical details
 
@@ -53,7 +53,8 @@ the presence of the LIBXL\_HAVE\_SCHED\_RTDS symbol. The 
ability of
 specifying different scheduling parameters for each vcpu has been
 introduced later, and is available if the following symbols are defined:
 * `LIBXL\_HAVE\_VCPU\_SCHED\_PARAMS`,
-* `LIBXL\_HAVE\_SCHED\_RTDS\_VCPU\_PARAMS`.
+* `LIBXL\_HAVE\_SCHED\_RTDS\_VCPU\_PARAMS`,
+* `LIBXL\_HAVE\_SCHED\_RTDS\_VCPU\_EXTRA`.
 
 # Limitations
 
@@ -95,7 +96,6 @@ at a macroscopic level), the following should be done:
 
 # Areas for improvement
 
-* Work-conserving mode to be added;
 * performance assessment, especially focusing on what level of real-time
   behavior the scheduler enables.
 
@@ -118,4 +118,5 @@ at a macroscopic level), the following should be done:
 Date   Revision Version  Notes
 --   ---
 2016-10-14 1Xen 4.8  Document written
+2017-08-31 2Xen 4.10 Revise for work conserving feature
 --   ---
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

[Xen-devel] [PATCH v4 2/5] libxl: enable per-VCPU extratime flag for RTDS

2017-10-11 Thread Meng Xu

Modify libxl_vcpu_sched_params_get/set and sched_rtds_vcpu_get/set
functions to support per-VCPU extratime flag

Signed-off-by: Meng Xu 
Reviewed-by: Dario Faggioli 
Acked-by: Wei Liu 

---
Changes from v2
1) Move extratime out of the section
   that is marked as depreciated in libxl_domain_sched_params.
2) Set vcpu extratime in sched_rtds_vcpu_get function function;
   This fix a bug in previous version
   when run command "xl sched-rtds -d 0 -v 1" which
   outputs vcpu extratime value incorrectly.

Changes from v1
1) Add LIBXL_HAVE_SCHED_RTDS_VCPU_EXTRA to indicate if extratime flag is
supported
2) Change flag name in domctl.h from XEN_DOMCTL_SCHED_RTDS_extratime to
XEN_DOMCTL_SCHEDRT_extra

Changes from RFC v1
Change work_conserving flag to extratime flag
---
 tools/libxl/libxl.h |  6 ++
 tools/libxl/libxl_sched.c   | 17 +
 tools/libxl/libxl_types.idl |  8 
 3 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index f82b91e..5e9aed7 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -257,6 +257,12 @@
 #define LIBXL_HAVE_SCHED_RTDS_VCPU_PARAMS 1
 
 /*
+ * LIBXL_HAVE_SCHED_RTDS_VCPU_EXTRA indicates RTDS scheduler
+ * now supports per-vcpu extratime settings.
+ */
+#define LIBXL_HAVE_SCHED_RTDS_VCPU_EXTRA 1
+
+/*
  * libxl_domain_build_info has the arm.gic_version field.
  */
 #define LIBXL_HAVE_BUILDINFO_ARM_GIC_VERSION 1
diff --git a/tools/libxl/libxl_sched.c b/tools/libxl/libxl_sched.c
index 7d144d0..512788f 100644
--- a/tools/libxl/libxl_sched.c
+++ b/tools/libxl/libxl_sched.c
@@ -532,6 +532,8 @@ static int sched_rtds_vcpu_get(libxl__gc *gc, uint32_t 
domid,
 for (i = 0; i < num_vcpus; i++) {
 scinfo->vcpus[i].period = vcpus[i].u.rtds.period;
 scinfo->vcpus[i].budget = vcpus[i].u.rtds.budget;
+scinfo->vcpus[i].extratime =
+!!(vcpus[i].u.rtds.flags & XEN_DOMCTL_SCHEDRT_extra);
 scinfo->vcpus[i].vcpuid = vcpus[i].vcpuid;
 }
 rc = 0;
@@ -579,6 +581,8 @@ static int sched_rtds_vcpu_get_all(libxl__gc *gc, uint32_t 
domid,
 for (i = 0; i < num_vcpus; i++) {
 scinfo->vcpus[i].period = vcpus[i].u.rtds.period;
 scinfo->vcpus[i].budget = vcpus[i].u.rtds.budget;
+scinfo->vcpus[i].extratime =
+!!(vcpus[i].u.rtds.flags & XEN_DOMCTL_SCHEDRT_extra);
 scinfo->vcpus[i].vcpuid = vcpus[i].vcpuid;
 }
 rc = 0;
@@ -628,6 +632,10 @@ static int sched_rtds_vcpu_set(libxl__gc *gc, uint32_t 
domid,
 vcpus[i].vcpuid = scinfo->vcpus[i].vcpuid;
 vcpus[i].u.rtds.period = scinfo->vcpus[i].period;
 vcpus[i].u.rtds.budget = scinfo->vcpus[i].budget;
+if (scinfo->vcpus[i].extratime)
+vcpus[i].u.rtds.flags |= XEN_DOMCTL_SCHEDRT_extra;
+else
+vcpus[i].u.rtds.flags &= ~XEN_DOMCTL_SCHEDRT_extra;
 }
 
 r = xc_sched_rtds_vcpu_set(CTX->xch, domid,
@@ -676,6 +684,10 @@ static int sched_rtds_vcpu_set_all(libxl__gc *gc, uint32_t 
domid,
 vcpus[i].vcpuid = i;
 vcpus[i].u.rtds.period = scinfo->vcpus[0].period;
 vcpus[i].u.rtds.budget = scinfo->vcpus[0].budget;
+if (scinfo->vcpus[0].extratime)
+vcpus[i].u.rtds.flags |= XEN_DOMCTL_SCHEDRT_extra;
+else
+vcpus[i].u.rtds.flags &= ~XEN_DOMCTL_SCHEDRT_extra;
 }
 
 r = xc_sched_rtds_vcpu_set(CTX->xch, domid,
@@ -726,6 +738,11 @@ static int sched_rtds_domain_set(libxl__gc *gc, uint32_t 
domid,
 sdom.period = scinfo->period;
 if (scinfo->budget != LIBXL_DOMAIN_SCHED_PARAM_BUDGET_DEFAULT)
 sdom.budget = scinfo->budget;
+/* Set extratime by default */
+if (scinfo->extratime)
+sdom.flags |= XEN_DOMCTL_SCHEDRT_extra;
+else
+sdom.flags &= ~XEN_DOMCTL_SCHEDRT_extra;
 if (sched_rtds_validate_params(gc, sdom.period, sdom.budget))
 return ERROR_INVAL;
 
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 2d0bb8a..dd7d364 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -421,14 +421,14 @@ libxl_domain_sched_params = Struct("domain_sched_params",[
 ("cap",  integer, {'init_val': 
'LIBXL_DOMAIN_SCHED_PARAM_CAP_DEFAULT'}),
 ("period",   integer, {'init_val': 
'LIBXL_DOMAIN_SCHED_PARAM_PERIOD_DEFAULT'}),
 ("budget",   integer, {'init_val': 
'LIBXL_DOMAIN_SCHED_PARAM_BUDGET_DEFAULT'}),
+("extratime",integer, {'init_val': 
'LIBXL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT'}),
 
-# The following three parameters ('slice', 'latency' and 'extratime') are 
deprecated,
+# The following three parameters ('slice' and 'latency') are deprecated,

[Xen-devel] [PATCH v4 0/5] Towards work-conserving RTDS

2017-10-11 Thread Meng Xu

This series of patches make RTDS scheduler work-conserving
without breaking real-time guarantees.
VCPUs with extratime flag set can get extra time
from the unreserved system resource.
System administrators can decide which VCPUs have extratime flag set.

Example:
Set the extratime bit of all VCPUs of domain 1:
# xl sched-rtds -d 1 -v all -p 1 -b 2000 -e 1
Each VCPU of domain 1 will be guaranteed to have 2000ms every 1ms
(if the system is schedulable).
If there is a CPU having no work to do,
domain 1's VCPUs will be scheduled onto the CPU,
even though the VCPUs have got 2000ms in 1ms.

Clear the extra bit of all VCPUs of domain 1:
# xl sched-rtds -d 1 -v all -p 1 -b 2000 -e 0

Set/Clear the extratime bit of one specific VCPU of domain 1:
# xl sched-rtds -d 1 -v 1 -p 1 -b 2000 -e 1
# xl sched-rtds -d 1 -v 1 -p 1 -b 2000 -e 0


The original design of the work-conserving RTDS was discussed at
https://www.mail-archive.com/xen-devel@lists.xen.org/msg77150.html

The first version was discussed at
https://www.mail-archive.com/xen-devel@lists.xen.org/msg117361.html

The second version was discussed at
https://www.mail-archive.com/xen-devel@lists.xen.org/msg120618.html

The third version has been mostly reviewed by Dario Faggioli and
acked by Wei Liu, except
[PATCH v4 4/5] xentrace: enable per-VCPU extratime flag for RTDS

The series of patch can be found at github:
https://github.com/PennPanda/RT-Xen
under the branch:
xenbits/rtds/work-conserving-v4

Changes from v3
Handle burn_budget event in xentrace and xenanalyze.
Tested the change with three VMs

Changes from v2
Sanity check the input of -e option which can only be 0 or 1
Set -e to 1 by default if 3rd party library does not set -e option
Set vcpu extratime in sched_rtds_vcpu_get function function, which
fixes a bug in previous version.
Change EXTRATIME to Extratime in the xl output

Changes from v1
Change XEN_DOMCTL_SCHED_RTDS_extratime to XEN_DOMCTL_SCHEDRT_extra
Revise xentrace, xenalyze, and docs
Add LIBXL_HAVE_SCHED_RTDS_VCPU_EXTRA symbol in libxl.h

Changes from RFC v1
Merge changes in sched_rt.c into one patch;
Minor change in variable name and comments.

Signed-off-by: Meng Xu 

[PATCH v4 1/5] xen:rtds: towards work conserving RTDS
[PATCH v4 2/5] libxl: enable per-VCPU extratime flag for RTDS
[PATCH v4 3/5] xl: enable per-VCPU extratime flag for RTDS
[PATCH v4 4/5] xentrace: enable per-VCPU extratime flag for RTDS
[PATCH v4 5/5] docs: enable per-VCPU extratime flag for RTDS


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

[Xen-devel] [PATCH v4 3/5] xl: enable per-VCPU extratime flag for RTDS

2017-10-11 Thread Meng Xu

Change main_sched_rtds and related output functions to support
per-VCPU extratime flag.

Signed-off-by: Meng Xu 
Reviewed-by: Dario Faggioli 
Acked-by: Wei Liu 

---
Changes from v2
Validate the -e option input that can only be 0 or 1
Update docs/man/xl.pod.1.in
Change EXTRATIME to Extratime

Changes from v1
No change because we agree on using -e 0/1 option to
set if a vcpu will get extra time or not

Changes from RFC v1
Changes work_conserving flag to extratime flag
---
 docs/man/xl.pod.1.in   | 59 +--
 tools/xl/xl_cmdtable.c |  3 ++-
 tools/xl/xl_sched.c| 62 +++---
 3 files changed, 78 insertions(+), 46 deletions(-)

diff --git a/docs/man/xl.pod.1.in b/docs/man/xl.pod.1.in
index cd8bb1c..486a24f 100644
--- a/docs/man/xl.pod.1.in
+++ b/docs/man/xl.pod.1.in
@@ -1117,11 +1117,11 @@ as B<--ratelimit_us> in B
 Set or get rtds (Real Time Deferrable Server) scheduler parameters.
 This rt scheduler applies Preemptive Global Earliest Deadline First
 real-time scheduling algorithm to schedule VCPUs in the system.
-Each VCPU has a dedicated period and budget.
-VCPUs in the same domain have the same period and budget.
+Each VCPU has a dedicated period, budget and extratime.
 While scheduled, a VCPU burns its budget.
 A VCPU has its budget replenished at the beginning of each period;
 Unused budget is discarded at the end of each period.
+A VCPU with extratime set gets extra time from the unreserved system resource.
 
 B
 
@@ -1145,6 +1145,11 @@ Period of time, in microseconds, over which to replenish 
the budget.
 Amount of time, in microseconds, that the VCPU will be allowed
 to run every period.
 
+=item B<-e Extratime>, B<--extratime=Extratime>
+
+Binary flag to decide if the VCPU will be allowed to get extra time from
+the unreserved system resource.
+
 =item B<-c CPUPOOL>, B<--cpupool=CPUPOOL>
 
 Restrict output to domains in the specified cpupool.
@@ -1160,57 +1165,57 @@ all the domains:
 
 xl sched-rtds -v all
 Cpupool Pool-0: sched=RTDS
-NameID VCPUPeriodBudget
-Domain-0 00 1  4000
-vm1  10   300   150
-vm1  11   400   200
-vm1  12 1  4000
-vm1  13  1000   500
-vm2  20 1  4000
-vm2  21 1  4000
+NameID VCPUPeriodBudget  Extratime
+Domain-0 00 1  4000yes
+vm1  20   300   150yes
+vm1  21   400   200yes
+vm1  22 1  4000yes
+vm1  23  1000   500yes
+vm2  40 1  4000yes
+vm2  41 1  4000yes
 
 Without any arguments, it will output the default scheduling
 parameters for each domain:
 
 xl sched-rtds
 Cpupool Pool-0: sched=RTDS
-NameIDPeriodBudget
-Domain-0 0 1  4000
-vm1  1 1  4000
-vm2  2 1  4000
+NameIDPeriodBudget  Extratime
+Domain-0 0 1  4000yes
+vm1  2 1  4000yes
+vm2  4 1  4000yes
 
 
-2) Use, for instancei, B<-d vm1, -v all> to see the budget and
+2) Use, for instance, B<-d vm1, -v all> to see the budget and
 period of all VCPUs of a specific domain (B):
 
 xl sched-rtds -d vm1 -v all
-NameID VCPUPeriodBudget
-vm1  10   300   150
-vm1  11   400   200
-vm1  12 1  4000
-vm1  13  1000   500
+NameID VCPUPeriodBudget  Extratime
+vm1  20   300   150yes
+vm1  21   400   200yes
+vm1  22 1  4000yes
+vm1  23  1000   500yes
 
 To see the parameters of a subset of the VCPUs of a domain, use:
 
 xl sched-rtds -d vm1 -v 0 -v 3
-NameID VCPUPeriodBudget
-vm1  10   300   150
-vm1  13  1000   500
+Name

Re: [Xen-devel] [PATCH v4 0/5] Towards work-conserving RTDS

2017-10-12 Thread Meng Xu

On Thu, Oct 12, 2017 at 5:02 AM, Wei Liu  wrote:
>
> FYI all patches except the xentrace one were committed yesterday.


Thank you very much, Wei!

Best,

Meng

-- 
Meng Xu
Ph.D. Candidate in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 0/5] Towards work-conserving RTDS

2017-10-17 Thread Meng Xu

On Tue, Oct 17, 2017 at 3:29 AM, Dario Faggioli  wrote:

> On Tue, 2017-10-17 at 09:26 +0200, Dario Faggioli wrote:
> > On Thu, 2017-10-12 at 10:34 -0400, Meng Xu wrote:
> > > On Thu, Oct 12, 2017 at 5:02 AM, Wei Liu 
> > > wrote:
> > > >
> > > > FYI all patches except the xentrace one were committed yesterday.
> > >
> > > Thank you very much, Wei!
> > >
> >
> > Hey Meng,
> >
> > Any update on that missing patch, though?
> >
> No, wait... Posted on Wednesday, mmmhh... Ah, so "this" is you posting
> the missing patch!
>

Yes. :) I didn't repost the patch. I made the changes and tested it once I
got the feedback.


>
> Ok, my bad, sorry. I was fooled by the fact that you resent the whole
> series, and that I did not get a copy of it (extra-list, I mean) as
> you're still using my old email address.
>
> Lemme have a look...
>

Ah, I neglected the email address. I was also wondering maybe you were
busy with something else. So I didn't send a reminder.

Thanks!

Best Regards,

Meng


>
> Regards,
> Dario
> --
> <> (Raistlin Majere)
> -
> Dario Faggioli, Ph.D, http://about.me/dario.faggioli
>



-- 
Meng Xu
Ph.D. Candidate in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 4/5] xentrace: enable per-VCPU extratime flag for RTDS

2017-10-17 Thread Meng Xu

On Tue, Oct 17, 2017 at 4:10 AM, Dario Faggioli  wrote:
> On Wed, 2017-10-11 at 14:02 -0400, Meng Xu wrote:
>> Change repl_budget event output for xentrace formats and xenalyze
>>
>> Signed-off-by: Meng Xu 
>>
> I'd say:
>
> Reviewed-by: Dario Faggioli 
>
> However...
>
>> diff --git a/tools/xentrace/xenalyze.c b/tools/xentrace/xenalyze.c
>> index 79bdba7..19e050f 100644
>> --- a/tools/xentrace/xenalyze.c
>> +++ b/tools/xentrace/xenalyze.c
>> @@ -7935,23 +7935,29 @@ void sched_process(struct pcpu_info *p)
>>  unsigned int vcpuid:16, domid:16;
>>  uint64_t cur_bg;
>>  int delta;
>> +unsigned priority_level;
>> +unsigned has_extratime;
>>
> ...this last field is 'bool' in Xen.
>
> I appreciate that xenalyze does not build if you just make this bool as
> well. But it does build for me, if you do that, and also include
> stdbool.h, which I think is a fine thing to do.

Right. I'm not sure about this. If including the stdbool.h is
preferred, I can resend this one with that change.

>
> Anyway, I'll leave this to George and tools' maintainers.

Sure!

Thanks,

Meng



-- 
Meng Xu
Ph.D. Candidate in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] VPMU interrupt unreliability

2017-10-19 Thread Meng Xu

On Thu, Oct 19, 2017 at 11:40 AM, Andrew Cooper
 wrote:
>
> On 19/10/17 16:09, Kyle Huey wrote:
> > On Wed, Oct 11, 2017 at 7:09 AM, Boris Ostrovsky
> >  wrote:
> >> On 10/10/2017 12:54 PM, Kyle Huey wrote:
> >>> On Mon, Jul 24, 2017 at 9:54 AM, Kyle Huey  wrote:
> >>>> On Mon, Jul 24, 2017 at 8:07 AM, Boris Ostrovsky
> >>>>  wrote:
> >>>>>>> One thing I noticed is that the workaround doesn't appear to be
> >>>>>>> complete: it is only checking PMC0 status and not other counters 
> >>>>>>> (fixed
> >>>>>>> or architectural). Of course, without knowing what the actual problem
> >>>>>>> was it's hard to say whether this was intentional.
> >>>>>> handle_pmc_quirk appears to loop through all the counters ...
> >>>>> Right, I didn't notice that it is shifting MSR_CORE_PERF_GLOBAL_STATUS
> >>>>> value one by one and so it is looking at all bits.
> >>>>>
> >>>>>>>> 2. Intercepting MSR loads for counters that have the workaround
> >>>>>>>> applied and giving the guest the correct counter value.
> >>>>>>> We'd have to keep track of whether the counter has been reset (by the
> >>>>>>> quirk) since the last MSR write.
> >>>>>> Yes.
> >>>>>>
> >>>>>>>> 3. Or perhaps even changing the workaround to disable the PMI on that
> >>>>>>>> counter until the guest acks via GLOBAL_OVF_CTRL, assuming that works
> >>>>>>>> on the relevant hardware.
> >>>>>>> MSR_CORE_PERF_GLOBAL_OVF_CTRL is written immediately after the quirk
> >>>>>>> runs (in core2_vpmu_do_interrupt()) so we already do this, don't we?
> >>>>>> I'm suggesting waiting until the *guest* writes to the (virtualized)
> >>>>>> GLOBAL_OVF_CTRL.
> >>>>> Wouldn't it be better to wait until the counter is reloaded?
> >>>> Maybe!  I haven't thought through it a lot.  It's still not clear to
> >>>> me whether MSR_CORE_PERF_GLOBAL_OVF_CTRL actually controls the
> >>>> interrupt in any way or whether it just resets the bits in
> >>>> MSR_CORE_PERF_GLOBAL_STATUS and acking the interrupt on the APIC is
> >>>> all that's required to reenable it.
> >>>>
> >>>> - Kyle
> >>> I wonder if it would be reasonable to just remove the workaround
> >>> entirely at some point.  The set of people using 1) several year old
> >>> hardware, 2) an up to date Xen, and 3) the off-by-default performance
> >>> counters is probably rather small.
> >> We'd probably want to only enable this for affected processors, not
> >> remove it outright. But the problem is that we still don't know for sure
> >> whether this issue affects NHM only, do we?
> >>
> >> (https://lists.xenproject.org/archives/html/xen-devel/2017-07/msg02242.html
> >> is the original message)
> > Yes, the basic problem is that we don't know where to draw the line.
>
> vPMU is disabled by default for security reasons,


Is there any document about the possible attack via the vPMU? The
document I found (such as [1] and XSA-163) just briefly say that the
vPMU should be disabled due to security concern.


[1] https://xenbits.xen.org/docs/unstable/misc/xen-command-line.html

>
> and also broken, in a
> way which demonstrates that vPMU isn't getting much real-world use.

I also noticed that AWS seems support part of the vPMU
functionalities, which were used by Netflix to optimize their
applications' performance, according to
http://www.brendangregg.com/blog/2017-05-04/the-pmcs-of-ec2.html .

I guess the security issue should be solved by AWS? However, without
knowing how the attack could be conducted, I'm not sure how AWS avoids
the attack concern for vPMU.

>
> As far as I'm concerned, all options (including rm -rf and start from
> scratch) are acceptable, especially if this ends up giving us a better
> overall subsystem.
>
> Do we know how other hypervisors work around this issue?

Maybe the solution of AWS is a choice? I'm not sure. I'm just thinking aloud. :)

Thanks,

Meng

-- 
Meng Xu
Ph.D. Candidate in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] VPMU interrupt unreliability

2017-10-22 Thread Meng Xu

On Fri, Oct 20, 2017 at 3:07 AM, Jan Beulich  wrote:
>
> >>> On 19.10.17 at 20:20,  wrote:
> > Is there any document about the possible attack via the vPMU? The
> > document I found (such as [1] and XSA-163) just briefly say that the
> > vPMU should be disabled due to security concern.
>
> Besides the other responses you've already got, I also recall there
> being at least some CPU models that would live lock upon the
> debug store being placed into virtual space not mapped by present
> pages.


Thank you very much for your explanation! :)


Best Regards,

Meng

---
Meng Xu
Ph.D. Candidate in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 4/5] xentrace: enable per-VCPU extratime flag for RTDS

2017-10-23 Thread Meng Xu

On Tue, Oct 17, 2017 at 4:10 AM, Dario Faggioli  wrote:
> On Wed, 2017-10-11 at 14:02 -0400, Meng Xu wrote:
>> Change repl_budget event output for xentrace formats and xenalyze
>>
>> Signed-off-by: Meng Xu 
>>
> I'd say:
>
> Reviewed-by: Dario Faggioli 

Hi guys,

Just a reminder, we may need this patch for the work-conserving RTDS
scheduler in Xen 4.10.

I say Julien sent out the rc2 today which does not include this patch.

Thanks and best regards,

Meng

---
Meng Xu
Ph.D. Candidate in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 4/5] xentrace: enable per-VCPU extratime flag for RTDS

2017-11-02 Thread Meng Xu

Hi George,

On Wed, Oct 25, 2017 at 10:31 AM, Wei Liu  wrote:
>
> On Mon, Oct 23, 2017 at 02:50:31PM -0400, Meng Xu wrote:
> > On Tue, Oct 17, 2017 at 4:10 AM, Dario Faggioli  wrote:
> > > On Wed, 2017-10-11 at 14:02 -0400, Meng Xu wrote:
> > >> Change repl_budget event output for xentrace formats and xenalyze
> > >>
> > >> Signed-off-by: Meng Xu 
> > >>
> > > I'd say:
> > >
> > > Reviewed-by: Dario Faggioli 
> >
> > Hi guys,
> >
> > Just a reminder, we may need this patch for the work-conserving RTDS
> > scheduler in Xen 4.10.
> >
> > I say Julien sent out the rc2 today which does not include this patch.
> >
> > Thanks and best regards,
> >
>
> I'm waiting for George's ack.


Just a friendly reminder:
Do you have any comment on this patch?

Thanks,

Meng

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 4/5] xentrace: enable per-VCPU extratime flag for RTDS

2017-11-16 Thread Meng Xu

Hi all,

On Tue, Oct 17, 2017 at 4:10 AM, Dario Faggioli  wrote:
>
> On Wed, 2017-10-11 at 14:02 -0400, Meng Xu wrote:
> > Change repl_budget event output for xentrace formats and xenalyze
> >
> > Signed-off-by: Meng Xu 
> >
> I'd say:
>
> Reviewed-by: Dario Faggioli 
>

Just a friendly reminder:
This patch has not been pushed into either the staging or master
branch of xen.git.

This is an essential patch for the new version of RTDS scheduler which
Dario and I are maintaining.
This patch won't affect other features.

It has been a while without hearing complaints from the tools maintainers.

Is it ok to push it?

>
> However...
>
> > diff --git a/tools/xentrace/xenalyze.c b/tools/xentrace/xenalyze.c
> > index 79bdba7..19e050f 100644
> > --- a/tools/xentrace/xenalyze.c
> > +++ b/tools/xentrace/xenalyze.c
> > @@ -7935,23 +7935,29 @@ void sched_process(struct pcpu_info *p)
> >  unsigned int vcpuid:16, domid:16;
> >  uint64_t cur_bg;
> >  int delta;
> > +unsigned priority_level;
> > +unsigned has_extratime;
> >
> ...this last field is 'bool' in Xen.
>
> I appreciate that xenalyze does not build if you just make this bool as
> well. But it does build for me, if you do that, and also include
> stdbool.h, which I think is a fine thing to do.
>
> Anyway, I'll leave this to George and tools' maintainers.

If it turns out bool is prefered, I can change it and send out a new one.
But please just let me know so that we can have a complete toolstack
for the new version of RTDS scheduler.

Thanks,

Meng

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 5/6] xen: RTDS: rearrange members of control structures

2017-07-21 Thread Meng Xu

On Fri, Jun 23, 2017 at 6:55 AM, Dario Faggioli
 wrote:
>
> Nothing changed in `pahole` output, in terms of holes
> and padding, but some fields have been moved, to put
> related members in same cache line.
>
> Signed-off-by: Dario Faggioli 
> ---
> Cc: Meng Xu 
> Cc: George Dunlap 
> ---
>  xen/common/sched_rt.c |   13 -
>  1 file changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
> index 1b30014..39f6bee 100644
> --- a/xen/common/sched_rt.c
> +++ b/xen/common/sched_rt.c
> @@ -171,11 +171,14 @@ static void repl_timer_handler(void *data);
>  struct rt_private {
>  spinlock_t lock;/* the global coarse-grained lock */
>  struct list_head sdom;  /* list of availalbe domains, used for dump 
> */
> +
>  struct list_head runq;  /* ordered list of runnable vcpus */
>  struct list_head depletedq; /* unordered list of depleted vcpus */
> +
> +struct timer *repl_timer;   /* replenishment timer */
>  struct list_head replq; /* ordered list of vcpus that need 
> replenishment */
> +
>  cpumask_t tickled;  /* cpus been tickled */
> -struct timer *repl_timer;   /* replenishment timer */
>  };
>
>  /*
> @@ -185,10 +188,6 @@ struct rt_vcpu {
>  struct list_head q_elem; /* on the runq/depletedq list */
>  struct list_head replq_elem; /* on the replenishment events list */
>
> -/* Up-pointers */
> -struct rt_dom *sdom;
> -struct vcpu *vcpu;
> -
>  /* VCPU parameters, in nanoseconds */
>  s_time_t period;
>  s_time_t budget;
> @@ -198,6 +197,10 @@ struct rt_vcpu {
>  s_time_t last_start; /* last start time */
>  s_time_t cur_deadline;   /* current deadline for EDF */
>
> +/* Up-pointers */
> +struct rt_dom *sdom;
> +struct vcpu *vcpu;
> +
>  unsigned flags;  /* mark __RTDS_scheduled, etc.. */
>  };
>

Reviewed-by: Meng Xu 

BTW, Dario, I'm wondering if you used any tool to give hints about how
to arrange the fields in a structure or you just did it manually?

Thanks,

Meng

---
Meng Xu
PhD Candidate in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

[Xen-devel] [PATCH RFC v1] xen:rtds: towards work conserving RTDS

2017-08-01 Thread Meng Xu

Make RTDS scheduler work conserving to utilize the idle resource,
without breaking the real-time guarantees.

VCPU model:
Each real-time VCPU is extended to have a work conserving flag
and a priority_level field.
When a VCPU's budget is depleted in the current period,
if it has work conserving flag set,
its priority_level will increase by 1 and its budget will be refilled;
othewrise, the VCPU will be moved to the depletedq.

Scheduling policy: modified global EDF:
A VCPU v1 has higher priority than another VCPU v2 if
(i) v1 has smaller priority_leve; or
(ii) v1 has the same priority_level but has a smaller deadline

Signed-off-by: Meng Xu 
---
 xen/common/sched_rt.c | 71 ++-
 1 file changed, 59 insertions(+), 12 deletions(-)

diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
index 39f6bee..740a712 100644
--- a/xen/common/sched_rt.c
+++ b/xen/common/sched_rt.c
@@ -49,13 +49,16 @@
  * A PCPU is feasible if the VCPU can run on this PCPU and (the PCPU is idle or
  * has a lower-priority VCPU running on it.)
  *
- * Each VCPU has a dedicated period and budget.
+ * Each VCPU has a dedicated period, budget and is_work_conserving flag
  * The deadline of a VCPU is at the end of each period;
  * A VCPU has its budget replenished at the beginning of each period;
  * While scheduled, a VCPU burns its budget.
  * The VCPU needs to finish its budget before its deadline in each period;
  * The VCPU discards its unused budget at the end of each period.
- * If a VCPU runs out of budget in a period, it has to wait until next period.
+ * A work conserving VCPU has is_work_conserving flag set to true;
+ * When a VCPU runs out of budget in a period, if it is work conserving,
+ * it increases its priority_level by 1 and refill its budget; otherwise,
+ * it has to wait until next period.
  *
  * Each VCPU is implemented as a deferable server.
  * When a VCPU has a task running on it, its budget is continuously burned;
@@ -63,7 +66,8 @@
  *
  * Queue scheme:
  * A global runqueue and a global depletedqueue for each CPU pool.
- * The runqueue holds all runnable VCPUs with budget, sorted by deadline;
+ * The runqueue holds all runnable VCPUs with budget,
+ * sorted by priority_level and deadline;
  * The depletedqueue holds all VCPUs without budget, unsorted;
  *
  * Note: cpumask and cpupool is supported.
@@ -191,6 +195,7 @@ struct rt_vcpu {
 /* VCPU parameters, in nanoseconds */
 s_time_t period;
 s_time_t budget;
+bool_t is_work_conserving;   /* is vcpu work conserving */
 
 /* VCPU current infomation in nanosecond */
 s_time_t cur_budget; /* current budget */
@@ -201,6 +206,8 @@ struct rt_vcpu {
 struct rt_dom *sdom;
 struct vcpu *vcpu;
 
+unsigned priority_level;
+
 unsigned flags;  /* mark __RTDS_scheduled, etc.. */
 };
 
@@ -245,6 +252,11 @@ static inline struct list_head *rt_replq(const struct 
scheduler *ops)
 return &rt_priv(ops)->replq;
 }
 
+static inline bool_t is_work_conserving(const struct rt_vcpu *svc)
+{
+return svc->is_work_conserving;
+}
+
 /*
  * Helper functions for manipulating the runqueue, the depleted queue,
  * and the replenishment events queue.
@@ -273,6 +285,20 @@ vcpu_on_replq(const struct rt_vcpu *svc)
 return !list_empty(&svc->replq_elem);
 }
 
+/* If v1 priority >= v2 priority, return value > 0
+ * Otherwise, return value < 0
+ */
+static int
+compare_vcpu_priority(const struct rt_vcpu *v1, const struct rt_vcpu *v2)
+{
+if ( v1->priority_level < v2->priority_level ||
+ ( v1->priority_level == v2->priority_level && 
+ v1->cur_deadline <= v2->cur_deadline ) )
+return 1;
+else
+return -1;
+}
+
 /*
  * Debug related code, dump vcpu/cpu information
  */
@@ -303,6 +329,7 @@ rt_dump_vcpu(const struct scheduler *ops, const struct 
rt_vcpu *svc)
 cpulist_scnprintf(keyhandler_scratch, sizeof(keyhandler_scratch), mask);
 printk("[%5d.%-2u] cpu %u, (%"PRI_stime", %"PRI_stime"),"
" cur_b=%"PRI_stime" cur_d=%"PRI_stime" last_start=%"PRI_stime"\n"
+   " \t\t priority_level=%d work_conserving=%d\n"
" \t\t onQ=%d runnable=%d flags=%x effective hard_affinity=%s\n",
 svc->vcpu->domain->domain_id,
 svc->vcpu->vcpu_id,
@@ -312,6 +339,8 @@ rt_dump_vcpu(const struct scheduler *ops, const struct 
rt_vcpu *svc)
 svc->cur_budget,
 svc->cur_deadline,
 svc->last_start,
+svc->priority_level,
+is_work_conserving(svc),
 vcpu_on_q(svc),
 vcpu_runnable(svc->vcpu),
 svc->flags,
@@ -423,15 +452,18 @@ rt_update_deadline(s_time_t now, struct rt_vcpu *svc)
  */
 svc->last_start = now;

[Xen-devel] [PATCH RFC v1 2/3] libxl: enable per-VCPU work conserving flag for RTDS

2017-08-01 Thread Meng Xu

Modify libxl_vcpu_sched_params_get/set and sched_rtds_vcpu_get/set
functions to support per-VCPU work conserving flag

Signed-off-by: Meng Xu 
---
 tools/libxl/libxl.h | 1 +
 tools/libxl/libxl_sched.c   | 3 +++
 tools/libxl/libxl_types.idl | 2 ++
 3 files changed, 6 insertions(+)

diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 7cf0f31..dd9c926 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -2058,6 +2058,7 @@ int libxl_sched_credit2_params_set(libxl_ctx *ctx, 
uint32_t poolid,
 #define LIBXL_DOMAIN_SCHED_PARAM_LATENCY_DEFAULT   -1
 #define LIBXL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT -1
 #define LIBXL_DOMAIN_SCHED_PARAM_BUDGET_DEFAULT-1
+#define LIBXL_DOMAIN_SCHED_PARAM_IS_WORK_CONSERVING_DEFAULT-1
 
 /* Per-VCPU parameters */
 #define LIBXL_SCHED_PARAM_VCPU_INDEX_DEFAULT   -1
diff --git a/tools/libxl/libxl_sched.c b/tools/libxl/libxl_sched.c
index faa604e..fe92747 100644
--- a/tools/libxl/libxl_sched.c
+++ b/tools/libxl/libxl_sched.c
@@ -558,6 +558,7 @@ static int sched_rtds_vcpu_get_all(libxl__gc *gc, uint32_t 
domid,
 for (i = 0; i < num_vcpus; i++) {
 scinfo->vcpus[i].period = vcpus[i].u.rtds.period;
 scinfo->vcpus[i].budget = vcpus[i].u.rtds.budget;
+scinfo->vcpus[i].is_work_conserving = 
vcpus[i].u.rtds.is_work_conserving;
 scinfo->vcpus[i].vcpuid = vcpus[i].vcpuid;
 }
 rc = 0;
@@ -607,6 +608,7 @@ static int sched_rtds_vcpu_set(libxl__gc *gc, uint32_t 
domid,
 vcpus[i].vcpuid = scinfo->vcpus[i].vcpuid;
 vcpus[i].u.rtds.period = scinfo->vcpus[i].period;
 vcpus[i].u.rtds.budget = scinfo->vcpus[i].budget;
+vcpus[i].u.rtds.is_work_conserving = 
scinfo->vcpus[i].is_work_conserving;
 }
 
 r = xc_sched_rtds_vcpu_set(CTX->xch, domid,
@@ -655,6 +657,7 @@ static int sched_rtds_vcpu_set_all(libxl__gc *gc, uint32_t 
domid,
 vcpus[i].vcpuid = i;
 vcpus[i].u.rtds.period = scinfo->vcpus[0].period;
 vcpus[i].u.rtds.budget = scinfo->vcpus[0].budget;
+vcpus[i].u.rtds.is_work_conserving = 
scinfo->vcpus[0].is_work_conserving;
 }
 
 r = xc_sched_rtds_vcpu_set(CTX->xch, domid,
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 8a9849c..f6c3ead 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -401,6 +401,7 @@ libxl_sched_params = Struct("sched_params",[
 ("period",   integer, {'init_val': 
'LIBXL_DOMAIN_SCHED_PARAM_PERIOD_DEFAULT'}),
 ("extratime",integer, {'init_val': 
'LIBXL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT'}),
 ("budget",   integer, {'init_val': 
'LIBXL_DOMAIN_SCHED_PARAM_BUDGET_DEFAULT'}),
+("is_work_conserving", integer, {'init_val': 
'LIBXL_DOMAIN_SCHED_PARAM_IS_WORK_CONSERVING_DEFAULT'}),
 ])
 
 libxl_vcpu_sched_params = Struct("vcpu_sched_params",[
@@ -414,6 +415,7 @@ libxl_domain_sched_params = Struct("domain_sched_params",[
 ("cap",  integer, {'init_val': 
'LIBXL_DOMAIN_SCHED_PARAM_CAP_DEFAULT'}),
 ("period",   integer, {'init_val': 
'LIBXL_DOMAIN_SCHED_PARAM_PERIOD_DEFAULT'}),
 ("budget",   integer, {'init_val': 
'LIBXL_DOMAIN_SCHED_PARAM_BUDGET_DEFAULT'}),
+("is_work_conserving", integer, {'init_val': 
'LIBXL_DOMAIN_SCHED_PARAM_IS_WORK_CONSERVING_DEFAULT'}),
 
 # The following three parameters ('slice', 'latency' and 'extratime') are 
deprecated,
 # and will have no effect if used, since the SEDF scheduler has been 
removed.
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

[Xen-devel] [PATCH RFC v1 1/3] xen:rtds: enable XL to set and get vcpu work conserving flag

2017-08-01 Thread Meng Xu

Extend the hypercalls(XEN_DOMCTL_SCHEDOP_getvcpuinfo/putvcpuinfo) to
get/set a domain's per-VCPU work conserving parameters.

Signed-off-by: Meng Xu 
---
 xen/common/sched_rt.c   | 2 ++
 xen/include/public/domctl.h | 1 +
 2 files changed, 3 insertions(+)

diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
index 740a712..76ed4cb 100644
--- a/xen/common/sched_rt.c
+++ b/xen/common/sched_rt.c
@@ -1442,6 +1442,7 @@ rt_dom_cntl(
 svc = rt_vcpu(d->vcpu[local_sched.vcpuid]);
 local_sched.u.rtds.budget = svc->budget / MICROSECS(1);
 local_sched.u.rtds.period = svc->period / MICROSECS(1);
+local_sched.u.rtds.is_work_conserving = 
svc->is_work_conserving;
 spin_unlock_irqrestore(&prv->lock, flags);
 
 if ( copy_to_guest_offset(op->u.v.vcpus, index,
@@ -1466,6 +1467,7 @@ rt_dom_cntl(
 svc = rt_vcpu(d->vcpu[local_sched.vcpuid]);
 svc->period = period;
 svc->budget = budget;
+svc->is_work_conserving = 
local_sched.u.rtds.is_work_conserving;
 spin_unlock_irqrestore(&prv->lock, flags);
 }
 /* Process a most 64 vCPUs without checking for preemptions. */
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index ff39762..e67cd9e 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -360,6 +360,7 @@ typedef struct xen_domctl_sched_credit2 {
 typedef struct xen_domctl_sched_rtds {
 uint32_t period;
 uint32_t budget;
+bool is_work_conserving;
 } xen_domctl_sched_rtds_t;
 
 typedef struct xen_domctl_schedparam_vcpu {
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

[Xen-devel] [PATCH RFC v1 3/3] xl: enable per-VCPU work conserving flag for RTDS

2017-08-01 Thread Meng Xu

Change main_sched_rtds and related output functions to support
per-VCPU work conserving flag.

Signed-off-by: Meng Xu 
---
 tools/xl/xl_cmdtable.c |  3 ++-
 tools/xl/xl_sched.c| 56 ++
 2 files changed, 40 insertions(+), 19 deletions(-)

diff --git a/tools/xl/xl_cmdtable.c b/tools/xl/xl_cmdtable.c
index 30eb93c..95997e1 100644
--- a/tools/xl/xl_cmdtable.c
+++ b/tools/xl/xl_cmdtable.c
@@ -272,12 +272,13 @@ struct cmd_spec cmd_table[] = {
 { "sched-rtds",
   &main_sched_rtds, 0, 1,
   "Get/set rtds scheduler parameters",
-  "[-d  [-v[=VCPUID/all]] [-p[=PERIOD]] [-b[=BUDGET]]]",
+  "[-d  [-v[=VCPUID/all]] [-p[=PERIOD]] [-b[=BUDGET]]] 
[-w[=WORKCONSERVING]]",
   "-d DOMAIN, --domain=DOMAIN Domain to modify\n"
   "-v VCPUID/all, --vcpuid=VCPUID/allVCPU to modify or output;\n"
   "   Using '-v all' to modify/output all vcpus\n"
   "-p PERIOD, --period=PERIOD Period (us)\n"
   "-b BUDGET, --budget=BUDGET Budget (us)\n"
+  "-w WORKCONSERVING, --workconserving=WORKCONSERVINGWORKCONSERVING 
(1=yes,0=no)\n"
 },
 { "domid",
   &main_domid, 0, 0,
diff --git a/tools/xl/xl_sched.c b/tools/xl/xl_sched.c
index 85722fe..35a64e1 100644
--- a/tools/xl/xl_sched.c
+++ b/tools/xl/xl_sched.c
@@ -251,7 +251,7 @@ static int sched_rtds_domain_output(
 libxl_domain_sched_params scinfo;
 
 if (domid < 0) {
-printf("%-33s %4s %9s %9s\n", "Name", "ID", "Period", "Budget");
+printf("%-33s %4s %9s %9s %15s\n", "Name", "ID", "Period", "Budget", 
"Work conserving");
 return 0;
 }
 
@@ -262,11 +262,12 @@ static int sched_rtds_domain_output(
 }
 
 domname = libxl_domid_to_name(ctx, domid);
-printf("%-33s %4d %9d %9d\n",
+printf("%-33s %4d %9d %9d %15d\n",
 domname,
 domid,
 scinfo.period,
-scinfo.budget);
+scinfo.budget,
+scinfo.is_work_conserving);
 free(domname);
 libxl_domain_sched_params_dispose(&scinfo);
 return 0;
@@ -279,8 +280,8 @@ static int sched_rtds_vcpu_output(int domid, 
libxl_vcpu_sched_params *scinfo)
 int i;
 
 if (domid < 0) {
-printf("%-33s %4s %4s %9s %9s\n", "Name", "ID",
-   "VCPU", "Period", "Budget");
+printf("%-33s %4s %4s %9s %9s %15s\n", "Name", "ID",
+   "VCPU", "Period", "Budget", "Work conserving");
 return 0;
 }
 
@@ -290,12 +291,13 @@ static int sched_rtds_vcpu_output(int domid, 
libxl_vcpu_sched_params *scinfo)
 
 domname = libxl_domid_to_name(ctx, domid);
 for ( i = 0; i < scinfo->num_vcpus; i++ ) {
-printf("%-33s %4d %4d %9"PRIu32" %9"PRIu32"\n",
+printf("%-33s %4d %4d %9"PRIu32" %9"PRIu32" %15d\n",
domname,
domid,
scinfo->vcpus[i].vcpuid,
scinfo->vcpus[i].period,
-   scinfo->vcpus[i].budget);
+   scinfo->vcpus[i].budget,
+   scinfo->vcpus[i].is_work_conserving );
 }
 free(domname);
 return 0;
@@ -309,8 +311,8 @@ static int sched_rtds_vcpu_output_all(int domid,
 int i;
 
 if (domid < 0) {
-printf("%-33s %4s %4s %9s %9s\n", "Name", "ID",
-   "VCPU", "Period", "Budget");
+printf("%-33s %4s %4s %9s %9s %15s\n", "Name", "ID",
+   "VCPU", "Period", "Budget", "Work conserving");
 return 0;
 }
 
@@ -321,12 +323,13 @@ static int sched_rtds_vcpu_output_all(int domid,
 
 domname = libxl_domid_to_name(ctx, domid);
 for ( i = 0; i < scinfo->num_vcpus; i++ ) {
-printf("%-33s %4d %4d %9"PRIu32" %9"PRIu32"\n",
+printf("%-33s %4d %4d %9"PRIu32" %9"PRIu32" %15d\n",
domname,
domid,
scinfo->vcpus[i].vcpuid,
scinfo->vcpus[i].period,
-   scinfo->vcpus[i].budget);
+   scinfo->vcpus[i].budget,
+   scinfo->vcpus[i].is_work_conserving);
 }
 free(domname);
 return 0;
@@ -702,14 +705,18 @@ int main_sched_rtds(int argc, char **argv)
 int *vcpus = (int *)xmalloc(sizeof(int)); /* IDs of VCPUs that change */
 int *periods = (int *)xmalloc(sizeof(int)); /* period is in microsecond */
 int *budgets = (int *)xmalloc(si

[Xen-devel] [PATCH RFC v1 0/3] Enable XL to set and get per-VCPU work conserving flag for RTDS scheduler

2017-08-01 Thread Meng Xu

This series of patches enable the toolstack to
set and get per-VCPU work-conserving flag.
With the toolstack, system administrators can decide
which VCPUs will be made work-conserving.

The design of the work-conserving RTDS was discussed in
https://www.mail-archive.com/xen-devel@lists.xen.org/msg77150.html

We plan to perform two steps in making RTDS scheduler work-conserving:
(1) First make all VCPUs work-conserving by default,
which was sent as a separate patch. This work aims for Xen 4.10 release.
(2) After that, we enable the XL to set and get per-VCPU work-conserving flag,
which is this series of patches.

Signed-off-by: Meng Xu 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH RFC v1 0/3] Enable XL to set and get per-VCPU work conserving flag for RTDS scheduler

2017-08-01 Thread Meng Xu

On Tue, Aug 1, 2017 at 2:33 PM, Meng Xu  wrote:
>
> This series of patches enable the toolstack to
> set and get per-VCPU work-conserving flag.
> With the toolstack, system administrators can decide
> which VCPUs will be made work-conserving.
>
> The design of the work-conserving RTDS was discussed in
> https://www.mail-archive.com/xen-devel@lists.xen.org/msg77150.html
>
> We plan to perform two steps in making RTDS scheduler work-conserving:
> (1) First make all VCPUs work-conserving by default,
> which was sent as a separate patch. This work aims for Xen 4.10 release.
> (2) After that, we enable the XL to set and get per-VCPU work-conserving flag,
> which is this series of patches.


The series of patches that have both steps done can be found at the
following repo: https://github.com/PennPanda/RT-Xen
under the branch xenbits/rtds/work-conserving-RFCv1.

Thanks,

Meng


Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

[Xen-devel] [PATCH v4] xen: rtds: only tickle non-already tickled CPUs

2017-08-01 Thread Meng Xu

When more than one idle VCPUs that have the same PCPU as their
previous running core invoke runq_tickle(), they will tickle the same
PCPU. The tickled PCPU will only pick at most one VCPU, i.e., the
highest-priority one, to execute. The other VCPUs will not be
scheduled for a period, even when there is an idle core, making these
VCPUs unnecessarily starve for one period.

Therefore, always make sure that we only tickle PCPUs that have not
been tickled already.

Signed-off-by: Haoran Li 
Signed-off-by: Meng Xu 

---
The initial discussion of this patch can be found at
https://lists.xenproject.org/archives/html/xen-devel/2017-02/msg02857.html

Changes in v4:
1) Take Dario's suggestions:
   Search the new->cpu first for the cpu to tickle.
   This get rid of the if statement in previous versions.
2) Reword the comments and commit messages.
3) Rebased on staging branch.

Issues in v2 and v3:
Did not rebase on the latest staging branch.
Did not solve the comments/issues in v1.
Please ignore the v2 and v3.
---
 xen/common/sched_rt.c | 29 ++---
 1 file changed, 14 insertions(+), 15 deletions(-)

diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
index 39f6bee..5fec95f 100644
--- a/xen/common/sched_rt.c
+++ b/xen/common/sched_rt.c
@@ -1147,9 +1147,9 @@ rt_vcpu_sleep(const struct scheduler *ops, struct vcpu 
*vc)
  * Called by wake() and context_saved()
  * We have a running candidate here, the kick logic is:
  * Among all the cpus that are within the cpu affinity
- * 1) if the new->cpu is idle, kick it. This could benefit cache hit
- * 2) if there are any idle vcpu, kick it.
- * 3) now all pcpus are busy;
+ * 1) if there are any idle vcpu, kick it.
+  For cache benefit,we first search new->cpu.
+ * 2) now all pcpus are busy;
  *among all the running vcpus, pick lowest priority one
  *if snext has higher priority, kick it.
  *
@@ -1177,17 +1177,13 @@ runq_tickle(const struct scheduler *ops, struct rt_vcpu 
*new)
 cpumask_and(¬_tickled, online, new->vcpu->cpu_hard_affinity);
 cpumask_andnot(¬_tickled, ¬_tickled, &prv->tickled);
 
-/* 1) if new's previous cpu is idle, kick it for cache benefit */
-if ( is_idle_vcpu(curr_on_cpu(new->vcpu->processor)) )
-{
-SCHED_STAT_CRANK(tickled_idle_cpu);
-cpu_to_tickle = new->vcpu->processor;
-goto out;
-}
-
-/* 2) if there are any idle pcpu, kick it */
-/* The same loop also find the one with lowest priority */
-for_each_cpu(cpu, ¬_tickled)
+/*
+ * 1) If there are any idle vcpu, kick it.
+ *For cache benefit,we first search new->cpu.
+ *The same loop also find the one with lowest priority.
+ */
+cpu = cpumask_test_or_cycle(new->vcpu->processor, ¬_tickled);
+while ( cpu!= nr_cpu_ids )
 {
 iter_vc = curr_on_cpu(cpu);
 if ( is_idle_vcpu(iter_vc) )
@@ -1200,9 +1196,12 @@ runq_tickle(const struct scheduler *ops, struct rt_vcpu 
*new)
 if ( latest_deadline_vcpu == NULL ||
  iter_svc->cur_deadline > latest_deadline_vcpu->cur_deadline )
 latest_deadline_vcpu = iter_svc;
+
+cpumask_clear_cpu(cpu, ¬_tickled);
+cpu = cpumask_cycle(cpu, ¬_tickled);
 }
 
-/* 3) candicate has higher priority, kick out lowest priority vcpu */
+/* 2) candicate has higher priority, kick out lowest priority vcpu */
 if ( latest_deadline_vcpu != NULL &&
  new->cur_deadline < latest_deadline_vcpu->cur_deadline )
 {
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

[Xen-devel] [PATCH v5] xen: rtds: only tickle non-already tickled CPUs

2017-08-02 Thread Meng Xu

When more than one idle VCPUs that have the same PCPU as their
previous running core invoke runq_tickle(), they will tickle the same
PCPU. The tickled PCPU will only pick at most one VCPU, i.e., the
highest-priority one, to execute. The other VCPUs will not be
scheduled for a period, even when there is an idle core, making these
VCPUs unnecessarily starve for one period.

Therefore, always make sure that we only tickle PCPUs that have not
been tickled already.

Signed-off-by: Haoran Li 
Signed-off-by: Meng Xu 
Reviewed-by: Dario Faggioli 

---
The initial discussion of this patch can be found at
https://lists.xenproject.org/archives/html/xen-devel/2017-02/msg02857.html

Changes in v5:
Revise comments as Dario suggested

Changes in v4:
1) Take Dario's suggestions:
   Search the new->cpu first for the cpu to tickle.
   This get rid of the if statement in previous versions.
2) Reword the comments and commit messages.
3) Rebased on staging branch.

Issues in v2 and v3:
Did not rebase on the latest staging branch.
Did not solve the comments/issues in v1.
Please ignore the v2 and v3.
---
 xen/common/sched_rt.c | 29 ++---
 1 file changed, 14 insertions(+), 15 deletions(-)

diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
index 39f6bee..0ac5816 100644
--- a/xen/common/sched_rt.c
+++ b/xen/common/sched_rt.c
@@ -1147,9 +1147,9 @@ rt_vcpu_sleep(const struct scheduler *ops, struct vcpu 
*vc)
  * Called by wake() and context_saved()
  * We have a running candidate here, the kick logic is:
  * Among all the cpus that are within the cpu affinity
- * 1) if the new->cpu is idle, kick it. This could benefit cache hit
- * 2) if there are any idle vcpu, kick it.
- * 3) now all pcpus are busy;
+ * 1) if there are any idle CPUs, kick one.
+  For cache benefit, we check new->cpu as first
+ * 2) now all pcpus are busy;
  *among all the running vcpus, pick lowest priority one
  *if snext has higher priority, kick it.
  *
@@ -1177,17 +1177,13 @@ runq_tickle(const struct scheduler *ops, struct rt_vcpu 
*new)
 cpumask_and(¬_tickled, online, new->vcpu->cpu_hard_affinity);
 cpumask_andnot(¬_tickled, ¬_tickled, &prv->tickled);
 
-/* 1) if new's previous cpu is idle, kick it for cache benefit */
-if ( is_idle_vcpu(curr_on_cpu(new->vcpu->processor)) )
-{
-SCHED_STAT_CRANK(tickled_idle_cpu);
-cpu_to_tickle = new->vcpu->processor;
-goto out;
-}
-
-/* 2) if there are any idle pcpu, kick it */
-/* The same loop also find the one with lowest priority */
-for_each_cpu(cpu, ¬_tickled)
+/*
+ * 1) If there are any idle CPUs, kick one.
+ *For cache benefit,we first search new->cpu.
+ *The same loop also find the one with lowest priority.
+ */
+cpu = cpumask_test_or_cycle(new->vcpu->processor, ¬_tickled);
+while ( cpu!= nr_cpu_ids )
 {
 iter_vc = curr_on_cpu(cpu);
 if ( is_idle_vcpu(iter_vc) )
@@ -1200,9 +1196,12 @@ runq_tickle(const struct scheduler *ops, struct rt_vcpu 
*new)
 if ( latest_deadline_vcpu == NULL ||
  iter_svc->cur_deadline > latest_deadline_vcpu->cur_deadline )
 latest_deadline_vcpu = iter_svc;
+
+cpumask_clear_cpu(cpu, ¬_tickled);
+cpu = cpumask_cycle(cpu, ¬_tickled);
 }
 
-/* 3) candicate has higher priority, kick out lowest priority vcpu */
+/* 2) candicate has higher priority, kick out lowest priority vcpu */
 if ( latest_deadline_vcpu != NULL &&
  new->cur_deadline < latest_deadline_vcpu->cur_deadline )
 {
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH RFC v1] xen:rtds: towards work conserving RTDS

2017-08-02 Thread Meng Xu

On Wed, Aug 2, 2017 at 1:46 PM, Dario Faggioli
 wrote:
> Hey, Meng!
>
> It's really cool to see progress on this... There was quite a bit of
> interest in scheduling in general at the Summit in Budapest, and one
> important thing for making sure RTDS will be really useful, is for it
> to have a work conserving mode! :-)

Glad to hear that. :-)

>
> On Tue, 2017-08-01 at 14:13 -0400, Meng Xu wrote:
>> Make RTDS scheduler work conserving to utilize the idle resource,
>> without breaking the real-time guarantees.
>
> Just kill the "to utilize the idle resource". We can expect that people
>  that are interested in this commit, also know what 'work conserving'
> means. :-)

Got it. Will do.

>
>> VCPU model:
>> Each real-time VCPU is extended to have a work conserving flag
>> and a priority_level field.
>> When a VCPU's budget is depleted in the current period,
>> if it has work conserving flag set,
>> its priority_level will increase by 1 and its budget will be
>> refilled;
>> othewrise, the VCPU will be moved to the depletedq.
>>
> Mmm... Ok. But is the budget burned, while the vCPU executes at
> priority_level 1? If yes, doesn't this mean we risk having less budget
> when we get back to priority_lvevel 0?
>
> Oh, wait, maybe it's the case that, when we get back to priority_level
> 0, we also get another replenishment, is that the case? If yes, I
> actually think it's fine...

It's the latter case: the vcpu will get another replenishment when it
gets back to priority_level 0.

>
>> diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
>> index 39f6bee..740a712 100644
>> --- a/xen/common/sched_rt.c
>> +++ b/xen/common/sched_rt.c
>> @@ -191,6 +195,7 @@ struct rt_vcpu {
>>  /* VCPU parameters, in nanoseconds */
>>  s_time_t period;
>>  s_time_t budget;
>> +bool_t is_work_conserving;   /* is vcpu work conserving */
>>
>>  /* VCPU current infomation in nanosecond */
>>  s_time_t cur_budget; /* current budget */
>> @@ -201,6 +206,8 @@ struct rt_vcpu {
>>  struct rt_dom *sdom;
>>  struct vcpu *vcpu;
>>
>> +unsigned priority_level;
>> +
>>  unsigned flags;  /* mark __RTDS_scheduled, etc.. */
>>
> So, since we've got a 'flags' field already, can the flag be one of its
> bit, instead of adding a new bool in the struct:
>
> /*
>  * RTDS_work_conserving: Can the vcpu run in the time that is
>  * not part of any real-time reservation, and would therefore
>  * be otherwise left idle?
>  */
> __RTDS_work_conserving   4
> #define RTDS_work_conserving (1<<__RTDS_work_conserving)

Thank you very much for the suggestion! I will modify based on your suggestion.

Actually, I was not very comfortable with the is_work_conserving field either.
It makes the structure verbose and mess up the struct's the cache_line
alignment.

>
>> @@ -245,6 +252,11 @@ static inline struct list_head *rt_replq(const
>> struct scheduler *ops)
>>  return &rt_priv(ops)->replq;
>>  }
>>
>> +static inline bool_t is_work_conserving(const struct rt_vcpu *svc)
>> +{
>>
> Use bool.

OK.

>
>> @@ -273,6 +285,20 @@ vcpu_on_replq(const struct rt_vcpu *svc)
>>  return !list_empty(&svc->replq_elem);
>>  }
>>
>> +/* If v1 priority >= v2 priority, return value > 0
>> + * Otherwise, return value < 0
>> + */
>>
> Comment style.

Got it. Will make it as:
/*
 * If v1 priority >= v2 priority, return value > 0
 * Otherwise, return value < 0
 */

>
> Apart from that, do you want this to return >0 if v1 should have
> priority over v2, and <0 if vice-versa, right? If yes...

Yes.

>
>> +static int
>> +compare_vcpu_priority(const struct rt_vcpu *v1, const struct rt_vcpu
>> *v2)
>> +{
>> +if ( v1->priority_level < v2->priority_level ||
>> + ( v1->priority_level == v2->priority_level &&
>> + v1->cur_deadline <= v2->cur_deadline ) )
>> +return 1;
>> +else
>> +return -1;
>>
>   int prio = v2->priority_level - v1->priority_level;
>
>   if ( prio == 0 )
> return v2->cur_deadline - v1->cur_deadline;
>
>   return prio;
>
> Return type has to become s_time_t, and there's a chance that it'll
> return 0, if they are at the same level, and have the same absolute
> deadline. But I think you can deal with this in the caller.

OK. Will do.

>
>> @@ -966,8 +1001,16 @@ burn_budget(const struct scheduler *

Re: [Xen-devel] [PATCH RFC v1 0/3] Enable XL to set and get per-VCPU work conserving flag for RTDS scheduler

2017-08-02 Thread Meng Xu

On Wed, Aug 2, 2017 at 1:49 PM, Dario Faggioli
 wrote:
> On Tue, 2017-08-01 at 14:33 -0400, Meng Xu wrote:
>> This series of patches enable the toolstack to
>> set and get per-VCPU work-conserving flag.
>> With the toolstack, system administrators can decide
>> which VCPUs will be made work-conserving.
>>
> Thanks for this series as well, Meng. I'll look at it in the next
> couple of days.
>>
>> We plan to perform two steps in making RTDS scheduler work-
>> conserving:
>> (1) First make all VCPUs work-conserving by default,
>> which was sent as a separate patch. This work aims for Xen 4.10
>> release.
>> (2) After that, we enable the XL to set and get per-VCPU work-
>> conserving flag,
>> which is this series of patches.
>>
> I think it's better if you merge the "xen:rtds: towards work conserving
> RTDS" as patch 1 of this series.
>
> In fact, sending them as separate series, you make people think that
> they're independent, while they're not (as this series is pretty
> useless, without that patch :-P).

Sure. I can do that. :)

Thanks,

Meng


Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH RFC v1 1/3] xen:rtds: enable XL to set and get vcpu work conserving flag

2017-08-03 Thread Meng Xu

On Thu, Aug 3, 2017 at 11:47 AM, Dario Faggioli
 wrote:
> On Tue, 2017-08-01 at 14:33 -0400, Meng Xu wrote:
>> --- a/xen/include/public/domctl.h
>> +++ b/xen/include/public/domctl.h
>> @@ -360,6 +360,7 @@ typedef struct xen_domctl_sched_credit2 {
>>  typedef struct xen_domctl_sched_rtds {
>>  uint32_t period;
>>  uint32_t budget;
>> +bool is_work_conserving;
>>
> I wonder whether it wouldn't be better (e.g., more future proof) to
> have a 'uint32_T flags' field here too.
>
> That way, if/when, in future, we want to introduce some other way of
> tweaking the scheduler's behavior for this vCPU, we already have space
> for specifying it...
>

uint32_t flag sounds reasonable to me.
I can do it in the next version.

Meng

---
Meng Xu
PhD Candidate in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH RFC v1 2/3] libxl: enable per-VCPU work conserving flag for RTDS

2017-08-03 Thread Meng Xu

On Thu, Aug 3, 2017 at 11:53 AM, Dario Faggioli
 wrote:
> On Tue, 2017-08-01 at 14:33 -0400, Meng Xu wrote:
>> diff --git a/tools/libxl/libxl_types.idl
>> b/tools/libxl/libxl_types.idl
>> index 8a9849c..f6c3ead 100644
>> --- a/tools/libxl/libxl_types.idl
>> +++ b/tools/libxl/libxl_types.idl
>> @@ -401,6 +401,7 @@ libxl_sched_params = Struct("sched_params",[
>>  ("period",   integer, {'init_val':
>> 'LIBXL_DOMAIN_SCHED_PARAM_PERIOD_DEFAULT'}),
>>  ("extratime",integer, {'init_val':
>> 'LIBXL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT'}),
>>  ("budget",   integer, {'init_val':
>> 'LIBXL_DOMAIN_SCHED_PARAM_BUDGET_DEFAULT'}),
>> +("is_work_conserving", integer, {'init_val':
>> 'LIBXL_DOMAIN_SCHED_PARAM_IS_WORK_CONSERVING_DEFAULT'}),
>>  ])
>>
> How about, here at libxl level, we use the "extratime" field that we
> have as a leftover from SEDF (and which had, in that scheduler, a
> similar meaning)?
>
> If we don't want to use that one, and we want a new field, I suggest
> thinking to a shorter name.

How about 'LIBXL_DOMAIN_SCHED_PARAM_FLAG'?
We use a bit in the flag field in the sched_rt.c to indicate if a VCPU
is work-conserving. The flag field is also extensible for adding other
VCPU properties in the future, if necessary.

Thanks,

Meng

---
Meng Xu
PhD Candidate in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH RFC v1 3/3] xl: enable per-VCPU work conserving flag for RTDS

2017-08-03 Thread Meng Xu

On Thu, Aug 3, 2017 at 12:03 PM, Dario Faggioli
 wrote:
> On Tue, 2017-08-01 at 14:33 -0400, Meng Xu wrote:
>> --- a/tools/xl/xl_cmdtable.c
>> +++ b/tools/xl/xl_cmdtable.c
>> @@ -272,12 +272,13 @@ struct cmd_spec cmd_table[] = {
>>  { "sched-rtds",
>>&main_sched_rtds, 0, 1,
>>"Get/set rtds scheduler parameters",
>> -  "[-d  [-v[=VCPUID/all]] [-p[=PERIOD]] [-b[=BUDGET]]]",
>> +  "[-d  [-v[=VCPUID/all]] [-p[=PERIOD]] [-b[=BUDGET]]]
>> [-w[=WORKCONSERVING]]",
>>"-d DOMAIN, --domain=DOMAIN Domain to modify\n"
>>"-v VCPUID/all, --vcpuid=VCPUID/allVCPU to modify or
>> output;\n"
>>"   Using '-v all' to modify/output all vcpus\n"
>>"-p PERIOD, --period=PERIOD Period (us)\n"
>>"-b BUDGET, --budget=BUDGET Budget (us)\n"
>> +  "-w WORKCONSERVING, --
>> workconserving=WORKCONSERVINGWORKCONSERVING (1=yes,0=no)\n"
>>
> Does this really need to accept a 1 or 0 parameter? Can't it be that,
> if -w is provided, the vCPU is marked as work-conserving, if it's not,
> it's considered reservation only.
>
>> --- a/tools/xl/xl_sched.c
>> +++ b/tools/xl/xl_sched.c
>>
>> @@ -279,8 +280,8 @@ static int sched_rtds_vcpu_output(int domid,
>> libxl_vcpu_sched_params *scinfo)
>>  int i;
>>
>>  if (domid < 0) {
>> -printf("%-33s %4s %4s %9s %9s\n", "Name", "ID",
>> -   "VCPU", "Period", "Budget");
>> +printf("%-33s %4s %4s %9s %9s %15s\n", "Name", "ID",
>> +   "VCPU", "Period", "Budget", "Work conserving");
>>  return 0;
>>  }
>>
>> @@ -290,12 +291,13 @@ static int sched_rtds_vcpu_output(int domid,
>> libxl_vcpu_sched_params *scinfo)
>>
>>  domname = libxl_domid_to_name(ctx, domid);
>>  for ( i = 0; i < scinfo->num_vcpus; i++ ) {
>> -printf("%-33s %4d %4d %9"PRIu32" %9"PRIu32"\n",
>> +printf("%-33s %4d %4d %9"PRIu32" %9"PRIu32" %15d\n",
>>
> As far as printing it goes, OTOH, I would indeed print a string, i.e.,
> "yes", if the field is found to be 1 (true), or "no", if the field is
> found to be 0 (false).
>
>> @@ -702,14 +705,18 @@ int main_sched_rtds(int argc, char **argv)
>>  int *vcpus = (int *)xmalloc(sizeof(int)); /* IDs of VCPUs that
>> change */
>>  int *periods = (int *)xmalloc(sizeof(int)); /* period is in
>> microsecond */
>>  int *budgets = (int *)xmalloc(sizeof(int)); /* budget is in
>> microsecond */
>> +int *workconservings = (int *)xmalloc(sizeof(int)); /* budget is
>> in microsecond */
>>
> Yeah, budget is in microseconds. But this is not budget! :-P

Ah, my bad..

>
> In fact (jokes apart), it can be just a bool, can't it?

Yes, bool is enough.
Is "workconserving" too long here?

I thought about alternative names, such as "wc", "workc", and
"extratime". None of them is good enough. The ideal one should be much
shorter and easy to link to "work conserving". :(
If we use "extratime", it may cause confusion with the "extratime" in
the depreciated SEDF. (That is my concern of reusing the EXTRATIME in
the libxl_type.idl.)

Maybe "workc" is better than "workconserving"?

Thanks,

Meng

---
Meng Xu
PhD Candidate in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH RFC v1 3/3] xl: enable per-VCPU work conserving flag for RTDS

2017-08-04 Thread Meng Xu

On Fri, Aug 4, 2017 at 5:01 AM, Dario Faggioli
 wrote:
> On Thu, 2017-08-03 at 18:02 -0400, Meng Xu wrote:
>> On Thu, Aug 3, 2017 at 12:03 PM, Dario Faggioli
>>  wrote:
>> >
>> > > @@ -702,14 +705,18 @@ int main_sched_rtds(int argc, char **argv)
>> > >  int *vcpus = (int *)xmalloc(sizeof(int)); /* IDs of VCPUs
>> > > that
>> > > change */
>> > >  int *periods = (int *)xmalloc(sizeof(int)); /* period is in
>> > > microsecond */
>> > >  int *budgets = (int *)xmalloc(sizeof(int)); /* budget is in
>> > > microsecond */
>> > > +int *workconservings = (int *)xmalloc(sizeof(int)); /*
>> > > budget is
>> > > in microsecond */
>> > >
>> >
>> > Yeah, budget is in microseconds. But this is not budget! :-P
>>
>> Ah, my bad..
>>
>> >
>> > In fact (jokes apart), it can be just a bool, can't it?
>>
>> Yes, bool is enough.
>> Is "workconserving" too long here?
>>
> So, I don't want to turn this into a discussion about what colour we
> should paint the infamous bikeshed... but, yeah, I don't especially
> like this name! :-P
>
> An I mean, not only here, but everywhere you've used it (changelogs,
> other patches, etc.).
>
> There are two reasons for that:
>  - it's indeed very long;
>  - being work conserving is (or at least, I've always heard it used
>and used it myself) a characteristic of a scheduling algorithm (or
>of its implementation), *not* of a task/vcpu/schedulable entity.

Fair enough. I agree work conserving  is not a good name.

>
>It is the scheduler that is work conserving, iff it never let CPUs
>sit idle, when there is work to do. In our case here, the scheduler
>is work conserving if all the vCPUs has this flag set. It's not,
>if even just one has it clear.
>
>And by putting workconserving-ness at the vCPU level, it looks to
>me that we're doing something terminologically wrong, and
>potentially confusing.
>
> I didn't bring this up before, because I'm a bit afraid that it's just
> be being picky... but since you mentioned this yourself.
>
>> I thought about alternative names, such as "wc", "workc", and
>> "extratime". None of them is good enough.
>>
> Yep, I agree that contractions like 'wc' or 'workc' are pretty bad.
> 'extratime', I'd actually like it better, TBH.
>
>> The ideal one should be much
>> shorter and easy to link to "work conserving". :(
>> If we use "extratime", it may cause confusion with the "extratime" in
>> the depreciated SEDF. (That is my concern of reusing the EXTRATIME in
>> the libxl_type.idl.)
>>
> Well, but SEDF being gone (and since quite a few time), and the fact
> that RTDS and SEDF have not really never been there together, does
> leave very few room for confusion, I think.
>
> While in academia (e.g., in the GRUB == Gready Reclaming of Unused
> Bandwidth papers), what you're trying to achieved, I've heard it called
> 'reclaiming' (as I'm sure you have as well :-)), and my friends that
> are still working on Linux, are actually using it in there:
>
> https://lkml.org/lkml/2017/5/18/1128
> https://lkml.org/lkml/2017/5/18/1137 <-- SCHED_FLAG_RECLAIM
>
> I'm not so sure about it... As I'm not sure the meaning would appear
> obvious, to people not into RT scheduling research.
>
> And even from this point of view, 'extratime' seems a lot better to me.
> And if it were me doing this, I'd probably use it, both in the
> internals and in the interface.
>

I'm thinking between reclaim and extratime.
I will use extratime since extratime is already in the libxl.
extratime means the VCPU will have extra time. It's the scheduler to
determine how much extratime it will get.

Thanks,

Meng

---
Meng Xu
PhD Candidate in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH RFC v1 2/3] libxl: enable per-VCPU work conserving flag for RTDS

2017-08-04 Thread Meng Xu

On Fri, Aug 4, 2017 at 10:34 AM, Wei Liu  wrote:
> On Fri, Aug 04, 2017 at 02:53:51PM +0200, Dario Faggioli wrote:
>> On Fri, 2017-08-04 at 13:10 +0100, Wei Liu wrote:
>> > On Fri, Aug 04, 2017 at 10:13:18AM +0200, Dario Faggioli wrote:
>> > > On Thu, 2017-08-03 at 17:39 -0400, Meng Xu wrote:
>> > > >
>> > > *HOWEVER*, in this case, we do have that 'extratime' field already,
>> > > as
>> > > a leftover from SEDF, which is there taking space and cluttering
>> > > the
>> > > interface, so why don't make good use of it. Especially considering
>> > > it
>> > > was used for _exactly_ the same thing, and with _exactly_ the same
>> > > meaning, and even for a very similar (i.e., SEDF was also real-
>> > > time)
>> > > kind of scheduler.
>> >
>> > Correct me if I'm wrong:
>> >
>> > 1. extratime is ever only used in SEDF
>> > 2. SEDF is removed
>> >
>> > That means we do have extratime to use in all other schedulers.
>> >
>> I'm not sure what you mean with this last line.
>>
>> IAC, this is how our the related data structures looks like, right now:
>>
>> libxl_sched_params = Struct("sched_params",[
>> ("vcpuid",   integer, {'init_val': 
>> 'LIBXL_SCHED_PARAM_VCPU_INDEX_DEFAULT'}),
>> ("weight",   integer, {'init_val': 
>> 'LIBXL_DOMAIN_SCHED_PARAM_WEIGHT_DEFAULT'}),
>> ("cap",  integer, {'init_val': 
>> 'LIBXL_DOMAIN_SCHED_PARAM_CAP_DEFAULT'}),
>> ("period",   integer, {'init_val': 
>> 'LIBXL_DOMAIN_SCHED_PARAM_PERIOD_DEFAULT'}),
>> ("extratime",integer, {'init_val': 
>> 'LIBXL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT'}),
>> ("budget",   integer, {'init_val': 
>> 'LIBXL_DOMAIN_SCHED_PARAM_BUDGET_DEFAULT'}),
>> ])
>>
>> The extratime field is there. Any scheduler can use it, if it wants
>> (and in the way it wants). Currently, no one of them does that.
>
> Right, that's what I wanted to know.
>
>>
>> libxl_domain_sched_params = Struct("domain_sched_params",[
>> ("sched",libxl_scheduler),
>> ("weight",   integer, {'init_val': 
>> 'LIBXL_DOMAIN_SCHED_PARAM_WEIGHT_DEFAULT'}),
>> ("cap",  integer, {'init_val': 
>> 'LIBXL_DOMAIN_SCHED_PARAM_CAP_DEFAULT'}),
>> ("period",   integer, {'init_val': 
>> 'LIBXL_DOMAIN_SCHED_PARAM_PERIOD_DEFAULT'}),
>> ("budget",   integer, {'init_val': 
>> 'LIBXL_DOMAIN_SCHED_PARAM_BUDGET_DEFAULT'}),
>>
>> # The following three parameters ('slice', 'latency' and 'extratime') 
>> are deprecated,
>> # and will have no effect if used, since the SEDF scheduler has been 
>> removed.
>> # Note that 'period' was an SDF parameter too, but it is still effective 
>> as it is
>> # now used (together with 'budget') by the RTDS scheduler.
>> ("slice",integer, {'init_val': 
>> 'LIBXL_DOMAIN_SCHED_PARAM_SLICE_DEFAULT'}),
>> ("latency",  integer, {'init_val': 
>> 'LIBXL_DOMAIN_SCHED_PARAM_LATENCY_DEFAULT'}),
>> ("extratime",integer, {'init_val': 
>> 'LIBXL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT'}),
>> ])
>>
>> Same here. 'slice', 'latency' and 'extratime' are there because we
>> deprecate, but don't remove stuff. They're not used in any way. [*]
>>
>> If, at some point, I'd decide to develop a feature for, say Credit2,
>> that controll the latency (whatever that would mean, it's just an
>> example! :-D) of domains, I think I'll use this 'latency' field, for
>> its interface, instead of adding some other stuff.
>>
>> > However, please consider the possibility of reintroducing SEDF in the
>> > future. Suppose that would happen, does extratime still has the same
>> > semantics?
>> >
>> Well, I guess yes. But how does this matter? Each scheduler can, if it
>> wants, use all these parameters in the way it actuallly prefers. So,
>> the fact that RTDS will be using 'extratime' for letting vCPUs execute
>> past their own real-time reservation, does not prevent the reintroduced
>> SEDF --nor any other already existing or new scheduler-- to also use
>> it, for similar (or maybe even not so similar) purposes.
>>
>> Or am I missing something?
>
> If extratime means different things to different schedulers, it's going
> to be confusing. As a layperson I can't tell what extratime is or how it
> is supposed to be used. I would like to have the field to have only one
> meaning.

Right now, extratime is not used by any scheduler. It was used in SEDF only.

Since RTDS is the first scheduler to use the extratime after SEDF is
depreciated, if we will use it, it only has one meaning: if extratime
is non-zero, it indicates the VCPU will get extra time.

I guess I lean to use extratime in the RTDS now.

Best,

Meng
---
Meng Xu
PhD Candidate in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH RFC v1] xen:rtds: towards work conserving RTDS

2017-08-05 Thread Meng Xu

>
>> @@ -966,8 +1001,16 @@ burn_budget(const struct scheduler *ops, struct
>> rt_vcpu *svc, s_time_t now)
>>
>>  if ( svc->cur_budget <= 0 )
>>  {
>> -svc->cur_budget = 0;
>> -__set_bit(__RTDS_depleted, &svc->flags);
>> +if ( is_work_conserving(svc) )
>> +{
>> +svc->priority_level++;
>>
>ASSERT(svc->priority_level <= 1);

I'm sorry I didn't see this suggestion in previous email. I don't
think this assert makes sense.

A vcpu that has extratime can have priority_level > 1.
For example, a VCPU (period = 100ms, budget = 10ms) runs alone on a
core. The VCPU may get its budget replenished  for 9 times in a
period. the vcpu's priority_level may be 9.

The priority_level here also indicates how many times the VCPU gets
the extra budget in the current period.

>
>> +svc->cur_budget = svc->budget;
>> +}
>> +else
>> +    {
>> +svc->cur_budget = 0;
>> +__set_bit(__RTDS_depleted, &svc->flags);
>> +}
>>  }

Thanks,

Meng

---
Meng Xu
PhD Candidate in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

[Xen-devel] [PATCH v1 1/3] xen:rtds: towards work conserving RTDS

2017-08-06 Thread Meng Xu

Make RTDS scheduler work conserving without breaking the real-time guarantees.

VCPU model:
Each real-time VCPU is extended to have an extratime flag
and a priority_level field.
When a VCPU's budget is depleted in the current period,
if it has extratime flag set,
its priority_level will increase by 1 and its budget will be refilled;
othewrise, the VCPU will be moved to the depletedq.

Scheduling policy is modified global EDF:
A VCPU v1 has higher priority than another VCPU v2 if
(i) v1 has smaller priority_leve; or
(ii) v1 has the same priority_level but has a smaller deadline

Queue management:
Run queue holds VCPUs with extratime flag set and VCPUs with
remaining budget. Run queue is sorted in increasing order of VCPUs priorities.
Depleted queue holds VCPUs which have extratime flag cleared and depleted 
budget.
Replenished queue is not modified.

Signed-off-by: Meng Xu 

---
Changes from RFC v1
Rewording comments and commit message
Remove is_work_conserving field from rt_vcpu structure
Use one bit in VCPU's flag to indicate if a VCPU will have extra time
Correct comments style
---
 xen/common/sched_rt.c   | 90 ++---
 xen/include/public/domctl.h |  3 ++
 2 files changed, 79 insertions(+), 14 deletions(-)

diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
index 39f6bee..4e048b9 100644
--- a/xen/common/sched_rt.c
+++ b/xen/common/sched_rt.c
@@ -49,13 +49,15 @@
  * A PCPU is feasible if the VCPU can run on this PCPU and (the PCPU is idle or
  * has a lower-priority VCPU running on it.)
  *
- * Each VCPU has a dedicated period and budget.
+ * Each VCPU has a dedicated period, budget and a extratime flag
  * The deadline of a VCPU is at the end of each period;
  * A VCPU has its budget replenished at the beginning of each period;
  * While scheduled, a VCPU burns its budget.
  * The VCPU needs to finish its budget before its deadline in each period;
  * The VCPU discards its unused budget at the end of each period.
- * If a VCPU runs out of budget in a period, it has to wait until next period.
+ * When a VCPU runs out of budget in a period, if its extratime flag is set,
+ * the VCPU increases its priority_level by 1 and refills its budget; 
otherwise,
+ * it has to wait until next period.
  *
  * Each VCPU is implemented as a deferable server.
  * When a VCPU has a task running on it, its budget is continuously burned;
@@ -63,7 +65,8 @@
  *
  * Queue scheme:
  * A global runqueue and a global depletedqueue for each CPU pool.
- * The runqueue holds all runnable VCPUs with budget, sorted by deadline;
+ * The runqueue holds all runnable VCPUs with budget,
+ * sorted by priority_level and deadline;
  * The depletedqueue holds all VCPUs without budget, unsorted;
  *
  * Note: cpumask and cpupool is supported.
@@ -151,6 +154,14 @@
 #define RTDS_depleted (1<<__RTDS_depleted)
 
 /*
+ * RTDS_extratime: Can the vcpu run in the time that is
+ * not part of any real-time reservation, and would therefore
+ * be otherwise left idle?
+ */
+#define __RTDS_extratime4
+#define RTDS_extratime (1<<__RTDS_extratime)
+
+/*
  * rt tracing events ("only" 512 available!). Check
  * include/public/trace.h for more details.
  */
@@ -201,6 +212,8 @@ struct rt_vcpu {
 struct rt_dom *sdom;
 struct vcpu *vcpu;
 
+unsigned priority_level;
+
 unsigned flags;  /* mark __RTDS_scheduled, etc.. */
 };
 
@@ -245,6 +258,11 @@ static inline struct list_head *rt_replq(const struct 
scheduler *ops)
 return &rt_priv(ops)->replq;
 }
 
+static inline bool has_extratime(const struct rt_vcpu *svc)
+{
+return (svc->flags & RTDS_extratime) ? 1 : 0;
+}
+
 /*
  * Helper functions for manipulating the runqueue, the depleted queue,
  * and the replenishment events queue.
@@ -274,6 +292,21 @@ vcpu_on_replq(const struct rt_vcpu *svc)
 }
 
 /*
+ * If v1 priority >= v2 priority, return value > 0
+ * Otherwise, return value < 0
+ */
+static s_time_t
+compare_vcpu_priority(const struct rt_vcpu *v1, const struct rt_vcpu *v2)
+{
+int prio = v2->priority_level - v1->priority_level;
+
+if ( prio == 0 )
+return v2->cur_deadline - v1->cur_deadline;
+
+return prio;
+}
+
+/*
  * Debug related code, dump vcpu/cpu information
  */
 static void
@@ -303,6 +336,7 @@ rt_dump_vcpu(const struct scheduler *ops, const struct 
rt_vcpu *svc)
 cpulist_scnprintf(keyhandler_scratch, sizeof(keyhandler_scratch), mask);
 printk("[%5d.%-2u] cpu %u, (%"PRI_stime", %"PRI_stime"),"
" cur_b=%"PRI_stime" cur_d=%"PRI_stime" last_start=%"PRI_stime"\n"
+   " \t\t priority_level=%d has_extratime=%d\n"
" \t\t onQ=%d runnable=%d flags=%x effective hard_affinity=%s\n",
 svc->vcpu->domain->domain_id,
 svc->vcpu->vcpu_id,
@@ -312,6 +346,8 @@ rt_dump_vcpu(const str

[Xen-devel] [PATCH v1 2/3] libxl: enable per-VCPU extratime flag for RTDS

2017-08-06 Thread Meng Xu

Modify libxl_vcpu_sched_params_get/set and sched_rtds_vcpu_get/set
functions to support per-VCPU extratime flag

Signed-off-by: Meng Xu 

---
Changes from RFC v1
Change work_conserving flag to extratime flag
---
 tools/libxl/libxl_sched.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/tools/libxl/libxl_sched.c b/tools/libxl/libxl_sched.c
index faa604e..4ebed96 100644
--- a/tools/libxl/libxl_sched.c
+++ b/tools/libxl/libxl_sched.c
@@ -558,6 +558,10 @@ static int sched_rtds_vcpu_get_all(libxl__gc *gc, uint32_t 
domid,
 for (i = 0; i < num_vcpus; i++) {
 scinfo->vcpus[i].period = vcpus[i].u.rtds.period;
 scinfo->vcpus[i].budget = vcpus[i].u.rtds.budget;
+if ( vcpus[i].u.rtds.flags & XEN_DOMCTL_SCHED_RTDS_extratime )
+   scinfo->vcpus[i].extratime = 1;
+else
+   scinfo->vcpus[i].extratime = 0;
 scinfo->vcpus[i].vcpuid = vcpus[i].vcpuid;
 }
 rc = 0;
@@ -607,6 +611,10 @@ static int sched_rtds_vcpu_set(libxl__gc *gc, uint32_t 
domid,
 vcpus[i].vcpuid = scinfo->vcpus[i].vcpuid;
 vcpus[i].u.rtds.period = scinfo->vcpus[i].period;
 vcpus[i].u.rtds.budget = scinfo->vcpus[i].budget;
+if ( scinfo->vcpus[i].extratime )
+vcpus[i].u.rtds.flags |= XEN_DOMCTL_SCHED_RTDS_extratime;
+else
+vcpus[i].u.rtds.flags &= ~XEN_DOMCTL_SCHED_RTDS_extratime;
 }
 
 r = xc_sched_rtds_vcpu_set(CTX->xch, domid,
@@ -655,6 +663,10 @@ static int sched_rtds_vcpu_set_all(libxl__gc *gc, uint32_t 
domid,
 vcpus[i].vcpuid = i;
 vcpus[i].u.rtds.period = scinfo->vcpus[0].period;
 vcpus[i].u.rtds.budget = scinfo->vcpus[0].budget;
+if ( scinfo->vcpus[0].extratime )
+vcpus[i].u.rtds.flags |= XEN_DOMCTL_SCHED_RTDS_extratime;
+else
+vcpus[i].u.rtds.flags &= ~XEN_DOMCTL_SCHED_RTDS_extratime;
 }
 
 r = xc_sched_rtds_vcpu_set(CTX->xch, domid,
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

[Xen-devel] [PATCH v1 3/3] xl: enable per-VCPU extratime flag for RTDS

2017-08-06 Thread Meng Xu

Change main_sched_rtds and related output functions to support
per-VCPU extratime flag.

Signed-off-by: Meng Xu 

---
Changes from RFC v1
Changes work_conserving flag to extratime flag
---
 tools/xl/xl_cmdtable.c |  3 ++-
 tools/xl/xl_sched.c| 56 ++
 2 files changed, 40 insertions(+), 19 deletions(-)

diff --git a/tools/xl/xl_cmdtable.c b/tools/xl/xl_cmdtable.c
index 2c71a9f..88933a4 100644
--- a/tools/xl/xl_cmdtable.c
+++ b/tools/xl/xl_cmdtable.c
@@ -272,12 +272,13 @@ struct cmd_spec cmd_table[] = {
 { "sched-rtds",
   &main_sched_rtds, 0, 1,
   "Get/set rtds scheduler parameters",
-  "[-d  [-v[=VCPUID/all]] [-p[=PERIOD]] [-b[=BUDGET]]]",
+  "[-d  [-v[=VCPUID/all]] [-p[=PERIOD]] [-b[=BUDGET]] 
[-e[=EXTRATIME]]]",
   "-d DOMAIN, --domain=DOMAIN Domain to modify\n"
   "-v VCPUID/all, --vcpuid=VCPUID/allVCPU to modify or output;\n"
   "   Using '-v all' to modify/output all vcpus\n"
   "-p PERIOD, --period=PERIOD Period (us)\n"
   "-b BUDGET, --budget=BUDGET Budget (us)\n"
+  "-e EXTRATIME, --extratime=EXTRATIME EXTRATIME (1=yes, 0=no)\n"
 },
 { "domid",
   &main_domid, 0, 0,
diff --git a/tools/xl/xl_sched.c b/tools/xl/xl_sched.c
index 85722fe..5138012 100644
--- a/tools/xl/xl_sched.c
+++ b/tools/xl/xl_sched.c
@@ -251,7 +251,7 @@ static int sched_rtds_domain_output(
 libxl_domain_sched_params scinfo;
 
 if (domid < 0) {
-printf("%-33s %4s %9s %9s\n", "Name", "ID", "Period", "Budget");
+printf("%-33s %4s %9s %9s %10s\n", "Name", "ID", "Period", "Budget", 
"Extra time");
 return 0;
 }
 
@@ -262,11 +262,12 @@ static int sched_rtds_domain_output(
 }
 
 domname = libxl_domid_to_name(ctx, domid);
-printf("%-33s %4d %9d %9d\n",
+printf("%-33s %4d %9d %9d %10s\n",
 domname,
 domid,
 scinfo.period,
-scinfo.budget);
+scinfo.budget,
+scinfo.extratime ? "yes" : "no");
 free(domname);
 libxl_domain_sched_params_dispose(&scinfo);
 return 0;
@@ -279,8 +280,8 @@ static int sched_rtds_vcpu_output(int domid, 
libxl_vcpu_sched_params *scinfo)
 int i;
 
 if (domid < 0) {
-printf("%-33s %4s %4s %9s %9s\n", "Name", "ID",
-   "VCPU", "Period", "Budget");
+printf("%-33s %4s %4s %9s %9s %10s\n", "Name", "ID",
+   "VCPU", "Period", "Budget", "Extra time");
 return 0;
 }
 
@@ -290,12 +291,13 @@ static int sched_rtds_vcpu_output(int domid, 
libxl_vcpu_sched_params *scinfo)
 
 domname = libxl_domid_to_name(ctx, domid);
 for ( i = 0; i < scinfo->num_vcpus; i++ ) {
-printf("%-33s %4d %4d %9"PRIu32" %9"PRIu32"\n",
+printf("%-33s %4d %4d %9"PRIu32" %9"PRIu32" %10s\n",
domname,
domid,
scinfo->vcpus[i].vcpuid,
scinfo->vcpus[i].period,
-   scinfo->vcpus[i].budget);
+   scinfo->vcpus[i].budget,
+   scinfo->vcpus[i].extratime ? "yes" : "no");
 }
 free(domname);
 return 0;
@@ -309,8 +311,8 @@ static int sched_rtds_vcpu_output_all(int domid,
 int i;
 
 if (domid < 0) {
-printf("%-33s %4s %4s %9s %9s\n", "Name", "ID",
-   "VCPU", "Period", "Budget");
+printf("%-33s %4s %4s %9s %9s %10s\n", "Name", "ID",
+   "VCPU", "Period", "Budget", "Extra time");
 return 0;
 }
 
@@ -321,12 +323,13 @@ static int sched_rtds_vcpu_output_all(int domid,
 
 domname = libxl_domid_to_name(ctx, domid);
 for ( i = 0; i < scinfo->num_vcpus; i++ ) {
-printf("%-33s %4d %4d %9"PRIu32" %9"PRIu32"\n",
+printf("%-33s %4d %4d %9"PRIu32" %9"PRIu32" %10s\n",
domname,
domid,
scinfo->vcpus[i].vcpuid,
scinfo->vcpus[i].period,
-   scinfo->vcpus[i].budget);
+   scinfo->vcpus[i].budget,
+   scinfo->vcpus[i].extratime ? "yes" : "no");
 }
 free(domname);
 return 0;
@@ -702,14 +705,18 @@ int main_sched_rtds(int argc, char **argv)
 int *vcpus = (int *)xmalloc(sizeof(int)); /* IDs of VCPUs that change */
 int *perio

1 2 3 4 >

1 - 100 of 397 matches

Mail list logo