From: jamal <[EMAIL PROTECTED]>
Date: Fri, 06 Jul 2007 10:39:15 -0400
> If the issue is usability of listing 1024 netdevices, i can think of
> many ways to resolve it.
I would agree with this if there were a reason for it, it's totally
unnecessary complication as far as I can see.
These virtual
On Fri, 2007-07-06 at 10:39 -0400, jamal wrote:
> The first thing that crossed my mind was "if you want to select a
> destination port based on a destination MAC you are talking about a
> switch/bridge". You bring up the issue of "a huge number of virtual NICs
> if you wanted arbitrary guests" whic
jamal wrote:
If the issue is usability of listing 1024 netdevices, i can think of
many ways to resolve it.
One way we can resolve the listing is with a simple tag to the netdev
struct i could say "list netdevices for guest 0-10" etc etc.
This would be a useful feature, not only for virtualizati
On Fri, 2007-06-07 at 17:32 +1000, Rusty Russell wrote:
[..some good stuff deleted here ..]
> Hope that adds something,
It does - thanks.
I think i was letting my experience pollute my thinking earlier when
Dave posted. The copy-avoidance requirement is clear to me[1].
I had another issue wh
On Tue, 2007-07-03 at 22:20 -0400, jamal wrote:
> On Tue, 2007-03-07 at 14:24 -0700, David Miller wrote:
> [.. some useful stuff here deleted ..]
>
> > That's why you have to copy into a purpose-built set of memory
> > that is composed of pages that _ONLY_ contain TX packet buffers
> > and nothing
On Tue, 2007-03-07 at 14:24 -0700, David Miller wrote:
[.. some useful stuff here deleted ..]
> That's why you have to copy into a purpose-built set of memory
> that is composed of pages that _ONLY_ contain TX packet buffers
> and nothing else.
>
> The cost of going through the switch is too high
From: jamal <[EMAIL PROTECTED]>
Date: Tue, 03 Jul 2007 08:42:33 -0400
> (likely not in the case of hypervisor based virtualization like Xen)
> just have their skbs cloned when crossing domains, is that not the
> case?[1]
> Assuming they copy, the balance that needs to be stricken now is
> between:
On Sat, 2007-30-06 at 13:33 -0700, David Miller wrote:
> It's like twice as fast, since the switch doesn't have to copy
> the packet in, switch it, then the destination guest copies it
> into it's address space.
>
> There is approximately one copy for each hop you go over through these
> virtual
From: jamal <[EMAIL PROTECTED]>
Date: Sat, 30 Jun 2007 10:52:44 -0400
> On Fri, 2007-29-06 at 21:35 -0700, David Miller wrote:
>
> > Awesome, but let's concentrate on the client since I can actually
> > implement and test anything we come up with :-)
>
> Ok, you need to clear one premise for me
On Fri, 2007-29-06 at 21:35 -0700, David Miller wrote:
> Awesome, but let's concentrate on the client since I can actually
> implement and test anything we come up with :-)
Ok, you need to clear one premise for me then ;->
You said the model is for the guest/client to hook have a port to the
host
> It would be great if we could finally get a working e1000
> multiqueue patch so work in this area can actually be tested.
I'm actively working on this right now. I'm on vacation next week, but
hopefully I can get something working before I leave OLS and post it.
-PJ
-
To unsubscribe from this
David Miller wrote:
> Now I get to pose a problem for everyone, prove to me how useful
> this new code is by showing me how it can be used to solve a
> reocurring problem in virtualized network drivers of which I've
> had to code one up recently, see my most recent blog entry at:
>
> http://
> "DM" == David Miller <[EMAIL PROTECTED]> writes:
DM> And some people still use hubs, believe it or not.
Hubs are 100Mbps at most. You could of course make a flooding Gbps
switch, but it would be rather silly. If you care about multicast
performance, you get a switch with IGMP snooping.
/B
From: jamal <[EMAIL PROTECTED]>
Date: Fri, 29 Jun 2007 21:30:53 -0400
> On Fri, 2007-29-06 at 14:31 -0700, David Miller wrote:
> > Maybe for the control node switch, yes, but not for the guest network
> > devices.
>
> And that is precisely what i was talking about - and i am sure thats how
> the
On Fri, 2007-29-06 at 14:31 -0700, David Miller wrote:
> This conversation begins to go into a pointless direction already, as
> I feared it would.
>
> Nobody is going to configure bridges, classification, tc, and all of
> this other crap just for a simple virtualized guest networking device.
>
>
From: Ben Greear <[EMAIL PROTECTED]>
Date: Fri, 29 Jun 2007 08:33:06 -0700
> Patrick McHardy wrote:
> > Right, but the current bridging code always uses promiscous mode
> > and its nice to avoid that if possible. Looking at the code, it
> > should be easy to avoid though by disabling learning (and
This conversation begins to go into a pointless direction already, as
I feared it would.
Nobody is going to configure bridges, classification, tc, and all of
this other crap just for a simple virtualized guest networking device.
It's a confined and well defined case that doesn't need any of that
Patrick McHardy wrote:
Ben Greear wrote:
Could someone give a quick example of when I am wrong and promisc mode
would allow
a NIC to receive a significant number of packets not really destined for
it?
In a switched environment it won't have a big effect, I agree.
It might help avoid r
Ben Greear wrote:
> Patrick McHardy wrote:
>
>> Right, but the current bridging code always uses promiscous mode
>> and its nice to avoid that if possible. Looking at the code, it
>> should be easy to avoid though by disabling learning (and thus
>> promisous mode) and adding unicast filters for al
Patrick McHardy wrote:
Right, but the current bridging code always uses promiscous mode
and its nice to avoid that if possible. Looking at the code, it
should be easy to avoid though by disabling learning (and thus
promisous mode) and adding unicast filters for all static fdb entries.
I am cur
On Fri, 2007-29-06 at 15:08 +0200, Patrick McHardy wrote:
> jamal wrote:
> > On Fri, 2007-29-06 at 13:59 +0200, Patrick McHardy wrote:
> Right, but the current bridging code always uses promiscous mode
> and its nice to avoid that if possible.
> Looking at the code, it
> should be easy to avoid t
jamal wrote:
> On Fri, 2007-29-06 at 13:59 +0200, Patrick McHardy wrote:
>
>
>>The difference to a real bridge is that the
>>all addresses are completely known in advance, so it doesn't need
>>promiscous mode for learning.
>
>
> You mean the per-virtual MAC addresses are known in advance, right
On Fri, 2007-29-06 at 13:59 +0200, Patrick McHardy wrote:
> I'm guessing that that wouldn't allow to do unicast filtering for
> the guests on the real device without hacking the bridge code for
> this special case.
For ingress (i guess you could say for egress as well): we can do it as
well toda
jamal wrote:
> On Thu, 2007-28-06 at 21:20 -0700, David Miller wrote:
>
>>Each guest gets a unique MAC address. There is a queue per-port
>>that can fill up.
>>
>>What all the drivers like this do right now is stop the queue if
>>any of the per-port queues fill up, and that's why my sunvnet
>>dri
Ive changed the topic for you friend - otherwise most people wont follow
(as youve said a few times yourself ;->).
On Thu, 2007-28-06 at 21:20 -0700, David Miller wrote:
> Now I get to pose a problem for everyone, prove to me how useful
> this new code is by showing me how it can be used to solv
> Ok everything is checked into net-2.6.23, thanks everyone.
Dave, thank you for your patience and feedback on this whole process.
Patrick and everyone else, thank you for your feedback and assistance.
I am looking at your posed virtualization question, but I need sleep
since I just remembered I'
From: Patrick McHardy <[EMAIL PROTECTED]>
Date: Thu, 28 Jun 2007 21:24:37 +0200
> Waskiewicz Jr, Peter P wrote:
> >>[...]
> >>The only reasonable thing it can do is not care about
> >>multiqueue and just dequeue as usual. In fact I think it
> >>should be an error to configure multiqueue on a non
> Waskiewicz Jr, Peter P wrote:
> >>[...]
> >>The only reasonable thing it can do is not care about
> multiqueue and
> >>just dequeue as usual. In fact I think it should be an error to
> >>configure multiqueue on a non-root qdisc.
> >
> >
> > Ack. This is a thought process that trips me up fr
Waskiewicz Jr, Peter P wrote:
>>[...]
>>The only reasonable thing it can do is not care about
>>multiqueue and just dequeue as usual. In fact I think it
>>should be an error to configure multiqueue on a non-root qdisc.
>
>
> Ack. This is a thought process that trips me up from time to time...I
> Absolutely not. First of all, its perfectly valid to use
> non-multiqueue qdiscs on multiqueue devices. Secondly, its
> only the root qdisc that has to know about multiqueue since
> that one controls the child qdiscs.
>
> Think about it, it makes absolutely no sense to have the
> child qdisc
Waskiewicz Jr, Peter P wrote:
>>PJ Waskiewicz wrote:
>>
>>>+#ifdef CONFIG_NET_SCH_MULTIQUEUE
>>>+if (q->mq)
>>>+skb->queue_mapping =
>>>+
>>
>>q->prio2band[band&TC_PRIO_MAX];
>>
>>>+else
> PJ Waskiewicz wrote:
> > +#ifdef CONFIG_NET_SCH_MULTIQUEUE
> > + if (q->mq)
> > + skb->queue_mapping =
> > +
> q->prio2band[band&TC_PRIO_MAX];
> > + else
> > + skb->
PJ Waskiewicz wrote:
> +#ifdef CONFIG_NET_SCH_MULTIQUEUE
> + if (q->mq)
> + skb->queue_mapping =
> + q->prio2band[band&TC_PRIO_MAX];
> + else
> + skb->queue_m
Waskiewicz Jr, Peter P wrote:
> Thanks for fixing; however, the current sch_prio doesn't unregister the
> qdisc if register_qdisc() on prio fails, or does that happen implicitly
> because the module will probably unload?
It failed, there's nothing to unregister. But when you register two
qdiscs a
> Its not error handling. You do:
>
> err = register qdisc 1
> if (err)
> return err;
> err = register qdisc 2
> if (err)
> unregister qdisc 2
> return err
>
> anyways, I already fixed that and cleaned up prio_classify
> the way I suggested. Will send shortly.
Thanks for fixing; ho
Patrick McHardy wrote:
> PJ Waskiewicz wrote:
>
>> +
>> static int __init prio_module_init(void)
>> {
>> -return register_qdisc(&prio_qdisc_ops);
>> +int err;
>> +err = register_qdisc(&prio_qdisc_ops);
>> +if (!err)
>> +err = register_qdisc(&rr_qdisc_ops);
>> +return
Waskiewicz Jr, Peter P wrote:
>>PJ Waskiewicz wrote:
>>
>>
>>>+
>>> static int __init prio_module_init(void) {
>>>-return register_qdisc(&prio_qdisc_ops);
>>>+int err;
>>>+err = register_qdisc(&prio_qdisc_ops);
>>>+if (!err)
>>>+err = register_qdisc(&rr_qdisc_ops);
>>>+
> PJ Waskiewicz wrote:
>
> > +
> > static int __init prio_module_init(void) {
> > - return register_qdisc(&prio_qdisc_ops);
> > + int err;
> > + err = register_qdisc(&prio_qdisc_ops);
> > + if (!err)
> > + err = register_qdisc(&rr_qdisc_ops);
> > + return err;
> > }
> >
>
PJ Waskiewicz wrote:
+
static int __init prio_module_init(void)
{
- return register_qdisc(&prio_qdisc_ops);
+ int err;
+ err = register_qdisc(&prio_qdisc_ops);
+ if (!err)
+ err = register_qdisc(&rr_qdisc_ops);
+ return err;
}
Thats still broken
> Thats not necessary. I just though you could add one exit point:
>
>
> ...
> out:
> skb->queue_mapping = q->mq ? band : 0;
> return q->queues[band];
> }
>
> But if that doesn't work don't bother ..
Unfortunately it won't, given how band might be used like this to select
the queue:
re
Waskiewicz Jr, Peter P wrote:
@@ -70,14 +72,28 @@ prio_classify(struct sk_buff *skb, struct Qdisc
*sch, int *qerr) #endif
if (TC_H_MAJ(band))
band = 0;
+ if (q->mq)
+skb->queue_mapping =
+
q->prio2b
> > @@ -70,14 +72,28 @@ prio_classify(struct sk_buff *skb, struct Qdisc
> > *sch, int *qerr) #endif
> > if (TC_H_MAJ(band))
> > band = 0;
> > + if (q->mq)
> > + skb->queue_mapping =
> > +
> PJ Waskiewicz wrote:
> > + /* If we're multiqueue, make sure the number of incoming bands
> > +* matches the number of queues on the device we're
> associating with.
> > +*/
> > + if (tb[TCA_PRIO_MQ - 1])
> > + q->mq = *(unsigned char *)RTA_DATA(tb[TCA_PRIO_MQ - 1]);
> > +
Waskiewicz Jr, Peter P wrote:
And RTA_PUT_FLAG. Now that I think of it, does it even makes
sense to have a prio private flag for this instead of a qdisc
global one?
There currently aren't any other qdiscs that are natural fits for
multiqueue that I can see. I can see the benefit though
> > enum
> > {
> > - TCA_PRIO_UNPSEC,
> > - TCA_PRIO_TEST,
>
>
> You misunderstood me. You can work on top of my compat
> attribute patches, but the example code should not have to go
> in to apply your patch.
Ok. I'll fix my patches.
> > diff --git a/net/sched/Kconfig b/net/sched/Kcon
PJ Waskiewicz wrote:
> + /* If we're multiqueue, make sure the number of incoming bands
> + * matches the number of queues on the device we're associating with.
> + */
> + if (tb[TCA_PRIO_MQ - 1])
> + q->mq = *(unsigned char *)RTA_DATA(tb[TCA_PRIO_MQ - 1]);
> +
> +
PJ Waskiewicz wrote:
> diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
> index 09808b7..ec3a9a5 100644
> --- a/include/linux/pkt_sched.h
> +++ b/include/linux/pkt_sched.h
> @@ -103,8 +103,8 @@ struct tc_prio_qopt
>
> enum
> {
> - TCA_PRIO_UNPSEC,
> - TCA_PRIO_TEST,
> Patrick McHardy wrote:
> > Waskiewicz Jr, Peter P wrote:
> >
> >>Thought about this more last night and this morning. As
> far as I can
> >>tell, I still need this. If the qdisc gets loaded with multiqueue
> >>turned on, I can just use the value of band to assign
> >>skb->queue_mapping. Bu
Patrick McHardy wrote:
> Waskiewicz Jr, Peter P wrote:
>
>>Thought about this more last night and this morning. As far as I can
>>tell, I still need this. If the qdisc gets loaded with multiqueue
>>turned on, I can just use the value of band to assign
>>skb->queue_mapping. But if the qdisc is l
Patrick McHardy wrote:
> void skb_set_queue_mapping(struct sk_buff *skb, unsigned int queue)
> {
> #ifdef CONFIG_NET_SCH_MULTIQUEUE
> skb->queue_mapping = queue;
> #else
> skb->queue_mapping = 0;
> #endif
Maybe even use it everywhere and guard skb->queue_mapping by
an #ifdef, on 32 bi
Waskiewicz Jr, Peter P wrote:
>>> #include
>>>@@ -40,9 +42,13 @@
>>> struct prio_sched_data
>>> {
>>> int bands;
>>>+#ifdef CONFIG_NET_SCH_RR
>>>+int curband; /* for round-robin */
>>>+#endif
>>> struct tcf_proto *filter_list;
>>> u8 prio2band[TC_PRIO_MAX+1];
>>> struct Qdisc
> > #include
> > @@ -40,9 +42,13 @@
> > struct prio_sched_data
> > {
> > int bands;
> > +#ifdef CONFIG_NET_SCH_RR
> > + int curband; /* for round-robin */
> > +#endif
> > struct tcf_proto *filter_list;
> > u8 prio2band[TC_PRIO_MAX+1];
> > struct Qdisc *queues[TCQ_PRIO_BANDS];
Waskiewicz Jr, Peter P wrote:
The dependencies seem to be very confused. SCHED_PRIO does
not depend on anything new, SCH_RR also doesn't depend on
anything. SCH_PRIO_MQ and SCH_RR_MQ (which is missing) depend
on SCH_PRIO/SCH_RR. A single NET_SCH_MULTIQUEUE option seems
better than adding one p
> The dependencies seem to be very confused. SCHED_PRIO does
> not depend on anything new, SCH_RR also doesn't depend on
> anything. SCH_PRIO_MQ and SCH_RR_MQ (which is missing) depend
> on SCH_PRIO/SCH_RR. A single NET_SCH_MULTIQUEUE option seems
> better than adding one per scheduler though.
PJ Waskiewicz wrote:
diff --git a/net/sched/Kconfig b/net/sched/Kconfig
index 475df84..ca0b352 100644
--- a/net/sched/Kconfig
+++ b/net/sched/Kconfig
@@ -102,8 +102,16 @@ config NET_SCH_ATM
To compile this code as a module, choose M here: the
module will be called sch_atm.
+
55 matches
Mail list logo