jamal wrote:
> On Tue, 2006-20-06 at 16:45 +0200, Patrick McHardy wrote:
> 
>>Actually in the PPPoE case Linux doesn't know about ethernet
>>headers either, since shaping is usually done on the PPP device.
>>But that doesn't really matter since the ethernet link is not
>>the bottleneck - although it does add some delay for packetization.
> 
> 
> good point. But one could argue that is within linux (local) as opposed
> to something downstream at the ISP i.e. i have knowledge of it and i
> could do clever things. The other is: I have to know that the ISP is
> using pigeons as the link layer downstream and compensate for it.
> 
> The issue is really is whether Linux should be interested in the
> throughput it is told about or the goodput (also known as effective
> throughput) the service provider offers. Two different issues by
> definition. 


In the case of PPPoE non-work-conserving qdiscs are already used
to manage a link that is non-local with knowledge of the its
bandwidth, contrary to a local link that would be best managed
in work-conserving mode. And I think for better accuracy it is
necessary to manage effective throughput, especially if you're
interested in guaranteed delays.

>>>Yes, Linux cant tell if your service provider is lying to you.
>>
>>I wouldn't call it lying as long as they don't say "1.5mbps IP
>>layer throughput". 
> 
> 
> It is a scam for sure.
> By definition of what throughput is - you are telling the truth; just
> not the whole truth. Most users think in terms of goodput and not
> throughput. 
> i.e you are not telling the whole truth by not saying "it is 1.5Mbps ATM
> throughput". Tpyically not an issue until somebody finds that by leaving
> out "ATM" you meant throughput and not goodput. 


I think that point can be used to argue in favour of that Linux should
be able to manage effective throughput :)

>>Ethernet doesn't provide 100mbit IP layer
>>throughput either, and with minimum sized IP packets its actually
>>well below that.
>
> 
> OTOH, nobody has ethernet MTUs of 64 bytes.


Sure, but I might now want my HFSC class with guaranteed delay of 140us
to be distrurbed by someone sending small packets, that need more time
on the wire than HFSC thinks.

> To be academic and pedantic: The schedulers should be focusing on
> throughput and not goodput.
> Look at it from another angle related to the nature of the link layer
> used:
> If i buy a 1.5 Mbps 802.11JHS (such a link layer technology doesnt
> exist, but assume for the sake of arguement it does) from a wireless
> service provider, ethernet headers etc - but in this case the link is so
> bad (because of the link layer technology) i have to retransmit so much
> that 0.5 Mbps is wasted on retransmits, the question becomes: 
> 1)Do i fix the scheduler to compensate for this link layer retransmit?
> or
> 2)Do i find some other creative way to tell the scheduler that
> without making any changes to it that my ftp (despite the retransmits)
> should only chew 100Kbps.?
> 
> I am saying that #2 is the choice to go with hence my assertion earlier,
> it should be fine to tell the scheduler all it has is 1Mbps and nobody
> gets hurt. #1 if i could do it with minimal intrusion and still get to
> use it when i have 802.11g. 
> 
> Not sure i made sense.

HFSC is actually capable of handling this quite well. If you use it
in work-conserving mode (and the card doesn't do (much) internal
queueing) it will get clocked by successful transmissions. Using
link-sharing classes you can define proportions for use of available
bandwidth, possibly with upper limits. No hacks required :)

Anyway, this again goes more in the direction of handling link speed
changes.

>>A non intrusive way is prefered of course, but I can't really see
>>one if you want more than just a special-case solution that only
>>covers qdiscs using rate-tables and even ignores inner qdiscs.
>>HFSC and SFQ for example both need to calculate the wire length
>>at runtime.
>>
> 
> Agreed. That would be equivalent to #1 above.
> 
> 
>>Handling all qdiscs would mean adding a pointer to a mapping table
>>to struct net_device and using something like "skb_wire_len(skb, dev)"
>>instead of skb->len in the queueing layer. 
> 
> 
> That does seem sensible and simpler. I would suspect then that you will
> do this one time with something like
> ip dev add compensate_header 100 bytes

Something like that, but its a bit more complicated.
For ATM we need some mapping:
[0-48]  -> 53
[49-96] -> 106
...

for Ethernet we need:
[0-60] -> 64
[60-n] -> n + 4

We could do something like this (feel free to imagine nicer names):

ATM:
table = {
        .step = 53,
        .map = {
                [0..48] = 53,
                [49..96] = 106,
                ...
        }
};

Requiring a table of size 32 for typical MTUs.

Ethernet:

table = {
        .step = 60,
        .map = {
                [0..60] = 60,
                [...] = 0,
        },
        .fixed_overhead = 4,
};

static inline unsigned int
skb_wire_len(struct sk_buff *skb, struct net_device *dev)
{
        unsigned int idx, len;

        if (dev->lengthtable == NULL)
                return skb->len;
        idx = skb->len / dev->lengthtable->step;
        len = dev->lengthtable->map[idx];
        return dev->lengthtable->fixed_overhead + len ? len : skb->len;
}

Unforunately I can't think of a way to handle the ATM case without
a division .. or iteration.

>>That of course doesn't
>>mean that we can't still provide pre-adjusted ratetables for qdiscs
>>that use them.
>>
> 
> 
> But what would the point be then if you can compensate as you did above?

It doesn't need runtime divisions :)
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to