Re: question about softirqs
Hi. On Tue, May 12, 2009 at 11:12:58AM +0200, Peter Zijlstra (a.p.zijls...@chello.nl) wrote: > Wouldn't the even better solution be to get rid of softirqs > all-together? And move tasklets into some thread context? Only if we are ready to fix 7 times rescheduling regressions compared to kernel threads (work queue actually). At least that's how tasklet behaved compared to work queue 1.5 years ago in the simplest and quite naive test where tasklet/work rescheduled iself number of times: http://marc.info/?l=linux-crypto-vger&m=119462472517405&w=2 -- Evgeniy Polyakov ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: RFC: issues concerning the next NAPI interface
On Tue, Aug 28, 2007 at 01:48:20PM +0200, Jan-Bernd Themann ([EMAIL PROTECTED]) wrote: > I'm not sure if I understand your approach correctly. > This approach may reduce the number of interrupts, but it does so > by blocking the CPU for up to 1 jiffy (that can be quite some time > on some platforms). So no other application / tasklet / softIRQ type > can do anything in between. The CPU utilization does not drop at all, > and I thought that is one reason why we try to reduce the number of > interrupts. Only NICs interrupts are suposed to be stopped, system will continue to work as usual, since all others are alive. Having hrtimer to reshcedule NIC procesing can work only if number of timer's interrupts are much less than NICs and if rate of the timer's starts/changes (presumbly in NICs interrupt) is small too, otherwise having too many NIC interrupts will not gain anything (actually it is what is supposed to be dropped noticebly). -- Evgeniy Polyakov ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 2/2] talitos: Freescale integrated security engine (SEC) driver
Hi. On Thu, May 29, 2008 at 02:12:50PM -0500, Kim Phillips ([EMAIL PROTECTED]) wrote: > +static irqreturn_t talitos_interrupt(int irq, void *data) > +{ > + struct device *dev = data; > + struct talitos_private *priv = dev_get_drvdata(dev); > + > + priv->status = in_be32(priv->reg + TALITOS_ISR); > + priv->status_lo = in_be32(priv->reg + TALITOS_ISR_LO); > + > + if (unlikely(priv->status & ~TALITOS_ISR_CHDONE)) { > + talitos_error((unsigned long)data); > + /* ack */ > + out_be32(priv->reg + TALITOS_ICR, priv->status); > + out_be32(priv->reg + TALITOS_ICR_LO, priv->status_lo); > + } > + else > + { > + /* ack */ > + out_be32(priv->reg + TALITOS_ICR, priv->status); > + out_be32(priv->reg + TALITOS_ICR_LO, priv->status_lo); > + > + if (likely(priv->status & TALITOS_ISR_CHDONE)) > + tasklet_schedule(&priv->done_task); > + } > + > + return (priv->status || priv->status_lo) ? IRQ_HANDLED : IRQ_NONE; > +} Don't you want to protect against simultaneous access to register space from different CPUs? Or it is single processor board only? -- Evgeniy Polyakov ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 2/2] talitos: Freescale integrated security engine (SEC) driver
On Fri, May 30, 2008 at 02:41:17PM -0500, Scott Wood ([EMAIL PROTECTED]) wrote: > >>Don't you want to protect against simultaneous access to register space > >>from different CPUs? Or it is single processor board only? > > > >Doesn't linux mask the IRQ line for the interrupt currently being > >serviced, and on all processors? > > Yes. Could there be interference from non-interrupt driver code on > another cpu (or interrupted code), though? Yes, that register space can be assigned from non-interrupt path on different cpu. I saw spin_lock_irqsave() is used in some other places, but not in interrupt handler itself. -- Evgeniy Polyakov ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 2/2] talitos: Freescale integrated security engine (SEC) driver
On Fri, May 30, 2008 at 03:48:20PM -0500, Kim Phillips ([EMAIL PROTECTED]) wrote: > sorry, by ISR I meant interrupt status registers. but I can't tell > where the suspected simultaneous accesses are. Evgeniy, can you point > out the register accesses you're talking about? priv->status is accessed from tasklets, although readonly, but that rises a red flag... Also callback invocation tasklet drops the lock around callback, during that time cached status and priv itself (and tail like in two simultaneous flushes) could change (or not?) -- Evgeniy Polyakov ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 2/2] talitos: Freescale integrated security engine (SEC) driver
Hi. On Fri, May 30, 2008 at 05:19:30PM -0500, Kim Phillips ([EMAIL PROTECTED]) wrote: > ok, I see what you are saying now; if a channel gets done during > talitos_done processing, it'll trigger an interrupt and reset > priv->status, leaving the tasklet in the dark as to which channel has > done status, depending on how many channel dones it has already > processed. I think the only solution here is to call flush_channel on > each channel, regardless of the bits in the interrupt status - I'll > send out a patch shortly. Out of curiosity, what is number of channels? I had simialar issue with HIFN crypto driver and limited number of descriptor to 80 iirc, since with that number HIFN traversal did not show perfromance degradataion on Ghz x86. > > callback, during that time cached status and priv itself (and tail like > > in two simultaneous flushes) could change (or not?) > > I think you're talking about a different 'status' here; flush_channel's > local 'status' doesn't resemble priv->status bits in any way, it looks > at the descriptor header writeback bits for done status, on a per > descriptor basis. It forwards this descriptor done vs. error status to > the callback. > > priv itself won't change; it's uniquely associated to the device. I meant descriptor hdr value accessed via it - can it be checked in tasklet under the lock and in submit path without? Can they correlate somehow? -- Evgeniy Polyakov ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 2/2] talitos: Freescale integrated security engine (SEC) driver
On Mon, Jun 02, 2008 at 09:27:01AM -0500, Kim Phillips ([EMAIL PROTECTED]) wrote: > > I meant descriptor hdr value accessed via it - can it be checked in > > tasklet under the lock and in submit path without? Can they correlate > > somehow? > > I believe the check for a non-null request->desc (under lock) before > the hdr value is accessed ensures this doesn't happen. But can it be changed? You write to it without lock, but read under the one (different for each channel though), so it attracted attention. -- Evgeniy Polyakov ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 2/2] talitos: Freescale integrated security engine (SEC) driver
On Mon, Jun 02, 2008 at 11:50:21AM -0500, Kim Phillips ([EMAIL PROTECTED]) wrote: > > But can it be changed? You write to it without lock, but read under the > > one (different for each channel though), so it attracted attention. > > can you point where in the code your concern is? talitos_submit() writes to hdr (initially I think?) without locks. It is read in flush_channel() under tail lock, but then it is dropped, so I rised a question, if it can be modified during that time, since if it can, status value, calculated from it, can be different and thus error check result can be false. Or this is not an issue? -- Evgeniy Polyakov ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 2/2] talitos: Freescale integrated security engine (SEC) driver
Hi. On Mon, Jun 02, 2008 at 02:06:03PM -0500, Kim Phillips ([EMAIL PROTECTED]) wrote: > it would be an issue if flush_cannel didn't save off the data required > to call the callback with in saved_req. flush_channel does this on > purpose to be able to call the callback outside of lock (as is > commented). Note desc gets assigned NULL prior to releasing the lock, > after copying its contents to saved_req. Excellent! It looks like there are no more issues and second version fixed found problem place. -- Evgeniy Polyakov ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 2/2] eHEA: Receive SKB Aggregation, generic LRO helper functions
On Thu, Jul 05, 2007 at 04:24:46PM +0200, Jan-Bernd Themann ([EMAIL PROTECTED]) wrote: > > I've couple of comments on the driver, but mainly the fact of decreased > > CPU usage itself - what was the magnitude of the win with this driver, > > it looks like because of per-packet receive code path invocation is the > > place of the latency... > > Your implementation looks generic enough to be used by any driver, don't > > you want to push it separately from eHEA driver? > > > > We can try to come up with a generic file with these helperfunctions. > What do you think about putting them into /net/ipv4/inet_lro.c and > /include/linux/inet_lro.h ? The more I think, the more it looks as appropriate to be used by all hardware drivers with proper API. As far as I recall this is third implementation in the linux drivers :) -- Evgeniy Polyakov ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [RFC 1/3] lro: Generic LRO for TCP traffic
Hi, Jan-Bernd. I have couple of comments over implementation besides one you saw previous time. On Wed, Jul 11, 2007 at 04:21:34PM +0200, Jan-Bernd Themann ([EMAIL PROTECTED]) wrote: > +static int lro_tcp_ip_check(struct sk_buff *skb, struct iphdr *iph, > + struct tcphdr *tcph, struct net_lro_desc *lro_desc) > +{ > +/* check ip header: packet length */ > +if (ntohs(iph->tot_len) > skb->len) > + return -1; > + > + if (TCP_PAYLOAD_LENGTH(iph, tcph) == 0) > + return -1; > + > + if (iph->ihl != IPH_LEN_WO_OPTIONS) > + return -1; > + > + if (tcph->cwr || tcph->ece || tcph->urg || !tcph->ack || tcph->psh > + || tcph->rst || tcph->syn || tcph->fin) > + return -1; I think you do not want to break lro frame because of push flag - it is pretty common flag, which does not brak processing (and I'm not sure if it has any special meaning this days). > + if (INET_ECN_is_ce(ipv4_get_dsfield(iph))) > + return -1; > + > + if (tcph->doff != TCPH_LEN_WO_OPTIONS > + && tcph->doff != TCPH_LEN_W_TIMESTAMP) > + return -1; > + > + /* check tcp options (only timestamp allowed) */ > + if (tcph->doff == TCPH_LEN_W_TIMESTAMP) { > + u32 *topt = (u32 *)(tcph + 1); > + > + if (*topt != htonl((TCPOPT_NOP << 24) | (TCPOPT_NOP << 16) > +| (TCPOPT_TIMESTAMP << 8) > +| TCPOLEN_TIMESTAMP)) > + return -1; > + > + /* timestamp should be in right order */ > + topt++; > + if (lro_desc && (ntohl(lro_desc->tcp_rcv_tsval) > ntohl(*topt))) > + return -1; This still does not handle wrapping over 32 bits. What about if (lro_desc && after(ntohl(lro_desc->tcp_rcv_tsval), ntohl(*topt))) return -1; > + /* timestamp reply should not be zero */ > + topt++; > + if (*topt == 0) > + return -1; > + } > + > + return 0; > +} > +static struct net_lro_desc *lro_get_desc(struct net_lro_mgr *mgr, > + struct net_lro_desc *lro_arr, > + struct iphdr *iph, > + struct tcphdr *tcph) > +{ > + struct net_lro_desc *lro_desc = NULL; > + struct net_lro_desc *tmp; > + int max_desc = mgr->max_desc; > + int i; > + > + for (i = 0; i < max_desc; i++) { > + tmp = &lro_arr[i]; > + if (tmp->active) > + if (!lro_check_tcp_conn(tmp, iph, tcph)) { > + lro_desc = tmp; > + goto out; > + } > + } Ugh... What about tree structure or hash here? > + for (i = 0; i < max_desc; i++) { > + if(!lro_arr[i].active) { > + lro_desc = &lro_arr[i]; > + goto out; > + } > + } > + > +out: > + return lro_desc; > +} > +int __lro_proc_skb(struct net_lro_mgr *lro_mgr, struct sk_buff *skb, > +struct vlan_group *vgrp, u16 vlan_tag, void *priv) > +{ > + struct net_lro_desc *lro_desc; > +struct iphdr *iph; > +struct tcphdr *tcph; > + > + if (!lro_mgr->get_ip_tcp_hdr > + || lro_mgr->get_ip_tcp_hdr(skb, &iph, &tcph, priv)) > + goto out; > + > + lro_desc = lro_get_desc(lro_mgr, lro_mgr->lro_arr, iph, tcph); > + if (!lro_desc) > + goto out; There is no protection of the descriptor array from accessing from different CPUs. Is it forbidden to share net_lro_mgr structure? -- Evgeniy Polyakov ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [RFC 1/1] lro: Generic Large Receive Offload for TCP traffic
Hi Jan-Bernd. On Fri, Jul 20, 2007 at 05:41:48PM +0200, Jan-Bernd Themann ([EMAIL PROTECTED]) wrote: > Generic LRO patch > > Signed-off-by: Jan-Bernd Themann <[EMAIL PROTECTED]> Besides couple trivial codyng/formatting nits I did not found any problematic places after review. Details below. Thanks. > +#define TCP_PAYLOAD_LENGTH(iph, tcph) \ > +(ntohs(iph->tot_len) - IP_HDR_LEN(iph) - TCP_HDR_LEN(tcph)) A tab? > +static void lro_add_common(struct net_lro_desc *lro_desc, struct iphdr *iph, > +struct tcphdr *tcph, int tcp_data_len) > +{ > + struct sk_buff *parent = lro_desc->parent; > + u32 *topt; > + > + lro_desc->pkt_aggr_cnt++; > + lro_desc->ip_tot_len += tcp_data_len; > + lro_desc->tcp_next_seq += tcp_data_len; > + lro_desc->tcp_window = tcph->window; > + lro_desc->tcp_ack = tcph->ack_seq; > + > + /* don't update tcp_rcv_tsval, would not work with PAWS */ > + if (lro_desc->tcp_saw_tstamp) { > + topt = (u32 *) (tcph + 1); > + lro_desc->tcp_rcv_tsecr = *(topt + 2); > + } > + > + parent->len += tcp_data_len; > + parent->data_len += tcp_data_len; > + > + lro_desc->data_csum = csum_add(lro_desc->data_csum, > +lro_tcp_data_csum(iph, tcph, > + tcp_data_len)); > + return; > +} return from void? And in other places too. > +int __lro_proc_skb(struct net_lro_mgr *lro_mgr, struct sk_buff *skb, > +struct vlan_group *vgrp, u16 vlan_tag, void *priv) > +{ > + struct net_lro_desc *lro_desc; > +struct iphdr *iph; > +struct tcphdr *tcph; > + u64 flags; Broken tab and spaces. > +struct sk_buff *lro_gen_skb(struct net_lro_mgr *lro_mgr, > + struct skb_frag_struct *frags, > + int len, int true_size, > + void *mac_hdr, > + int hlen) > +{ > + struct sk_buff *skb; > +struct skb_frag_struct *skb_frags; > + int data_len = len; The same. > + skb = netdev_alloc_skb(lro_mgr->dev, hlen); > + if (!skb) > + return NULL; > + > +skb->len = len; > + skb->data_len = len - hlen; Here too. There is number of such places, ommitted others. -- Evgeniy Polyakov ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev