svn commit: r344099 - head/sys/net
Author: rrs Date: Wed Feb 13 14:57:59 2019 New Revision: 344099 URL: https://svnweb.freebsd.org/changeset/base/344099 Log: This commit adds the missing release mechanism for the ratelimiting code. The two modules (lagg and vlan) did have allocation routines, and even though they are indirect (and vector down to the underlying interfaces) they both need to have a free routine (that also vectors down to the actual interface). Sponsored by: Netflix Inc. Differential Revision:https://reviews.freebsd.org/D19032 Modified: head/sys/net/if_lagg.c head/sys/net/if_vlan.c Modified: head/sys/net/if_lagg.c == --- head/sys/net/if_lagg.c Wed Feb 13 14:39:16 2019(r344098) +++ head/sys/net/if_lagg.c Wed Feb 13 14:57:59 2019(r344099) @@ -133,6 +133,7 @@ static int lagg_ioctl(struct ifnet *, u_long, caddr_t) static int lagg_snd_tag_alloc(struct ifnet *, union if_snd_tag_alloc_params *, struct m_snd_tag **); +static voidlagg_snd_tag_free(struct m_snd_tag *); #endif static int lagg_setmulti(struct lagg_port *); static int lagg_clrmulti(struct lagg_port *); @@ -514,6 +515,7 @@ lagg_clone_create(struct if_clone *ifc, int unit, cadd ifp->if_flags = IFF_SIMPLEX | IFF_BROADCAST | IFF_MULTICAST; #ifdef RATELIMIT ifp->if_snd_tag_alloc = lagg_snd_tag_alloc; + ifp->if_snd_tag_free = lagg_snd_tag_free; #endif ifp->if_capenable = ifp->if_capabilities = IFCAP_HWSTATS; @@ -1568,6 +1570,13 @@ lagg_snd_tag_alloc(struct ifnet *ifp, /* forward allocation request */ return (ifp->if_snd_tag_alloc(ifp, params, ppmt)); } + +static void +lagg_snd_tag_free(struct m_snd_tag *tag) +{ + tag->ifp->if_snd_tag_free(tag); +} + #endif static int Modified: head/sys/net/if_vlan.c == --- head/sys/net/if_vlan.c Wed Feb 13 14:39:16 2019(r344098) +++ head/sys/net/if_vlan.c Wed Feb 13 14:57:59 2019(r344099) @@ -267,6 +267,7 @@ static int vlan_ioctl(struct ifnet *ifp, u_long cmd, c #ifdef RATELIMIT static int vlan_snd_tag_alloc(struct ifnet *, union if_snd_tag_alloc_params *, struct m_snd_tag **); +static void vlan_snd_tag_free(struct m_snd_tag *); #endif static void vlan_qflush(struct ifnet *ifp); static int vlan_setflag(struct ifnet *ifp, int flag, int status, @@ -1047,6 +1048,7 @@ vlan_clone_create(struct if_clone *ifc, char *name, si ifp->if_ioctl = vlan_ioctl; #ifdef RATELIMIT ifp->if_snd_tag_alloc = vlan_snd_tag_alloc; + ifp->if_snd_tag_free = vlan_snd_tag_free; #endif ifp->if_flags = VLAN_IFFLAGS; ether_ifattach(ifp, eaddr); @@ -1933,5 +1935,11 @@ vlan_snd_tag_alloc(struct ifnet *ifp, return (EOPNOTSUPP); /* forward allocation request */ return (ifp->if_snd_tag_alloc(ifp, params, ppmt)); +} + +static void +vlan_snd_tag_free(struct m_snd_tag *tag) +{ + tag->ifp->if_snd_tag_free(tag); } #endif ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
svn commit: r344103 - head/sys/netinet
Author: ae Date: Wed Feb 13 15:46:05 2019 New Revision: 344103 URL: https://svnweb.freebsd.org/changeset/base/344103 Log: In r335015 PCB destroing was made deferred using epoch_call(). But ipsec_delete_pcbpolicy() uses some VNET-virtualized variables, and thus it needs VNET context, that is missing during gtaskqueue executing. Use inp_vnet context to set curvnet in in_pcbfree_deferred(). PR: 235684 MFC after:1 week Modified: head/sys/netinet/in_pcb.c Modified: head/sys/netinet/in_pcb.c == --- head/sys/netinet/in_pcb.c Wed Feb 13 15:30:06 2019(r344102) +++ head/sys/netinet/in_pcb.c Wed Feb 13 15:46:05 2019(r344103) @@ -1565,6 +1565,7 @@ in_pcbfree_deferred(epoch_context_t ctx) inp = __containerof(ctx, struct inpcb, inp_epoch_ctx); INP_WLOCK(inp); + CURVNET_SET(inp->inp_vnet); #ifdef INET struct ip_moptions *imo = inp->inp_moptions; inp->inp_moptions = NULL; @@ -1597,6 +1598,7 @@ in_pcbfree_deferred(epoch_context_t ctx) #ifdef INET inp_freemoptions(imo); #endif + CURVNET_RESTORE(); } /* ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r344099 - head/sys/net
On 2/13/19 6:57 AM, Randall Stewart wrote: > Author: rrs > Date: Wed Feb 13 14:57:59 2019 > New Revision: 344099 > URL: https://svnweb.freebsd.org/changeset/base/344099 > > Log: > This commit adds the missing release mechanism for the > ratelimiting code. The two modules (lagg and vlan) did have > allocation routines, and even though they are indirect (and > vector down to the underlying interfaces) they both need to > have a free routine (that also vectors down to the actual interface). > > Sponsored by: Netflix Inc. > Differential Revision: https://reviews.freebsd.org/D19032 Hmm, I don't understand why you'd ever invoke if_snd_tag_free from anything but 'tag->ifp' rather than some other ifp. What if the route for a connection moves so that a tag allocated on cc0 is now on a route that goes over em0? You can't expect em0 to have an if_snd_tag_free routine that will know to go invoke cxgbe's snd_tag_free. I think you should always be using 'tag->ifp->if_snd_tag_free' to free tags and never using any other ifp. That is, I think this should be reverted and that instead you need to fix the code invoking if_snd_tag_free to invoke it on the tag's ifp instead of some random other ifp. -- John Baldwin ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
svn commit: r344106 - in head/sys: riscv/include riscv/riscv vm
Author: markj Date: Wed Feb 13 17:19:37 2019 New Revision: 344106 URL: https://svnweb.freebsd.org/changeset/base/344106 Log: Implement transparent 2MB superpage promotion for RISC-V. This includes support for pmap_enter(..., psind=1) as described in the commit log message for r321378. The changes are largely modelled after amd64. arm64 has more stringent requirements around superpage creation to avoid the possibility of TLB conflict aborts, and these requirements do not apply to RISC-V, which like amd64 permits simultaneous caching of 4KB and 2MB translations for a given page. RISC-V's PTE format includes only two software bits, and as these are already consumed we do not have an analogue for amd64's PG_PROMOTED. Instead, pmap_remove_l2() always invalidates the entire 2MB address range. pmap_ts_referenced() is modified to clear PTE_A, now that we support both hardware- and software-managed reference and dirty bits. Also fix pmap_fault_fixup() so that it does not set PTE_A or PTE_D on kernel mappings. Reviewed by: kib (earlier version) Discussed with: jhb Sponsored by: The FreeBSD Foundation Differential Revision:https://reviews.freebsd.org/D18863 Differential Revision:https://reviews.freebsd.org/D18864 Differential Revision:https://reviews.freebsd.org/D18865 Differential Revision:https://reviews.freebsd.org/D18866 Differential Revision:https://reviews.freebsd.org/D18867 Differential Revision:https://reviews.freebsd.org/D18868 Modified: head/sys/riscv/include/param.h head/sys/riscv/include/pmap.h head/sys/riscv/include/pte.h head/sys/riscv/include/vmparam.h head/sys/riscv/riscv/pmap.c head/sys/vm/vm_fault.c Modified: head/sys/riscv/include/param.h == --- head/sys/riscv/include/param.h Wed Feb 13 16:02:55 2019 (r344105) +++ head/sys/riscv/include/param.h Wed Feb 13 17:19:37 2019 (r344106) @@ -82,7 +82,7 @@ #definePAGE_SIZE (1 << PAGE_SHIFT) /* Page size */ #definePAGE_MASK (PAGE_SIZE - 1) -#defineMAXPAGESIZES1 /* maximum number of supported page sizes */ +#defineMAXPAGESIZES3 /* maximum number of supported page sizes */ #ifndef KSTACK_PAGES #defineKSTACK_PAGES4 /* pages of kernel stack (with pcb) */ Modified: head/sys/riscv/include/pmap.h == --- head/sys/riscv/include/pmap.h Wed Feb 13 16:02:55 2019 (r344105) +++ head/sys/riscv/include/pmap.h Wed Feb 13 17:19:37 2019 (r344106) @@ -44,6 +44,8 @@ #include #include +#include + #ifdef _KERNEL #definevtophys(va) pmap_kextract((vm_offset_t)(va)) @@ -80,6 +82,7 @@ struct pmap { pd_entry_t *pm_l1; TAILQ_HEAD(,pv_chunk) pm_pvchunk; /* list of mappings in pmap */ LIST_ENTRY(pmap)pm_list;/* List of all pmaps */ + struct vm_radix pm_root; }; typedef struct pv_entry { @@ -139,6 +142,7 @@ voidpmap_kenter_device(vm_offset_t, vm_size_t, vm_pad vm_paddr_t pmap_kextract(vm_offset_t va); void pmap_kremove(vm_offset_t); void pmap_kremove_device(vm_offset_t, vm_size_t); +bool pmap_ps_enabled(pmap_t); void *pmap_mapdev(vm_offset_t, vm_size_t); void *pmap_mapbios(vm_paddr_t, vm_size_t); Modified: head/sys/riscv/include/pte.h == --- head/sys/riscv/include/pte.hWed Feb 13 16:02:55 2019 (r344105) +++ head/sys/riscv/include/pte.hWed Feb 13 17:19:37 2019 (r344106) @@ -62,7 +62,8 @@ typedef uint64_tpn_t; /* page number */ #defineL3_SIZE (1 << L3_SHIFT) #defineL3_OFFSET (L3_SIZE - 1) -#defineLn_ENTRIES (1 << 9) +#defineLn_ENTRIES_SHIFT 9 +#defineLn_ENTRIES (1 << Ln_ENTRIES_SHIFT) #defineLn_ADDR_MASK(Ln_ENTRIES - 1) /* Bits 9:8 are reserved for software */ @@ -79,6 +80,8 @@ typedef uint64_tpn_t; /* page number */ #definePTE_RWX (PTE_R | PTE_W | PTE_X) #definePTE_RX (PTE_R | PTE_X) #definePTE_KERN(PTE_V | PTE_R | PTE_W | PTE_A | PTE_D) +#definePTE_PROMOTE (PTE_V | PTE_RWX | PTE_D | PTE_A | PTE_G | PTE_U | \ +PTE_SW_MANAGED | PTE_SW_WIRED) #definePTE_PPN0_S 10 #definePTE_PPN1_S 19 Modified: head/sys/riscv/include/vmparam.h == --- head/sys/riscv/include/vmparam.hWed Feb 13 16:02:55 2019 (r344105) +++ head/sys/riscv/include/vmparam.hWed Feb 13 17:19:37 2019 (r3
svn commit: r344107 - head/sys/riscv/riscv
Author: markj Date: Wed Feb 13 17:38:47 2019 New Revision: 344107 URL: https://svnweb.freebsd.org/changeset/base/344107 Log: Implement pmap_clear_modify() for RISC-V. Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision:https://reviews.freebsd.org/D18875 Modified: head/sys/riscv/riscv/pmap.c Modified: head/sys/riscv/riscv/pmap.c == --- head/sys/riscv/riscv/pmap.c Wed Feb 13 17:19:37 2019(r344106) +++ head/sys/riscv/riscv/pmap.c Wed Feb 13 17:38:47 2019(r344107) @@ -4074,6 +4074,14 @@ pmap_advise(pmap_t pmap, vm_offset_t sva, vm_offset_t void pmap_clear_modify(vm_page_t m) { + struct md_page *pvh; + struct rwlock *lock; + pmap_t pmap; + pv_entry_t next_pv, pv; + pd_entry_t *l2, oldl2; + pt_entry_t *l3, oldl3; + vm_offset_t va; + int md_gen, pvh_gen; KASSERT((m->oflags & VPO_UNMANAGED) == 0, ("pmap_clear_modify: page %p is not managed", m)); @@ -4088,8 +4096,78 @@ pmap_clear_modify(vm_page_t m) */ if ((m->aflags & PGA_WRITEABLE) == 0) return; - - /* RISCVTODO: We lack support for tracking if a page is modified */ + pvh = (m->flags & PG_FICTITIOUS) != 0 ? &pv_dummy : + pa_to_pvh(VM_PAGE_TO_PHYS(m)); + lock = VM_PAGE_TO_PV_LIST_LOCK(m); + rw_rlock(&pvh_global_lock); + rw_wlock(lock); +restart: + TAILQ_FOREACH_SAFE(pv, &pvh->pv_list, pv_next, next_pv) { + pmap = PV_PMAP(pv); + if (!PMAP_TRYLOCK(pmap)) { + pvh_gen = pvh->pv_gen; + rw_wunlock(lock); + PMAP_LOCK(pmap); + rw_wlock(lock); + if (pvh_gen != pvh->pv_gen) { + PMAP_UNLOCK(pmap); + goto restart; + } + } + va = pv->pv_va; + l2 = pmap_l2(pmap, va); + oldl2 = pmap_load(l2); + if ((oldl2 & PTE_W) != 0) { + if (pmap_demote_l2_locked(pmap, l2, va, &lock)) { + if ((oldl2 & PTE_SW_WIRED) == 0) { + /* +* Write protect the mapping to a +* single page so that a subsequent +* write access may repromote. +*/ + va += VM_PAGE_TO_PHYS(m) - + PTE_TO_PHYS(oldl2); + l3 = pmap_l2_to_l3(l2, va); + oldl3 = pmap_load(l3); + if ((oldl3 & PTE_V) != 0) { + while (!atomic_fcmpset_long(l3, + &oldl3, oldl3 & ~(PTE_D | + PTE_W))) + cpu_spinwait(); + vm_page_dirty(m); + pmap_invalidate_page(pmap, va); + } + } + } + } + PMAP_UNLOCK(pmap); + } + TAILQ_FOREACH(pv, &m->md.pv_list, pv_next) { + pmap = PV_PMAP(pv); + if (!PMAP_TRYLOCK(pmap)) { + md_gen = m->md.pv_gen; + pvh_gen = pvh->pv_gen; + rw_wunlock(lock); + PMAP_LOCK(pmap); + rw_wlock(lock); + if (pvh_gen != pvh->pv_gen || md_gen != m->md.pv_gen) { + PMAP_UNLOCK(pmap); + goto restart; + } + } + l2 = pmap_l2(pmap, pv->pv_va); + KASSERT((pmap_load(l2) & PTE_RWX) == 0, + ("pmap_clear_modify: found a 2mpage in page %p's pv list", + m)); + l3 = pmap_l2_to_l3(l2, pv->pv_va); + if ((pmap_load(l3) & (PTE_D | PTE_W)) == (PTE_D | PTE_W)) { + pmap_clear_bits(l3, PTE_D); + pmap_invalidate_page(pmap, pv->pv_va); + } + PMAP_UNLOCK(pmap); + } + rw_wunlock(lock); + rw_runlock(&pvh_global_lock); } void * ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
svn commit: r344108 - in head/sys/riscv: include riscv
Author: markj Date: Wed Feb 13 17:50:01 2019 New Revision: 344108 URL: https://svnweb.freebsd.org/changeset/base/344108 Log: Implement per-CPU pmap activation tracking for RISC-V. This reduces the overhead of TLB invalidations by ensuring that we only interrupt CPUs which are using the given pmap. Tracking is performed in pmap_activate(), which gets called during context switches: from cpu_throw(), if a thread is exiting or an AP is starting, or cpu_switch() for a regular context switch. For now, pmap_sync_icache() still must interrupt all CPUs. Reviewed by: kib (earlier version), jhb Sponsored by: The FreeBSD Foundation Differential Revision:https://reviews.freebsd.org/D18874 Modified: head/sys/riscv/include/pcb.h head/sys/riscv/include/pcpu.h head/sys/riscv/include/pmap.h head/sys/riscv/riscv/genassym.c head/sys/riscv/riscv/machdep.c head/sys/riscv/riscv/mp_machdep.c head/sys/riscv/riscv/pmap.c head/sys/riscv/riscv/swtch.S head/sys/riscv/riscv/vm_machdep.c Modified: head/sys/riscv/include/pcb.h == --- head/sys/riscv/include/pcb.hWed Feb 13 17:38:47 2019 (r344107) +++ head/sys/riscv/include/pcb.hWed Feb 13 17:50:01 2019 (r344108) @@ -55,7 +55,6 @@ struct pcb { #definePCB_FP_STARTED 0x1 #definePCB_FP_USERMASK 0x1 uint64_tpcb_sepc; /* Supervisor exception pc */ - vm_offset_t pcb_l1addr; /* L1 page tables base address */ vm_offset_t pcb_onfault;/* Copyinout fault handler */ }; Modified: head/sys/riscv/include/pcpu.h == --- head/sys/riscv/include/pcpu.h Wed Feb 13 17:38:47 2019 (r344107) +++ head/sys/riscv/include/pcpu.h Wed Feb 13 17:50:01 2019 (r344108) @@ -45,6 +45,7 @@ #defineALT_STACK_SIZE 128 #definePCPU_MD_FIELDS \ + struct pmap *pc_curpmap;/* Currently active pmap */ \ uint32_t pc_pending_ipis; /* IPIs pending to this CPU */ \ char __pad[61] Modified: head/sys/riscv/include/pmap.h == --- head/sys/riscv/include/pmap.h Wed Feb 13 17:38:47 2019 (r344107) +++ head/sys/riscv/include/pmap.h Wed Feb 13 17:50:01 2019 (r344108) @@ -41,6 +41,7 @@ #ifndef LOCORE #include +#include #include #include @@ -80,6 +81,8 @@ struct pmap { struct mtx pm_mtx; struct pmap_statistics pm_stats; /* pmap statictics */ pd_entry_t *pm_l1; + u_long pm_satp;/* value for SATP register */ + cpuset_tpm_active; /* active on cpus */ TAILQ_HEAD(,pv_chunk) pm_pvchunk; /* list of mappings in pmap */ LIST_ENTRY(pmap)pm_list;/* List of all pmaps */ struct vm_radix pm_root; @@ -137,6 +140,10 @@ extern vm_offset_t virtual_end; #defineL1_MAPPABLE_P(va, pa, size) \ va) | (pa)) & L1_OFFSET) == 0 && (size) >= L1_SIZE) +struct thread; + +void pmap_activate_boot(pmap_t); +void pmap_activate_sw(struct thread *); void pmap_bootstrap(vm_offset_t, vm_paddr_t, vm_size_t); void pmap_kenter_device(vm_offset_t, vm_size_t, vm_paddr_t); vm_paddr_t pmap_kextract(vm_offset_t va); Modified: head/sys/riscv/riscv/genassym.c == --- head/sys/riscv/riscv/genassym.c Wed Feb 13 17:38:47 2019 (r344107) +++ head/sys/riscv/riscv/genassym.c Wed Feb 13 17:50:01 2019 (r344108) @@ -63,7 +63,6 @@ ASSYM(TDF_ASTPENDING, TDF_ASTPENDING); ASSYM(TDF_NEEDRESCHED, TDF_NEEDRESCHED); ASSYM(PCB_ONFAULT, offsetof(struct pcb, pcb_onfault)); -ASSYM(PCB_L1ADDR, offsetof(struct pcb, pcb_l1addr)); ASSYM(PCB_SIZE, sizeof(struct pcb)); ASSYM(PCB_RA, offsetof(struct pcb, pcb_ra)); ASSYM(PCB_SP, offsetof(struct pcb, pcb_sp)); Modified: head/sys/riscv/riscv/machdep.c == --- head/sys/riscv/riscv/machdep.c Wed Feb 13 17:38:47 2019 (r344107) +++ head/sys/riscv/riscv/machdep.c Wed Feb 13 17:50:01 2019 (r344108) @@ -871,10 +871,6 @@ initriscv(struct riscv_bootparams *rvbp) init_proc0(rvbp->kern_stack); - /* set page table base register for thread0 */ - thread0.td_pcb->pcb_l1addr = \ - (rvbp->kern_l1pt - KERNBASE + rvbp->kern_phys); - msgbufinit(msgbufp, msgbufsize); mutex_init(); init_param2(physmem); Modified: head/sys/riscv/riscv/mp_machdep.c == --- head/sys/
Re: svn commit: r344099 - head/sys/net
I disagree. If you define an alloc it is only reciprocal that you should define a free. The code in question that hit this was changed (its in a version of rack that has the rate-limit and TLS code).. but I think these things *should* be balanced.. if you provide an Allocate, you should also provide a Free… R > On Feb 13, 2019, at 12:09 PM, John Baldwin wrote: > > On 2/13/19 6:57 AM, Randall Stewart wrote: >> Author: rrs >> Date: Wed Feb 13 14:57:59 2019 >> New Revision: 344099 >> URL: https://svnweb.freebsd.org/changeset/base/344099 >> >> Log: >> This commit adds the missing release mechanism for the >> ratelimiting code. The two modules (lagg and vlan) did have >> allocation routines, and even though they are indirect (and >> vector down to the underlying interfaces) they both need to >> have a free routine (that also vectors down to the actual interface). >> >> Sponsored by: Netflix Inc. >> Differential Revision: https://reviews.freebsd.org/D19032 > > Hmm, I don't understand why you'd ever invoke if_snd_tag_free from anything > but 'tag->ifp' rather than some other ifp. What if the route for a connection > moves so that a tag allocated on cc0 is now on a route that goes over em0? > You can't expect em0 to have an if_snd_tag_free routine that will know to > go invoke cxgbe's snd_tag_free. I think you should always be using > 'tag->ifp->if_snd_tag_free' to free tags and never using any other ifp. > > That is, I think this should be reverted and that instead you need to fix > the code invoking if_snd_tag_free to invoke it on the tag's ifp instead of > some random other ifp. > > -- > John Baldwin > > -- Randall Stewart r...@netflix.com ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r344099 - head/sys/net
oh and one other thing.. It was *not* a random IFP.. it was the IFP to the lagg. I.e. an alloc() was done to the lagg.. and the free was done back to the same IFP (that provided the allocate). R > On Feb 13, 2019, at 1:02 PM, Randall Stewart wrote: > > I disagree. If you define an alloc it is only > reciprocal that you should define a free. > > The code in question that hit this was changed (its in a version > of rack that has the rate-limit and TLS code).. but I think these > things *should* be balanced.. if you provide an Allocate, you > should also provide a Free… > > R > > >> On Feb 13, 2019, at 12:09 PM, John Baldwin wrote: >> >> On 2/13/19 6:57 AM, Randall Stewart wrote: >>> Author: rrs >>> Date: Wed Feb 13 14:57:59 2019 >>> New Revision: 344099 >>> URL: https://svnweb.freebsd.org/changeset/base/344099 >>> >>> Log: >>> This commit adds the missing release mechanism for the >>> ratelimiting code. The two modules (lagg and vlan) did have >>> allocation routines, and even though they are indirect (and >>> vector down to the underlying interfaces) they both need to >>> have a free routine (that also vectors down to the actual interface). >>> >>> Sponsored by: Netflix Inc. >>> Differential Revision: https://reviews.freebsd.org/D19032 >> >> Hmm, I don't understand why you'd ever invoke if_snd_tag_free from anything >> but 'tag->ifp' rather than some other ifp. What if the route for a >> connection >> moves so that a tag allocated on cc0 is now on a route that goes over em0? >> You can't expect em0 to have an if_snd_tag_free routine that will know to >> go invoke cxgbe's snd_tag_free. I think you should always be using >> 'tag->ifp->if_snd_tag_free' to free tags and never using any other ifp. >> >> That is, I think this should be reverted and that instead you need to fix >> the code invoking if_snd_tag_free to invoke it on the tag's ifp instead of >> some random other ifp. >> >> -- >> John Baldwin >> >> > > -- > Randall Stewart > r...@netflix.com > > > -- Randall Stewart r...@netflix.com ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r344099 - head/sys/net
On 2/13/19 10:03 AM, Randall Stewart wrote: > oh and one other thing.. > > It was *not* a random IFP.. it was the IFP to the lagg. > > I.e. an alloc() was done to the lagg.. and the free was > done back to the same IFP (that provided the allocate). Yes, that's wrong. Suppose the route changes so that my traffic is now over em0 instead of lagg0 (where em0 isn't a member of the lagg), how do you expect if_lagg_free to invoke em0's free routine? In your case it does, but only by accident. It doesn't work in the other case I described which is if you have non-lagg interfaces and a route moves from cc0 to em0. In that case your existing code that is using the wrong ifp will just panic. These aren't real alloc routines as the lagg and vlan ones don't allocate anything, they pass along the request to the child and the child allocates the tag. Only ifnet's that actually allocate tags should need to free them, and you should be using tag->ifp to as the ifp whose if_snd_tag_free works. > R > >> On Feb 13, 2019, at 1:02 PM, Randall Stewart wrote: >> >> I disagree. If you define an alloc it is only >> reciprocal that you should define a free. >> >> The code in question that hit this was changed (its in a version >> of rack that has the rate-limit and TLS code).. but I think these >> things *should* be balanced.. if you provide an Allocate, you >> should also provide a Free… >> >> R >> >> >>> On Feb 13, 2019, at 12:09 PM, John Baldwin wrote: >>> >>> On 2/13/19 6:57 AM, Randall Stewart wrote: Author: rrs Date: Wed Feb 13 14:57:59 2019 New Revision: 344099 URL: https://svnweb.freebsd.org/changeset/base/344099 Log: This commit adds the missing release mechanism for the ratelimiting code. The two modules (lagg and vlan) did have allocation routines, and even though they are indirect (and vector down to the underlying interfaces) they both need to have a free routine (that also vectors down to the actual interface). Sponsored by: Netflix Inc. Differential Revision: https://reviews.freebsd.org/D19032 >>> >>> Hmm, I don't understand why you'd ever invoke if_snd_tag_free from anything >>> but 'tag->ifp' rather than some other ifp. What if the route for a >>> connection >>> moves so that a tag allocated on cc0 is now on a route that goes over em0? >>> You can't expect em0 to have an if_snd_tag_free routine that will know to >>> go invoke cxgbe's snd_tag_free. I think you should always be using >>> 'tag->ifp->if_snd_tag_free' to free tags and never using any other ifp. >>> >>> That is, I think this should be reverted and that instead you need to fix >>> the code invoking if_snd_tag_free to invoke it on the tag's ifp instead of >>> some random other ifp. >>> >>> -- >>> John Baldwin >>> >>> >> >> -- >> Randall Stewart >> r...@netflix.com >> >> >> > > -- > Randall Stewart > r...@netflix.com > > > -- John Baldwin ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
svn commit: r344109 - head/lib/libthr/arch/powerpc/include
Author: luporl Date: Wed Feb 13 18:28:53 2019 New Revision: 344109 URL: https://svnweb.freebsd.org/changeset/base/344109 Log: silence cast-align warnings from clang on powerpc64 silence the following warning when compiling libthr with clang 8 for powerpc64 architecture: usr/src/lib/libthr/arch/powerpc/include/pthread_md.h:82:10: error: cast from 'uint8_t *' (aka 'unsigned char *') to 'struct tcb *' increases required alignment from 1 to 8 [-Werror,-Wcast-align] 82: return ((struct tcb *)(_tp - TP_OFFSET)); Submitted by: alfredo.junior_eldorado.org.br Reviewed by: git_bdragon.rtk0.net, emaste, kib, jhibbits, luporl Differential Revision:https://reviews.freebsd.org/D18807 Modified: head/lib/libthr/arch/powerpc/include/pthread_md.h Modified: head/lib/libthr/arch/powerpc/include/pthread_md.h == --- head/lib/libthr/arch/powerpc/include/pthread_md.h Wed Feb 13 17:50:01 2019(r344108) +++ head/lib/libthr/arch/powerpc/include/pthread_md.h Wed Feb 13 18:28:53 2019(r344109) @@ -72,14 +72,15 @@ _tcb_set(struct tcb *tcb) static __inline struct tcb * _tcb_get(void) { - register uint8_t *_tp; +register struct tcb *tcb; + #ifdef __powerpc64__ - __asm __volatile("mr %0,13" : "=r"(_tp)); + __asm __volatile("addi %0,13,%1" : "=r"(tcb) : "i"(-TP_OFFSET)); #else - __asm __volatile("mr %0,2" : "=r"(_tp)); + __asm __volatile("addi %0,2,%1" : "=r"(tcb) : "i"(-TP_OFFSET)); #endif - return ((struct tcb *)(_tp - TP_OFFSET)); + return (tcb); } static __inline struct pthread * ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
svn commit: r344112 - head/contrib/llvm/lib/MC
Author: dim Date: Wed Feb 13 20:13:40 2019 New Revision: 344112 URL: https://svnweb.freebsd.org/changeset/base/344112 Log: Pull in r353907 from upstream llvm trunk (by Reid Kleckner): [MC] Make symbol version errors non-fatal We stil don't have a source location, which is pretty lame, but at least we won't tell the user to file a clang bug report anymore. Fixes PR40712 This will make errors for symbols with @@ versions that are not defined non-fatal. For example: void f(void) { __asm__(".symver foo,bar@@baz"); } will now result in: error: versioned symbol bar@@baz must be defined instead of clang crashing with a diagnostic report. PR: 234671 Upstream PR: https://bugs.llvm.org/show_bug.cgi?id=40712 MFC after:3 days Modified: head/contrib/llvm/lib/MC/ELFObjectWriter.cpp Modified: head/contrib/llvm/lib/MC/ELFObjectWriter.cpp == --- head/contrib/llvm/lib/MC/ELFObjectWriter.cppWed Feb 13 19:00:06 2019(r344111) +++ head/contrib/llvm/lib/MC/ELFObjectWriter.cppWed Feb 13 20:13:40 2019(r344112) @@ -1258,14 +1258,20 @@ void ELFObjectWriter::executePostLayoutBinding(MCAssem if (!Symbol.isUndefined() && !Rest.startswith("@@@")) continue; -// FIXME: produce a better error message. +// FIXME: Get source locations for these errors or diagnose them earlier. if (Symbol.isUndefined() && Rest.startswith("@@") && -!Rest.startswith("@@@")) - report_fatal_error("A @@ version cannot be undefined"); +!Rest.startswith("@@@")) { + Asm.getContext().reportError(SMLoc(), "versioned symbol " + AliasName + +" must be defined"); + continue; +} -if (Renames.count(&Symbol) && Renames[&Symbol] != Alias) - report_fatal_error(llvm::Twine("Multiple symbol versions defined for ") + - Symbol.getName()); +if (Renames.count(&Symbol) && Renames[&Symbol] != Alias) { + Asm.getContext().reportError( + SMLoc(), llvm::Twine("multiple symbol versions defined for ") + + Symbol.getName()); + continue; +} Renames.insert(std::make_pair(&Symbol, Alias)); } ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r343030 - in head/sys: cam conf dev/md dev/nvme fs/fuse fs/nfsclient fs/smbfs kern sys ufs/ffs vm
On Tue, 15 Jan 2019 01:02:17 + (UTC) Gleb Smirnoff wrote: > Author: glebius > Date: Tue Jan 15 01:02:16 2019 > New Revision: 343030 > URL: https://svnweb.freebsd.org/changeset/base/343030 > > Log: > Allocate pager bufs from UMA instead of 80-ish mutex protected > linked list. > o In vm_pager_bufferinit() create pbuf_zone and start accounting on > how many pbufs are we going to have set. > In various subsystems that are going to utilize pbufs create > private zones via call to pbuf_zsecond_create(). The latter calls > uma_zsecond_create(), and sets a limit on created zone. After startup > preallocate pbufs according to requirements of all pbuf zones. > > Subsystems that used to have a private limit with old allocator > now have private pbuf zones: md(4), fusefs, NFS client, smbfs, VFS > cluster, FFS, swap, vnode pager. > > The following subsystems use shared pbuf zone: cam(4), nvme(4), > physio(9), aio(4). They should have their private limits, but > changing that is out of scope of this commit. > > o Fetch tunable value of kern.nswbuf from init_param2() and while > here move NSWBUF_MIN to opt_param.h and eliminate opt_swap.h, that > was holding only this option. > Default values aren't touched by this commit, but they probably > should be reviewed wrt to modern hardware. > > This change removes a tight bottleneck from sendfile(2) operation, > that uses pbufs in vnode pager. Other pagers also would benefit from > faster allocation. > > Together with: gallatin > Tested by: pho > > Modified: > head/sys/cam/cam_periph.c > head/sys/conf/options > head/sys/dev/md/md.c > head/sys/dev/nvme/nvme_ctrlr.c > head/sys/fs/fuse/fuse_main.c > head/sys/fs/fuse/fuse_vnops.c > head/sys/fs/nfsclient/nfs_clbio.c > head/sys/fs/nfsclient/nfs_clport.c > head/sys/fs/smbfs/smbfs_io.c > head/sys/fs/smbfs/smbfs_vfsops.c > head/sys/kern/kern_physio.c > head/sys/kern/subr_param.c > head/sys/kern/vfs_aio.c > head/sys/kern/vfs_bio.c > head/sys/kern/vfs_cluster.c > head/sys/sys/buf.h > head/sys/ufs/ffs/ffs_rawread.c > head/sys/vm/swap_pager.c > head/sys/vm/vm_pager.c > head/sys/vm/vnode_pager.c > Hi Gleb, This seems to break 32-bit platforms, or at least 32-bit book-e powerpc, which has a limited KVA space (~500MB). It preallocates I've seen over 2500 pbufs, at 128kB each, eating up over 300MB KVA, leaving very little left for the rest of runtime. I spent a couple hours earlier today debugging with Mark Johnston, and his consensus is that the vnode_pbuf_zone is too big on 32-bit platforms. Unfortunately I know very little about this area, so can't provide much extra insight, but can readily reproduce the issues I see triggered by this change, so am willing to help where I can. - Justin ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r343030 - in head/sys: cam conf dev/md dev/nvme fs/fuse fs/nfsclient fs/smbfs kern sys ufs/ffs vm
On Wed, 13 Feb 2019, Justin Hibbits wrote: On Tue, 15 Jan 2019 01:02:17 + (UTC) Gleb Smirnoff wrote: Author: glebius Date: Tue Jan 15 01:02:16 2019 New Revision: 343030 URL: https://svnweb.freebsd.org/changeset/base/343030 Log: Allocate pager bufs from UMA instead of 80-ish mutex protected linked list. ... This seems to break 32-bit platforms, or at least 32-bit book-e powerpc, which has a limited KVA space (~500MB). It preallocates I've seen over 2500 pbufs, at 128kB each, eating up over 300MB KVA, leaving very little left for the rest of runtime. Hrmph. I complained other things in this commit this when it was committed, but not this largest bug since preallocation was broken then so I thought that it wasn't done, so that problems are smaller unless the excessive limits are actually reached. Now i386 does it: XX ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP XX XX swrbuf: 336,128, 0, 0, 0, 0, 0 XX swwbuf: 336, 64, 0, 0, 0, 0, 0 XX nfspbuf:336,128, 0, 0, 0, 0, 0 XX mdpbuf: 336, 25, 0, 0, 0, 0, 0 XX clpbuf: 336,128, 0, 5, 4, 0, 0 XX vnpbuf: 336, 2048, 0, 0, 0, 0, 0 XX pbuf: 336, 16, 0,2535, 0, 0, 0 but i386 now has 4GB of KVA, with almost 3GB to waste, so the bug is not noticed there. The preallocation wasn't there in my last mail to the author about nearby bugs, on 24 Jan 2019: YY vnpbuf: 568, 2048, 0, 0, 0, 0, 0 YY clpbuf: 568,128, 0, 128,8750, 0, 1 YY pbuf: 568, 16, 0, 4, 0, 0, 0 This output is on amd64 where the SIZE is larger and everything else was the same as on i386. Now amd64 shows the large preallocation too. There seems to be another bug for the especially small LIMIT of 16 to turn into a preallocation of 2535 and not cause immediate reduction to the limit. I happen to have kernels from 24 and 25 Jan handy. The first one is amd64 r343346M built on Jan 23, and it doesn't do the large preallocation. The second one is i386 r343388:343418M built on Jan 25, and it does the large preallocation. Both call uma_prealloc() to ask for nswbuf_max = 0x9e9 buffers, but the old version only allocates 4 buffers while later version allocate 0x9e9 buffers. The only relevant commit between the good and bad versions seems to be r343453. This fixes uma_prealloc() to actually work. But it is a feature for it to not work when its caller asks for too much. 0x9e9 is the sum of the LIMITs of all pbuf pools. The main bug in r343030 is that it expands nswbuf, which is supposed to give the combined limit, from its normal value of 256 to 0x9e9. (r343030 actually used nswbuf before it was properly initialized, so used its maximum value of 256 even on small systems with nswbuf = 16. Only this has been fixed.) On i386, nbuf is excessively limited so as to give a maxbufspace of about 100MB so as to fit in 1GB of kva even with infinite RAM and -current's actual 4GB of kva. nbuf is correctly limited to give a much smaller maxbufspace when RAM is small (kva scaling for this is not done so well). nswbuf is restricted if nbuf is restricted, but not enough (except in my version). It is normally 256, so the pbuf allocation used to be 32MB, and this is already a bit large compared with 100MB for maxbufspace. Expanding pbufs by a factor of 0x9e9/0x100 gives the silly combination of 100MB for maxbufspace and 317MB for pbufs. If kva is only 512MB instead of 1GB, then maxbufspace should be only 50MB and nswbuf should be smaller too. Similarly for PAE on i386 back when it was configured with 1GB kva by default. Only about 512MB are left after allocating space for page table metadata. I have fixes that scale most of this better. Large subsystems starting with kmem get a hard-coded fraction of the usable kva. E.g., kmem gets about 60% of usable kva instead of about 40% of nominal kva. Most other large subsystems including the buffer cache get about 1/8 of the remaining 40% of usable kva. Scaling for other subsystems is mostly worse than for kmem. pbufs are part of the buffer cache allocation. The expansion factor of 0x9e9/0x100 breaks this. I don't understand how pbuf_preallocate() allocates for the other pbuf pools. When I debugged this for clpbufs, the preallocation was not used. pbuf types other than clpbufs seem to be unused in my configurations. I thought that pbufs were used during initialization, since they end up with a nonzero FREE count, but their only use seems to be to preallocate them. Bruce ___ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubsc