svn commit: r344099 - head/sys/net

2019-02-13 Thread Randall Stewart
Author: rrs
Date: Wed Feb 13 14:57:59 2019
New Revision: 344099
URL: https://svnweb.freebsd.org/changeset/base/344099

Log:
  This commit adds the missing release mechanism for the
  ratelimiting code. The two modules (lagg and vlan) did have
  allocation routines, and even though they are indirect (and
  vector down to the underlying interfaces) they both need to
  have a free routine (that also vectors down to the actual interface).
  
  Sponsored by: Netflix Inc.
  Differential Revision:https://reviews.freebsd.org/D19032

Modified:
  head/sys/net/if_lagg.c
  head/sys/net/if_vlan.c

Modified: head/sys/net/if_lagg.c
==
--- head/sys/net/if_lagg.c  Wed Feb 13 14:39:16 2019(r344098)
+++ head/sys/net/if_lagg.c  Wed Feb 13 14:57:59 2019(r344099)
@@ -133,6 +133,7 @@ static int  lagg_ioctl(struct ifnet *, u_long, caddr_t)
 static int lagg_snd_tag_alloc(struct ifnet *,
union if_snd_tag_alloc_params *,
struct m_snd_tag **);
+static voidlagg_snd_tag_free(struct m_snd_tag *);
 #endif
 static int lagg_setmulti(struct lagg_port *);
 static int lagg_clrmulti(struct lagg_port *);
@@ -514,6 +515,7 @@ lagg_clone_create(struct if_clone *ifc, int unit, cadd
ifp->if_flags = IFF_SIMPLEX | IFF_BROADCAST | IFF_MULTICAST;
 #ifdef RATELIMIT
ifp->if_snd_tag_alloc = lagg_snd_tag_alloc;
+   ifp->if_snd_tag_free = lagg_snd_tag_free;
 #endif
ifp->if_capenable = ifp->if_capabilities = IFCAP_HWSTATS;
 
@@ -1568,6 +1570,13 @@ lagg_snd_tag_alloc(struct ifnet *ifp,
/* forward allocation request */
return (ifp->if_snd_tag_alloc(ifp, params, ppmt));
 }
+
+static void
+lagg_snd_tag_free(struct m_snd_tag *tag)
+{
+   tag->ifp->if_snd_tag_free(tag);
+}
+
 #endif
 
 static int

Modified: head/sys/net/if_vlan.c
==
--- head/sys/net/if_vlan.c  Wed Feb 13 14:39:16 2019(r344098)
+++ head/sys/net/if_vlan.c  Wed Feb 13 14:57:59 2019(r344099)
@@ -267,6 +267,7 @@ static  int vlan_ioctl(struct ifnet *ifp, u_long cmd, c
 #ifdef RATELIMIT
 static int vlan_snd_tag_alloc(struct ifnet *,
 union if_snd_tag_alloc_params *, struct m_snd_tag **);
+static void vlan_snd_tag_free(struct m_snd_tag *);
 #endif
 static void vlan_qflush(struct ifnet *ifp);
 static int vlan_setflag(struct ifnet *ifp, int flag, int status,
@@ -1047,6 +1048,7 @@ vlan_clone_create(struct if_clone *ifc, char *name, si
ifp->if_ioctl = vlan_ioctl;
 #ifdef RATELIMIT
ifp->if_snd_tag_alloc = vlan_snd_tag_alloc;
+   ifp->if_snd_tag_free = vlan_snd_tag_free;
 #endif
ifp->if_flags = VLAN_IFFLAGS;
ether_ifattach(ifp, eaddr);
@@ -1933,5 +1935,11 @@ vlan_snd_tag_alloc(struct ifnet *ifp,
return (EOPNOTSUPP);
/* forward allocation request */
return (ifp->if_snd_tag_alloc(ifp, params, ppmt));
+}
+
+static void
+vlan_snd_tag_free(struct m_snd_tag *tag)
+{
+   tag->ifp->if_snd_tag_free(tag);
 }
 #endif
___
svn-src-head@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"


svn commit: r344103 - head/sys/netinet

2019-02-13 Thread Andrey V. Elsukov
Author: ae
Date: Wed Feb 13 15:46:05 2019
New Revision: 344103
URL: https://svnweb.freebsd.org/changeset/base/344103

Log:
  In r335015 PCB destroing was made deferred using epoch_call().
  
  But ipsec_delete_pcbpolicy() uses some VNET-virtualized variables,
  and thus it needs VNET context, that is missing during gtaskqueue
  executing. Use inp_vnet context to set curvnet in in_pcbfree_deferred().
  
  PR:   235684
  MFC after:1 week

Modified:
  head/sys/netinet/in_pcb.c

Modified: head/sys/netinet/in_pcb.c
==
--- head/sys/netinet/in_pcb.c   Wed Feb 13 15:30:06 2019(r344102)
+++ head/sys/netinet/in_pcb.c   Wed Feb 13 15:46:05 2019(r344103)
@@ -1565,6 +1565,7 @@ in_pcbfree_deferred(epoch_context_t ctx)
inp = __containerof(ctx, struct inpcb, inp_epoch_ctx);
 
INP_WLOCK(inp);
+   CURVNET_SET(inp->inp_vnet);
 #ifdef INET
struct ip_moptions *imo = inp->inp_moptions;
inp->inp_moptions = NULL;
@@ -1597,6 +1598,7 @@ in_pcbfree_deferred(epoch_context_t ctx)
 #ifdef INET
inp_freemoptions(imo);
 #endif 
+   CURVNET_RESTORE();
 }
 
 /*
___
svn-src-head@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"


Re: svn commit: r344099 - head/sys/net

2019-02-13 Thread John Baldwin
On 2/13/19 6:57 AM, Randall Stewart wrote:
> Author: rrs
> Date: Wed Feb 13 14:57:59 2019
> New Revision: 344099
> URL: https://svnweb.freebsd.org/changeset/base/344099
> 
> Log:
>   This commit adds the missing release mechanism for the
>   ratelimiting code. The two modules (lagg and vlan) did have
>   allocation routines, and even though they are indirect (and
>   vector down to the underlying interfaces) they both need to
>   have a free routine (that also vectors down to the actual interface).
>   
>   Sponsored by:   Netflix Inc.
>   Differential Revision:  https://reviews.freebsd.org/D19032

Hmm, I don't understand why you'd ever invoke if_snd_tag_free from anything
but 'tag->ifp' rather than some other ifp.  What if the route for a connection
moves so that a tag allocated on cc0 is now on a route that goes over em0?
You can't expect em0 to have an if_snd_tag_free routine that will know to
go invoke cxgbe's snd_tag_free.  I think you should always be using
'tag->ifp->if_snd_tag_free' to free tags and never using any other ifp.

That is, I think this should be reverted and that instead you need to fix
the code invoking if_snd_tag_free to invoke it on the tag's ifp instead of
some random other ifp.

-- 
John Baldwin


___
svn-src-head@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"


svn commit: r344106 - in head/sys: riscv/include riscv/riscv vm

2019-02-13 Thread Mark Johnston
Author: markj
Date: Wed Feb 13 17:19:37 2019
New Revision: 344106
URL: https://svnweb.freebsd.org/changeset/base/344106

Log:
  Implement transparent 2MB superpage promotion for RISC-V.
  
  This includes support for pmap_enter(..., psind=1) as described in the
  commit log message for r321378.
  
  The changes are largely modelled after amd64.  arm64 has more stringent
  requirements around superpage creation to avoid the possibility of TLB
  conflict aborts, and these requirements do not apply to RISC-V, which
  like amd64 permits simultaneous caching of 4KB and 2MB translations for
  a given page.  RISC-V's PTE format includes only two software bits, and
  as these are already consumed we do not have an analogue for amd64's
  PG_PROMOTED.  Instead, pmap_remove_l2() always invalidates the entire
  2MB address range.
  
  pmap_ts_referenced() is modified to clear PTE_A, now that we support
  both hardware- and software-managed reference and dirty bits.  Also
  fix pmap_fault_fixup() so that it does not set PTE_A or PTE_D on kernel
  mappings.
  
  Reviewed by:  kib (earlier version)
  Discussed with:   jhb
  Sponsored by: The FreeBSD Foundation
  Differential Revision:https://reviews.freebsd.org/D18863
  Differential Revision:https://reviews.freebsd.org/D18864
  Differential Revision:https://reviews.freebsd.org/D18865
  Differential Revision:https://reviews.freebsd.org/D18866
  Differential Revision:https://reviews.freebsd.org/D18867
  Differential Revision:https://reviews.freebsd.org/D18868

Modified:
  head/sys/riscv/include/param.h
  head/sys/riscv/include/pmap.h
  head/sys/riscv/include/pte.h
  head/sys/riscv/include/vmparam.h
  head/sys/riscv/riscv/pmap.c
  head/sys/vm/vm_fault.c

Modified: head/sys/riscv/include/param.h
==
--- head/sys/riscv/include/param.h  Wed Feb 13 16:02:55 2019
(r344105)
+++ head/sys/riscv/include/param.h  Wed Feb 13 17:19:37 2019
(r344106)
@@ -82,7 +82,7 @@
 #definePAGE_SIZE   (1 << PAGE_SHIFT)   /* Page size */
 #definePAGE_MASK   (PAGE_SIZE - 1)
 
-#defineMAXPAGESIZES1   /* maximum number of supported 
page sizes */
+#defineMAXPAGESIZES3   /* maximum number of supported page 
sizes */
 
 #ifndef KSTACK_PAGES
 #defineKSTACK_PAGES4   /* pages of kernel stack (with pcb) */

Modified: head/sys/riscv/include/pmap.h
==
--- head/sys/riscv/include/pmap.h   Wed Feb 13 16:02:55 2019
(r344105)
+++ head/sys/riscv/include/pmap.h   Wed Feb 13 17:19:37 2019
(r344106)
@@ -44,6 +44,8 @@
 #include 
 #include 
 
+#include 
+
 #ifdef _KERNEL
 
 #definevtophys(va) pmap_kextract((vm_offset_t)(va))
@@ -80,6 +82,7 @@ struct pmap {
pd_entry_t  *pm_l1;
TAILQ_HEAD(,pv_chunk)   pm_pvchunk; /* list of mappings in pmap */
LIST_ENTRY(pmap)pm_list;/* List of all pmaps */
+   struct vm_radix pm_root;
 };
 
 typedef struct pv_entry {
@@ -139,6 +142,7 @@ voidpmap_kenter_device(vm_offset_t, vm_size_t, 
vm_pad
 vm_paddr_t pmap_kextract(vm_offset_t va);
 void   pmap_kremove(vm_offset_t);
 void   pmap_kremove_device(vm_offset_t, vm_size_t);
+bool   pmap_ps_enabled(pmap_t);
 
 void   *pmap_mapdev(vm_offset_t, vm_size_t);
 void   *pmap_mapbios(vm_paddr_t, vm_size_t);

Modified: head/sys/riscv/include/pte.h
==
--- head/sys/riscv/include/pte.hWed Feb 13 16:02:55 2019
(r344105)
+++ head/sys/riscv/include/pte.hWed Feb 13 17:19:37 2019
(r344106)
@@ -62,7 +62,8 @@ typedef   uint64_tpn_t;   /* page 
number */
 #defineL3_SIZE (1 << L3_SHIFT)
 #defineL3_OFFSET   (L3_SIZE - 1)
 
-#defineLn_ENTRIES  (1 << 9)
+#defineLn_ENTRIES_SHIFT 9
+#defineLn_ENTRIES  (1 << Ln_ENTRIES_SHIFT)
 #defineLn_ADDR_MASK(Ln_ENTRIES - 1)
 
 /* Bits 9:8 are reserved for software */
@@ -79,6 +80,8 @@ typedef   uint64_tpn_t;   /* page 
number */
 #definePTE_RWX (PTE_R | PTE_W | PTE_X)
 #definePTE_RX  (PTE_R | PTE_X)
 #definePTE_KERN(PTE_V | PTE_R | PTE_W | PTE_A | PTE_D)
+#definePTE_PROMOTE (PTE_V | PTE_RWX | PTE_D | PTE_A | PTE_G | 
PTE_U | \
+PTE_SW_MANAGED | PTE_SW_WIRED)
 
 #definePTE_PPN0_S  10
 #definePTE_PPN1_S  19

Modified: head/sys/riscv/include/vmparam.h
==
--- head/sys/riscv/include/vmparam.hWed Feb 13 16:02:55 2019
(r344105)
+++ head/sys/riscv/include/vmparam.hWed Feb 13 17:19:37 2019
(r3

svn commit: r344107 - head/sys/riscv/riscv

2019-02-13 Thread Mark Johnston
Author: markj
Date: Wed Feb 13 17:38:47 2019
New Revision: 344107
URL: https://svnweb.freebsd.org/changeset/base/344107

Log:
  Implement pmap_clear_modify() for RISC-V.
  
  Reviewed by:  kib
  Sponsored by: The FreeBSD Foundation
  Differential Revision:https://reviews.freebsd.org/D18875

Modified:
  head/sys/riscv/riscv/pmap.c

Modified: head/sys/riscv/riscv/pmap.c
==
--- head/sys/riscv/riscv/pmap.c Wed Feb 13 17:19:37 2019(r344106)
+++ head/sys/riscv/riscv/pmap.c Wed Feb 13 17:38:47 2019(r344107)
@@ -4074,6 +4074,14 @@ pmap_advise(pmap_t pmap, vm_offset_t sva, vm_offset_t 
 void
 pmap_clear_modify(vm_page_t m)
 {
+   struct md_page *pvh;
+   struct rwlock *lock;
+   pmap_t pmap;
+   pv_entry_t next_pv, pv;
+   pd_entry_t *l2, oldl2;
+   pt_entry_t *l3, oldl3;
+   vm_offset_t va;
+   int md_gen, pvh_gen;
 
KASSERT((m->oflags & VPO_UNMANAGED) == 0,
("pmap_clear_modify: page %p is not managed", m));
@@ -4088,8 +4096,78 @@ pmap_clear_modify(vm_page_t m)
 */
if ((m->aflags & PGA_WRITEABLE) == 0)
return;
-
-   /* RISCVTODO: We lack support for tracking if a page is modified */
+   pvh = (m->flags & PG_FICTITIOUS) != 0 ? &pv_dummy :
+   pa_to_pvh(VM_PAGE_TO_PHYS(m));
+   lock = VM_PAGE_TO_PV_LIST_LOCK(m);
+   rw_rlock(&pvh_global_lock);
+   rw_wlock(lock);
+restart:
+   TAILQ_FOREACH_SAFE(pv, &pvh->pv_list, pv_next, next_pv) {
+   pmap = PV_PMAP(pv);
+   if (!PMAP_TRYLOCK(pmap)) {
+   pvh_gen = pvh->pv_gen;
+   rw_wunlock(lock);
+   PMAP_LOCK(pmap);
+   rw_wlock(lock);
+   if (pvh_gen != pvh->pv_gen) {
+   PMAP_UNLOCK(pmap);
+   goto restart;
+   }
+   }
+   va = pv->pv_va;
+   l2 = pmap_l2(pmap, va);
+   oldl2 = pmap_load(l2);
+   if ((oldl2 & PTE_W) != 0) {
+   if (pmap_demote_l2_locked(pmap, l2, va, &lock)) {
+   if ((oldl2 & PTE_SW_WIRED) == 0) {
+   /*
+* Write protect the mapping to a
+* single page so that a subsequent
+* write access may repromote.
+*/
+   va += VM_PAGE_TO_PHYS(m) -
+   PTE_TO_PHYS(oldl2);
+   l3 = pmap_l2_to_l3(l2, va);
+   oldl3 = pmap_load(l3);
+   if ((oldl3 & PTE_V) != 0) {
+   while (!atomic_fcmpset_long(l3,
+   &oldl3, oldl3 & ~(PTE_D |
+   PTE_W)))
+   cpu_spinwait();
+   vm_page_dirty(m);
+   pmap_invalidate_page(pmap, va);
+   }
+   }
+   }
+   }
+   PMAP_UNLOCK(pmap);
+   }
+   TAILQ_FOREACH(pv, &m->md.pv_list, pv_next) {
+   pmap = PV_PMAP(pv);
+   if (!PMAP_TRYLOCK(pmap)) {
+   md_gen = m->md.pv_gen;
+   pvh_gen = pvh->pv_gen;
+   rw_wunlock(lock);
+   PMAP_LOCK(pmap);
+   rw_wlock(lock);
+   if (pvh_gen != pvh->pv_gen || md_gen != m->md.pv_gen) {
+   PMAP_UNLOCK(pmap);
+   goto restart;
+   }
+   }
+   l2 = pmap_l2(pmap, pv->pv_va);
+   KASSERT((pmap_load(l2) & PTE_RWX) == 0,
+   ("pmap_clear_modify: found a 2mpage in page %p's pv list",
+   m));
+   l3 = pmap_l2_to_l3(l2, pv->pv_va);
+   if ((pmap_load(l3) & (PTE_D | PTE_W)) == (PTE_D | PTE_W)) {
+   pmap_clear_bits(l3, PTE_D);
+   pmap_invalidate_page(pmap, pv->pv_va);
+   }
+   PMAP_UNLOCK(pmap);
+   }
+   rw_wunlock(lock);
+   rw_runlock(&pvh_global_lock);
 }
 
 void *
___
svn-src-head@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"


svn commit: r344108 - in head/sys/riscv: include riscv

2019-02-13 Thread Mark Johnston
Author: markj
Date: Wed Feb 13 17:50:01 2019
New Revision: 344108
URL: https://svnweb.freebsd.org/changeset/base/344108

Log:
  Implement per-CPU pmap activation tracking for RISC-V.
  
  This reduces the overhead of TLB invalidations by ensuring that we
  only interrupt CPUs which are using the given pmap.  Tracking is
  performed in pmap_activate(), which gets called during context switches:
  from cpu_throw(), if a thread is exiting or an AP is starting, or
  cpu_switch() for a regular context switch.
  
  For now, pmap_sync_icache() still must interrupt all CPUs.
  
  Reviewed by:  kib (earlier version), jhb
  Sponsored by: The FreeBSD Foundation
  Differential Revision:https://reviews.freebsd.org/D18874

Modified:
  head/sys/riscv/include/pcb.h
  head/sys/riscv/include/pcpu.h
  head/sys/riscv/include/pmap.h
  head/sys/riscv/riscv/genassym.c
  head/sys/riscv/riscv/machdep.c
  head/sys/riscv/riscv/mp_machdep.c
  head/sys/riscv/riscv/pmap.c
  head/sys/riscv/riscv/swtch.S
  head/sys/riscv/riscv/vm_machdep.c

Modified: head/sys/riscv/include/pcb.h
==
--- head/sys/riscv/include/pcb.hWed Feb 13 17:38:47 2019
(r344107)
+++ head/sys/riscv/include/pcb.hWed Feb 13 17:50:01 2019
(r344108)
@@ -55,7 +55,6 @@ struct pcb {
 #definePCB_FP_STARTED  0x1
 #definePCB_FP_USERMASK 0x1
uint64_tpcb_sepc;   /* Supervisor exception pc */
-   vm_offset_t pcb_l1addr; /* L1 page tables base address */
vm_offset_t pcb_onfault;/* Copyinout fault handler */
 };
 

Modified: head/sys/riscv/include/pcpu.h
==
--- head/sys/riscv/include/pcpu.h   Wed Feb 13 17:38:47 2019
(r344107)
+++ head/sys/riscv/include/pcpu.h   Wed Feb 13 17:50:01 2019
(r344108)
@@ -45,6 +45,7 @@
 #defineALT_STACK_SIZE  128
 
 #definePCPU_MD_FIELDS  
\
+   struct pmap *pc_curpmap;/* Currently active pmap */ \
uint32_t pc_pending_ipis;   /* IPIs pending to this CPU */  \
char __pad[61]
 

Modified: head/sys/riscv/include/pmap.h
==
--- head/sys/riscv/include/pmap.h   Wed Feb 13 17:38:47 2019
(r344107)
+++ head/sys/riscv/include/pmap.h   Wed Feb 13 17:50:01 2019
(r344108)
@@ -41,6 +41,7 @@
 #ifndef LOCORE
 
 #include 
+#include 
 #include 
 #include 
 
@@ -80,6 +81,8 @@ struct pmap {
struct mtx  pm_mtx;
struct pmap_statistics  pm_stats;   /* pmap statictics */
pd_entry_t  *pm_l1;
+   u_long  pm_satp;/* value for SATP register */
+   cpuset_tpm_active;  /* active on cpus */
TAILQ_HEAD(,pv_chunk)   pm_pvchunk; /* list of mappings in pmap */
LIST_ENTRY(pmap)pm_list;/* List of all pmaps */
struct vm_radix pm_root;
@@ -137,6 +140,10 @@ extern vm_offset_t virtual_end;
 #defineL1_MAPPABLE_P(va, pa, size) 
\
va) | (pa)) & L1_OFFSET) == 0 && (size) >= L1_SIZE)
 
+struct thread;
+
+void   pmap_activate_boot(pmap_t);
+void   pmap_activate_sw(struct thread *);
 void   pmap_bootstrap(vm_offset_t, vm_paddr_t, vm_size_t);
 void   pmap_kenter_device(vm_offset_t, vm_size_t, vm_paddr_t);
 vm_paddr_t pmap_kextract(vm_offset_t va);

Modified: head/sys/riscv/riscv/genassym.c
==
--- head/sys/riscv/riscv/genassym.c Wed Feb 13 17:38:47 2019
(r344107)
+++ head/sys/riscv/riscv/genassym.c Wed Feb 13 17:50:01 2019
(r344108)
@@ -63,7 +63,6 @@ ASSYM(TDF_ASTPENDING, TDF_ASTPENDING);
 ASSYM(TDF_NEEDRESCHED, TDF_NEEDRESCHED);
 
 ASSYM(PCB_ONFAULT, offsetof(struct pcb, pcb_onfault));
-ASSYM(PCB_L1ADDR, offsetof(struct pcb, pcb_l1addr));
 ASSYM(PCB_SIZE, sizeof(struct pcb));
 ASSYM(PCB_RA, offsetof(struct pcb, pcb_ra));
 ASSYM(PCB_SP, offsetof(struct pcb, pcb_sp));

Modified: head/sys/riscv/riscv/machdep.c
==
--- head/sys/riscv/riscv/machdep.c  Wed Feb 13 17:38:47 2019
(r344107)
+++ head/sys/riscv/riscv/machdep.c  Wed Feb 13 17:50:01 2019
(r344108)
@@ -871,10 +871,6 @@ initriscv(struct riscv_bootparams *rvbp)
 
init_proc0(rvbp->kern_stack);
 
-   /* set page table base register for thread0 */
-   thread0.td_pcb->pcb_l1addr = \
-   (rvbp->kern_l1pt - KERNBASE + rvbp->kern_phys);
-
msgbufinit(msgbufp, msgbufsize);
mutex_init();
init_param2(physmem);

Modified: head/sys/riscv/riscv/mp_machdep.c
==
--- head/sys/

Re: svn commit: r344099 - head/sys/net

2019-02-13 Thread Randall Stewart via svn-src-head
I disagree. If you define an alloc it is only
reciprocal that you should define a free.

The code in question that hit this was changed (its in a version
of rack that has the rate-limit and TLS code).. but I think these
things *should* be balanced.. if you provide an Allocate, you
should also provide a Free… 

R


> On Feb 13, 2019, at 12:09 PM, John Baldwin  wrote:
> 
> On 2/13/19 6:57 AM, Randall Stewart wrote:
>> Author: rrs
>> Date: Wed Feb 13 14:57:59 2019
>> New Revision: 344099
>> URL: https://svnweb.freebsd.org/changeset/base/344099
>> 
>> Log:
>>  This commit adds the missing release mechanism for the
>>  ratelimiting code. The two modules (lagg and vlan) did have
>>  allocation routines, and even though they are indirect (and
>>  vector down to the underlying interfaces) they both need to
>>  have a free routine (that also vectors down to the actual interface).
>> 
>>  Sponsored by:   Netflix Inc.
>>  Differential Revision:  https://reviews.freebsd.org/D19032
> 
> Hmm, I don't understand why you'd ever invoke if_snd_tag_free from anything
> but 'tag->ifp' rather than some other ifp.  What if the route for a connection
> moves so that a tag allocated on cc0 is now on a route that goes over em0?
> You can't expect em0 to have an if_snd_tag_free routine that will know to
> go invoke cxgbe's snd_tag_free.  I think you should always be using
> 'tag->ifp->if_snd_tag_free' to free tags and never using any other ifp.
> 
> That is, I think this should be reverted and that instead you need to fix
> the code invoking if_snd_tag_free to invoke it on the tag's ifp instead of
> some random other ifp.
> 
> -- 
> John Baldwin
> 
> 

--
Randall Stewart
r...@netflix.com



___
svn-src-head@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"


Re: svn commit: r344099 - head/sys/net

2019-02-13 Thread Randall Stewart via svn-src-head
oh and one other thing..

It was *not* a random IFP.. it was the IFP to the lagg.

I.e. an alloc() was done to the lagg.. and the free was
done back to the same IFP (that provided the allocate).

R

> On Feb 13, 2019, at 1:02 PM, Randall Stewart  wrote:
> 
> I disagree. If you define an alloc it is only
> reciprocal that you should define a free.
> 
> The code in question that hit this was changed (its in a version
> of rack that has the rate-limit and TLS code).. but I think these
> things *should* be balanced.. if you provide an Allocate, you
> should also provide a Free… 
> 
> R
> 
> 
>> On Feb 13, 2019, at 12:09 PM, John Baldwin  wrote:
>> 
>> On 2/13/19 6:57 AM, Randall Stewart wrote:
>>> Author: rrs
>>> Date: Wed Feb 13 14:57:59 2019
>>> New Revision: 344099
>>> URL: https://svnweb.freebsd.org/changeset/base/344099
>>> 
>>> Log:
>>> This commit adds the missing release mechanism for the
>>> ratelimiting code. The two modules (lagg and vlan) did have
>>> allocation routines, and even though they are indirect (and
>>> vector down to the underlying interfaces) they both need to
>>> have a free routine (that also vectors down to the actual interface).
>>> 
>>> Sponsored by:   Netflix Inc.
>>> Differential Revision:  https://reviews.freebsd.org/D19032
>> 
>> Hmm, I don't understand why you'd ever invoke if_snd_tag_free from anything
>> but 'tag->ifp' rather than some other ifp.  What if the route for a 
>> connection
>> moves so that a tag allocated on cc0 is now on a route that goes over em0?
>> You can't expect em0 to have an if_snd_tag_free routine that will know to
>> go invoke cxgbe's snd_tag_free.  I think you should always be using
>> 'tag->ifp->if_snd_tag_free' to free tags and never using any other ifp.
>> 
>> That is, I think this should be reverted and that instead you need to fix
>> the code invoking if_snd_tag_free to invoke it on the tag's ifp instead of
>> some random other ifp.
>> 
>> -- 
>> John Baldwin
>> 
>> 
> 
> --
> Randall Stewart
> r...@netflix.com
> 
> 
> 

--
Randall Stewart
r...@netflix.com



___
svn-src-head@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"


Re: svn commit: r344099 - head/sys/net

2019-02-13 Thread John Baldwin
On 2/13/19 10:03 AM, Randall Stewart wrote:
> oh and one other thing..
> 
> It was *not* a random IFP.. it was the IFP to the lagg.
> 
> I.e. an alloc() was done to the lagg.. and the free was
> done back to the same IFP (that provided the allocate).

Yes, that's wrong.  Suppose the route changes so that my traffic is now over
em0 instead of lagg0 (where em0 isn't a member of the lagg), how do you
expect if_lagg_free to invoke em0's free routine?  In your case it does,
but only by accident.  It doesn't work in the other case I described which
is if you have non-lagg interfaces and a route moves from cc0 to em0.  In
that case your existing code that is using the wrong ifp will just panic.

These aren't real alloc routines as the lagg and vlan ones don't allocate
anything, they pass along the request to the child and the child allocates
the tag.  Only ifnet's that actually allocate tags should need to free them,
and you should be using tag->ifp to as the ifp whose if_snd_tag_free works.

> R
> 
>> On Feb 13, 2019, at 1:02 PM, Randall Stewart  wrote:
>>
>> I disagree. If you define an alloc it is only
>> reciprocal that you should define a free.
>>
>> The code in question that hit this was changed (its in a version
>> of rack that has the rate-limit and TLS code).. but I think these
>> things *should* be balanced.. if you provide an Allocate, you
>> should also provide a Free… 
>>
>> R
>>
>>
>>> On Feb 13, 2019, at 12:09 PM, John Baldwin  wrote:
>>>
>>> On 2/13/19 6:57 AM, Randall Stewart wrote:
 Author: rrs
 Date: Wed Feb 13 14:57:59 2019
 New Revision: 344099
 URL: https://svnweb.freebsd.org/changeset/base/344099

 Log:
 This commit adds the missing release mechanism for the
 ratelimiting code. The two modules (lagg and vlan) did have
 allocation routines, and even though they are indirect (and
 vector down to the underlying interfaces) they both need to
 have a free routine (that also vectors down to the actual interface).

 Sponsored by:  Netflix Inc.
 Differential Revision: https://reviews.freebsd.org/D19032
>>>
>>> Hmm, I don't understand why you'd ever invoke if_snd_tag_free from anything
>>> but 'tag->ifp' rather than some other ifp.  What if the route for a 
>>> connection
>>> moves so that a tag allocated on cc0 is now on a route that goes over em0?
>>> You can't expect em0 to have an if_snd_tag_free routine that will know to
>>> go invoke cxgbe's snd_tag_free.  I think you should always be using
>>> 'tag->ifp->if_snd_tag_free' to free tags and never using any other ifp.
>>>
>>> That is, I think this should be reverted and that instead you need to fix
>>> the code invoking if_snd_tag_free to invoke it on the tag's ifp instead of
>>> some random other ifp.
>>>
>>> -- 
>>> John Baldwin
>>>
>>>
>>
>> --
>> Randall Stewart
>> r...@netflix.com
>>
>>
>>
> 
> --
> Randall Stewart
> r...@netflix.com
> 
> 
> 


-- 
John Baldwin


___
svn-src-head@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"


svn commit: r344109 - head/lib/libthr/arch/powerpc/include

2019-02-13 Thread Leandro Lupori
Author: luporl
Date: Wed Feb 13 18:28:53 2019
New Revision: 344109
URL: https://svnweb.freebsd.org/changeset/base/344109

Log:
  silence cast-align warnings from clang on powerpc64
  
  silence the following warning when compiling libthr with clang 8
  for powerpc64 architecture:
  
  usr/src/lib/libthr/arch/powerpc/include/pthread_md.h:82:10: error:
  cast from 'uint8_t *' (aka 'unsigned char *') to 'struct tcb *'
  increases required alignment from 1 to 8 [-Werror,-Wcast-align]
  82:  return ((struct tcb *)(_tp - TP_OFFSET));
  
  Submitted by: alfredo.junior_eldorado.org.br
  Reviewed by:  git_bdragon.rtk0.net, emaste, kib, jhibbits, luporl
  Differential Revision:https://reviews.freebsd.org/D18807

Modified:
  head/lib/libthr/arch/powerpc/include/pthread_md.h

Modified: head/lib/libthr/arch/powerpc/include/pthread_md.h
==
--- head/lib/libthr/arch/powerpc/include/pthread_md.h   Wed Feb 13 17:50:01 
2019(r344108)
+++ head/lib/libthr/arch/powerpc/include/pthread_md.h   Wed Feb 13 18:28:53 
2019(r344109)
@@ -72,14 +72,15 @@ _tcb_set(struct tcb *tcb)
 static __inline struct tcb *
 _tcb_get(void)
 {
-   register uint8_t *_tp;
+register struct tcb *tcb;
+
 #ifdef __powerpc64__
-   __asm __volatile("mr %0,13" : "=r"(_tp));
+   __asm __volatile("addi %0,13,%1" : "=r"(tcb) : "i"(-TP_OFFSET));
 #else
-   __asm __volatile("mr %0,2" : "=r"(_tp));
+   __asm __volatile("addi %0,2,%1" : "=r"(tcb) : "i"(-TP_OFFSET));
 #endif
 
-   return ((struct tcb *)(_tp - TP_OFFSET));
+   return (tcb);
 }
 
 static __inline struct pthread *
___
svn-src-head@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"


svn commit: r344112 - head/contrib/llvm/lib/MC

2019-02-13 Thread Dimitry Andric
Author: dim
Date: Wed Feb 13 20:13:40 2019
New Revision: 344112
URL: https://svnweb.freebsd.org/changeset/base/344112

Log:
  Pull in r353907 from upstream llvm trunk (by Reid Kleckner):
  
[MC] Make symbol version errors non-fatal
  
We stil don't have a source location, which is pretty lame, but at
least we won't tell the user to file a clang bug report anymore.
  
Fixes PR40712
  
  This will make errors for symbols with @@ versions that are not defined
  non-fatal.  For example:
  
void f(void)
{
  __asm__(".symver foo,bar@@baz");
}
  
  will now result in:
  
error: versioned symbol bar@@baz must be defined
  
  instead of clang crashing with a diagnostic report.
  
  PR:   234671
  Upstream PR:  https://bugs.llvm.org/show_bug.cgi?id=40712
  MFC after:3 days

Modified:
  head/contrib/llvm/lib/MC/ELFObjectWriter.cpp

Modified: head/contrib/llvm/lib/MC/ELFObjectWriter.cpp
==
--- head/contrib/llvm/lib/MC/ELFObjectWriter.cppWed Feb 13 19:00:06 
2019(r344111)
+++ head/contrib/llvm/lib/MC/ELFObjectWriter.cppWed Feb 13 20:13:40 
2019(r344112)
@@ -1258,14 +1258,20 @@ void ELFObjectWriter::executePostLayoutBinding(MCAssem
 if (!Symbol.isUndefined() && !Rest.startswith("@@@"))
   continue;
 
-// FIXME: produce a better error message.
+// FIXME: Get source locations for these errors or diagnose them earlier.
 if (Symbol.isUndefined() && Rest.startswith("@@") &&
-!Rest.startswith("@@@"))
-  report_fatal_error("A @@ version cannot be undefined");
+!Rest.startswith("@@@")) {
+  Asm.getContext().reportError(SMLoc(), "versioned symbol " + AliasName +
+" must be defined");
+  continue;
+}
 
-if (Renames.count(&Symbol) && Renames[&Symbol] != Alias)
-  report_fatal_error(llvm::Twine("Multiple symbol versions defined for ") +
- Symbol.getName());
+if (Renames.count(&Symbol) && Renames[&Symbol] != Alias) {
+  Asm.getContext().reportError(
+  SMLoc(), llvm::Twine("multiple symbol versions defined for ") +
+   Symbol.getName());
+  continue;
+}
 
 Renames.insert(std::make_pair(&Symbol, Alias));
   }
___
svn-src-head@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"


Re: svn commit: r343030 - in head/sys: cam conf dev/md dev/nvme fs/fuse fs/nfsclient fs/smbfs kern sys ufs/ffs vm

2019-02-13 Thread Justin Hibbits
On Tue, 15 Jan 2019 01:02:17 + (UTC)
Gleb Smirnoff  wrote:

> Author: glebius
> Date: Tue Jan 15 01:02:16 2019
> New Revision: 343030
> URL: https://svnweb.freebsd.org/changeset/base/343030
> 
> Log:
>   Allocate pager bufs from UMA instead of 80-ish mutex protected
> linked list. 
>   o In vm_pager_bufferinit() create pbuf_zone and start accounting on
> how many pbufs are we going to have set.
> In various subsystems that are going to utilize pbufs create
> private zones via call to pbuf_zsecond_create(). The latter calls
> uma_zsecond_create(), and sets a limit on created zone. After startup
> preallocate pbufs according to requirements of all pbuf zones.
>   
> Subsystems that used to have a private limit with old allocator
> now have private pbuf zones: md(4), fusefs, NFS client, smbfs, VFS
> cluster, FFS, swap, vnode pager.
>   
> The following subsystems use shared pbuf zone: cam(4), nvme(4),
> physio(9), aio(4). They should have their private limits, but
> changing that is out of scope of this commit.
>   
>   o Fetch tunable value of kern.nswbuf from init_param2() and while
> here move NSWBUF_MIN to opt_param.h and eliminate opt_swap.h, that
> was holding only this option.
> Default values aren't touched by this commit, but they probably
> should be reviewed wrt to modern hardware.
>   
>   This change removes a tight bottleneck from sendfile(2) operation,
> that uses pbufs in vnode pager. Other pagers also would benefit from
> faster allocation.
>   
>   Together with:  gallatin
>   Tested by:  pho
> 
> Modified:
>   head/sys/cam/cam_periph.c
>   head/sys/conf/options
>   head/sys/dev/md/md.c
>   head/sys/dev/nvme/nvme_ctrlr.c
>   head/sys/fs/fuse/fuse_main.c
>   head/sys/fs/fuse/fuse_vnops.c
>   head/sys/fs/nfsclient/nfs_clbio.c
>   head/sys/fs/nfsclient/nfs_clport.c
>   head/sys/fs/smbfs/smbfs_io.c
>   head/sys/fs/smbfs/smbfs_vfsops.c
>   head/sys/kern/kern_physio.c
>   head/sys/kern/subr_param.c
>   head/sys/kern/vfs_aio.c
>   head/sys/kern/vfs_bio.c
>   head/sys/kern/vfs_cluster.c
>   head/sys/sys/buf.h
>   head/sys/ufs/ffs/ffs_rawread.c
>   head/sys/vm/swap_pager.c
>   head/sys/vm/vm_pager.c
>   head/sys/vm/vnode_pager.c
> 

Hi Gleb,

This seems to break 32-bit platforms, or at least 32-bit book-e
powerpc, which has a limited KVA space (~500MB).  It preallocates I've
seen over 2500 pbufs, at 128kB each, eating up over 300MB KVA,
leaving very little left for the rest of runtime.

I spent a couple hours earlier today debugging with Mark Johnston, and
his consensus is that the vnode_pbuf_zone is too big on 32-bit
platforms.  Unfortunately I know very little about this area, so can't
provide much extra insight, but can readily reproduce the issues I see
triggered by this change, so am willing to help where I can.

- Justin
___
svn-src-head@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"


Re: svn commit: r343030 - in head/sys: cam conf dev/md dev/nvme fs/fuse fs/nfsclient fs/smbfs kern sys ufs/ffs vm

2019-02-13 Thread Bruce Evans

On Wed, 13 Feb 2019, Justin Hibbits wrote:


On Tue, 15 Jan 2019 01:02:17 + (UTC)
Gleb Smirnoff  wrote:


Author: glebius
Date: Tue Jan 15 01:02:16 2019
New Revision: 343030
URL: https://svnweb.freebsd.org/changeset/base/343030

Log:
  Allocate pager bufs from UMA instead of 80-ish mutex protected
linked list.

...

This seems to break 32-bit platforms, or at least 32-bit book-e
powerpc, which has a limited KVA space (~500MB).  It preallocates I've
seen over 2500 pbufs, at 128kB each, eating up over 300MB KVA,
leaving very little left for the rest of runtime.


Hrmph.  I complained other things in this commit this when it was
committed, but not this largest bug since preallocation was broken then
so I thought that it wasn't done, so that problems are smaller unless the
excessive limits are actually reached.

Now i386 does it:

XX ITEM   SIZE  LIMIT USED FREE  REQ FAIL SLEEP
XX 
XX swrbuf: 336,128,   0,   0,   0,   0,   0

XX swwbuf: 336, 64,   0,   0,   0,   0,   0
XX nfspbuf:336,128,   0,   0,   0,   0,   0
XX mdpbuf: 336, 25,   0,   0,   0,   0,   0
XX clpbuf: 336,128,   0,   5,   4,   0,   0
XX vnpbuf: 336,   2048,   0,   0,   0,   0,   0
XX pbuf:   336, 16,   0,2535,   0,   0,   0

but i386 now has 4GB of KVA, with almost 3GB to waste, so the bug is not
noticed there.

The preallocation wasn't there in my last mail to the author about nearby
bugs, on 24 Jan 2019:

YY vnpbuf: 568,   2048,   0,   0,   0,   0,   0
YY clpbuf: 568,128,   0, 128,8750,   0,   1
YY pbuf:   568, 16,   0,   4,   0,   0,   0

This output is on amd64 where the SIZE is larger and everything else was
the same as on i386.  Now amd64 shows the large preallocation too.

There seems to be another bug for the especially small LIMIT of 16 to
turn into a preallocation of 2535 and not cause immediate reduction to
the limit.

I happen to have kernels from 24 and 25 Jan handy.  The first one is
amd64 r343346M built on Jan 23, and it doesn't do the large
preallocation.  The second one is i386 r343388:343418M built on Jan
25, and it does the large preallocation.  Both call uma_prealloc() to
ask for nswbuf_max = 0x9e9 buffers, but the old version only allocates
4 buffers while later version allocate 0x9e9 buffers.

The only relevant commit between the good and bad versions seems to be
r343453.  This fixes uma_prealloc() to actually work.  But it is a feature
for it to not work when its caller asks for too much.

0x9e9 is the sum of the LIMITs of all pbuf pools.  The main bug in
r343030 is that it expands nswbuf, which is supposed to give the
combined limit, from its normal value of 256 to 0x9e9.  (r343030
actually used nswbuf before it was properly initialized, so used its
maximum value of 256 even on small systems with nswbuf = 16.  Only
this has been fixed.)

On i386, nbuf is excessively limited so as to give a maxbufspace of
about 100MB so as to fit in 1GB of kva even with infinite RAM and
-current's actual 4GB of kva.  nbuf is correctly limited to give a
much smaller maxbufspace when RAM is small (kva scaling for this is
not done so well).  nswbuf is restricted if nbuf is restricted, but
not enough (except in my version).  It is normally 256, so the pbuf
allocation used to be 32MB, and this is already a bit large compared
with 100MB for maxbufspace.  Expanding pbufs by a factor of 0x9e9/0x100
gives the silly combination of 100MB for maxbufspace and 317MB for
pbufs.

If kva is only 512MB instead of 1GB, then maxbufspace should be only
50MB and nswbuf should be smaller too.  Similarly for PAE on i386 back
when it was configured with 1GB kva by default.  Only about 512MB are
left after allocating space for page table metadata.  I have fixes
that scale most of this better.  Large subsystems starting with kmem
get a hard-coded fraction of the usable kva.  E.g., kmem gets about
60% of usable kva instead of about 40% of nominal kva.  Most other
large subsystems including the buffer cache get about 1/8 of the
remaining 40% of usable kva.  Scaling for other subsystems is mostly
worse than for kmem.  pbufs are part of the buffer cache allocation.
The expansion factor of 0x9e9/0x100 breaks this.

I don't understand how pbuf_preallocate() allocates for the other
pbuf pools.  When I debugged this for clpbufs, the preallocation was
not used.  pbuf types other than clpbufs seem to be unused in my
configurations.  I thought that pbufs were used during initialization,
since they end up with a nonzero FREE count, but their only use seems
to be to preallocate them.

Bruce
___
svn-src-head@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubsc