Re: svn commit: r238755 - head/sys/x86/x86
on 25/07/2012 01:10 Jim Harris said the following: > Author: jimharris > Date: Tue Jul 24 22:10:11 2012 > New Revision: 238755 > URL: http://svn.freebsd.org/changeset/base/238755 > > Log: > Add rmb() to tsc_read_##x to enforce serialization of rdtsc captures. > > Intel Architecture Manual specifies that rdtsc instruction is not > serialized, > so without this change, TSC synchronization test would periodically fail, > resulting in use of HPET timecounter instead of TSC-low. This caused > severe performance degradation (40-50%) when running high IO/s workloads > due to > HPET MMIO reads and GEOM stat collection. > > Tests on Xeon E5-2600 (Sandy Bridge) 8C systems were seeing TSC > synchronization > fail approximately 20% of the time. Should rather the synchronization test be fixed if it's the culprit? Or is this change universally good for the real uses of TSC? > Sponsored by: Intel > Reviewed by: kib > MFC after: 3 days > > Modified: > head/sys/x86/x86/tsc.c > > Modified: head/sys/x86/x86/tsc.c > == > --- head/sys/x86/x86/tsc.cTue Jul 24 20:15:41 2012(r238754) > +++ head/sys/x86/x86/tsc.cTue Jul 24 22:10:11 2012(r238755) > @@ -328,6 +328,7 @@ init_TSC(void) > > #ifdef SMP > > +/* rmb is required here because rdtsc is not a serializing instruction. */ > #define TSC_READ(x) \ > static void \ > tsc_read_##x(void *arg) \ > @@ -335,6 +336,7 @@ tsc_read_##x(void *arg) \ > uint32_t *tsc = arg;\ > u_int cpu = PCPU_GET(cpuid);\ > \ > + rmb(); \ > tsc[cpu * 3 + x] = rdtsc32(); \ > } > TSC_READ(0) > -- Andriy Gapon ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r238722 - in head/lib/msun: . ld128 ld80 man src
On 2012-Jul-24 13:57:12 -0400, David Schultz wrote: >On Tue, Jul 24, 2012, Steve Kargl wrote: >> On Tue, Jul 24, 2012 at 08:43:35AM +, Alexey Dokuchaev wrote: >> > On Mon, Jul 23, 2012 at 07:13:56PM +, Steve Kargl wrote: >> > > Compute the exponential of x for Intel 80-bit format and IEEE 128-bit >> > > format. These implementations are based on >> > I believe some ports could benefit from OSVERSION bump for this one. ... >against. In this case, it would help any ports that have >workarounds for the lack of expl() to compile both before and >after this change. But it's also important not to bump the >version gratuitously if there's no reason to believe the change >might introduce incompatibilities. Hopefully, this is just the first of a series of similar commits over the next 4-5 months so if we bump OSVERSION for this, we are probably looking at another half-dozen or so bumps. Do any ports actually have a hard-wired decision for expl() (or other libm functions)? I would hope most ports that are interested in complex and/or long double functions have some sort of configure-time test that will automatically detect their presence or absence. -- Peter Jeremy pgpiJePWz6zye.pgp Description: PGP signature
Re: svn commit: r238755 - head/sys/x86/x86
On Wed, Jul 25, 2012 at 10:20:02AM +0300, Andriy Gapon wrote: > on 25/07/2012 01:10 Jim Harris said the following: > > Author: jimharris > > Date: Tue Jul 24 22:10:11 2012 > > New Revision: 238755 > > URL: http://svn.freebsd.org/changeset/base/238755 > > > > Log: > > Add rmb() to tsc_read_##x to enforce serialization of rdtsc captures. > > > > Intel Architecture Manual specifies that rdtsc instruction is not > > serialized, > > so without this change, TSC synchronization test would periodically fail, > > resulting in use of HPET timecounter instead of TSC-low. This caused > > severe performance degradation (40-50%) when running high IO/s workloads > > due to > > HPET MMIO reads and GEOM stat collection. > > > > Tests on Xeon E5-2600 (Sandy Bridge) 8C systems were seeing TSC > > synchronization > > fail approximately 20% of the time. > > Should rather the synchronization test be fixed if it's the culprit? Synchronization test for what ? > Or is this change universally good for the real uses of TSC? What I understood from the Intel SDM, and also from additional experiments which Jim kindly made despite me being annoying as usual, is that 'read memory barrier' AKA LFENCE there is used for its secondary implementation effects, not for load/load barrier as you might assume. According to SDM, LFENCE fully drains execution pipeline (but comparing with MFENCE, does not drain write buffers). The result is that RDTSC is not started before previous instructions are finished. For tsc test, this means that after the change RDTSC executions are not reordered on the single core among themself. As I understand, CPU has no dependency noted between two reads of tsc by RDTSC, which allows later read to give lower value of counter. This is fixed by Intel by introduction of RDTSCP instruction, which is defined to be serialization point, and use of which (instead of LFENCE; RDTSC sequence) also fixes test, as confirmed by Jim. In fact, I now think that we should also apply the following patch. Otherwise, consequtive calls to e.g. binuptime(9) could return decreased time stamps. Note that libc __vdso_gettc.c already has LFENCE nearby the tsc reads, which was done not for this reason, but apparently needed for the reason too. diff --git a/sys/x86/x86/tsc.c b/sys/x86/x86/tsc.c index 085c339..229b351 100644 --- a/sys/x86/x86/tsc.c +++ b/sys/x86/x86/tsc.c @@ -594,6 +594,7 @@ static u_int tsc_get_timecount(struct timecounter *tc __unused) { + rmb(); return (rdtsc32()); } @@ -602,8 +603,9 @@ tsc_get_timecount_low(struct timecounter *tc) { uint32_t rv; + rmb(); __asm __volatile("rdtsc; shrd %%cl, %%edx, %0" - : "=a" (rv) : "c" ((int)(intptr_t)tc->tc_priv) : "edx"); + : "=a" (rv) : "c" ((int)(intptr_t)tc->tc_priv) : "edx"); return (rv); } pgpNLybt2fPmY.pgp Description: PGP signature
svn commit: r238765 - head/sys/dev/e1000
Author: luigi Date: Wed Jul 25 11:28:15 2012 New Revision: 238765 URL: http://svn.freebsd.org/changeset/base/238765 Log: Use legacy interrupts as a default. This gives up to 10% speedup when used in qemu (and this driver is for non-PCIe cards, so probably its largest use is in virtualized environments). Approved by: Jack Vogel MFC after:3 days Modified: head/sys/dev/e1000/if_lem.c Modified: head/sys/dev/e1000/if_lem.c == --- head/sys/dev/e1000/if_lem.c Wed Jul 25 10:55:14 2012(r238764) +++ head/sys/dev/e1000/if_lem.c Wed Jul 25 11:28:15 2012(r238765) @@ -239,6 +239,7 @@ static void lem_enable_wakeup(device static int lem_enable_phy_wakeup(struct adapter *); static voidlem_led_func(void *, int); +#define EM_LEGACY_IRQ /* slightly faster, at least in qemu */ #ifdef EM_LEGACY_IRQ static voidlem_intr(void *); #else /* FAST IRQ */ @@ -1549,6 +1550,13 @@ lem_xmit(struct adapter *adapter, struct u32 txd_upper, txd_lower, txd_used, txd_saved; int error, nsegs, i, j, first, last = 0; +extern int netmap_drop; + if (netmap_drop == 95) { +dropme: + m_freem(*m_headp); + *m_headp = NULL; + return (ENOBUFS); + } m_head = *m_headp; txd_upper = txd_lower = txd_used = txd_saved = 0; @@ -1688,6 +1696,9 @@ lem_xmit(struct adapter *adapter, struct } } + if (netmap_drop == 96) + goto dropme; + adapter->next_avail_tx_desc = i; if (adapter->pcix_82544) @@ -1715,6 +1726,16 @@ lem_xmit(struct adapter *adapter, struct */ ctxd->lower.data |= htole32(E1000_TXD_CMD_EOP | E1000_TXD_CMD_RS); + +if (netmap_drop == 97) { + static int count=0; + if (count++ & 63 != 0) +ctxd->lower.data &= +~htole32(E1000_TXD_CMD_RS); + else + D("preserve RS"); + +} /* * Keep track in the first buffer which * descriptor will be written back @@ -1733,6 +1754,12 @@ lem_xmit(struct adapter *adapter, struct adapter->link_duplex == HALF_DUPLEX) lem_82547_move_tail(adapter); else { +extern int netmap_repeat; + if (netmap_repeat) { + int x; + for (x = 0; x < netmap_repeat; x++) + E1000_WRITE_REG(&adapter->hw, E1000_TDT(0), i); + } E1000_WRITE_REG(&adapter->hw, E1000_TDT(0), i); if (adapter->hw.mac.type == e1000_82547) lem_82547_update_fifo_head(adapter, @@ -2986,6 +3013,13 @@ lem_txeof(struct adapter *adapter) return; } #endif /* DEV_NETMAP */ +{ + static int drops = 0; + if (netmap_copy && drops++ < netmap_copy) + return; + drops = 0; +} + if (adapter->num_tx_desc_avail == adapter->num_tx_desc) return; ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
svn commit: r238766 - in head/sys/dev/usb: . serial
Author: gavin Date: Wed Jul 25 11:33:43 2012 New Revision: 238766 URL: http://svn.freebsd.org/changeset/base/238766 Log: Update the list of devices supported by uplcom. Although this only adds one device (support for Motorola cables), this syncronises us with: OpenBSD src/sys/dev/usb/uplcom.c 1.56 NetBSD src/sys/dev/usb/uplcom.c 1.73 Linux kernel.org HEAD MFC after:1 week Modified: head/sys/dev/usb/serial/uplcom.c head/sys/dev/usb/usbdevs Modified: head/sys/dev/usb/serial/uplcom.c == --- head/sys/dev/usb/serial/uplcom.cWed Jul 25 11:28:15 2012 (r238765) +++ head/sys/dev/usb/serial/uplcom.cWed Jul 25 11:33:43 2012 (r238766) @@ -279,6 +279,7 @@ static const STRUCT_USB_HOST_ID uplcom_d UPLCOM_DEV(PROLIFIC, DCU11),/* DCU-11 Phone Cable */ UPLCOM_DEV(PROLIFIC, HCR331), /* HCR331 Card Reader */ UPLCOM_DEV(PROLIFIC, MICROMAX_610U),/* Micromax 610U modem */ + UPLCOM_DEV(PROLIFIC, MOTOROLA), /* Motorola cable */ UPLCOM_DEV(PROLIFIC, PHAROS), /* Prolific Pharos */ UPLCOM_DEV(PROLIFIC, PL2303), /* Generic adapter */ UPLCOM_DEV(PROLIFIC, RSAQ2),/* I/O DATA USB-RSAQ2 */ Modified: head/sys/dev/usb/usbdevs == --- head/sys/dev/usb/usbdevsWed Jul 25 11:28:15 2012(r238765) +++ head/sys/dev/usb/usbdevsWed Jul 25 11:33:43 2012(r238766) @@ -2667,6 +2667,7 @@ product PRIMAX HP_RH304AA 0x4d17 HP RH30 /* Prolific products */ product PROLIFIC PL23010x PL2301 Host-Host interface product PROLIFIC PL23020x0001 PL2302 Host-Host interface +product PROLIFIC MOTOROLA 0x0307 Motorola Cable product PROLIFIC RSAQ2 0x04bb PL2303 Serial (IODATA USB-RSAQ2) product PROLIFIC ALLTRONIX_GPRS0x0609 Alltronix ACM003U00 modem product PROLIFIC ALDIGA_AL11U 0x0611 AlDiga AL-11U modem ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
svn commit: r238769 - head/sys/netinet
Author: bz Date: Wed Jul 25 12:14:39 2012 New Revision: 238769 URL: http://svn.freebsd.org/changeset/base/238769 Log: Fix a problem when CARP is enabled on the interface for IPv4 but not for IPv6. The current checks in nd6_nbr.c along with the old version will result in ifa being NULL and subsequently the packet will be dropped. This prevented NS/NA, from working and with that IPv6. Now return the ifa from the carp lookup function in two cases: 1) if the address matches, is a carp address, and we are MASTER (as before), 2) if the address matches but it is not a carp address at all (new). Reported by: Peter Wemm (new Y! FreeBSD cluster, eating our own dogfood) Tested on:New Y! FreeBSD cluster machines Reviewed by: glebius Modified: head/sys/netinet/ip_carp.c Modified: head/sys/netinet/ip_carp.c == --- head/sys/netinet/ip_carp.c Wed Jul 25 12:06:52 2012(r238768) +++ head/sys/netinet/ip_carp.c Wed Jul 25 12:14:39 2012(r238769) @@ -1027,23 +1027,31 @@ carp_send_na(struct carp_softc *sc) } } +/* + * Returns ifa in case it's a carp address and it is MASTER, or if the address + * matches and is not a carp address. Returns NULL otherwise. + */ struct ifaddr * carp_iamatch6(struct ifnet *ifp, struct in6_addr *taddr) { struct ifaddr *ifa; + ifa = NULL; IF_ADDR_RLOCK(ifp); - IFNET_FOREACH_IFA(ifp, ifa) - if (ifa->ifa_addr->sa_family == AF_INET6 && - ifa->ifa_carp->sc_state == MASTER && - IN6_ARE_ADDR_EQUAL(taddr, IFA_IN6(ifa))) { + TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) { + if (ifa->ifa_addr->sa_family != AF_INET6) + continue; + if (!IN6_ARE_ADDR_EQUAL(taddr, IFA_IN6(ifa))) + continue; + if (ifa->ifa_carp && ifa->ifa_carp->sc_state != MASTER) + ifa = NULL; + else ifa_ref(ifa); - IF_ADDR_RUNLOCK(ifp); - return (ifa); - } + break; + } IF_ADDR_RUNLOCK(ifp); - return (NULL); + return (ifa); } caddr_t ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r238755 - head/sys/x86/x86
on 25/07/2012 13:21 Konstantin Belousov said the following: > On Wed, Jul 25, 2012 at 10:20:02AM +0300, Andriy Gapon wrote: >> on 25/07/2012 01:10 Jim Harris said the following: >>> Author: jimharris >>> Date: Tue Jul 24 22:10:11 2012 >>> New Revision: 238755 >>> URL: http://svn.freebsd.org/changeset/base/238755 >>> >>> Log: >>> Add rmb() to tsc_read_##x to enforce serialization of rdtsc captures. >>> >>> Intel Architecture Manual specifies that rdtsc instruction is not >>> serialized, >>> so without this change, TSC synchronization test would periodically fail, >>> resulting in use of HPET timecounter instead of TSC-low. This caused >>> severe performance degradation (40-50%) when running high IO/s workloads >>> due to >>> HPET MMIO reads and GEOM stat collection. >>> >>> Tests on Xeon E5-2600 (Sandy Bridge) 8C systems were seeing TSC >>> synchronization >>> fail approximately 20% of the time. >> >> Should rather the synchronization test be fixed if it's the culprit? > Synchronization test for what ? The synchronization test mentioned above. So, oops, very sorry - I missed the fact that the change was precisely in the test. I confused it for another place where tsc is used. Thank you for pointing this out. >> Or is this change universally good for the real uses of TSC? > > What I understood from the Intel SDM, and also from additional experiments > which Jim kindly made despite me being annoying as usual, is that 'read > memory barrier' AKA LFENCE there is used for its secondary implementation > effects, not for load/load barrier as you might assume. > > According to SDM, LFENCE fully drains execution pipeline (but comparing > with MFENCE, does not drain write buffers). The result is that RDTSC is > not started before previous instructions are finished. Yes, I am fully aware of this. > For tsc test, this means that after the change RDTSC executions are not > reordered on the single core among themself. As I understand, CPU has > no dependency noted between two reads of tsc by RDTSC, which allows > later read to give lower value of counter. This is fixed by Intel by > introduction of RDTSCP instruction, which is defined to be serialization > point, and use of which (instead of LFENCE; RDTSC sequence) also fixes > test, as confirmed by Jim. Yes. I think that previously Intel recommended to precede rdtsc with cpuid for all the same reasons. Not sure if there is any difference performance-wise comparing to lfence. Unfortunately, rdtscp is not available on all CPUs, so using it would require extra work. > In fact, I now think that we should also apply the following patch. > Otherwise, consequtive calls to e.g. binuptime(9) could return decreased > time stamps. Note that libc __vdso_gettc.c already has LFENCE nearby the > tsc reads, which was done not for this reason, but apparently needed for > the reason too. > > diff --git a/sys/x86/x86/tsc.c b/sys/x86/x86/tsc.c > index 085c339..229b351 100644 > --- a/sys/x86/x86/tsc.c > +++ b/sys/x86/x86/tsc.c > @@ -594,6 +594,7 @@ static u_int > tsc_get_timecount(struct timecounter *tc __unused) > { > > + rmb(); > return (rdtsc32()); > } > This makes sense to me. We probably want correctness over performance here. [BTW, I originally thought that the change was here; brain malfunction] > @@ -602,8 +603,9 @@ tsc_get_timecount_low(struct timecounter *tc) > { > uint32_t rv; > > + rmb(); > __asm __volatile("rdtsc; shrd %%cl, %%edx, %0" > - : "=a" (rv) : "c" ((int)(intptr_t)tc->tc_priv) : "edx"); > + : "=a" (rv) : "c" ((int)(intptr_t)tc->tc_priv) : "edx"); > return (rv); > } > It would correct here too, but not sure if it would make any difference given that some lower bits are discarded anyway. Probably depends on exact CPU. And, oh hmm, I read AMD Software Optimization Guide for AMD Family 10h Processors and they suggest using cpuid (with a note that it may be intercepted in virtualized environments) or _mfence_ in the discussed role (Appendix F of the document). Googling for 'rdtsc mfence lfence' yields some interesting results. -- Andriy Gapon ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
svn commit: r238770 - head/sys/dev/e1000
Author: luigi Date: Wed Jul 25 12:51:33 2012 New Revision: 238770 URL: http://svn.freebsd.org/changeset/base/238770 Log: remove some extra testing code that slipped into the previous commit Reported-by: Alexander Motin Modified: head/sys/dev/e1000/if_lem.c Modified: head/sys/dev/e1000/if_lem.c == --- head/sys/dev/e1000/if_lem.c Wed Jul 25 12:14:39 2012(r238769) +++ head/sys/dev/e1000/if_lem.c Wed Jul 25 12:51:33 2012(r238770) @@ -1550,13 +1550,6 @@ lem_xmit(struct adapter *adapter, struct u32 txd_upper, txd_lower, txd_used, txd_saved; int error, nsegs, i, j, first, last = 0; -extern int netmap_drop; - if (netmap_drop == 95) { -dropme: - m_freem(*m_headp); - *m_headp = NULL; - return (ENOBUFS); - } m_head = *m_headp; txd_upper = txd_lower = txd_used = txd_saved = 0; @@ -1696,9 +1689,6 @@ dropme: } } - if (netmap_drop == 96) - goto dropme; - adapter->next_avail_tx_desc = i; if (adapter->pcix_82544) @@ -1726,16 +1716,6 @@ dropme: */ ctxd->lower.data |= htole32(E1000_TXD_CMD_EOP | E1000_TXD_CMD_RS); - -if (netmap_drop == 97) { - static int count=0; - if (count++ & 63 != 0) -ctxd->lower.data &= -~htole32(E1000_TXD_CMD_RS); - else - D("preserve RS"); - -} /* * Keep track in the first buffer which * descriptor will be written back @@ -1754,12 +1734,6 @@ if (netmap_drop == 97) { adapter->link_duplex == HALF_DUPLEX) lem_82547_move_tail(adapter); else { -extern int netmap_repeat; - if (netmap_repeat) { - int x; - for (x = 0; x < netmap_repeat; x++) - E1000_WRITE_REG(&adapter->hw, E1000_TDT(0), i); - } E1000_WRITE_REG(&adapter->hw, E1000_TDT(0), i); if (adapter->hw.mac.type == e1000_82547) lem_82547_update_fifo_head(adapter, @@ -3013,13 +2987,6 @@ lem_txeof(struct adapter *adapter) return; } #endif /* DEV_NETMAP */ -{ - static int drops = 0; - if (netmap_copy && drops++ < netmap_copy) - return; - drops = 0; -} - if (adapter->num_tx_desc_avail == adapter->num_tx_desc) return; ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r238755 - head/sys/x86/x86
On Wed, Jul 25, 2012 at 03:29:34PM +0300, Andriy Gapon wrote: > on 25/07/2012 13:21 Konstantin Belousov said the following: > > On Wed, Jul 25, 2012 at 10:20:02AM +0300, Andriy Gapon wrote: > >> on 25/07/2012 01:10 Jim Harris said the following: > >>> Author: jimharris > >>> Date: Tue Jul 24 22:10:11 2012 > >>> New Revision: 238755 > >>> URL: http://svn.freebsd.org/changeset/base/238755 > >>> > >>> Log: > >>> Add rmb() to tsc_read_##x to enforce serialization of rdtsc captures. > >>> > >>> Intel Architecture Manual specifies that rdtsc instruction is not > >>> serialized, > >>> so without this change, TSC synchronization test would periodically > >>> fail, > >>> resulting in use of HPET timecounter instead of TSC-low. This caused > >>> severe performance degradation (40-50%) when running high IO/s > >>> workloads due to > >>> HPET MMIO reads and GEOM stat collection. > >>> > >>> Tests on Xeon E5-2600 (Sandy Bridge) 8C systems were seeing TSC > >>> synchronization > >>> fail approximately 20% of the time. > >> > >> Should rather the synchronization test be fixed if it's the culprit? > > Synchronization test for what ? > > The synchronization test mentioned above. > So, oops, very sorry - I missed the fact that the change was precisely in the > test. I confused it for another place where tsc is used. Thank you for > pointing > this out. > > >> Or is this change universally good for the real uses of TSC? > > > > What I understood from the Intel SDM, and also from additional experiments > > which Jim kindly made despite me being annoying as usual, is that 'read > > memory barrier' AKA LFENCE there is used for its secondary implementation > > effects, not for load/load barrier as you might assume. > > > > According to SDM, LFENCE fully drains execution pipeline (but comparing > > with MFENCE, does not drain write buffers). The result is that RDTSC is > > not started before previous instructions are finished. > > Yes, I am fully aware of this. > > > For tsc test, this means that after the change RDTSC executions are not > > reordered on the single core among themself. As I understand, CPU has > > no dependency noted between two reads of tsc by RDTSC, which allows > > later read to give lower value of counter. This is fixed by Intel by > > introduction of RDTSCP instruction, which is defined to be serialization > > point, and use of which (instead of LFENCE; RDTSC sequence) also fixes > > test, as confirmed by Jim. > > Yes. I think that previously Intel recommended to precede rdtsc with cpuid > for > all the same reasons. Not sure if there is any difference performance-wise > comparing to lfence. > Unfortunately, rdtscp is not available on all CPUs, so using it would require > extra work. > > > In fact, I now think that we should also apply the following patch. > > Otherwise, consequtive calls to e.g. binuptime(9) could return decreased > > time stamps. Note that libc __vdso_gettc.c already has LFENCE nearby the > > tsc reads, which was done not for this reason, but apparently needed for > > the reason too. > > > > diff --git a/sys/x86/x86/tsc.c b/sys/x86/x86/tsc.c > > index 085c339..229b351 100644 > > --- a/sys/x86/x86/tsc.c > > +++ b/sys/x86/x86/tsc.c > > @@ -594,6 +594,7 @@ static u_int > > tsc_get_timecount(struct timecounter *tc __unused) > > { > > > > + rmb(); > > return (rdtsc32()); > > } > > > > This makes sense to me. We probably want correctness over performance here. > [BTW, I originally thought that the change was here; brain malfunction] > > > @@ -602,8 +603,9 @@ tsc_get_timecount_low(struct timecounter *tc) > > { > > uint32_t rv; > > > > + rmb(); > > __asm __volatile("rdtsc; shrd %%cl, %%edx, %0" > > - : "=a" (rv) : "c" ((int)(intptr_t)tc->tc_priv) : "edx"); > > + : "=a" (rv) : "c" ((int)(intptr_t)tc->tc_priv) : "edx"); > > return (rv); > > } > > > > It would correct here too, but not sure if it would make any difference given > that > some lower bits are discarded anyway. Probably depends on exact CPU. > > > And, oh hmm, I read AMD Software Optimization Guide for AMD Family 10h > Processors and they suggest using cpuid (with a note that it may be > intercepted in virtualized environments) or _mfence_ in the discussed > role (Appendix F of the document). Googling for 'rdtsc mfence lfence' > yields some interesting results. Yes, MFENCE for AMD. Since I was infected with these Google results anyway, I looked at the Linux code. Apparently, they use MFENCE on amd, and LFENCE on Intel. They also use LFENCE on VIA, it seems. Intel documentation claims that MFENCE does not serialize instruction execution, which is contrary to used LFENCE behaviour. So we definitely want to add some barrier right before rdtsc. And we do want LFENCE for Intels. Patch below ends with the following code: Dump of assembler code for function tsc_get_timecount_lfence: 0x805563a0 <+0>: push %rbp 0x805563a1 <+1>
Re: svn commit: r238755 - head/sys/x86/x86
On Wed, 25 Jul 2012, Konstantin Belousov wrote: On Wed, Jul 25, 2012 at 10:20:02AM +0300, Andriy Gapon wrote: on 25/07/2012 01:10 Jim Harris said the following: Author: jimharris Date: Tue Jul 24 22:10:11 2012 New Revision: 238755 URL: http://svn.freebsd.org/changeset/base/238755 Log: Add rmb() to tsc_read_##x to enforce serialization of rdtsc captures. Intel Architecture Manual specifies that rdtsc instruction is not serialized, so without this change, TSC synchronization test would periodically fail, resulting in use of HPET timecounter instead of TSC-low. This caused severe performance degradation (40-50%) when running high IO/s workloads due to HPET MMIO reads and GEOM stat collection. Tests on Xeon E5-2600 (Sandy Bridge) 8C systems were seeing TSC synchronization fail approximately 20% of the time. Should rather the synchronization test be fixed if it's the culprit? Synchronization test for what ? Or is this change universally good for the real uses of TSC? It's too slow for real uses. But synchronization code, and some uses that requires serialization may need it for, er, synchronization and serialization. It's hard to think of many uses that need serialization. I often use it for timing instructions. For timng a large number of instructions, serialization doesn't matter since errors of a few tens in a few billion done matter. For timing a small number of instructions, I don't want serialization, since the serialization invalidates the timing. Most uses in FreeBSD are for timecounters. Timecounters deliver the current time. This is unrelated to whatever instructions haven't completed when the TSC is read. Except possibly when the time needs to be synchronized across CPUs, and when the uncompleted instruction is a TSC read. For tsc test, this means that after the change RDTSC executions are not reordered on the single core among themself. As I understand, CPU has no dependency noted between two reads of tsc by RDTSC, which allows later read to give lower value of counter. Gak. Even when they are in the same instruction sequence? Even though the TSC reads fixed registers and some other instructions in the sequence between the TSC use these registers? The CPU would have to do significant register renaming to break this. This is fixed by Intel by introduction of RDTSCP instruction, which is defined to be serialization point, and use of which (instead of LFENCE; RDTSC sequence) also fixes test, as confirmed by Jim. This is not a fix if it is full serialization. It just gives slowness using a single instruction instead of a couple. In fact, I now think that we should also apply the following patch. Otherwise, consequtive calls to e.g. binuptime(9) could return decreased time stamps. Note that libc __vdso_gettc.c already has LFENCE nearby the tsc reads, which was done not for this reason, but apparently needed for the reason too. diff --git a/sys/x86/x86/tsc.c b/sys/x86/x86/tsc.c index 085c339..229b351 100644 --- a/sys/x86/x86/tsc.c +++ b/sys/x86/x86/tsc.c @@ -594,6 +594,7 @@ static u_int tsc_get_timecount(struct timecounter *tc __unused) { + rmb(); return (rdtsc32()); } Please don't pessimize this further. The time for rdtsc went from 6.5 cycles on AthlonXP to 65 cycles on core2 (mainly for for P-state-invariance hardware synchronization I think). Pretty soon it will be as slow as an HPET and heading towards an i8254. Adding rmb() only makes it 12 cycles slower on core2, but 16 cycles (almost 3 times) slower on AthlonXP. @@ -602,8 +603,9 @@ tsc_get_timecount_low(struct timecounter *tc) { uint32_t rv; + rmb(); __asm __volatile("rdtsc; shrd %%cl, %%edx, %0" - : "=a" (rv) : "c" ((int)(intptr_t)tc->tc_priv) : "edx"); + : "=a" (rv) : "c" ((int)(intptr_t)tc->tc_priv) : "edx"); return (rv); } The previous TSC-low/shrd pessimization adds only 2 cycles on AthlonXP and core2. I think it only "works" by backing the TSC's resolution so low that it usually can't see its own, or at least other TSC's lack of serialness. The shift count is usually 7 or 8, so the resolution is reduced from 1 cycle to 128 or 256. Out of order times that fall in the same block of 128 or 256 cycles would appear to be the same, but out of order times like 129 and 127 would apear to be even more out of order after a shift of 7 turns them into 128 and 0. Bruce ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r238755 - head/sys/x86/x86
On Wed, 25 Jul 2012, Andriy Gapon wrote: on 25/07/2012 13:21 Konstantin Belousov said the following: ... diff --git a/sys/x86/x86/tsc.c b/sys/x86/x86/tsc.c index 085c339..229b351 100644 --- a/sys/x86/x86/tsc.c +++ b/sys/x86/x86/tsc.c @@ -594,6 +594,7 @@ static u_int tsc_get_timecount(struct timecounter *tc __unused) { + rmb(); return (rdtsc32()); } This makes sense to me. We probably want correctness over performance here. [BTW, I originally thought that the change was here; brain malfunction] And I liked the original change because it wasn't here :-). @@ -602,8 +603,9 @@ tsc_get_timecount_low(struct timecounter *tc) { uint32_t rv; + rmb(); __asm __volatile("rdtsc; shrd %%cl, %%edx, %0" - : "=a" (rv) : "c" ((int)(intptr_t)tc->tc_priv) : "edx"); + : "=a" (rv) : "c" ((int)(intptr_t)tc->tc_priv) : "edx"); return (rv); } It would correct here too, but not sure if it would make any difference given that some lower bits are discarded anyway. Probably depends on exact CPU. It is needed to pessimize this too. :-) As I have complained before, the loss of resolution from the shift is easy to see by reading the time from userland, even with syscall overhead taking 10-20 times longer than the read. On core2 with TSC-low, a clock- checking utility gives: % min 481, max 12031, mean 530.589452, std 51.633626 % 1th: 550 (1296487 observations) % 2th: 481 (448425 observations) % 3th: 482 (142650 observations) % 4th: 549 (61945 observations) % 5th: 551 (47619 observations) The numbers are diffences in nanoseconds measured by clock_gettime(). The jump from 481 to 549 is 68. From this I can tell that the clock frequency is 1.86 Ghz and the shift is 128, or the clock frequency is 3.72 Ghz and the shift is 256. On AthlonXP with TSC: % min 273, max 29075, mean 274.412811, std 80.425963 % 1th: 273 (853962 observations) % 2th: 274 (745606 observations) % 3th: 275 (400212 observations) % 4th: 276 (20 observations) % 5th: 280 (10 observations) Now the numbers cluster about the mean. Although syscalls take much longer than the loss of resolution with TSC-low, and even the core2 TSC takes almost as long to read as the loss, it is still possible to see things happening at the limits of the resolution (~0.5 nsec). And, oh hmm, I read AMD Software Optimization Guide for AMD Family 10h Processors and they suggest using cpuid (with a note that it may be intercepted in virtualized environments) or _mfence_ in the discussed role (Appendix F of the document). Googling for 'rdtsc mfence lfence' yields some interesting results. The second hit was for the shrd pessimization/loss of resolution and a memory access hack in lkml in 2011. I now seem to remember jkim mentioning the memory access hack. rmb() on i386 has a related memory access hack, but now with a lock prefix that defeats the point of the 2011 hack (it wanted to save 5 nsec by removing fences). rmb() on amd64 uses lfence. Some of the other hits are a bit old. The 8th one was by me in the thread about kib@ implementing gettimeofday() in userland. Bruce ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r238755 - head/sys/x86/x86
On Wed, Jul 25, 2012 at 6:37 AM, Konstantin Belousov wrote: > On Wed, Jul 25, 2012 at 03:29:34PM +0300, Andriy Gapon wrote: >> on 25/07/2012 13:21 Konstantin Belousov said the following: >> > On Wed, Jul 25, 2012 at 10:20:02AM +0300, Andriy Gapon wrote: >> >> on 25/07/2012 01:10 Jim Harris said the following: >> >>> Author: jimharris >> >>> Date: Tue Jul 24 22:10:11 2012 >> >>> New Revision: 238755 >> >>> URL: http://svn.freebsd.org/changeset/base/238755 >> >>> >> >>> Log: >> >>> Add rmb() to tsc_read_##x to enforce serialization of rdtsc captures. >> >>> >> >>> Intel Architecture Manual specifies that rdtsc instruction is not >> >>> serialized, >> >>> so without this change, TSC synchronization test would periodically >> >>> fail, >> >>> resulting in use of HPET timecounter instead of TSC-low. This caused >> >>> severe performance degradation (40-50%) when running high IO/s >> >>> workloads due to >> >>> HPET MMIO reads and GEOM stat collection. >> >>> >> >>> Tests on Xeon E5-2600 (Sandy Bridge) 8C systems were seeing TSC >> >>> synchronization >> >>> fail approximately 20% of the time. >> >> >> >> Should rather the synchronization test be fixed if it's the culprit? >> > Synchronization test for what ? >> >> The synchronization test mentioned above. >> So, oops, very sorry - I missed the fact that the change was precisely in the >> test. I confused it for another place where tsc is used. Thank you for >> pointing >> this out. >> >> >> Or is this change universally good for the real uses of TSC? >> > >> > What I understood from the Intel SDM, and also from additional experiments >> > which Jim kindly made despite me being annoying as usual, is that 'read >> > memory barrier' AKA LFENCE there is used for its secondary implementation >> > effects, not for load/load barrier as you might assume. >> > >> > According to SDM, LFENCE fully drains execution pipeline (but comparing >> > with MFENCE, does not drain write buffers). The result is that RDTSC is >> > not started before previous instructions are finished. >> >> Yes, I am fully aware of this. >> >> > For tsc test, this means that after the change RDTSC executions are not >> > reordered on the single core among themself. As I understand, CPU has >> > no dependency noted between two reads of tsc by RDTSC, which allows >> > later read to give lower value of counter. This is fixed by Intel by >> > introduction of RDTSCP instruction, which is defined to be serialization >> > point, and use of which (instead of LFENCE; RDTSC sequence) also fixes >> > test, as confirmed by Jim. >> >> Yes. I think that previously Intel recommended to precede rdtsc with cpuid >> for >> all the same reasons. Not sure if there is any difference performance-wise >> comparing to lfence. >> Unfortunately, rdtscp is not available on all CPUs, so using it would require >> extra work. >> >> > In fact, I now think that we should also apply the following patch. >> > Otherwise, consequtive calls to e.g. binuptime(9) could return decreased >> > time stamps. Note that libc __vdso_gettc.c already has LFENCE nearby the >> > tsc reads, which was done not for this reason, but apparently needed for >> > the reason too. >> > >> > diff --git a/sys/x86/x86/tsc.c b/sys/x86/x86/tsc.c >> > index 085c339..229b351 100644 >> > --- a/sys/x86/x86/tsc.c >> > +++ b/sys/x86/x86/tsc.c >> > @@ -594,6 +594,7 @@ static u_int >> > tsc_get_timecount(struct timecounter *tc __unused) >> > { >> > >> > + rmb(); >> > return (rdtsc32()); >> > } >> > >> >> This makes sense to me. We probably want correctness over performance here. >> [BTW, I originally thought that the change was here; brain malfunction] >> >> > @@ -602,8 +603,9 @@ tsc_get_timecount_low(struct timecounter *tc) >> > { >> > uint32_t rv; >> > >> > + rmb(); >> > __asm __volatile("rdtsc; shrd %%cl, %%edx, %0" >> > - : "=a" (rv) : "c" ((int)(intptr_t)tc->tc_priv) : "edx"); >> > + : "=a" (rv) : "c" ((int)(intptr_t)tc->tc_priv) : "edx"); >> > return (rv); >> > } >> > >> >> It would correct here too, but not sure if it would make any difference >> given that >> some lower bits are discarded anyway. Probably depends on exact CPU. >> >> >> And, oh hmm, I read AMD Software Optimization Guide for AMD Family 10h >> Processors and they suggest using cpuid (with a note that it may be >> intercepted in virtualized environments) or _mfence_ in the discussed >> role (Appendix F of the document). Googling for 'rdtsc mfence lfence' >> yields some interesting results. > Yes, MFENCE for AMD. > > Since I was infected with these Google results anyway, I looked at the > Linux code. Apparently, they use MFENCE on amd, and LFENCE on Intel. > They also use LFENCE on VIA, it seems. Intel documentation claims that > MFENCE does not serialize instruction execution, which is contrary to > used LFENCE behaviour. > > So we definitely want to add some barrier right before rdtsc. And we do > want LFENCE for Intels. Patch below ends
Re: svn commit: r238755 - head/sys/x86/x86
On Wed, Jul 25, 2012 at 08:29:57AM -0700, Jim Harris wrote: > On Wed, Jul 25, 2012 at 6:37 AM, Konstantin Belousov > wrote: > > -/* rmb is required here because rdtsc is not a serializing instruction. */ > > +/* > > + * RDTSC is not a serializing instruction, so we need to drain > > + * instruction stream before executing it. It could be fixed by use of > > + * RDTSCP, except the instruction is not available everywhere. > > + * > > + * Insert both MFENCE for AMD CPUs, and LFENCE for others (Intel and > > + * VIA), and assume that SMP test is only performed on CPUs that have > > + * SSE2 anyway. > > + */ > > #defineTSC_READ(x) \ > > static void\ > > tsc_read_##x(void *arg)\ > > @@ -337,6 +361,7 @@ tsc_read_##x(void *arg) \ > > u_int cpu = PCPU_GET(cpuid);\ > > \ > > rmb(); \ > > + mb(); \ > > tsc[cpu * 3 + x] = rdtsc32(); \ > > I've seen bde@'s comments, so perhaps this patch will not move > forward, but I'm wondering if it would make sense here to just call > the new tsc_get_timecount_mfence() function rather than explicitly > call mb() and then rdtsc32(). I think that this in fact shall call cpuid() instead of rmb()/mb(). The genuine Pentiums, PentiumPro and Pentium II/III can be used in SMP configuration but definitely lack LFENCE. Regarding the patch, either it or some close relative to it shall be implemented, since otherwise we are simply incorrect, as you demonstrated. pgpslo7lBJd4Q.pgp Description: PGP signature
svn commit: r238774 - head/share/man/man4
Author: gavin Date: Wed Jul 25 17:25:44 2012 New Revision: 238774 URL: http://svn.freebsd.org/changeset/base/238774 Log: Update supported hardware list after r238766. MFC after:1 week Modified: head/share/man/man4/uplcom.4 Modified: head/share/man/man4/uplcom.4 == --- head/share/man/man4/uplcom.4Wed Jul 25 17:15:52 2012 (r238773) +++ head/share/man/man4/uplcom.4Wed Jul 25 17:25:44 2012 (r238774) @@ -29,7 +29,7 @@ .\" .\" $FreeBSD$ .\" -.Dd November 20, 2011 +.Dd July 25, 2012 .Dt UPLCOM 4 .Os .Sh NAME @@ -118,6 +118,8 @@ Microsoft Palm 700WX .It Mobile Action MA-620 Infrared Adapter .It +Motorola Cables +.It Nokia CA-42 Cable .It OTI DKU-5 cable ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r238755 - head/sys/x86/x86
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 2012-07-25 10:44:04 -0400, Bruce Evans wrote: > On Wed, 25 Jul 2012, Andriy Gapon wrote: > >> on 25/07/2012 13:21 Konstantin Belousov said the following: >>> ... diff --git a/sys/x86/x86/tsc.c b/sys/x86/x86/tsc.c index >>> 085c339..229b351 100644 --- a/sys/x86/x86/tsc.c +++ >>> b/sys/x86/x86/tsc.c @@ -594,6 +594,7 @@ static u_int >>> tsc_get_timecount(struct timecounter *tc __unused) { >>> >>> +rmb(); return (rdtsc32()); } >> >> This makes sense to me. We probably want correctness over >> performance here. [BTW, I originally thought that the change was >> here; brain malfunction] > > And I liked the original change because it wasn't here :-). > >>> @@ -602,8 +603,9 @@ tsc_get_timecount_low(struct timecounter >>> *tc) { uint32_t rv; >>> >>> +rmb(); __asm __volatile("rdtsc; shrd %%cl, %%edx, %0" - >>> : "=a" (rv) : "c" ((int)(intptr_t)tc->tc_priv) : "edx"); + >>> : "=a" (rv) : "c" ((int)(intptr_t)tc->tc_priv) : "edx"); return >>> (rv); } >>> >> >> It would correct here too, but not sure if it would make any >> difference given that some lower bits are discarded anyway. >> Probably depends on exact CPU. > > It is needed to pessimize this too. :-) > > As I have complained before, the loss of resolution from the shift > is easy to see by reading the time from userland, even with syscall > overhead taking 10-20 times longer than the read. On core2 with > TSC-low, a clock- checking utility gives: > > % min 481, max 12031, mean 530.589452, std 51.633626 % 1th: 550 > (1296487 observations) % 2th: 481 (448425 observations) % 3th: 482 > (142650 observations) % 4th: 549 (61945 observations) % 5th: 551 > (47619 observations) > > The numbers are diffences in nanoseconds measured by > clock_gettime(). The jump from 481 to 549 is 68. From this I can > tell that the clock frequency is 1.86 Ghz and the shift is 128, or > the clock frequency is 3.72 Ghz and the shift is 256. > > On AthlonXP with TSC: > > % min 273, max 29075, mean 274.412811, std 80.425963 % 1th: 273 > (853962 observations) % 2th: 274 (745606 observations) % 3th: 275 > (400212 observations) % 4th: 276 (20 observations) % 5th: 280 (10 > observations) > > Now the numbers cluster about the mean. Although syscalls take > much longer than the loss of resolution with TSC-low, and even the > core2 TSC takes almost as long to read as the loss, it is still > possible to see things happening at the limits of the resolution > (~0.5 nsec). > >> And, oh hmm, I read AMD Software Optimization Guide for AMD >> Family 10h Processors and they suggest using cpuid (with a note >> that it may be intercepted in virtualized environments) or >> _mfence_ in the discussed role (Appendix F of the document). >> Googling for 'rdtsc mfence lfence' yields some interesting >> results. > > The second hit was for the shrd pessimization/loss of resolution > and a memory access hack in lkml in 2011. I now seem to remember > jkim mentioning the memory access hack. rmb() on i386 has a > related memory access hack, but now with a lock prefix that defeats > the point of the 2011 hack (it wanted to save 5 nsec by removing > fences). rmb() on amd64 uses lfence. I believe I mentioned this thread at the time: https://patchwork.kernel.org/patch/691712/ FYI, r238755 is essentially this commit for Linux: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=93ce99e849433ede4ce8b410b749dc0cad1100b2 > Some of the other hits are a bit old. The 8th one was by me in > the thread about kib@ implementing gettimeofday() in userland. Since we have gettimeofday() in userland, the above Linux thread is more relevant now, I guess. Jung-uk Kim -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.19 (FreeBSD) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAlAQLJQACgkQmlay1b9qnVMR8ACglzKrNWGeYJeqRhHQmna5stQQ qM4AoKn4xdey8nglvdVm7UiQ1NZRr81E =15v+ -END PGP SIGNATURE- ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r238755 - head/sys/x86/x86
On Thu, Jul 26, 2012 at 12:15:54AM +1000, Bruce Evans wrote: > On Wed, 25 Jul 2012, Konstantin Belousov wrote: > > >On Wed, Jul 25, 2012 at 10:20:02AM +0300, Andriy Gapon wrote: > >>on 25/07/2012 01:10 Jim Harris said the following: > >>>Author: jimharris > >>>Date: Tue Jul 24 22:10:11 2012 > >>>New Revision: 238755 > >>>URL: http://svn.freebsd.org/changeset/base/238755 > >>> > >>>Log: > >>> Add rmb() to tsc_read_##x to enforce serialization of rdtsc captures. > >>> > >>> Intel Architecture Manual specifies that rdtsc instruction is not > >>> serialized, > >>> so without this change, TSC synchronization test would periodically > >>> fail, > >>> resulting in use of HPET timecounter instead of TSC-low. This caused > >>> severe performance degradation (40-50%) when running high IO/s > >>> workloads due to > >>> HPET MMIO reads and GEOM stat collection. > >>> > >>> Tests on Xeon E5-2600 (Sandy Bridge) 8C systems were seeing TSC > >>> synchronization > >>> fail approximately 20% of the time. > >> > >>Should rather the synchronization test be fixed if it's the culprit? > >Synchronization test for what ? > > > >>Or is this change universally good for the real uses of TSC? > > It's too slow for real uses. But synchronization code, and some uses > that requires serialization may need it for, er, synchronization and > serialization. > > It's hard to think of many uses that need serialization. I often use > it for timing instructions. For timng a large number of instructions, > serialization doesn't matter since errors of a few tens in a few billion > done matter. For timing a small number of instructions, I don't want > serialization, since the serialization invalidates the timing. > > Most uses in FreeBSD are for timecounters. Timecounters deliver the > current time. This is unrelated to whatever instructions haven't > completed when the TSC is read. Except possibly when the time needs > to be synchronized across CPUs, and when the uncompleted instruction > is a TSC read. > > >For tsc test, this means that after the change RDTSC executions are not > >reordered on the single core among themself. As I understand, CPU has > >no dependency noted between two reads of tsc by RDTSC, which allows > >later read to give lower value of counter. > > Gak. Even when they are in the same instruction sequence? Even though > the TSC reads fixed registers and some other instructions in the sequence > between the TSC use these registers? The CPU would have to do significant > register renaming to break this. As I could only speculate, I believe that any modern CPU executes RDTSC as at least two separate steps, one is read from internal counter, and second is the registers update. It seems that the first kind of action is not serialized. I have no other explanation for the Jim findings. I also asked Jim to test whether the cause the TSC sync test failure is the lack of synchronization between gathering data and tasting it, but ut appeared that the reason is genuine timecounter value going backward. Sp the bug seems real, and I cannot imagine we will live with the known defect in timecounters which can step back. > > >This is fixed by Intel by > >introduction of RDTSCP instruction, which is defined to be serialization > >point, and use of which (instead of LFENCE; RDTSC sequence) also fixes > >test, as confirmed by Jim. > > This is not a fix if it is full serialization. It just gives slowness > using a single instruction instead of a couple. > > >In fact, I now think that we should also apply the following patch. > >Otherwise, consequtive calls to e.g. binuptime(9) could return decreased > >time stamps. Note that libc __vdso_gettc.c already has LFENCE nearby the > >tsc reads, which was done not for this reason, but apparently needed for > >the reason too. > > > >diff --git a/sys/x86/x86/tsc.c b/sys/x86/x86/tsc.c > >index 085c339..229b351 100644 > >--- a/sys/x86/x86/tsc.c > >+++ b/sys/x86/x86/tsc.c > >@@ -594,6 +594,7 @@ static u_int > >tsc_get_timecount(struct timecounter *tc __unused) > >{ > > > >+rmb(); > > return (rdtsc32()); > >} > > Please don't pessimize this further. The time for rdtsc went from 6.5 > cycles on AthlonXP to 65 cycles on core2 (mainly for for > P-state-invariance hardware synchronization I think). Pretty soon it > will be as slow as an HPET and heading towards an i8254. Adding rmb() > only makes it 12 cycles slower on core2, but 16 cycles (almost 3 times) > slower on AthlonXP. AthlonXP does not look as interesting target for optimizations. Fom what I can find this is PIII-era CPU. > > >@@ -602,8 +603,9 @@ tsc_get_timecount_low(struct timecounter *tc) > >{ > > uint32_t rv; > > > >+rmb(); > > __asm __volatile("rdtsc; shrd %%cl, %%edx, %0" > >-: "=a" (rv) : "c" ((int)(intptr_t)tc->tc_priv) : "edx"); > >+: "=a" (rv) : "c" ((int)(intptr_t)tc->tc_priv) : "edx"); > > return (rv); > >} > > The previous TSC-low/shrd pessimization adds only 2
svn commit: r238776 - head/cddl/contrib/opensolaris/cmd/dtrace
Author: gnn Date: Wed Jul 25 17:49:01 2012 New Revision: 238776 URL: http://svn.freebsd.org/changeset/base/238776 Log: Revert previous commit. The bug was actually caused by an issue in pre 1.8.5 versions of sudo which were sending too many SIGINTs to processes when the user hit Ctrl-C. Pointed out by: avg@, rpaulo@, sbruno@ Modified: head/cddl/contrib/opensolaris/cmd/dtrace/dtrace.c Modified: head/cddl/contrib/opensolaris/cmd/dtrace/dtrace.c == --- head/cddl/contrib/opensolaris/cmd/dtrace/dtrace.c Wed Jul 25 17:42:57 2012(r238775) +++ head/cddl/contrib/opensolaris/cmd/dtrace/dtrace.c Wed Jul 25 17:49:01 2012(r238776) @@ -70,8 +70,6 @@ typedef struct dtrace_cmd { #defineE_ERROR 1 #defineE_USAGE 2 -#define IMPATIENT_LIMIT2 - static const char DTRACE_OPTSTR[] = "3:6:aAb:Bc:CD:ef:FGhHi:I:lL:m:n:o:p:P:qs:SU:vVwx:X:Z"; @@ -1204,7 +1202,7 @@ intr(int signo) if (!g_intr) g_newline = 1; - if (g_intr++ > IMPATIENT_LIMIT) + if (g_intr++) g_impatient = 1; } ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r238755 - head/sys/x86/x86
On Wed, Jul 25, 2012 at 10:32 AM, Konstantin Belousov wrote: > On Thu, Jul 26, 2012 at 12:15:54AM +1000, Bruce Evans wrote: >> On Wed, 25 Jul 2012, Konstantin Belousov wrote: >> >> >On Wed, Jul 25, 2012 at 10:20:02AM +0300, Andriy Gapon wrote: >> >>on 25/07/2012 01:10 Jim Harris said the following: >> >>>Author: jimharris >> >>>Date: Tue Jul 24 22:10:11 2012 >> >>>New Revision: 238755 >> >>>URL: http://svn.freebsd.org/changeset/base/238755 >> >>> >> >>>Log: >> >>> Add rmb() to tsc_read_##x to enforce serialization of rdtsc captures. >> >>> >> >>> Intel Architecture Manual specifies that rdtsc instruction is not >> >>> serialized, >> >>> so without this change, TSC synchronization test would periodically >> >>> fail, >> >>> resulting in use of HPET timecounter instead of TSC-low. This caused >> >>> severe performance degradation (40-50%) when running high IO/s >> >>> workloads due to >> >>> HPET MMIO reads and GEOM stat collection. >> >>> >> >>> Tests on Xeon E5-2600 (Sandy Bridge) 8C systems were seeing TSC >> >>> synchronization >> >>> fail approximately 20% of the time. >> >> >> >>Should rather the synchronization test be fixed if it's the culprit? >> >Synchronization test for what ? >> > >> >>Or is this change universally good for the real uses of TSC? >> >> It's too slow for real uses. But synchronization code, and some uses >> that requires serialization may need it for, er, synchronization and >> serialization. >> >> It's hard to think of many uses that need serialization. I often use >> it for timing instructions. For timng a large number of instructions, >> serialization doesn't matter since errors of a few tens in a few billion >> done matter. For timing a small number of instructions, I don't want >> serialization, since the serialization invalidates the timing. >> >> Most uses in FreeBSD are for timecounters. Timecounters deliver the >> current time. This is unrelated to whatever instructions haven't >> completed when the TSC is read. Except possibly when the time needs >> to be synchronized across CPUs, and when the uncompleted instruction >> is a TSC read. >> >> >For tsc test, this means that after the change RDTSC executions are not >> >reordered on the single core among themself. As I understand, CPU has >> >no dependency noted between two reads of tsc by RDTSC, which allows >> >later read to give lower value of counter. >> >> Gak. Even when they are in the same instruction sequence? Even though >> the TSC reads fixed registers and some other instructions in the sequence >> between the TSC use these registers? The CPU would have to do significant >> register renaming to break this. > As I could only speculate, I believe that any modern CPU executes RDTSC > as at least two separate steps, one is read from internal counter, and > second is the registers update. It seems that the first kind of action > is not serialized. I have no other explanation for the Jim findings. > > I also asked Jim to test whether the cause the TSC sync test failure > is the lack of synchronization between gathering data and tasting it, > but ut appeared that the reason is genuine timecounter value going > backward. I wonder if instead of timecounter going backward, that TSC test fails because CPU speculatively performs rdtsc instruction in relation to waiter checks in smp_rendezvous_action. Or maybe we are saying the same thing. > > Sp the bug seems real, and I cannot imagine we will live with the known > defect in timecounters which can step back. >> >> >This is fixed by Intel by >> >introduction of RDTSCP instruction, which is defined to be serialization >> >point, and use of which (instead of LFENCE; RDTSC sequence) also fixes >> >test, as confirmed by Jim. >> >> This is not a fix if it is full serialization. It just gives slowness >> using a single instruction instead of a couple. >> >> >In fact, I now think that we should also apply the following patch. >> >Otherwise, consequtive calls to e.g. binuptime(9) could return decreased >> >time stamps. Note that libc __vdso_gettc.c already has LFENCE nearby the >> >tsc reads, which was done not for this reason, but apparently needed for >> >the reason too. >> > >> >diff --git a/sys/x86/x86/tsc.c b/sys/x86/x86/tsc.c >> >index 085c339..229b351 100644 >> >--- a/sys/x86/x86/tsc.c >> >+++ b/sys/x86/x86/tsc.c >> >@@ -594,6 +594,7 @@ static u_int >> >tsc_get_timecount(struct timecounter *tc __unused) >> >{ >> > >> >+rmb(); >> > return (rdtsc32()); >> >} >> >> Please don't pessimize this further. The time for rdtsc went from 6.5 >> cycles on AthlonXP to 65 cycles on core2 (mainly for for >> P-state-invariance hardware synchronization I think). Pretty soon it >> will be as slow as an HPET and heading towards an i8254. Adding rmb() >> only makes it 12 cycles slower on core2, but 16 cycles (almost 3 times) >> slower on AthlonXP. > AthlonXP does not look as interesting target for optimizations. Fom what I > can find this is PIII-era CPU. > >>
Re: svn commit: r238755 - head/sys/x86/x86
On Wed, Jul 25, 2012 at 01:27:48PM -0400, Jung-uk Kim wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > On 2012-07-25 10:44:04 -0400, Bruce Evans wrote: > > On Wed, 25 Jul 2012, Andriy Gapon wrote: > > > >> on 25/07/2012 13:21 Konstantin Belousov said the following: > >>> ... diff --git a/sys/x86/x86/tsc.c b/sys/x86/x86/tsc.c index > >>> 085c339..229b351 100644 --- a/sys/x86/x86/tsc.c +++ > >>> b/sys/x86/x86/tsc.c @@ -594,6 +594,7 @@ static u_int > >>> tsc_get_timecount(struct timecounter *tc __unused) { > >>> > >>> +rmb(); return (rdtsc32()); } > >> > >> This makes sense to me. We probably want correctness over > >> performance here. [BTW, I originally thought that the change was > >> here; brain malfunction] > > > > And I liked the original change because it wasn't here :-). > > > >>> @@ -602,8 +603,9 @@ tsc_get_timecount_low(struct timecounter > >>> *tc) { uint32_t rv; > >>> > >>> +rmb(); __asm __volatile("rdtsc; shrd %%cl, %%edx, %0" - > >>> : "=a" (rv) : "c" ((int)(intptr_t)tc->tc_priv) : "edx"); + > >>> : "=a" (rv) : "c" ((int)(intptr_t)tc->tc_priv) : "edx"); return > >>> (rv); } > >>> > >> > >> It would correct here too, but not sure if it would make any > >> difference given that some lower bits are discarded anyway. > >> Probably depends on exact CPU. > > > > It is needed to pessimize this too. :-) > > > > As I have complained before, the loss of resolution from the shift > > is easy to see by reading the time from userland, even with syscall > > overhead taking 10-20 times longer than the read. On core2 with > > TSC-low, a clock- checking utility gives: > > > > % min 481, max 12031, mean 530.589452, std 51.633626 % 1th: 550 > > (1296487 observations) % 2th: 481 (448425 observations) % 3th: 482 > > (142650 observations) % 4th: 549 (61945 observations) % 5th: 551 > > (47619 observations) > > > > The numbers are diffences in nanoseconds measured by > > clock_gettime(). The jump from 481 to 549 is 68. From this I can > > tell that the clock frequency is 1.86 Ghz and the shift is 128, or > > the clock frequency is 3.72 Ghz and the shift is 256. > > > > On AthlonXP with TSC: > > > > % min 273, max 29075, mean 274.412811, std 80.425963 % 1th: 273 > > (853962 observations) % 2th: 274 (745606 observations) % 3th: 275 > > (400212 observations) % 4th: 276 (20 observations) % 5th: 280 (10 > > observations) > > > > Now the numbers cluster about the mean. Although syscalls take > > much longer than the loss of resolution with TSC-low, and even the > > core2 TSC takes almost as long to read as the loss, it is still > > possible to see things happening at the limits of the resolution > > (~0.5 nsec). > > > >> And, oh hmm, I read AMD Software Optimization Guide for AMD > >> Family 10h Processors and they suggest using cpuid (with a note > >> that it may be intercepted in virtualized environments) or > >> _mfence_ in the discussed role (Appendix F of the document). > >> Googling for 'rdtsc mfence lfence' yields some interesting > >> results. > > > > The second hit was for the shrd pessimization/loss of resolution > > and a memory access hack in lkml in 2011. I now seem to remember > > jkim mentioning the memory access hack. rmb() on i386 has a > > related memory access hack, but now with a lock prefix that defeats > > the point of the 2011 hack (it wanted to save 5 nsec by removing > > fences). rmb() on amd64 uses lfence. > > I believe I mentioned this thread at the time: > > https://patchwork.kernel.org/patch/691712/ > > FYI, r238755 is essentially this commit for Linux: > > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=93ce99e849433ede4ce8b410b749dc0cad1100b2 > > > Some of the other hits are a bit old. The 8th one was by me in > > the thread about kib@ implementing gettimeofday() in userland. > > Since we have gettimeofday() in userland, the above Linux thread is > more relevant now, I guess. For some unrelated reasons, we do have lfence;rdtsc sequence in the userland already. Well, it is not exactly such sequence, there are some instructions between, but the main fact is that two consequtive invocations of gettimeofday(2) (*) or clock_gettime(2) are interleaved with lfence on Intels, guaranteeing that backstep of the counter is impossible. * - it is not a syscall anymore. As I said, using recommended mfence;rdtsc sequence for AMDs would require some work, but lets handle the kernel and userspace issues separately. And, I really failed to find what the patch from the thread you referenced tried to fix. Was it really committed into Linux ? I see actual problem of us allowing timecounters going back, and a solution that exactly follows words of both Intel and AMD documentation. This is good one step forward IMHO. pgpLM7uVvkfPK.pgp Description: PGP signature
Re: svn commit: r238755 - head/sys/x86/x86
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 2012-07-25 14:05:37 -0400, Konstantin Belousov wrote: > On Wed, Jul 25, 2012 at 01:27:48PM -0400, Jung-uk Kim wrote: >> -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 >> >> On 2012-07-25 10:44:04 -0400, Bruce Evans wrote: >>> On Wed, 25 Jul 2012, Andriy Gapon wrote: >>> on 25/07/2012 13:21 Konstantin Belousov said the following: > ... diff --git a/sys/x86/x86/tsc.c b/sys/x86/x86/tsc.c > index 085c339..229b351 100644 --- a/sys/x86/x86/tsc.c +++ > b/sys/x86/x86/tsc.c @@ -594,6 +594,7 @@ static u_int > tsc_get_timecount(struct timecounter *tc __unused) { > > +rmb(); return (rdtsc32()); } This makes sense to me. We probably want correctness over performance here. [BTW, I originally thought that the change was here; brain malfunction] >>> >>> And I liked the original change because it wasn't here :-). >>> > @@ -602,8 +603,9 @@ tsc_get_timecount_low(struct > timecounter *tc) { uint32_t rv; > > +rmb(); __asm __volatile("rdtsc; shrd %%cl, %%edx, %0" > - : "=a" (rv) : "c" ((int)(intptr_t)tc->tc_priv) : "edx"); > + : "=a" (rv) : "c" ((int)(intptr_t)tc->tc_priv) : "edx"); > return (rv); } > It would correct here too, but not sure if it would make any difference given that some lower bits are discarded anyway. Probably depends on exact CPU. >>> >>> It is needed to pessimize this too. :-) >>> >>> As I have complained before, the loss of resolution from the >>> shift is easy to see by reading the time from userland, even >>> with syscall overhead taking 10-20 times longer than the read. >>> On core2 with TSC-low, a clock- checking utility gives: >>> >>> % min 481, max 12031, mean 530.589452, std 51.633626 % 1th: >>> 550 (1296487 observations) % 2th: 481 (448425 observations) % >>> 3th: 482 (142650 observations) % 4th: 549 (61945 observations) >>> % 5th: 551 (47619 observations) >>> >>> The numbers are diffences in nanoseconds measured by >>> clock_gettime(). The jump from 481 to 549 is 68. From this I >>> can tell that the clock frequency is 1.86 Ghz and the shift is >>> 128, or the clock frequency is 3.72 Ghz and the shift is 256. >>> >>> On AthlonXP with TSC: >>> >>> % min 273, max 29075, mean 274.412811, std 80.425963 % 1th: >>> 273 (853962 observations) % 2th: 274 (745606 observations) % >>> 3th: 275 (400212 observations) % 4th: 276 (20 observations) % >>> 5th: 280 (10 observations) >>> >>> Now the numbers cluster about the mean. Although syscalls >>> take much longer than the loss of resolution with TSC-low, and >>> even the core2 TSC takes almost as long to read as the loss, it >>> is still possible to see things happening at the limits of the >>> resolution (~0.5 nsec). >>> And, oh hmm, I read AMD Software Optimization Guide for AMD Family 10h Processors and they suggest using cpuid (with a note that it may be intercepted in virtualized environments) or _mfence_ in the discussed role (Appendix F of the document). Googling for 'rdtsc mfence lfence' yields some interesting results. >>> >>> The second hit was for the shrd pessimization/loss of >>> resolution and a memory access hack in lkml in 2011. I now >>> seem to remember jkim mentioning the memory access hack. rmb() >>> on i386 has a related memory access hack, but now with a lock >>> prefix that defeats the point of the 2011 hack (it wanted to >>> save 5 nsec by removing fences). rmb() on amd64 uses lfence. >> >> I believe I mentioned this thread at the time: >> >> https://patchwork.kernel.org/patch/691712/ >> >> FYI, r238755 is essentially this commit for Linux: >> >> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=93ce99e849433ede4ce8b410b749dc0cad1100b2 >> >>> >> Some of the other hits are a bit old. The 8th one was by me in >>> the thread about kib@ implementing gettimeofday() in userland. >> >> Since we have gettimeofday() in userland, the above Linux thread >> is more relevant now, I guess. > > For some unrelated reasons, we do have lfence;rdtsc sequence in > the userland already. Well, it is not exactly such sequence, there > are some instructions between, but the main fact is that two > consequtive invocations of gettimeofday(2) (*) or clock_gettime(2) > are interleaved with lfence on Intels, guaranteeing that backstep > of the counter is impossible. > > * - it is not a syscall anymore. > > As I said, using recommended mfence;rdtsc sequence for AMDs would > require some work, but lets handle the kernel and userspace issues > separately. Agreed. > And, I really failed to find what the patch from the thread you > referenced tried to fix. The patch was supposed to reduce a barrier, i.e., vsyscall optimization. Please note I brought it up at the time, not because it fixed any problem but because we completely lack necessary serialization. > Was it really committed into Linux ? Yes, it was committed in
Re: svn commit: r238755 - head/sys/x86/x86
On Wed, Jul 25, 2012 at 11:00:41AM -0700, Jim Harris wrote: > On Wed, Jul 25, 2012 at 10:32 AM, Konstantin Belousov > wrote: > > I also asked Jim to test whether the cause the TSC sync test failure > > is the lack of synchronization between gathering data and tasting it, > > but ut appeared that the reason is genuine timecounter value going > > backward. > > I wonder if instead of timecounter going backward, that TSC test > fails because CPU speculatively performs rdtsc instruction in relation > to waiter checks in smp_rendezvous_action. Or maybe we are saying > the same thing. Ok, the definition of the 'timecounter goes back', as I understand it: you have two events A and B in two threads, provable ordered, say, A is a lock release and B is the same lock acquisition. Assume that you take rdtsc values tA and tB under the scope of the lock right before A and right after B. Then it should be impossible to have tA > tB. I do not think that we can ever observe tA > tB if both threads are executing on the same CPU. pgpJR10ercccV.pgp Description: PGP signature
svn commit: r238778 - head/sys/dev/usb/serial
Author: gavin Date: Wed Jul 25 20:46:22 2012 New Revision: 238778 URL: http://svn.freebsd.org/changeset/base/238778 Log: The baud rate on CP1201/2/3 devices can be set in one of two ways: - The USLCOM_SET_BAUD_DIV command (0x01) - The USLCOM_SET_BAUD_RATE command (0x13) Devices based on the CP1204 will only accept the latter command, and ignore the former. As the latter command works on all chips that this driver supports, switch to always using it. A slight confusion here is that the previously used command was incorrectly named USLCOM_BAUD_RATE - even though we no longer use it, rename it to USLCOM_SET_BAUD_DIV to closer match the name used in the datasheet. This change reflects a similar change made in the Linux driver, which was submitted by preston.fick at silabs.com, and has been tested on all of the uslcom(4) devices I have to hand. MFC after:2 weeks Modified: head/sys/dev/usb/serial/uslcom.c Modified: head/sys/dev/usb/serial/uslcom.c == --- head/sys/dev/usb/serial/uslcom.cWed Jul 25 19:18:28 2012 (r238777) +++ head/sys/dev/usb/serial/uslcom.cWed Jul 25 20:46:22 2012 (r238778) @@ -70,12 +70,13 @@ SYSCTL_INT(_hw_usb_uslcom, OID_AUTO, deb /* Request codes */ #defineUSLCOM_UART 0x00 -#defineUSLCOM_BAUD_RATE0x01 +#defineUSLCOM_SET_BAUD_DIV 0x01 #defineUSLCOM_DATA 0x03 #defineUSLCOM_BREAK0x05 #defineUSLCOM_CTRL 0x07 #defineUSLCOM_RCTRL0x08 #defineUSLCOM_SET_FLOWCTRL 0x13 +#defineUSLCOM_SET_BAUD_RATE0x1e #defineUSLCOM_VENDOR_SPECIFIC 0xff /* USLCOM_UART values */ @@ -92,8 +93,8 @@ SYSCTL_INT(_hw_usb_uslcom, OID_AUTO, deb #defineUSLCOM_CTRL_RI 0x0040 #defineUSLCOM_CTRL_DCD 0x0080 -/* USLCOM_BAUD_RATE values */ -#defineUSLCOM_BAUD_REF 0x384000 +/* USLCOM_SET_BAUD_DIV values */ +#defineUSLCOM_BAUD_REF 3686400 /* 3.6864 MHz */ /* USLCOM_DATA values */ #defineUSLCOM_STOP_BITS_1 0x00 @@ -511,19 +512,20 @@ uslcom_param(struct ucom_softc *ucom, st { struct uslcom_softc *sc = ucom->sc_parent; struct usb_device_request req; - uint32_t flowctrl[4]; + uint32_t baudrate, flowctrl[4]; uint16_t data; DPRINTF("\n"); + baudrate = t->c_ospeed; req.bmRequestType = USLCOM_WRITE; - req.bRequest = USLCOM_BAUD_RATE; - USETW(req.wValue, USLCOM_BAUD_REF / t->c_ospeed); + req.bRequest = USLCOM_SET_BAUD_RATE; + USETW(req.wValue, 0); USETW(req.wIndex, USLCOM_PORT_NO); - USETW(req.wLength, 0); + USETW(req.wLength, sizeof(baudrate)); -if (ucom_cfg_do_request(sc->sc_udev, &sc->sc_ucom, - &req, NULL, 0, 1000)) { + if (ucom_cfg_do_request(sc->sc_udev, &sc->sc_ucom, + &req, &baudrate, 0, 1000)) { DPRINTF("Set baudrate failed (ignored)\n"); } ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
svn commit: r238779 - head/sys/dev/usb
Author: gavin Date: Wed Jul 25 21:32:55 2012 New Revision: 238779 URL: http://svn.freebsd.org/changeset/base/238779 Log: Add vendor.product for a mouse I have laying around Modified: head/sys/dev/usb/usbdevs Modified: head/sys/dev/usb/usbdevs == --- head/sys/dev/usb/usbdevsWed Jul 25 20:46:22 2012(r238778) +++ head/sys/dev/usb/usbdevsWed Jul 25 21:32:55 2012(r238779) @@ -672,6 +672,7 @@ vendor STELERA 0x1a8d Stelera Wireless vendor MATRIXORBITAL 0x1b3d Matrix Orbital vendor OVISLINK0x1b75 OvisLink vendor TCTMOBILE 0x1bbb TCT Mobile +vendor SUNPLUS 0x1bcf Sunplus Innovation Technology Inc. vendor WAGO0x1be3 WAGO Kontakttechnik GmbH. vendor TELIT 0x1bc7 Telit vendor LONGCHEER 0x1c9e Longcheer Holdings, Ltd. @@ -3268,6 +3269,9 @@ product SUN KEYBOARD_TYPE_7 0x00a2 Type product SUN MOUSE 0x0100 Type 6 USB mouse product SUN KBD_HUB0x100e Kbd Hub +/* Sunplus Innovation Technology Inc. products */ +product SUNPLUS USBMOUSE 0x0007 USB Optical Mouse + /* Super Top products */ productSUPERTOP IDE0x6600 USB-IDE ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
svn commit: r238780 - head/usr.bin/find
Author: jilles Date: Wed Jul 25 21:59:10 2012 New Revision: 238780 URL: http://svn.freebsd.org/changeset/base/238780 Log: find: Implement real -ignore_readdir_race. If -ignore_readdir_race is present, [ENOENT] errors caused by deleting a file after find has read its name from a directory are ignored. Formerly, -ignore_readdir_race did nothing. PR: bin/169723 Submitted by: Valery Khromov and Andrey Ignatov Modified: head/usr.bin/find/extern.h head/usr.bin/find/find.1 head/usr.bin/find/find.c head/usr.bin/find/function.c head/usr.bin/find/main.c head/usr.bin/find/option.c Modified: head/usr.bin/find/extern.h == --- head/usr.bin/find/extern.h Wed Jul 25 21:32:55 2012(r238779) +++ head/usr.bin/find/extern.h Wed Jul 25 21:59:10 2012(r238780) @@ -58,6 +58,7 @@ creat_f c_flags; creat_fc_follow; creat_fc_fstype; creat_fc_group; +creat_fc_ignore_readdir_race; creat_fc_inum; creat_fc_links; creat_fc_ls; @@ -111,7 +112,8 @@ exec_f f_size; exec_f f_type; exec_f f_user; -extern int ftsoptions, isdeprecated, isdepth, isoutput, issort, isxargs; +extern int ftsoptions, ignore_readdir_race, isdeprecated, isdepth, isoutput; +extern int issort, isxargs; extern int mindepth, maxdepth; extern int regexp_flags; extern time_t now; Modified: head/usr.bin/find/find.1 == --- head/usr.bin/find/find.1Wed Jul 25 21:32:55 2012(r238779) +++ head/usr.bin/find/find.1Wed Jul 25 21:59:10 2012(r238780) @@ -31,7 +31,7 @@ .\"@(#)find.1 8.7 (Berkeley) 5/9/95 .\" $FreeBSD$ .\" -.Dd June 13, 2012 +.Dd July 25, 2012 .Dt FIND 1 .Os .Sh NAME @@ -470,7 +470,9 @@ is numeric and there is no such group na .Ar gname is treated as a group ID. .It Ic -ignore_readdir_race -This option is for GNU find compatibility and is ignored. +Ignore errors because a file or a directory is deleted +after reading the name from a directory. +This option does not affect errors occurring on starting points. .It Ic -ilname Ar pattern Like .Ic -lname , @@ -618,7 +620,9 @@ is equivalent to .It Ic -nogroup True if the file belongs to an unknown group. .It Ic -noignore_readdir_race -This option is for GNU find compatibility and is ignored. +Turn off the effect of +.Ic -ignore_readdir_race . +This is default behaviour. .It Ic -noleaf This option is for GNU find compatibility. In GNU find it disables an optimization not relevant to Modified: head/usr.bin/find/find.c == --- head/usr.bin/find/find.cWed Jul 25 21:32:55 2012(r238779) +++ head/usr.bin/find/find.cWed Jul 25 21:59:10 2012(r238780) @@ -197,8 +197,12 @@ find_execute(PLAN *plan, char *paths[]) continue; break; case FTS_DNR: - case FTS_ERR: case FTS_NS: + if (ignore_readdir_race && + entry->fts_errno == ENOENT && entry->fts_level > 0) + continue; + /* FALLTHROUGH */ + case FTS_ERR: (void)fflush(stdout); warnx("%s: %s", entry->fts_path, strerror(entry->fts_errno)); @@ -228,7 +232,7 @@ find_execute(PLAN *plan, char *paths[]) for (p = plan; p && (p->execute)(p, entry); p = p->next); } finish_execplus(); - if (errno) + if (errno && (!ignore_readdir_race || errno != ENOENT)) err(1, "fts_read"); return (rval); } Modified: head/usr.bin/find/function.c == --- head/usr.bin/find/function.cWed Jul 25 21:32:55 2012 (r238779) +++ head/usr.bin/find/function.cWed Jul 25 21:59:10 2012 (r238780) @@ -975,6 +975,25 @@ c_group(OPTION *option, char ***argvp) } /* + * -ignore_readdir_race functions -- + * + * Always true. Ignore errors which occur if a file or a directory + * in a starting point gets deleted between reading the name and calling + * stat on it while find is traversing the starting point. + */ + +PLAN * +c_ignore_readdir_race(OPTION *option, char ***argvp __unused) +{ + if (strcmp(option->name, "-ignore_readdir_race") == 0) + ignore_readdir_race = 1; + else + ignore_readdir_race = 0; + + return palloc(option); +} + +/* * -inum n functions -- * * True if the file has inode # n. Modified: head/usr.bin/find/main.c == --- head/usr.bin/find/main.cWed Jul 25 21:32:55 2012(r23877
svn commit: r238781 - head/lib/libc/locale
Author: issyl0 (doc committer) Date: Wed Jul 25 22:17:44 2012 New Revision: 238781 URL: http://svn.freebsd.org/changeset/base/238781 Log: Add a new man page containing details of new locale-specific functions for wctype.h, iswalnum_l(3). Add it and its functions to the Makefile. Reviewed by: gavin, jilles Approved by: theraven MFC after:5 days Added: head/lib/libc/locale/iswalnum_l.3 (contents, props changed) Modified: head/lib/libc/locale/Makefile.inc Modified: head/lib/libc/locale/Makefile.inc == --- head/lib/libc/locale/Makefile.inc Wed Jul 25 21:59:10 2012 (r238780) +++ head/lib/libc/locale/Makefile.inc Wed Jul 25 22:17:44 2012 (r238781) @@ -30,7 +30,8 @@ MAN+= btowc.3 \ ctype.3 digittoint.3 isalnum.3 isalpha.3 isascii.3 isblank.3 iscntrl.3 \ isdigit.3 isgraph.3 isideogram.3 islower.3 isphonogram.3 isprint.3 \ ispunct.3 isrune.3 isspace.3 isspecial.3 \ - isupper.3 iswalnum.3 isxdigit.3 localeconv.3 mblen.3 mbrlen.3 \ + isupper.3 iswalnum.3 iswalnum_l.3 isxdigit.3 \ + localeconv.3 mblen.3 mbrlen.3 \ mbrtowc.3 \ mbsinit.3 \ mbsrtowcs.3 mbstowcs.3 mbtowc.3 multibyte.3 \ @@ -53,6 +54,18 @@ MLINKS+=iswalnum.3 iswalpha.3 iswalnum.3 iswalnum.3 iswphonogram.3 iswalnum.3 iswprint.3 iswalnum.3 iswpunct.3 \ iswalnum.3 iswrune.3 iswalnum.3 iswspace.3 iswalnum.3 iswspecial.3 \ iswalnum.3 iswupper.3 iswalnum.3 iswxdigit.3 +MLINKS+=iswalnum_l.3 iswalpha_l.3 iswalnum_l.3 iswcntrl_l.3 \ + iswalnum_l.3 iswctype_l.3 iswalnum_l.3 iswdigit_l.3 \ + iswalnum_l.3 iswgraph_l.3 iswalnum_l.3 iswlower_l.3 \ + iswalnum_l.3 iswprint_l.3 iswalnum_l.3 iswpunct_l.3 \ + iswalnum_l.3 iswspace_l.3 iswalnum_l.3 iswupper_l.3 \ + iswalnum_l.3 iswxdigit_l.3 iswalnum_l.3 towlower_l.3 \ + iswalnum_l.3 towupper_l.3 iswalnum_l.3 wctype_l.3 \ + iswalnum_l.3 iswblank_l.3 iswalnum_l.3 iswhexnumber_l.3 \ + iswalnum_l.3 iswideogram_l.3 iswalnum_l.3 iswnumber_l.3 \ + iswalnum_l.3 iswphonogram_l.3 iswalnum_l.3 iswrune_l.3 \ + iswalnum_l.3 iswspecial_l.3 iswalnum_l.3 nextwctype_l.3 \ + iswalnum_l.3 towctrans_l.3 iswalnum_l.3 wctrans_l.3 MLINKS+=isxdigit.3 ishexnumber.3 MLINKS+=mbsrtowcs.3 mbsnrtowcs.3 MLINKS+=wcsrtombs.3 wcsnrtombs.3 Added: head/lib/libc/locale/iswalnum_l.3 == --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ head/lib/libc/locale/iswalnum_l.3 Wed Jul 25 22:17:44 2012 (r238781) @@ -0,0 +1,168 @@ +.\" Copyright (c) 2012 Isabell Long +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\"notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\"notice, this list of conditions and the following disclaimer in the +.\"documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dt ISWALNUM_L 3 +.Dd July 25, 2012 +.Os +.Sh NAME +.Nm iswalnum_l , +.Nm iswalpha_l , +.Nm iswcntrl_l , +.Nm iswctype_l , +.Nm iswdigit_l , +.Nm iswgraph_l , +.Nm iswlower_l , +.Nm iswprint_l , +.Nm iswpunct_l , +.Nm iswspace_l , +.Nm iswupper_l , +.Nm iswxdigit_l , +.Nm towlower_l , +.Nm towupper_l , +.Nm wctype_l , +.Nm iswblank_l , +.Nm iswhexnumber_l , +.Nm iswideogram_l , +.Nm iswnumber_l , +.Nm iswphonogram_l , +.Nm iswrune_l , +.Nm iswspecial_l , +.Nm nextwctype_l , +.Nm towctrans_l , +.Nm wctrans_l +.Nd wide character classification utilities +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In wctype.h +.Ft int +.Fn iswalnum_l "wint_t wc" "locale_t loc" +.Ft int +.Fn iswalpha_l "wint_t wc" "locale_t loc" +.Ft int +.Fn iswcntrl_l "wint_t wc" "locale_t loc" +.Ft int +.Fn iswctype_l "wint_t wc" "locale_t loc" +.Ft int +.Fn iswdigit_l "wint_t wc" "locale_t loc" +.Ft int +.F
Re: svn commit: r238741 - head/lib/libelf
Sent from my iPhone On Jul 24, 2012, at 9:03 AM, "Andrey A. Chernov" wrote: > Author: ache > Date: Tue Jul 24 16:03:28 2012 > New Revision: 238741 > URL: http://svn.freebsd.org/changeset/base/238741 > > Log: > Don't ever build files depending on the directory where they are placed in. > It is obvious that its modification time will change with each such file > builded. > This bug cause whole libelf to rebuild itself each second make run > (and relink that files on each first make run) in the loop. A bunch of the sys/boot directories probably need this too..___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
svn commit: r238782 - head/lib/msun/src
Author: kargl Date: Thu Jul 26 03:50:24 2012 New Revision: 238782 URL: http://svn.freebsd.org/changeset/base/238782 Log: Replace code that toggles between 53 and 64 bits on i386 class hardware with the ENTERI and RETURNI macros, which are now available in math_private.h. Suggested by: bde Approved by: das (mentor) Modified: head/lib/msun/src/s_cbrtl.c Modified: head/lib/msun/src/s_cbrtl.c == --- head/lib/msun/src/s_cbrtl.c Wed Jul 25 22:17:44 2012(r238781) +++ head/lib/msun/src/s_cbrtl.c Thu Jul 26 03:50:24 2012(r238782) @@ -51,23 +51,12 @@ cbrtl(long double x) if (k == BIAS + LDBL_MAX_EXP) return (x + x); -#ifdef __i386__ - fp_prec_t oprec; - - oprec = fpgetprec(); - if (oprec != FP_PE) - fpsetprec(FP_PE); -#endif + ENTERI(); if (k == 0) { /* If x = +-0, then cbrt(x) = +-0. */ - if ((u.bits.manh | u.bits.manl) == 0) { -#ifdef __i386__ - if (oprec != FP_PE) - fpsetprec(oprec); -#endif - return (x); - } + if ((u.bits.manh | u.bits.manl) == 0) + RETURNI(x); /* Adjust subnormal numbers. */ u.e *= 0x1.0p514; k = u.bits.exp; @@ -149,9 +138,5 @@ cbrtl(long double x) t=t+t*r;/* error <= 0.5 + 0.5/3 + epsilon */ t *= v.e; -#ifdef __i386__ - if (oprec != FP_PE) - fpsetprec(oprec); -#endif - return (t); + RETURNI(t); } ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
svn commit: r238783 - in head/lib/msun: ld128 ld80
Author: kargl Date: Thu Jul 26 03:59:33 2012 New Revision: 238783 URL: http://svn.freebsd.org/changeset/base/238783 Log: * ld80/expl.c: . Remove a few #ifdefs that should have been removed in the initial commit. . Sort fpmath.h to its rightful place. * ld128/s_expl.c: . Replace EXPMASK with its actual value. . Sort fpmath.h to its rightful place. Requested by: bde Approved by: das (mentor) Modified: head/lib/msun/ld128/s_expl.c head/lib/msun/ld80/s_expl.c Modified: head/lib/msun/ld128/s_expl.c == --- head/lib/msun/ld128/s_expl.cThu Jul 26 03:50:24 2012 (r238782) +++ head/lib/msun/ld128/s_expl.cThu Jul 26 03:59:33 2012 (r238783) @@ -29,12 +29,11 @@ __FBSDID("$FreeBSD$"); #include +#include "fpmath.h" #include "math.h" #include "math_private.h" -#include "fpmath.h" #defineBIAS(LDBL_MAX_EXP - 1) -#defineEXPMASK (BIAS + LDBL_MAX_EXP) static volatile const long double twom1 = 0x1p-1L, tiny = 0x1p-1L; @@ -205,7 +204,7 @@ expl(long double x) /* Filter out exceptional cases. */ u.e = x; hx = u.xbits.expsign; - ix = hx & EXPMASK; + ix = hx & 0x7fff; if (ix >= BIAS + 13) { /* |x| >= 8192 or x is NaN */ if (ix == BIAS + LDBL_MAX_EXP) { if (u.xbits.manh != 0 Modified: head/lib/msun/ld80/s_expl.c == --- head/lib/msun/ld80/s_expl.c Thu Jul 26 03:50:24 2012(r238782) +++ head/lib/msun/ld80/s_expl.c Thu Jul 26 03:59:33 2012(r238783) @@ -45,13 +45,9 @@ __FBSDID("$FreeBSD$"); #include #endif +#include "fpmath.h" #include "math.h" -#defineFPSETPREC -#ifdef NO_FPSETPREC -#undef FPSETPREC -#endif #include "math_private.h" -#include "fpmath.h" #defineBIAS(LDBL_MAX_EXP - 1) ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
svn commit: r238784 - in head/lib/msun: ld128 ld80
Author: kargl Date: Thu Jul 26 04:05:08 2012 New Revision: 238784 URL: http://svn.freebsd.org/changeset/base/238784 Log: Replace the macro name NUM with INTERVALS. This change provides compatibility with the INTERVALS macro used in the soon-to-be-commmitted expm1l() and someday-to-be-committed log*l() functions. Add a comment into ld128/s_expl.c noting at gcc issue that was deleted when rewriting ld80/e_expl.c as ld128/s_expl.c. Requested by: bde Approved by: das (mentor) Modified: head/lib/msun/ld128/s_expl.c head/lib/msun/ld80/s_expl.c Modified: head/lib/msun/ld128/s_expl.c == --- head/lib/msun/ld128/s_expl.cThu Jul 26 03:59:33 2012 (r238783) +++ head/lib/msun/ld128/s_expl.cThu Jul 26 04:05:08 2012 (r238784) @@ -35,6 +35,7 @@ __FBSDID("$FreeBSD$"); #defineBIAS(LDBL_MAX_EXP - 1) +/* XXX Prevent gcc from erroneously constant folding this: */ static volatile const long double twom1 = 0x1p-1L, tiny = 0x1p-1L; static const long double @@ -57,12 +58,12 @@ P9 = 2.755731922401038678178761995444688 P10 = 2.75573236172670046201884000197885520e-7L, P11 = 2.50517544183909126492878226167697856e-8L; -#defineNUM 128 +#defineINTERVALS 128 static const struct { long double hi; long double lo; -} s[NUM] = { +} s[INTERVALS] = { 0x1p0L, 0x0p0L, 0x1.0163da9fb33356d84a66aep0L, 0x3.36dcdfa4003ec04c360be2404078p-92L, 0x1.02c9a3e778060ee6f7cacap0L, 0x4.f7a29bde93d70a2cabc5cb89ba10p-92L, @@ -226,8 +227,8 @@ expl(long double x) fn = x * INV_L + 0x1.8p112 - 0x1.8p112; n = (int)fn; - n2 = (unsigned)n % NUM; /* Tang's j. */ - k = (n - n2) / NUM; + n2 = (unsigned)n % INTERVALS; /* Tang's j. */ + k = (n - n2) / INTERVALS; r1 = x - fn * L1; r2 = -fn * L2; Modified: head/lib/msun/ld80/s_expl.c == --- head/lib/msun/ld80/s_expl.c Thu Jul 26 03:59:33 2012(r238783) +++ head/lib/msun/ld80/s_expl.c Thu Jul 26 04:05:08 2012(r238784) @@ -36,7 +36,7 @@ __FBSDID("$FreeBSD$"); * in IEEE floating-point arithmetic," ACM Trans. Math. Soft., 15, * 144-157 (1989). * - * where the 32 table entries have been expanded to NUM (see below). + * where the 32 table entries have been expanded to INTERVALS (see below). */ #include @@ -65,9 +65,9 @@ u_threshold = LD80C(0xb21dfe7f09e2baa9, static const double __aligned(64) /* - * ln2/NUM = L1+L2 (hi+lo decomposition for multiplication). L1 must have - * at least 22 (= log2(|LDBL_MIN_EXP-extras|) + log2(NUM)) lowest bits zero - * so that multiplication of it by n is exact. + * ln2/INTERVALS = L1+L2 (hi+lo decomposition for multiplication). L1 must + * have at least 22 (= log2(|LDBL_MIN_EXP-extras|) + log2(INTERVALS)) lowest + * bits zero so that multiplication of it by n is exact. */ L1 = 5.4152123484527692e-3, /* 0x162e42ff00.0p-60 */ L2 = -3.2819649005320973e-13, /* -0x1718432a1b0e26.0p-94 */ @@ -75,7 +75,7 @@ INV_L = 1.8466496523378731e+2,/* 0x17 /* * Domain [-0.002708, 0.002708], range ~[-5.7136e-24, 5.7110e-24]: * |exp(x) - p(x)| < 2**-77.2 - * (0.002708 is ln2/(2*NUM) rounded up a little). + * (0.002708 is ln2/(2*INTERVALS) rounded up a little). */ P2 = 0.5, P3 = 1.6119e-1, /* 0x155490.0p-55 */ @@ -84,16 +84,16 @@ P5 = 8.354987869413e-3,/* 0x P6 = 1.391738560272e-3; /* 0x16c16c651633ae.0p-62 */ /* - * 2^(i/NUM) for i in [0,NUM] is represented by two values where the - * first 47 (?!) bits of the significand is stored in hi and the next 53 + * 2^(i/INTERVALS) for i in [0,INTERVALS] is represented by two values where + * the first 47 (?!) bits of the significand is stored in hi and the next 53 * bits are in lo. */ -#defineNUM 128 +#defineINTERVALS 128 static const struct { double hi; double lo; -} s[NUM] __aligned(16) = { +} s[INTERVALS] __aligned(16) = { 0x1p+0, 0x0p+0, 0x1.0163da9fb330p+0, 0x1.ab6c25335719bp-47, 0x1.02c9a3e77804p+0, 0x1.07737be56527cp-47, @@ -265,8 +265,8 @@ expl(long double x) #else n = (int)fn; #endif - n2 = (unsigned)n % NUM; /* Tang's j. */ - k = (n - n2) / NUM; + n2 = (unsigned)n % INTERVALS; /* Tang's j. */ + k = (n - n2) / INTERVALS; r1 = x - fn * L1; r2 = -fn * L2; ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
svn commit: r238785 - head/sys/arm/conf
Author: imp Date: Thu Jul 26 05:35:10 2012 New Revision: 238785 URL: http://svn.freebsd.org/changeset/base/238785 Log: Update partitions to reflect "sam9 demo" defaults. Update i2c devices to just include the eeprom. Update dataflash chip select to be CS 1 (this doesn't work yet and needs changes to at91_spi and the spibus infrastructure). Fix typo in comment. Modified: head/sys/arm/conf/SAM9260EK head/sys/arm/conf/SAM9260EK.hints Modified: head/sys/arm/conf/SAM9260EK == --- head/sys/arm/conf/SAM9260EK Thu Jul 26 04:05:08 2012(r238784) +++ head/sys/arm/conf/SAM9260EK Thu Jul 26 05:35:10 2012(r238785) @@ -17,12 +17,12 @@ # # $FreeBSD$ -ident ETHERNUT5 +ident SAM9260EK -include "../at91/std.ethernut5" +include "../at91/std.sam9260ek" # To statically compile in device wiring instead of /boot/device.hints -hints "ETHERNUT5.hints" +hints "SAM9260EK.hints" #makeoptions DEBUG=-g# Build kernel with gdb(1) debug symbols @@ -103,13 +103,13 @@ devicebpf # Berkeley packet filter # Ethernet device mii # Minimal MII support -device ate # Atmel AT91 Ethernet friver +device ate # Atmel AT91 Ethernet driver # I2C device at91_twi# Atmel AT91 Two-wire Interface device iic # I2C generic I/O device driver device iicbus # I2C bus system -device pcf8563 # NXP PCF8563 clock/calendar +device icee# I2C eeprom # MMC/SD device at91_mci# Atmel AT91 Multimedia Card Interface Modified: head/sys/arm/conf/SAM9260EK.hints == --- head/sys/arm/conf/SAM9260EK.hints Thu Jul 26 04:05:08 2012 (r238784) +++ head/sys/arm/conf/SAM9260EK.hints Thu Jul 26 05:35:10 2012 (r238785) @@ -2,50 +2,48 @@ # Atmel AT45DB21D hint.at45d.0.at="spibus0" -hint.at45d.0.addr=0x00 -# user 132 kbytes +hint.at45d.0.cs=1 +# Area 0: to 41FF (RO) Bootstrap +# Area 1: 4200 to 83FF Environment +# Area 2: 8400 to 00041FFF (RO) U-Boot +# Area 3: 00042000 to 00251FFF Kernel +# Area 4: 00252000 to 0083 FS +# bootstrap hint.map.0.at="flash/spi0" hint.map.0.start=0x -hint.map.0.end=0x00020fff -hint.map.0.name="user" +hint.map.0.end=0x41ff +hint.map.0.name="bootstrap" hint.map.0.readonly=1 -# setup 132 kbytes +# uboot environment hint.map.1.at="flash/spi0" -hint.map.1.start=0x00021000 -hint.map.1.end=0x00041fff -hint.map.1.name="setup" -hint.map.1.readonly=1 -# uboot 528 kbytes +hint.map.1.start=0x4200 +hint.map.1.end=0x00083ff +hint.map.1.name="uboot-env" +#hint.map.1.readonly=1 +# uboot hint.map.2.at="flash/spi0" -hint.map.2.start=0x00042000 -hint.map.2.end=0x000c5fff +hint.map.2.start=0x8400 +hint.map.2.end=0x00041fff hint.map.2.name="uboot" hint.map.2.readonly=1 -# kernel 2640 kbytes +# kernel hint.map.3.at="flash/spi0" -hint.map.3.start=0x000c6000 -hint.map.3.end=0x00359fff -hint.map.3.name="kernel" +hint.map.3.start=0x00042000 +hint.map.3.end=0x00251fff +hint.map.3.name="fs" #hint.map.3.readonly=1 -# nutos 528 kbytes +# fs hint.map.4.at="flash/spi0" -hint.map.4.start=0x0035a000 -hint.map.4.end=0x003ddfff -hint.map.4.name="nutos" -hint.map.4.readonly=1 -# env 132 kbytes -hint.map.5.at="flash/spi0" -hint.map.5.start=0x003de000 -hint.map.5.end=0x003fefff -hint.map.5.name="env" -hint.map.5.readonly=1 -# env 132 kbytes -hint.map.6.at="flash/spi0" -hint.map.6.start=0x003ff000 -hint.map.6.end=0x0041 -hint.map.6.name="nutoscfg" -hint.map.6.readonly=1 +hint.map.4.start=0x00252000 +hint.map.4.end=0x0083 +hint.map.4.name="fs" +#hint.map.4.readonly=1 + +# EEPROM +hint.icee.0.at="iicbus0" +hint.icee.0.addr=0xa0 +hint.icee.0.type=16 +hint.icee.0.size=65536 +hint.icee.0.rd_sz=256 +hint.icee.0.wr_sz=256 -# NXP PCF8563 clock/calendar -hint.pcf8563_rtc.0.at="iicbus0" -hint.pcf8563_rtc.0.addr=0xa2 ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
svn commit: r238786 - head/sys/arm/conf
Author: imp Date: Thu Jul 26 05:37:36 2012 New Revision: 238786 URL: http://svn.freebsd.org/changeset/base/238786 Log: Fix typo in comment. spibus uses cs= rather than addr=, so fix hints to use that (nop since spibus cs defaults to 0, and at91_spi assumes 0). Modified: head/sys/arm/conf/ETHERNUT5 head/sys/arm/conf/ETHERNUT5.hints Modified: head/sys/arm/conf/ETHERNUT5 == --- head/sys/arm/conf/ETHERNUT5 Thu Jul 26 05:35:10 2012(r238785) +++ head/sys/arm/conf/ETHERNUT5 Thu Jul 26 05:37:36 2012(r238786) @@ -103,7 +103,7 @@ device bpf # Berkeley packet filter # Ethernet device mii # Minimal MII support -device ate # Atmel AT91 Ethernet friver +device ate # Atmel AT91 Ethernet driver # I2C device at91_twi# Atmel AT91 Two-wire Interface Modified: head/sys/arm/conf/ETHERNUT5.hints == --- head/sys/arm/conf/ETHERNUT5.hints Thu Jul 26 05:35:10 2012 (r238785) +++ head/sys/arm/conf/ETHERNUT5.hints Thu Jul 26 05:37:36 2012 (r238786) @@ -2,7 +2,7 @@ # Atmel AT45DB21D hint.at45d.0.at="spibus0" -hint.at45d.0.addr=0x00 +hint.at45d.0.cs=0 # user 132 kbytes hint.map.0.at="flash/spi0" hint.map.0.start=0x ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
svn commit: r238787 - head/sys/arm/at91
Author: imp Date: Thu Jul 26 05:46:56 2012 New Revision: 238787 URL: http://svn.freebsd.org/changeset/base/238787 Log: Some models have 6 USARTS + DBGU. Set a consistent name. Modified: head/sys/arm/at91/uart_bus_at91usart.c Modified: head/sys/arm/at91/uart_bus_at91usart.c == --- head/sys/arm/at91/uart_bus_at91usart.c Thu Jul 26 05:37:36 2012 (r238786) +++ head/sys/arm/at91/uart_bus_at91usart.c Thu Jul 26 05:46:56 2012 (r238787) @@ -95,6 +95,12 @@ usart_at91_probe(device_t dev) case 4: device_set_desc(dev, "USART3"); break; + case 5: + device_set_desc(dev, "USART4"); + break; + case 6: + device_set_desc(dev, "USART5"); + break; } sc->sc_class = &at91_usart_class; if (sc->sc_class->uc_rclk == 0) ___ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"
Re: svn commit: r238755 - head/sys/x86/x86
On Wed, 25 Jul 2012, Konstantin Belousov wrote: On Wed, Jul 25, 2012 at 11:00:41AM -0700, Jim Harris wrote: On Wed, Jul 25, 2012 at 10:32 AM, Konstantin Belousov wrote: I also asked Jim to test whether the cause the TSC sync test failure is the lack of synchronization between gathering data and tasting it, but ut appeared that the reason is genuine timecounter value going backward. I wonder if instead of timecounter going backward, that TSC test fails because CPU speculatively performs rdtsc instruction in relation to waiter checks in smp_rendezvous_action. Or maybe we are saying the same thing. Ok, the definition of the 'timecounter goes back', as I understand it: you have two events A and B in two threads, provable ordered, say, A is a lock release and B is the same lock acquisition. Assume that you take rdtsc values tA and tB under the scope of the lock right before A and right after B. Then it should be impossible to have tA > tB. For the threaded case, there has to something for the accesses to be provably ordered. It is hard to see how the something can be strong enough unless it serializes all thread state in A and B. The rdtsc state is not part of the thread state as know to APIs, but it is hard to see how threads can serialize themselves without also serializing the TSC. For most uses, the scope of the serialization and locking also needs to extend across multiple timer reads. Otherwise you can have situations like: read the time interrupt or context switch read later time in other intr handler/thread save late time back to previous context save earlier time It is unclear how to even prevent such situations. You (at least, I) don't want heavyweight locking/synchronization to prevent the context switches. And the kernel rarely if ever does such synchronization. binuptime() has none internally. It just spins if necessary until the read becomes stable. Most callers of binuptime() just call it. I do not think that we can ever observe tA > tB if both threads are executing on the same CPU. I thought that that was the problem, with a single thread and no context switches seeing the TSC go backwards. Even then, it would take non-useful behaviour (except for calibration and benchmarks) like spinning executing rdtsc to see it going backwards. Normally there are many instructions between rdtsc's and the non-serialization isn't as deep as that. Using syscalls, you just can't read the timecounter without about 1000 cycles between reads. When there is a context switch, there is usually accidental serialization from locking. I care about timestamps being ordered more than most people, and tried to kill the get*time() APIs because they are weakly ordered relative to the non-get variants (they return times in the past, and there is no way to round down to get consistent times). I tried to fix them by adding locking and updating them to the latest time whenever a non-get variant gives a later time (by being used). This was too slow, and breaks the design criteria that timecounter calls should not use any explicit locking. However, if you want slowness, then you can get it similarly by fixing the monotonicity of rdtsc in software. I think I just figured out how to do this with the same slowness as serialization, if a locked instruction serialzes; maybe less otherwise: spin: ptsc = prev_tsc;/* memory -> local (intentionally !atomic) */ tsc = rdtsc(); /* only 32 bits for timecounters */ if (tsc <= ptsc) { /* I forgot about wrap at first -- see below */ /* * It went backwards, or stopped. Could handle more * completely, starting with panic() to see if this * happens at all. */ return (ptsc); /* stopped is better than backwards */ } /* Usual case; update (32 bits). */ if (atomic_cmpset_int(&prev_tsc, ptsc, tsc)) return (tsc); goto spin; The 32-bitness of timecounters is important for the algorithm, and for efficiency on i386. We assume that the !atomic read gives coherent bits. The value may be in the past. When tsc <= ptsc, the value is in the future, so value must be up to date, unless there is massive non-seriality with another CPU having just written a value more up to date than this CPU read. We don't care about this, since losing this race is no different from being preempted after we read. When tsc > ptsc, we want to write it as 32 bits to avoid the cmpxchg8b slowness/unportability. Again, the value may be out of date when we try to update it, because we were preempted. We don't care about this either, as above, but detected some cases as a side effect of checking that ptsc is up to date. Normally ptsc was up to date when it was read, but it could easly be out of date when it was che