Re: Unusual threading behavior on single processes
On Fri, Mar 27, 2020 at 09:03:40PM +, Stefmorino wrote: > I have question about a performance quirk on OpenBSD, but I'm not really sure > how to address it, or what the root cause even is; that being how > multithreaded > applications (libpthread?) behave (notably, games). > > I have tested many applications, the behavior is the same in all of them, but > I'll talk about OpenMW (an open-source game engine for morrowind) since I have > the most useful information about how this program is threaded. By default, > OpenMW uses 4 threads (cited here: > https://openmw.readthedocs.io/en/stable/reference/modding/settings/cells.html), > one for main/generic processing, one for graphics, one for audio, and one for > preloading terrain. You can see this if you look at the thread usage under top > while running the game; however, this is exactly where my question comes into > play. Instead of each thread processing the game independently with their own > limits, each thread is "capped" to the total limit of one thread (I.E. instead > of openmw's process using 100% of 4 threads, or 400% cpu in top, instead the > process uses 25% across 4 threads, or 100% cpu in top). I tested this using > GENERIC instead of GENERIC.MP as well, and get identical performance on the > one > thread; it's almost like pthreads is acting as a placeholder of sorts and not > actually improving performance where it should. > > Is it a lock (spin is at 0)? A placeholder? A limitation of how Ryzen SMP is > implemented? Hard to tell, no idea what that game engine does. But this not a general problem, e.g. the malloc_duel regress test (/usr/src/regress/lib/libpthread/malloc_duel). I see > 100% as well with other multi-threaded programs. 32013 otto 600 6020K 1552K onproc/3 - 1:07 228.81% malloc_due Wild guess: it could be that you program actually does not do real threading, but userland threading. Check with top -H if it really creates threads. You should see multiple threads having the same PID. or all thraeds are using a resource that cannot be shared. -Otto > > I'd be happy to do any additional testing, I have a fresh -current source tree > ready > > dmesg > OpenBSD 6.6-current (GENERIC.MP) #75: Tue Mar 24 12:56:37 MDT 2020 > dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP > real mem = 16603250688 (15834MB) > avail mem = 16087437312 (15342MB) > mpath0 at root > scsibus0 at mpath0: 256 targets > mainbus0 at root > bios0 at mainbus0: SMBIOS rev. 3.1 @ 0x986ec000 (62 entries) > bios0: vendor LENOVO version "R0UET76W (1.56 )" date 11/05/2019 > bios0: LENOVO 20KVCTO1WW > acpi0 at bios0: ACPI 5.0 > acpi0: sleep states S0 S3 S4 S5 > acpi0: tables DSDT FACP SSDT SSDT CRAT CDIT UEFI MSDM BATB HPET APIC MCFG > SBST WSMT IVRS FPDT SSDT SSDT SSDT UEFI SSDT > acpi0: wakeup devices GPP0(S3) GPP1(S3) GPP2(S3) GPP3(S3) GPP4(S3) GPP5(S3) > GPP6(S3) GP17(S3) XHC0(S3) XHC1(S3) GP18(S3) LID_(S3) SLPB(S3) > acpitimer0 at acpi0: 3579545 Hz, 32 bits > acpihpet0 at acpi0: 14318180 Hz > acpimadt0 at acpi0 addr 0xfee0: PC-AT compat > cpu0 at mainbus0: apid 0 (boot processor) > cpu0: AMD Ryzen 5 2500U with Radeon Vega Mobile Gfx, 1996.61 MHz, 17-11-00 > cpu0: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,SKINIT,TCE,TOPEXT,CPCTR,DBKP,PCTRL3,MWAITX,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,RDSEED,ADX,SMAP,CLFLUSHOPT,SHA,IBPB,XSAVEOPT,XSAVEC,XGETBV1,XSAVES > cpu0: 64KB 64b/line 4-way I-cache, 32KB 64b/line 8-way D-cache, 512KB > 64b/line 8-way L2 cache, 4MB 64b/line 16-way L3 cache > cpu0: ITLB 64 4KB entries fully associative, 64 4MB entries fully associative > cpu0: DTLB 64 4KB entries fully associative, 64 4MB entries fully associative > cpu0: smt 0, core 0, package 0 > mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges > cpu0: apic clock running at 24MHz > cpu0: mwait min=64, max=64, C-substates=1.1, IBE > cpu1 at mainbus0: apid 1 (application processor) > cpu1: AMD Ryzen 5 2500U with Radeon Vega Mobile Gfx, 1996.23 MHz, 17-11-00 > cpu1: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,SKINIT,TCE,TOPEXT,CPCTR,DBKP,PCTRL3,MWAITX,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,RDSEED,ADX,SMAP,CLFLUSHOPT,SHA,IBPB,XSAVEOPT,XSAVEC,XGETBV1,XSAVES > cpu1: 64KB 64b/line 4-way I-cache, 32KB 64b/line 8-way D-cache, 512KB > 64b/line 8-way L2 cache, 4MB 64b/line 16-way L3 cache > cpu1: ITLB 64 4KB entries fully associative, 64 4MB entries fully associative > cpu1: DTLB 64 4KB entries fully associative, 64 4MB entries fully associative > cpu1: smt 1, core
Zoom meeting via chromium web app
Greetings, I'm trying to use the Zoom meeting platform in OpenBSD through the Chromium web app (-current, very recent snapshot, Chromium 80.0.3987.149, amd64). When I click on the app icon, a new browser window opens and the sign-in web page appears, but soon after the browser is killed: Mar 28 09:52:43 theseus /bsd: chrome(36809): pledge sysctl 2: 6 2 Mar 28 09:52:43 theseus /bsd: chrome[36809]: pledge "", syscall 202 Starting chrome with --disable-unveil doesn't help (same error). Anybody did succeed in using this (or a similar) platform? Any hints would be very appreciated. All the best -- Alessandro De Laurenzis [mailto:jus...@atlantide.mooo.com] Web: http://www.atlantide.mooo.com LinkedIn: http://it.linkedin.com/in/delaurenzis
Re: Zoom meeting via chromium web app
On Sat, Mar 28, 2020 at 10:00:28AM +0100, Alessandro De Laurenzis wrote: > Greetings, > > I'm trying to use the Zoom meeting platform in OpenBSD through the Chromium > web app (-current, very recent snapshot, Chromium 80.0.3987.149, amd64). > > When I click on the app icon, a new browser window opens and the sign-in web > page appears, but soon after the browser is killed: > > Mar 28 09:52:43 theseus /bsd: chrome(36809): pledge sysctl 2: 6 2 > Mar 28 09:52:43 theseus /bsd: chrome[36809]: pledge "", syscall 202 > > Starting chrome with --disable-unveil doesn't help (same error). > > Anybody did succeed in using this (or a similar) platform? You can use --no-sandbox. But Zoom will not work anyway, at least for me it doesn't recognize my audio nor my camera. I use Windows for video conf. -- Antoine
Re: Zoom meeting via chromium web app
On March 28, 2020 11:40:25 AM GMT+01:00, Antoine Jacoutot wrote: >On Sat, Mar 28, 2020 at 10:00:28AM +0100, Alessandro De Laurenzis >wrote: >> Greetings, >> >> I'm trying to use the Zoom meeting platform in OpenBSD through the >Chromium >> web app (-current, very recent snapshot, Chromium 80.0.3987.149, >amd64). >> >> When I click on the app icon, a new browser window opens and the >sign-in web >> page appears, but soon after the browser is killed: >> >> Mar 28 09:52:43 theseus /bsd: chrome(36809): pledge sysctl 2: 6 2 >> Mar 28 09:52:43 theseus /bsd: chrome[36809]: pledge "", syscall 202 >> >> Starting chrome with --disable-unveil doesn't help (same error). >> >> Anybody did succeed in using this (or a similar) platform? > >You can use --no-sandbox. >But Zoom will not work anyway, at least for me it doesn't recognize my >audio >nor my camera. >I use Windows for video conf. Hello, I haven't tried Zoom but I successfully used Jisti with Chromium on current. I just had to chown /dev/videoX. It was working nicely until my system hung after 10 or 15 min though, likely because of the lack of hardware acceleration. You should give it a try. Cheers, -- Tristan
RE: Unusual threading behavior on single processes
Haai, Just to make a more-or-less general point (or two)... "Otto Moerbeek" wrote: > On Fri, Mar 27, 2020 at 09:03:40PM +, Stefmorino wrote: > >> I have tested many applications, the behavior is the same in all of them, but >> I'll talk about OpenMW (an open-source game engine for morrowind) since I >> have >> the most useful information about how this program is threaded. By default, >> OpenMW uses 4 threads (cited here: >> https://openmw.readthedocs.io/en/stable/reference/modding/settings/cells.html), >> one for main/generic processing, one for graphics, one for audio, and one for >> preloading terrain. >>[snip] >> >> Is it a lock (spin is at 0)? A placeholder? A limitation of how Ryzen SMP is >> implemented? >[snip] > > Wild guess: it could be that you program actually does not do real > threading, but userland threading. "Fibering", in other words. > Check with top -H if it really > creates threads. You should see multiple threads having the same PID. > or all thraeds are using a resource that cannot be shared. Likely the latter. It's always funny, isn't it... A coder thinks "hey, I want a multi-threading 'cause its 1337, I'll just neatly run these subsystems within seperate threads and I'm done!". The fact that such is a frequently a naive proposition should be clear to the more clueful reader. Games tend to be heavy on global state, and are more likely to benefit from a multi-process model w/ carefully thought-out boundaries, than from a shared-everything thread model. While that need not be the case here, mestrongly suspects it is. Take heed, and measure. Always measure. Take care, --zeurkous. > -Otto -- Friggin' Machines!
/bin/sh: ctags: Argument list too long (make tags)
Apparently the number files in kern is on the hairy edge of ARG_MAX on openbsd 6.6-current amd64. If I run the same command in /usr/src, it works making the problem easy to ignore until more files are added. Should ctags grow an option to take a list of inputs from a file or is -a smart enough to be used with xagrs to resolve this problem? cd /home/greg/s/src/sys/kern; make tags ... TDIR=`mktemp -d /tmp/_tagXX` || exit 1; eval "S=/home/greg/s/src/sys/arch/amd64/../.." && config -s /home/greg/s/src/sys/arch/amd64/../.. -b ${TDIR} /home/greg/s/src/sys/arch/amd64/conf/GENERIC.MP && eval "_arch=\"`make -V _arch -f ${TDIR}/Makefile`\"" && eval "_mach=\"`make -V _mach -f ${TDIR}/Makefile`\"" && eval "_machdir=\/home/greg/s/src/sys/arch/amd64/../../arch/${_mach}" && eval "_archdir=\/home/greg/s/src/sys/arch/amd64/../../arch/${_arch}" && eval "HFILES=\"`find /home/greg/s/src/sys/arch/amd64/../.. \( -path /home/greg/s/src/sys/arch/amd64/../../'arch' -o -path /home/greg/s/src/sys/arch/amd64/../../stand -o -path /home/greg/s/src/sys/arch/amd64/../../lib/libsa -o -path /home/greg/s/src/sys/arch/amd64/../..'/lib/libkern/arch' \) -prune -o -name '*.h'; find ${_machdir} ${_archdir} /home/greg/s/src/sys/arch/amd64/../../lib/libkern/arch/${_mach} \( -name boot -o -name stand \) -prune -o -name '*.h'`\"" && eval "SFILES=\"`make -V SFILES -f ${TDIR}/Makefile`\"" && eval "CFILES=\"`make -V CFILES -f ${TDIR}/Makefile`\"" && eval "AFILES=\"`make -V AFILES -f ${TDIR}/Makefile`\"" && ctags -wd -f /home/greg/s/src/sys/arch/amd64/tags ${CFILES} ${HFILES} && egrep "^[_A-Z]*ENTRY[_A-Z]*\(.*\)" ${SFILES} ${AFILES} | sed "s;\\([^:]*\\):\\([^(]*\\)(\\([^, )]*\\)\\(.*\\);\\3 \\1 /^\\2(\\3\\4$/;" >> /home/greg/s/src/sys/arch/amd64/tags && sort -o /home/greg/s/src/sys/arch/amd64/tags /home/greg/s/src/sys/arch/amd64/tags && rm -rf ${TDIR} /bin/sh: ctags: Argument list too long *** Error 1 in arch/amd64 (Makefile:42 'tags') -- nest.cx is Gmail hosted, use PGP: https://pgp.key-server.io/0x0B1542BD8DF5A1B0 Fingerprint: 5E2B 2D0E 1E03 2046 BEC3 4D50 0B15 42BD 8DF5 A1B0
Re: Unusual threading behavior on single processes
Thank you, your information was very helpful. I compiled and ran malloc_duel and it's working as intended. I wasn't aware of the -H flag for top, and I can see programs are threading as you say, though the bottleneck to my poor performance is still a mystery. I took some screen captures so you can see what I'm seeing: Xonotic: https://0x0.st/iBCD.png OpenMW: https://0x0.st/iBC5.png Terraria: (fnaify if curious, thanks thfr :>) https://0x0.st/iMrr.png In the case of OpenMW, the bottleneck actually seems pretty obvious with what top -H reports. I don't really know what to say about the other examples. I would break out a profiling tool at this stage, but the results of testing with top -H have left me with no idea where the bottleneck is (except openmw where it might actually be CPU); digging through systat hasn't really given me any revelations either. :/ If anyone has a hunch where I should check, or if you need me to test a different software, I'd be more than happy to. Regards, Stefmorino On Sat, Mar 28, 2020, at 09:00:21AM +, Otto Moerbeek wrote: > On Fri, Mar 27, 2020 at 09:03:40PM +, Stefmorino wrote: >> I have question about a performance quirk on OpenBSD, but I'm not really sure >> how to address it, or what the root cause even is; that being how >> multithreaded >> applications (libpthread?) behave (notably, games). >> >> I have tested many applications, the behavior is the same in all of them, but >> I'll talk about OpenMW (an open-source game engine for morrowind) since I >> have >> the most useful information about how this program is threaded. By default, >> OpenMW uses 4 threads (cited here: >> https://openmw.readthedocs.io/en/stable/reference/modding/settings/cells.html), >> one for main/generic processing, one for graphics, one for audio, and one for >> preloading terrain. You can see this if you look at the thread usage under >> top >> while running the game; however, this is exactly where my question comes into >> play. Instead of each thread processing the game independently with their own >> limits, each thread is "capped" to the total limit of one thread (I.E. >> instead >> of openmw's process using 100% of 4 threads, or 400% cpu in top, instead the >> process uses 25% across 4 threads, or 100% cpu in top). I tested this using >> GENERIC instead of GENERIC.MP as well, and get identical performance on the >> one >> thread; it's almost like pthreads is acting as a placeholder of sorts and not >> actually improving performance where it should. >> >> Is it a lock (spin is at 0)? A placeholder? A limitation of how Ryzen SMP is >> implemented? > > Hard to tell, no idea what that game engine does. But this not a > general problem, e.g. the malloc_duel regress test > (/usr/src/regress/lib/libpthread/malloc_duel). I see > 100% as well > with other multi-threaded programs. > > 32013 otto 600 6020K 1552K onproc/3 - 1:07 228.81% > malloc_due > > Wild guess: it could be that you program actually does not do real > threading, but userland threading. Check with top -H if it really > creates threads. You should see multiple threads having the same PID. > or all thraeds are using a resource that cannot be shared. > > -Otto >> >> I'd be happy to do any additional testing, I have a fresh -current source >> tree >> ready >> >> dmesg >> OpenBSD 6.6-current (GENERIC.MP) #75: Tue Mar 24 12:56:37 MDT 2020 >> dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP >> real mem = 16603250688 (15834MB) >> avail mem = 16087437312 (15342MB) >> mpath0 at root >> scsibus0 at mpath0: 256 targets >> mainbus0 at root >> bios0 at mainbus0: SMBIOS rev. 3.1 @ 0x986ec000 (62 entries) >> bios0: vendor LENOVO version "R0UET76W (1.56 )" date 11/05/2019 >> bios0: LENOVO 20KVCTO1WW >> acpi0 at bios0: ACPI 5.0 >> acpi0: sleep states S0 S3 S4 S5 >> acpi0: tables DSDT FACP SSDT SSDT CRAT CDIT UEFI MSDM BATB HPET APIC MCFG >> SBST WSMT IVRS FPDT SSDT SSDT SSDT UEFI SSDT >> acpi0: wakeup devices GPP0(S3) GPP1(S3) GPP2(S3) GPP3(S3) GPP4(S3) GPP5(S3) >> GPP6(S3) GP17(S3) XHC0(S3) XHC1(S3) GP18(S3) LID_(S3) SLPB(S3) >> acpitimer0 at acpi0: 3579545 Hz, 32 bits >> acpihpet0 at acpi0: 14318180 Hz >> acpimadt0 at acpi0 addr 0xfee0: PC-AT compat >> cpu0 at mainbus0: apid 0 (boot processor) >> cpu0: AMD Ryzen 5 2500U with Radeon Vega Mobile Gfx, 1996.61 MHz, 17-11-00 >> cpu0: >> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,SKINIT,TCE,TOPEXT,CPCTR,DBKP,PCTRL3,MWAITX,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,RDSEED,ADX,SMAP,CLFLUSHOPT,SHA,IBPB,XSAVEOPT,XSAVEC,XGETBV1,XSAVES >> cpu0: 64KB 64b/line 4-way I-cache, 32KB 64b/line 8-way D-cache, 512KB >> 64b/line 8-way L2 cache, 4MB 64b/line 16-w