Christian Ehrhardt <christian.ehrha...@canonical.com> writes:
> On Thu, Jul 16, 2020 at 6:27 PM Alex Bennée <alex.ben...@linaro.org> wrote: > >> >> Christian Ehrhardt <christian.ehrha...@canonical.com> writes: >> >> > On Wed, Jul 15, 2020 at 5:58 PM BALATON Zoltan <bala...@eik.bme.hu> >> wrote: >> > >> >> See commit 47a2def4533a2807e48954abd50b32ecb1aaf29a and the next two >> >> following it. >> >> >> > >> > Thank you Zoltan for pointing out this commit, I agree that this seems >> to be >> > the trigger for the issues I'm seeing. Unfortunately the common CI host >> size >> > is 1-2G. For example on Ubuntu Autopkgtests 1.5G. >> > Those of them running guests do so in 0.5-1G size in TCG mode >> > (as they often can't rely on having KVM available). >> > >> > The 1G TB buffer + 0.5G actual guest size + lack of dynamic downsizing >> > on memory pressure (never existed) makes these systems go OOM-Killing >> > the qemu process. >> >> Ooops. I admit the assumption was that most people running system >> emulation would be doing it on beefier machines. >> >> > The patches indicated that the TB flushes on a full guest boot are a >> > good indicator of the TB size efficiency. From my old checks I had: >> > >> > - Qemu 4.2 512M guest with 32M default overwritten by ram-size/4 >> > TB flush count 14, 14, 16 >> > - Qemu 5.0 512M guest with 1G default >> > TB flush count 1, 1, 1 >> > >> > I agree that ram/4 seems odd, especially on huge guests that is a lot >> > potentially wasted. And most environments have a bit of breathing >> > room 1G is too big in small host systems and the common CI system falls >> > into this category. So I tuned it down to 256M for a test. >> > >> > - Qemu 4.2 512M guest with tb-size 256M >> > TB flush count 5, 5, 5 >> > - Qemu 5.0 512M guest with tb-size 256M >> > TB flush count 5, 5, 5 >> > - Qemu 5.0 512M guest with 256M default in code >> > TB flush count 5, 5, 5 >> > >> > So performance wise the results are as much in-between as you'd think >> from a >> > TB size in between. And the memory consumption which (for me) is the >> actual >> > current issue to fix would be back in line again as expected. >> >> So I'm glad you have the workaround. >> >> > >> > So on one hand I'm suggesting something like: >> > --- a/accel/tcg/translate-all.c >> > +++ b/accel/tcg/translate-all.c >> > @@ -944,7 +944,7 @@ static void page_lock_pair(PageDesc **re >> > * Users running large scale system emulation may want to tweak their >> > * runtime setup via the tb-size control on the command line. >> > */ >> > -#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (1 * GiB) >> > +#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (256 * MiB) >> >> The problem we have is any number we pick here is arbitrary. And while >> it did regress your use-case changing it again just pushes a performance >> regression onto someone else. > > > Thanks for your feedback Alex! > > That is true "for you" since 5.0 is released from upstreams POV. > But from the downstreams POV no 5.0 exists for Ubuntu yet and I'd break > many places releasing it like that. > Sadly the performance gain to the other cases will most likely go unnoticed. > > >> The most (*) 64 bit desktop PCs have 16Gb >> of RAM, almost all have more than 8gb. And there is a workaround. >> > > Due to our work around virtualization the values representing > "most 64 bit desktop PCs" aren't the only thing that matters :-) > > ... > > >> > This is a bit more tricky than it seems as ram_sizes is no more >> > present in that context but it is enough to discuss it. >> > That should serve all cases - small and large - better as a pure >> > static default of 1G or always ram/4? >> >> I'm definitely against re-introducing ram_size into the mix. The >> original commit (a1b18df9a4) that broke this introduced an ordering >> dependency which we don't want to bring back. >> > > I agree with that reasoning, but currently without any size dependency > the "arbitrary value" we picked to be 1G is even more fixed than it was > before. > Compared to pre v5.0 for now I can only decide to > a) tune it down -> performance impact for huge guests > b) keep it at 1G -> functional breakage with small hosts > > I'd be more amenable to something that took into account host memory and >> limited the default if it was smaller than a threshold. Is there a way >> to probe that that doesn't involve slurping /proc/meminfo? >> > > I agree that a host-size dependency might be the better way to go, > yet I have no great cross-platform resilient way to get that. > Maybe we can make it like "if I can get some value consider it, > otherwise use the current default". > That would improve many places already, while keeping the rest at the > current behavior. > > >> > >> > P.S. I added Alex being the Author of the offending patch and >> Richard/Paolo >> > for being listed in the Maintainers file for TCG. >> >> >> > From Zoltan (unifying the thread a bit): > >> I agree that this should be dependent on host memory size not guest >> ram_size but it might be tricky to get that value because different host >> OSes would need different ways. > > Well - where it isn't available we will continue to take the default > qemu 5.0 already had. If required on other platforms as well they can add > their way of host memory detection into this as needed. > >> Maybe a new qemu_host_mem_size portability >> function will be needed that implements this for different host OSes. >> POSIX may or may not have sysconf _SC_PHYS_PAGES and _SC_AVPHYS_PAGES > > We should not try to get into the business of _SC_AVPHYS_PAGES and > try to understand/assume what might be cache or otherwise (re)usable. > Since we only look for some alignment to hosts size _SC_PHYS_PAGES should > be good enough and available in more places than the other options. > >> and linux has sysinfo but don't know how reliable these are. > > sysconf is slightly more widely available than sysinfo and has enough for > what we need. > > > I have combined the thoughts above into a patch and it works well in > my tests. > > 32G Host: > pages 8187304.000000 > pagesize 4096.000000 > max_default 4191899648 > final tb_size 1073741824 > > (qemu) info jit > Translation buffer state: > gen code size 210425059/1073736659 > TB count 368273 > TB avg target size 20 max=1992 bytes > TB avg host size 330 bytes (expansion ratio: 16.1) > cross page TB count 1656 (0%) > direct jump count 249813 (67%) (2 jumps=182112 49%) > TB hash buckets 197613/262144 (75.38% head buckets used) > TB hash occupancy 34.15% avg chain occ. Histogram: [0,10)%|▆ █ > ▅▁▃▁▁|[90,100]% > TB hash avg chain 1.020 buckets. Histogram: 1|█▁▁|3 > > Statistics: > TB flush count 1 > TB invalidate count 451673 > TLB full flushes 0 > TLB partial flushes 154819 > TLB elided flushes 191627 > [TCG profiler not compiled] > > => 1G TB size not changed compared to v5.0 - as intended > > > But on a small 1.5G Host it now works without OOM: > > pages 379667.000000 > pagesize 4096.000000 > max_default 194389504 > final tb_size 194389504 > > (qemu) info jit > Translation buffer state: > gen code size 86179731/194382803 > TB count 149995 > TB avg target size 20 max=1992 bytes > TB avg host size 333 bytes (expansion ratio: 16.5) > cross page TB count 716 (0%) > direct jump count 98854 (65%) (2 jumps=74962 49%) > TB hash buckets 58842/65536 (89.79% head buckets used) > TB hash occupancy 51.46% avg chain occ. Histogram: [0,10)%|▃ ▇ > █▂▆▁▄|[90,100]% > TB hash avg chain 1.091 buckets. Histogram: 1|█▁▁|3 > > Statistics: > TB flush count 10 > TB invalidate count 31733 > TLB full flushes 0 > TLB partial flushes 180891 > TLB elided flushes 244107 > [TCG profiler not compiled] > > => ~185M which is way more reasonable given the host size > > The patch will have a rather large comment in it, I'm not sure if the full > comment is needed, but I wanted to leave a trace what&why is going > on in the code for the next one who comes by. > > Submitting as a proper patch to the list in a bit ... Ahh did you not get CC'ed by my patch this morning? -- Alex Bennée