On Thu, Jul 16, 2020 at 6:27 PM Alex Bennée <alex.ben...@linaro.org> wrote:
> > Christian Ehrhardt <christian.ehrha...@canonical.com> writes: > > > On Wed, Jul 15, 2020 at 5:58 PM BALATON Zoltan <bala...@eik.bme.hu> > wrote: > > > >> See commit 47a2def4533a2807e48954abd50b32ecb1aaf29a and the next two > >> following it. > >> > > > > Thank you Zoltan for pointing out this commit, I agree that this seems > to be > > the trigger for the issues I'm seeing. Unfortunately the common CI host > size > > is 1-2G. For example on Ubuntu Autopkgtests 1.5G. > > Those of them running guests do so in 0.5-1G size in TCG mode > > (as they often can't rely on having KVM available). > > > > The 1G TB buffer + 0.5G actual guest size + lack of dynamic downsizing > > on memory pressure (never existed) makes these systems go OOM-Killing > > the qemu process. > > Ooops. I admit the assumption was that most people running system > emulation would be doing it on beefier machines. > > > The patches indicated that the TB flushes on a full guest boot are a > > good indicator of the TB size efficiency. From my old checks I had: > > > > - Qemu 4.2 512M guest with 32M default overwritten by ram-size/4 > > TB flush count 14, 14, 16 > > - Qemu 5.0 512M guest with 1G default > > TB flush count 1, 1, 1 > > > > I agree that ram/4 seems odd, especially on huge guests that is a lot > > potentially wasted. And most environments have a bit of breathing > > room 1G is too big in small host systems and the common CI system falls > > into this category. So I tuned it down to 256M for a test. > > > > - Qemu 4.2 512M guest with tb-size 256M > > TB flush count 5, 5, 5 > > - Qemu 5.0 512M guest with tb-size 256M > > TB flush count 5, 5, 5 > > - Qemu 5.0 512M guest with 256M default in code > > TB flush count 5, 5, 5 > > > > So performance wise the results are as much in-between as you'd think > from a > > TB size in between. And the memory consumption which (for me) is the > actual > > current issue to fix would be back in line again as expected. > > So I'm glad you have the workaround. > > > > > So on one hand I'm suggesting something like: > > --- a/accel/tcg/translate-all.c > > +++ b/accel/tcg/translate-all.c > > @@ -944,7 +944,7 @@ static void page_lock_pair(PageDesc **re > > * Users running large scale system emulation may want to tweak their > > * runtime setup via the tb-size control on the command line. > > */ > > -#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (1 * GiB) > > +#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (256 * MiB) > > The problem we have is any number we pick here is arbitrary. And while > it did regress your use-case changing it again just pushes a performance > regression onto someone else. Thanks for your feedback Alex! That is true "for you" since 5.0 is released from upstreams POV. But from the downstreams POV no 5.0 exists for Ubuntu yet and I'd break many places releasing it like that. Sadly the performance gain to the other cases will most likely go unnoticed. > The most (*) 64 bit desktop PCs have 16Gb > of RAM, almost all have more than 8gb. And there is a workaround. > Due to our work around virtualization the values representing "most 64 bit desktop PCs" aren't the only thing that matters :-) ... > > This is a bit more tricky than it seems as ram_sizes is no more > > present in that context but it is enough to discuss it. > > That should serve all cases - small and large - better as a pure > > static default of 1G or always ram/4? > > I'm definitely against re-introducing ram_size into the mix. The > original commit (a1b18df9a4) that broke this introduced an ordering > dependency which we don't want to bring back. > I agree with that reasoning, but currently without any size dependency the "arbitrary value" we picked to be 1G is even more fixed than it was before. Compared to pre v5.0 for now I can only decide to a) tune it down -> performance impact for huge guests b) keep it at 1G -> functional breakage with small hosts I'd be more amenable to something that took into account host memory and > limited the default if it was smaller than a threshold. Is there a way > to probe that that doesn't involve slurping /proc/meminfo? > I agree that a host-size dependency might be the better way to go, yet I have no great cross-platform resilient way to get that. Maybe we can make it like "if I can get some value consider it, otherwise use the current default". That would improve many places already, while keeping the rest at the current behavior. > > > > P.S. I added Alex being the Author of the offending patch and > Richard/Paolo > > for being listed in the Maintainers file for TCG. > > > >From Zoltan (unifying the thread a bit): > I agree that this should be dependent on host memory size not guest > ram_size but it might be tricky to get that value because different host > OSes would need different ways. Well - where it isn't available we will continue to take the default qemu 5.0 already had. If required on other platforms as well they can add their way of host memory detection into this as needed. > Maybe a new qemu_host_mem_size portability > function will be needed that implements this for different host OSes. > POSIX may or may not have sysconf _SC_PHYS_PAGES and _SC_AVPHYS_PAGES We should not try to get into the business of _SC_AVPHYS_PAGES and try to understand/assume what might be cache or otherwise (re)usable. Since we only look for some alignment to hosts size _SC_PHYS_PAGES should be good enough and available in more places than the other options. > and linux has sysinfo but don't know how reliable these are. sysconf is slightly more widely available than sysinfo and has enough for what we need. I have combined the thoughts above into a patch and it works well in my tests. 32G Host: pages 8187304.000000 pagesize 4096.000000 max_default 4191899648 final tb_size 1073741824 (qemu) info jit Translation buffer state: gen code size 210425059/1073736659 TB count 368273 TB avg target size 20 max=1992 bytes TB avg host size 330 bytes (expansion ratio: 16.1) cross page TB count 1656 (0%) direct jump count 249813 (67%) (2 jumps=182112 49%) TB hash buckets 197613/262144 (75.38% head buckets used) TB hash occupancy 34.15% avg chain occ. Histogram: [0,10)%|▆ █ ▅▁▃▁▁|[90,100]% TB hash avg chain 1.020 buckets. Histogram: 1|█▁▁|3 Statistics: TB flush count 1 TB invalidate count 451673 TLB full flushes 0 TLB partial flushes 154819 TLB elided flushes 191627 [TCG profiler not compiled] => 1G TB size not changed compared to v5.0 - as intended But on a small 1.5G Host it now works without OOM: pages 379667.000000 pagesize 4096.000000 max_default 194389504 final tb_size 194389504 (qemu) info jit Translation buffer state: gen code size 86179731/194382803 TB count 149995 TB avg target size 20 max=1992 bytes TB avg host size 333 bytes (expansion ratio: 16.5) cross page TB count 716 (0%) direct jump count 98854 (65%) (2 jumps=74962 49%) TB hash buckets 58842/65536 (89.79% head buckets used) TB hash occupancy 51.46% avg chain occ. Histogram: [0,10)%|▃ ▇ █▂▆▁▄|[90,100]% TB hash avg chain 1.091 buckets. Histogram: 1|█▁▁|3 Statistics: TB flush count 10 TB invalidate count 31733 TLB full flushes 0 TLB partial flushes 180891 TLB elided flushes 244107 [TCG profiler not compiled] => ~185M which is way more reasonable given the host size The patch will have a rather large comment in it, I'm not sure if the full comment is needed, but I wanted to leave a trace what&why is going on in the code for the next one who comes by. Submitting as a proper patch to the list in a bit ... -- > Alex Bennée > -- Christian Ehrhardt Staff Engineer, Ubuntu Server Canonical Ltd