Richard Henderson writes: > On 12/22/2016 10:35 AM, Lluís Vilanova wrote: >> To handle both issues, this series replicates the shared physical TB cache, >> creating a separate physical TB cache for every combination of event states >> (those with the 'vcpu' and 'tcg' properties). Then, all vCPUs tracing the >> same >> events will use the same physical TB cache.
> Why do we need to "split the physical TB cache" as opposed to simply including > the trace state into the TB hash function? Mmmm, that's an interesting alternative I did not consider. Are you aiming at minimizing the changes, or do you also think it would be more efficient? The dynamic tracing state would then be an arbitrarily long bitmap (defined by the number of events with the 'vcpu' property), so I'm not sure how to fit it into the hashing function with minimal collisions (the bitmap is now limited to an unsigned long to use it as an index to the TB cache "matrix"). The other drawback I see is that then it would also take longer to compute the hashing function, instead of the simpler array indexing. As a benefit, workloads with a high frequency of TB-flushing operations might be a bit faster (there would be a single QHT). If someone can provide me the code for the modified hash lookup function to account for the trace dstate bitmap contents, I will integrate it and measure if there is any important change in performance. Cheers, Lluis