Looks like nouveau works fine when setting PIPE_CAP_USER_*_BUFFERS to false. I'm not sure why, as the u_vbuf code seems similarly busted -- e.g. it does
map -= vb->stride * min_index; But... perhaps I'm misreading/misunderstanding the code.When I fudge the code to force unroll_indices = true, it does in fact break. I'll send re-send this as a patch series which also fixes the u_vbuf code, I guess. On Wed, Jan 22, 2014 at 3:50 PM, Ilia Mirkin <imir...@alum.mit.edu> wrote: > Do you think that the 32-bit multiply overflow when adding to the > pointer is the correct behaviour? If it causes radeon/whatever to > fail, I'd argue that they're adjusting their pointer incorrectly as > well. I'll see what's up by setting that cap to 0, and if that causes > the test to break, I'll figure out why and fix util/u_vbuf as well. > > On Wed, Jan 22, 2014 at 3:46 PM, Marek Olšák <mar...@gmail.com> wrote: >> The draw-elements-base-vertex-neg test passes on Radeon, which uses >> the common util/u_vbuf for uploading vertices. I know Nouveau is >> probably the only driver which doesn't use it, not counting the swrast >> drivers. I'm afraid that your change from fail to pass for Nouveau >> will break the test for everybody else. You can switch to using >> util/u_vbuf by reporting PIPE_CAP_USER_VERTEX_BUFFERS = 0. Then you >> will hit the same code path as Radeon. >> >> Marek >> >> On Wed, Jan 22, 2014 at 9:32 PM, Ilia Mirkin <imir...@alum.mit.edu> wrote: >>> On Wed, Jan 22, 2014 at 3:27 PM, Marek Olšák <mar...@gmail.com> wrote: >>>> Does Nouveau still work if you report PIPE_CAP_USER_VERTEX_BUFFERS = 0? >>> >>> I'm not in front of a machine with nouveau, so I can't tell you right >>> now, but I'll test it out later tonight. Out of curiousity though, why >>> do you ask? Is it related to this patch, or just idle curiiousity on >>> your end? >>> >>>> >>>> Marek >>>> >>>> On Wed, Jan 22, 2014 at 3:37 AM, Ilia Mirkin <imir...@alum.mit.edu> wrote: >>>>> This was discovered as a result of the draw-elements-base-vertex-neg >>>>> piglit test, which passes very negative offsets in, followed up by large >>>>> indices. The nouveau code correctly adjusts the pointer, but the >>>>> transfer code needs to do the proper inverse correction. Similarly fix >>>>> up the SSE code to do a 64-bit multiply to compute the proper offset. >>>>> >>>>> Signed-off-by: Ilia Mirkin <imir...@alum.mit.edu> >>>>> --- >>>>> >>>>> With this change, nouveau passes for the draw-elements-base-vertex-neg >>>>> piglit >>>>> test with user_varrays, on a 64-bit setup both with and without >>>>> GALLIUM_NOSSE=1. I'm pretty sure that the change should be minimal to a >>>>> non-x86 setup since the rexw will be a no-op. I guess there will be an >>>>> extra >>>>> register use for the mov, but it shouldn't be too expensive, esp on >>>>> anything >>>>> remotely current. >>>>> >>>>> src/gallium/auxiliary/translate/translate_generic.c | 2 +- >>>>> src/gallium/auxiliary/translate/translate_sse.c | 8 ++++++-- >>>>> 2 files changed, 7 insertions(+), 3 deletions(-) >>>>> >>>>> diff --git a/src/gallium/auxiliary/translate/translate_generic.c >>>>> b/src/gallium/auxiliary/translate/translate_generic.c >>>>> index 5bf97db..5ffce32 100644 >>>>> --- a/src/gallium/auxiliary/translate/translate_generic.c >>>>> +++ b/src/gallium/auxiliary/translate/translate_generic.c >>>>> @@ -638,7 +638,7 @@ static ALWAYS_INLINE void PIPE_CDECL generic_run_one( >>>>> struct translate_generic * >>>>> } >>>>> >>>>> src = tg->attrib[attr].input_ptr + >>>>> - tg->attrib[attr].input_stride * index; >>>>> + (ptrdiff_t)tg->attrib[attr].input_stride * index; >>>>> >>>>> copy_size = tg->attrib[attr].copy_size; >>>>> if(likely(copy_size >= 0)) >>>>> diff --git a/src/gallium/auxiliary/translate/translate_sse.c >>>>> b/src/gallium/auxiliary/translate/translate_sse.c >>>>> index a78ea91..a72454a 100644 >>>>> --- a/src/gallium/auxiliary/translate/translate_sse.c >>>>> +++ b/src/gallium/auxiliary/translate/translate_sse.c >>>>> @@ -1121,7 +1121,9 @@ static boolean init_inputs( struct translate_sse *p, >>>>> x86_cmovcc(p->func, tmp_EAX, buf_max_index, cc_AE); >>>>> } >>>>> >>>>> - x86_imul(p->func, tmp_EAX, buf_stride); >>>>> + x86_mov(p->func, p->tmp2_EDX, buf_stride); >>>>> + x64_rexw(p->func); >>>>> + x86_imul(p->func, tmp_EAX, p->tmp2_EDX); >>>>> x64_rexw(p->func); >>>>> x86_add(p->func, tmp_EAX, buf_base_ptr); >>>>> >>>>> @@ -1207,7 +1209,9 @@ static struct x86_reg get_buffer_ptr( struct >>>>> translate_sse *p, >>>>> x86_cmp(p->func, ptr, buf_max_index); >>>>> x86_cmovcc(p->func, ptr, buf_max_index, cc_AE); >>>>> >>>>> - x86_imul(p->func, ptr, buf_stride); >>>>> + x86_mov(p->func, p->tmp2_EDX, buf_stride); >>>>> + x64_rexw(p->func); >>>>> + x86_imul(p->func, ptr, p->tmp2_EDX); >>>>> x64_rexw(p->func); >>>>> x86_add(p->func, ptr, buf_base_ptr); >>>>> return ptr; >>>>> -- >>>>> 1.8.3.2 >>>>> >>>>> _______________________________________________ >>>>> mesa-dev mailing list >>>>> mesa-dev@lists.freedesktop.org >>>>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev