> The generic vector types (used with the vector_size attribute) could be > seen as the beginnings of such an abstract layer.
Yes, this is very likely is going to be a starting point. I'm sorry that I have not mentioned this in my first email. Maybe there could be some alternative ideas how it should look like. But this is the most obvious one. Basically, general vectors support only a restricted set of operations: +, -, *, /, &, |, ^, ~ The indexing, as you already said, is not supported. I know about this patch, but the question is what would be the most efficient way to implement it. Do we always want to return a value or a memory address of the particular vector element, or may be we can optimize the set of operations using vector-shifting and vector-masking to keep an element just inside the vector. Sometimes it could be faster. You cannot compare two vectors, although you have built-in instructions for that. You cannot do shifts within a vector and it could be very useful. Sometimes general vector extension just fails, producing a code that causes Segmentation fault. For example: #include <stdio.h> #define N 1024 typedef short __attribute__((vector_size(16))) v8hi; short a[N]; v8hi *pa = (v8hi *)a, *pvt; v8hi va; int main(int argc, char *argv[]) { FILE *f; int i, var; f = fopen(argv[1], "r"); for (i = 0; i < N; i++) { fscanf(f, "%i", &var); a[i] = (short) var; } printf("Before the assignment\n"); va = *((v8hi *)&(a[0])); pvt = ((v8hi *)&(a[3])); *pvt = va; printf("After the assignment\n"); for (i = 0; i < 20; i++) { printf("%i ", a[i]); } printf("\n"); fclose (f); } all the vector assignments are converted in case of intel architecture into instruction "movdqa" which works only if memory is aligned, which is not the case in this example. Compiler can't figure it out and produces a code which causes segmentation fault. Although if you would compile the same code on an architecture without SIMD support then it works fine. It is surely not a very serious bug but it makes hard to use generic vector support. Reduction of the operation is not supported, you can't sum over the vector of elements. Some architectures have a support for this feature as well. Permutation of elements within a vector. Saturated arithmetic. But I'm not an expert in that field. I mean I don't know what kind o instructions each architecture provides for saturated arithmetic. But I think it would not be hard to find it out. And some more. The question is what should be done at the first stage. It is surely a very big project, not for one summer. Depending on the taste of the mentor, different things could be done as at the beginning. Are you interested in mentoring this project? -- Artem Shinkarov