About VLIW backend
Hi, I wonder if any efforts have been made to retarget GCC to VLIW backend.Is there any project trying to do that? Is it included in the GCC mainstream? Thanks. Regards, Li Wang
Re: About VLIW backend
Hi, I know that. But I am talking to a more _pure_ VLIW architecture which totally relies on static scheduling rather than EPIC architecture. Thanks. Li Wang wrote: Hi, I wonder if any efforts have been made to retarget GCC to VLIW backend.Is there any project trying to do that? Is it included in the GCC mainstream? Thanks. the ia64 is a VLIW architecture! Regards, Li Wang
Re: How to describe function units allocation
Hi, Thanks. As you know, I am trying to retarget GCC to a somewhat different VLIW backend by beginning from understanding the TMS320C6x port codes. Now I know that I could achieve the functional units allocation in assembler. However, I am still interesting in that if possible to do this by just modifying cc1. Not involve the assembler gas. If possible to achieve that by only coding the .md, .h and .c files? Regards, Li Wang Hi, For the backend TI DSP TMS320C6x, There are four types of functional units which are .L unit, .M unit, .S unit and .D unit, and each type consists of two units named .X1 and .X2 respectively. Namely, there are total 8 units. Except the .M units surve only for multiply, other units share many functions. For example, they both enable 32 bits arithmetical operation. And in the assembly, which functional unit is used to perform operation must be explicitly indicated. For example, ADD .S1 A0, A1, A2; ADD .L1 A0, A1, A2; ADD .D1 A0, A1, A2 achieve the same goal by using different units. Surely, when producing assembly, a functional unit allocation somewhat like register allocation is needed. I wonder how can I describe the relationship in the machine description file, and whether I need write a functional unit allocation algorithm or it is done by a general purpose allocation algorithm embedded in GCC, like register allocation, I only need give some architecture descriptions? Thanks in advance for your kind assistance. IMHO. the functional units that accompany the assembly instruction are optional. However, for c6x-gcc the reason cc1 doesnt allocate functional units is that the assembler ( as part of the c6x binutils ) does the functional unit allocation on its own. There are some notes about how the assembler does this in Extending the GNU Assembler for Texas Instruments TMS320C6x-DSP.pdf HTH, Pranav Regards, Li Wang
Re: How to let GCC produce flat assembly
Hi, I may need explain this problem more clearly.For a backend which runs as coprocessor to a host processor, such as GPU, which incoporates large numbers of ALUS and processes only arithmetic operations and some other simple operations, runs in VLIW pattern to accelerate the host processor. Say, this coprocessor is referred as 'raw processor', note, I don't mention GPU, GPU is similar in mechnism but more complex than this. It owns simple ISA, and has no dedicated ESP, EBP to support function call, It fetches the VLIW instruction from instruction memory one by one, and execute it. If I want to let GCC produce assembly for it, how should I code the machine description file? Should I first let cc1 produce a elf assembly for it, and then let binutils trunate it to a flat assembly? It seems ugly hacking. Thanks. Regards, Li Wang > Li Wang wrote: > >> Hi, >> I wonder how to let GCC produce flat assembly, say, just like the .com >> file under the DOS, without function calls and complicate executable >> file headers, only instructions. How to modify the machine description >> file to achieve that? Thanks in advance. >> > > Perhaps you are asking on the wrong list. > > And what exactly do you want to achieve and why? > > What is your target system? > > Why using (and appropriately configuring) the binutils (in particular > its linker, ld, implicitly invoked by gcc) not appropriate for your > needs? I am sure that you can configure it appropriately (binutils is > very powerful). > > You still will need other generated data than the instructions. > Typically, constants such as strings. And many other stuff. > > >
How to let GCC produce flat assembly
Hi, I wonder how to let GCC produce flat assembly, say, just like the .com file under the DOS, without function calls and complicate executable file headers, only instructions. How to modify the machine description file to achieve that? Thanks in advance. Regards, Li Wang
Re: How to let GCC produce flat assembly
Hi, Thanks for your attention and response. I think I am still not very accurate to describe what I want to do. I am too anxious to explain far from clearly. Now permit me use a simple example, for the simple C program below, compiled by cc1 targetting to x86 platform, the assembly is as follows, int main() { int a, b, c; a = 2; b = 2; c = a + b; return 0; } .file"test.c" .text .globl main .typemain,@function main: pushl%ebp movl%esp, %ebp subl$24, %esp andl$-16, %esp movl$0, %eax subl%eax, %esp movl$2, -4(%ebp) movl$2, -8(%ebp) movl-8(%ebp), %eax addl-4(%ebp), %eax movl%eax, -12(%ebp) movl$0, %eax leave ret As you said, the coprocessor has no ABI to describe a stack and a function interface, then inline applies. But how could I inline 'main'? And I am sorry for I misuse the word 'elf assembly', what exactly I mean by that is how to omit the section or any other informations helps linker to organize a executable from the cc1 output. In a word, codes something like the following is what I want, If possible to let cc1 produce such assembly? Thanks. movl$2, -4(%ebp) movl$2, -8(%ebp) movl-8(%ebp), %eax addl -4(%ebp), %eax Regards, Li Wang On Thu, Nov 15, 2007 at 04:20:49PM -0800, Li Wang wrote: I may need explain this problem more clearly. Yes, my earlier message directing you to gcc-help was because I thought you didn't grasp what the compiler should do and what the linker should do; sorry about that. For a backend which runs as coprocessor to a host processor, such as GPU, which incoporates large numbers of ALUS and processes only arithmetic operations and some other simple operations, runs in VLIW pattern to accelerate the host processor. Say, this coprocessor is referred as 'raw processor', note, I don't mention GPU, GPU is similar in mechnism but more complex than this. It owns simple ISA, and has no dedicated ESP, EBP to support function call. But those registers aren't dedicated to support function calls on the x86 except by convention. If your coprocessor has no ABI to describe a stack and a function interface, you need to invent one, so that you can do function calls. gcc can inline the calls where it makes sense, and the scores can be adjusted so that a lot of inlining happens if your stack is inefficient. If I want to let GCC produce assembly for it, how should I code the machine description file? Should I first let cc1 produce a elf assembly for it, and then let binutils trunate it to a flat assembly? It seems ugly hacking. Thanks. gcc produces assembler code. as turns it into object code. ld links to form an executable. That's the way that it works.
Re: How to let GCC produce flat assembly
Dave Korn 写道: On 16 November 2007 05:56, Li Wang wrote: As you said, the coprocessor has no ABI to describe a stack and a function interface, then inline applies. But how could I inline 'main'? And I am sorry for I misuse the word 'elf assembly', what exactly I mean by that is how to omit the section or any other informations helps linker to organize a executable from the cc1 output. In a word, codes something like the following is what I want, If possible to let cc1 produce such assembly? Thanks. movl$2, -4(%ebp) movl$2, -8(%ebp) movl-8(%ebp), %eax addl-4(%ebp), %eax Various CPU backends (but IIRC not i386) implement a "naked" function attribute, which suppresses function epilogue and prologue generation. You could implement something like that. It seems to be what I want. Could you please give more clues? Which backend and where I can find that "naked" function attribute, thanks. cheers, DaveK Regards, Li Wang
Generate Codes for a something like stack/dataflow computer
Hi, We are retargetting GCC to a VLIW chip, which runs as a coprocessor to a general purpose processor. The coprocessor is responsible for expediating some code sections which have good parallel characteristics without any dependences. Its ISA enables it can only fetch data sequentially rather than random access from a on-chip memory which is shared by the host processor, through dedicated function units named DBx. The host processor is responsible to place data there, and told the DBx base address and data length. Once the data is fetched by the coprocessor, it is stored to local registers owned by the coprocessor, and before the computing ends, the data will always reside in the coprocessor's registers. Namely, without spills and it permits no spills. From the coprocessor standpoint, the instructions supports no memory operands and no any addressing mode. It supports only register move and arithmetical operations. It looks something like data flow computer or stack computer. Let's take the following codes as an example: int main() { int a[16], b[16], c[16]; compute(a, b, c); return 0; } void compute(int a[], int b[], int c[]) { for (int j = 0; j < 16; j++) c[j] = a[j] + b[j]; return; } We want to put the function compute() executed on the coprocessor, and host processor organizes and places the data at proper positions in the on-chip memory, prepare the DBx function units. Assume DB0 is allocated to array a[], DB1 to b[], DB2 to c[]. Then the assemble codes for the coprocessor we want to generate like as follows, L3: if (data in DB0 not exausted) goto L1; else goto L2; L1: get R0, DB0; // load a data from the on-chip memory through DB0 to R0 get R1, DB1; add R2, R0, R1; put R2, DB2; // store result to DB2 goto L3; L2: end; Could anyone give some hints how to implement that, currently the GCC internals for addressing mode in the machine description could support that? Li