IRIX 6.5 » Books » Developer »
MIPSpro N32 ABI Handbook
(document number: 007-2816-005 / published: 2002-11-19)
table of contents | additional info | download find in page | jump to first hit | clear highlight
Chapter 4. N32 Examples and Case Studies
This chapter provides examples and case studies of
programs that have been converted from o32 to n32. Each step in the conversion
is presented and examined in detail. This chapter contains:
An examination of the following sample application,
app1, illustrates the steps necessary to port from o32 to n32.
As you can see, app1 is trivial in functionality, but
it is constructed to illustrate several of the issues involved in converting
code from o32 to n32.
The app1 application contains the following files:
main.c, which contains the function
main().
foo.c, which
contains foo(), a varargs function.
gp.s, which contains the assembly language
leaf routine, get_gp(). This function returns the value
of the global pointer register ($gp).
regs.s, which contains
the assembly language function regs(). This function is
linked separately into its own DSO. The function regs()
returns the value of $gp, the return address register (
$ra), and the stack pointer ($sp). This function
also makes calls to the libc routines malloc()
and free() as well as calculating the sum of
two double precision values passed to it as arguments and returns the sum
through a pointer that is also passed to it as an argument.
Figure 4-1 shows a call tree for the
app1 program. It illustrates that main() calls
get_gp(), foo() and printf().
The function foo() calls regs()
and printf(), while regs() calls
malloc() and free(). The figure also shows that
app1 is linked against two shared objects, libc.so
and regs.so.
The source code for the original versions of main.c,
foo.c, gp.s. and regs.s
are shown in the following:
/* main.c */
extern void foo();
main()
{
unsigned gp,ra,sp, get_regs();
double d1 = 1.0;
double d2 = 2.0;
double res;
gp = get_gp();
printf(“gp is 0x%x\n”, gp);
foo(7, 3.14, &gp, &ra,
&sp, d1, &d2, &res);
} |
/* foo.c */
#include <stdarg.h>
void foo(int narg, ...)
{
va_list ap;
double d1;
double daddr1, *daddr2, *resaddr;
unsigned *gp, *ra, *sp;
va_start(ap, narg);
printf(“Number of Arguments is: %d\n”,narg);
d1 = va_arg(ap, double);
printf(“%e\n”,d1);
gp = va_arg(ap, unsigned*);
ra = va_arg(ap, unsigned*);
sp = va_arg(ap, unsigned*);
daddr1 = va_arg(ap, double);
daddr2 = va_arg(ap, double*);
resaddr = va_arg(ap, double*);
printf(“first double precision argument is %e\n”,daddr1);
printf(“second double precision argument is %e\n”,*daddr2);
regs(gp, ra, sp, daddr1, daddr2, resaddr);
printf(“Back from assembly routine\n”);
printf(“gp is 0x%x\n”,*gp);
printf(“ra is 0x%x\n”,*ra);
printf(“sp is 0x%x\n”,*sp);
printf(“result of double precision add is %e\n”,*resaddr);
va_end(ap);
}
/* gp.s */
#include <regdef.h>
#include <asm.h>
LEAF(get_gp)
move v0, gp
j ra
.end get_gp
/* regs.s */
#include <regdef.h>
.text
.globlregs # make regs external
.entregs 2
regs:
.set noreorder
.cploadt9 # setup gp
.set reorder
subu sp, 32 # create stack frame
sw ra, 28(sp) # save return address
.cprestore 24 # for caller saved gp
# save gp 24(sp)
sw gp, 0(a0) # return gp in first arg
sw ra, 0(a1) # return ra in second arg
sw sp, 0(a2) # return sp in third arg
li a0, 1000 # call libc routines
jal malloc # for illustrative purposes
move a0, v0 # to make regs
jal free # a nested function
lw t0, 56(sp) # get fifth argument from stack
lwc1 $f4, 4(t0) # load it in fp register
lwc1 $f5, 0(t0) # fp values are stored in LE
# format
lwc1 $f6, 52(sp) # get fourth argument from stack
lwc1 $f7, 48(sp) # fp values are stored in LE
# format
add.d $f8, $f4, $f6 # do the calculation
lw t0, 60(sp) # get the sixth argument
# from the stack
swc1 $f8, 4(t0) # save the result
swc1 $f9, 0(t0) # fp values are stored in LE
lw ra, 28(sp) # get return address
addu sp, 32 # pop stack
j ra # return to caller
.end regs |
Building and Running the o32 Application
The commands used to build app1 are shown
below. As mentioned previously, regs.s is compiled and
linked separately into its own DSO, while main.c,
foo.c, and gp.s are compiled and linked together.
% cc -32 -O -shared -o regs.so regs.s
% cc -32 -O -o app1 main.c foo.c gp.s regs.so |
In order to run the application, the LD_LIBRARY_PATH
environment variable must be set to the directory where regs.so
resides as shown in the following command:
% setenv LD_LIBRARY_PATH . |
Running the application produces the following results. Note that the
value of $gp is different when code is executing
in the regs.so DSO.
% app1
gp is 0x100090f0
Number of Arguments is: 7
3.140000e+00
first double precision argument is 1.000000e+00
second double precision argument is 2.000000e+00
Back from assembly routine
gp is 0x5fff8ff0
ra is 0x400d10
sp is 0x7fff2e28
result of double precision add is 3.000000e+00 |
If the files foo.c and main.c
were recompiled for n32, the resulting executable would not work for a variety
of reasons. Each reason is examined below and a solution is given. The resulting
set of files will work when compiled either for o32 or for n32. This section
covers:
Attempting
to recompile main.c foo.c
-n32 results in two sets of warnings shown below:
% cc -n32 -O -o app1 main.c foo.c gp2.s
foo.c
!!! Warning (user routine 'foo'):
!!! Prototype required when passing floating point parameter to
varargs routine: printf
!!! Use '#include <stdio.h>' (see ANSI X3.159-1989, Section 3.3.2.2)
ld32: WARNING 110: floating-point parameters exist in the call for
“foo”, a VARARG function, in object “main.o” without a prototype --
would result in invalid result. Definition can be found in object
“foo.o”
ld32: WARNING 110: floating-point parameters exist in the call for
“printf”, a VARARG function, in object “foo.o” without a prototype
-- would result in invalid result. Definition can be found in
object “/usr/lib32/mips4/libc.so” |
The first warning points out that printf() is a
varargs routine that is being called with floating point arguments.
Under these circumstances, a prototype must exist for printf().
This is accomplished by adding the following line to the top of
foo.c:
The second warning points out that foo() is also
a varargs routine with floating point arguments and must
also be prototyped. This is fixed by changing the declaration of
foo() in main.c to:
For completeness, <stdio.h> is also included
in main.c to provide a prototype for printf()
should it ever use floating point arguments.
As a result of these small changes, the C files are fixed and ready
to be compiled using the -n32 option. The new versions
are shown below.
/* main.c */
#include <stdio.h>
extern void foo(int, ...);
main()
{
unsigned gp,ra,sp, get_regs();
double d1 = 1.0;
double d2 = 2.0;
double res;
gp = get_gp();
printf(“gp is 0x%x\n”, gp);
foo(7, 3.14, &gp, &ra,
&sp, d1, &d2, &res);
}
/* foo.c */
#include <stdio.h>
#include <stdarg.h>
void foo(int narg, ...)
{
va_list ap;
double d1;
double daddr1, *daddr2, *resaddr;
unsigned *gp, *ra, *sp;
va_start(ap, narg);
printf(“Number of Arguments is: %d\n”,narg);
d1 = va_arg(ap, double);
printf(“%e\n”,d1);
gp = va_arg(ap, unsigned*);
ra = va_arg(ap, unsigned*);
sp = va_arg(ap, unsigned*);
daddr1 = va_arg(ap, double);
daddr2 = va_arg(ap, double*);
resaddr = va_arg(ap, double*);
printf(“first double precision argument is %e\n”,daddr1);
printf(“second double precision argument is %e\n”,*daddr2);
regs(gp, ra, sp, daddr1, daddr2, resaddr);
printf(“Back from assembly routine\n”);
printf(“gp is 0x%x\n”,*gp);
printf(“ra is 0x%x\n”,*ra);
printf(“sp is 0x%x\n”,*sp);
printf(“result of double precision add is %e\n”,*resaddr);
va_end(ap);
} |
Because get_gp()
is a leaf routine that is linked in the same DSO where it is called,
no changes are required to port it to n32. However, you have to recompile
it.
On the other hand, regs() requires a lot of work.
The issues that need to be addressed are detailed in the following sections.
As explained
throughout this book, the o32 ABI follows the convention that $gp
(global pointer register) is caller saved. This means that the
global pointer is saved before each function call and restored after each
function call returns. This is accomplished by using the .cpload
and .cprestore assembler pseudo instructions
respectively. Both lines are present in the original version of
regs.s.
The n32 ABI, on the other hand, follows the convention that
$gp is callee saved. This means that $gp is saved
at the beginning of each routine and restored right before that routine itself
returns. This is accomplished through the use of .cpsetup,
an assembler pseudo instruction.
The recommended way to deal with these various pseudo instructions
is to use the macros provided in <sys/asm.h>. The
following macros provide correct use of these pseudo instructions whether
compiled for o32 or for n32:
SETUP_GP expands to the .cpload
t9 pseudo instruction for o32. For n32 it is null.
SAVE_GP(GPOFF) expands to the
.cprestore pseudo instruction for o32. For n32 it is null.
SETUP_GP64(GPOFF, regs) expands to the
.cpsetup pseudo instruction for n32. For o32 it is null.
Under o32, registers
are 32 bits wide. Under n32, they are 64 bits wide. As a result, assembly
language routines must be careful in the way they operate on registers. The
following macros defined in <sys/asm.h> are useful
because they expand to 32-bit instructions under o32 and to 64-bit instructions
under n32.
REG_S expands to sw
for o32 and to sd for n32.
REG_L expands to lw
for o32 and to ld for n32.
PTR_SUBU expands to subu
for o32 and to sub for n32.
PTR_ADDU expands to addu
for o32 and to add for n32.
The
get_regs() function in regs.s is called with
six arguments. Under o32, the first three are passed in registers
a0 through a2. The fourth argument (a double
precision parameter) is passed at offset 16 relative to the stack. The fifth
and sixth arguments are passed at offsets 24 and 28 relative to the stack,
respectively. Under n32, however, all of the arguments are passed in registers.
The first three arguments are passed in registers a0 through
a2 as they were under o32. The next parameter is passed in register
$f15. The last two parameters are passed in registers
a4 and a5 respectively. Table 4-1
summarizes where each of the arguments are passed under the two conventions.
Table 4-1. Argument Passing
Argument
| o32
| n32
|
|---|
argument1
| a0
| a0
| argument2
| a1
| a1
| argument3
| a2
| a2
| argument4
| $sp+16
| $f15
| argument5
| $sp+24
| a4
| argument6
| $sp+28
| a5
|
 | Note: Under o32, there are no a4 and a5
registers, but under n32 they must be saved on the stack because
they are used after calls to an external function.
|
The following code fragment illustrates accessing the arguments under
n32:
mov.d $f4,$f15 # 5th argument in 5th fp
# arg. register
l.d $f6,0(a4) # fourth argument in
# fourth arg. register
s.d $f8,0(a5) # save in 6th arg. reg |
Extra Floating Point Registers
As explained in Chapter 3, “Compatibility, Porting, and Assembly Language Programming
Issues”, floating point registers
are 64 bits wide under n32. They are no longer accessed as pairs of single
precision registers for double precision calculations. As a result, the section
of code that uses the pairs of lwc1 or swc1
instructions must be changed. The simplest way to accomplish this is to use
the l.d assembly language instruction. This instruction
expands to two lwc1 instructions under -mips1
; under -mips2 and above, it expands to the
ldc1 instruction.
This
section shows the new version of regs.s. It is coded
so that it will compile and execute for either o32 or n32 environments.
/* regs.s */
#include <sys/regdef.h>
#include <sys/asm.h>
.text
LOCALSZ=5 # save ra, a4, a5, gp, $f15
FRAMESZ= (((NARGSAVE+LOCALSZ)*SZREG)+ALSZ)&ALMASK
RAOFF=FRAMESZ-(1*SZREG) # stack offset where ra is saved
A4OFF=FRAMESZ-(2*SZREG) # stack offset where a4 is saved
A5OFF=FRAMESZ-(3*SZREG) # stack offset where a5 is saved
GPOFF=FRAMESZ-(4*SZREG) # stack offset where gp is saved
FPOFF=FRAMESZ-(5*SZREG) # stack offset where $f15 is
# saved
# a4, a5, and $f15 don't have to
# be saved, but no harm done in
# doing so
NESTED(regs, FRAMESZ, ra)
# define regs to be a nested
# function
SETUP_GP # used for caller saved gp
PTR_SUBU sp,FRAMESZ # setup stack frame
SETUP_GP64(GPOFF, regs) # used for callee saved gp
SAVE_GP(GPOFF) # used for caller saved gp
REG_S ra, RAOFF(sp) # save ra on stack
#if (_MIPS_SIM != _MIPS_SIM_ABI32)
# not needed for o32
REG_S a4, A4OFF(sp) # save a4 on stack (argument 4)
REG_S a5, A5OFF(sp) # save a5 on stack (argument 5)
s.d $f15,FPOFF(sp) # save $f15 on stack (argument 6)
#endif /* _MIPS_SIM != _MIPS_SIM_ABI32 */
sw gp, 0(a0) # return gp in first arg
sw ra, 0(a1) # return ra in second arg
sw sp, 0(a2) # return sp in third arg
li a0, 1000 # call malloc
jal malloc # for illustration purposes only
move a0, v0 # call free
jal free # go into libc.so twice
# this is why a4, a5, $f15
# had to be saved
#if (_MIPS_SIM != _MIPS_SIM_ABI32)
# not needed for o32
l.d $f15,FPOFF(sp) # restore $f15 (argument #6)
REG_L a4, A4OFF(sp) # restore a4 (argument #4)
REG_L a5, A5OFF(sp) # restore a5 (argument #5)
#endif /* _MIPS_SIM != _MIPS_SIM_ABI32 */
#if (_MIPS_SIM == _MIPS_SIM_ABI32)
# for o32 arguments will
# need to be pulled from the
# stack
lw t0,FRAMESZ+24(sp) # fifth argument is 24
# relative to original sp
l.d $f4,0(t0) # use l.d for correct code
# on both mips1 & mips2
l.d $f6,FRAMESZ+16(sp) # fourth argument is 16
# relative to original sp
add.d $f8, $f4, $f6 # do the calculation
lw t0,FRAMESZ+28(sp) # sixth argument is 28
# relative to original sp
s.d $f8,0(t0) # save the result there
#else
# n32 args are in regs
mov.d $f4,$f15 # 5th argument in 5th fp
# arg. register
l.d $f6,0(a4) # fourth argument in
# fourth arg. register
add.d $f8, $f4, $f6 # do the calculation
s.d $f8,0(a5) # save in 6th arg. reg
#endif /* _MIPS_SIM != _MIPS_SIM_ABI32 */
REG_L ra, RAOFF(sp) # restore return address
RESTORE_GP64 # restore gp for n32
# (callee saved)
PTR_ADDU sp,FRAMESZ # pop stack
j ra # return to caller
.endregs |
Building and Running the N32 Application
The commands for building an n32 version of app1
are shown below. The only difference is in the use of the -n32
argument on the compiler command line. If app1 was a
large application using many libraries, the command line or makefile would
possibly need to be modified to refer to the correct library paths. In the
case of app1, the correct libc.so
is automatically used as a result of the -n32 argument.
% cc -n32 -O -shared -o regs.so regs.s
% cc -n32 -O -o app1 main.c foo.c gp.s regs.so |
In order to run the application, the LD_LIBRARY_PATH
environment variable must again be set to the directory where regs.so
resides.
% setenv LD_LIBRARY_PATH . |
Running the application produces the following results. Note that the
values of some of the returned registers are different from those returned
by the o32 version of app1.
% app1
gp is 0x100090e8
Number of Arguments is: 7
3.140000e+00
first double precision argument is 1.000000e+00
second double precision argument is 2.000000e+00
Back from assembly routine
gp is 0x5fff8ff0
ra is 0x10000d68
sp is 0x7fff2e30
result of double precision add is 3.000000e+00 |
Building Multiple Versions of the Application
Following
the prevous procedure generates new n32 versions of app1
and regs.so; however, they overwrite the old o32 versions.
To build multiple versions of app1, use one of the following
methods:
Use different names for the n32 and o32 versions of the application
and DSO. This method is simple, but for large applications, you must rename
each DSO.
Create separate directories for the o32 and n32 applications
and DSOs, respectively. Modify the preceding commands or modify the makefiles
to create app1 and reg.so in the
appropriate directory. This method offers more organization than the first
approach, but you must set the LD_LIBRARY_PATH accordingly.
Create separate directories as specified above, but add the
-rpath argument to the command line that builds app1.
MIPSpro N32 ABI Handbook
(document number: 007-2816-005 / published: 2002-11-19)
table of contents | additional info | download
Front Matter
About This Guide
Chapter 1. N32 ABI Overview
Chapter 2. Calling Convention Implementations
Chapter 3. Compatibility, Porting, and Assembly Language Programming Issues
Chapter 4. N32 Examples and Case Studies
Index
home/search |
what's new |
help
|