./cpu/mpc85xx/start.S中设置GOT(Global Offset Table)的代码如下:
/*
* Set up GOT: Global Offset Table
*
* Use r14 to access the GOT
*/
START_GOT
GOT_ENTRY(_GOT2_TABLE_)
GOT_ENTRY(_FIXUP_TABLE_)
GOT_ENTRY(_start)
GOT_ENTRY(_start_of_vectors)
GOT_ENTRY(_end_of_vectors)
GOT_ENTRY(transfer_to_handler)
GOT_ENTRY(__init_end)
GOT_ENTRY(_end)
GOT_ENTRY(__bss_start)
#if defined(CONFIG_FADS)
GOT_ENTRY(environment)
#endif
END_GOT
相关的宏在./include/ppc_asm.tmpl中定义:
#define START_GOT /
.section ".got2","aw"; /
.LCTOC1 = .+32768
#define END_GOT /
.text
#define GET_GOT /
bl 1f ; /
.text 2 ; /
0: .long .LCTOC1-1f ; /
.text ; /
1: mflr r14 ; /
lwz r0,0b-1b(r14) ; /
add r14,r0,r14 ;
#define GOT_ENTRY(NAME) .L_ ## NAME = . - .LCTOC1 ; .long NAME
#define GOT(NAME) .L_ ## NAME (r14)
总体来说,START_GOT用于定义表的开始,END_GOT用于定义表的结束,GOT_ENTRY用于将offset写入表中,GOT用于从表中读出 offset,GET_GOT用于将表进行初始化。
下面详细解释之:
START_GOT定义了段“got2”,属性为“allocatable and
writable”,并定义了变量.LCTOC1,.LCTOC1的值是表的最高地址。如果设表的起始地址为TABLE_START,则.LCTOC1的
值为TABLE_START+0x8000。
END_GOT定义为子段text 0的开始。
GOT_ENTRY定义了变量.L_NAME,其值为当前表项的地址(.)-.LCTOC1。如果设NAME的表项偏移地址为
NAME_OFFSET,那么.L_NAME = . - .LCTOC1 = TABLE_START + NAME_OFFSET - (
TABLE_START + 0x8000 ) = NAME_OFFSET -
0x8000。之后将名字为NAME的offset值写入当前表项,这些offset值是在编译的时候确定的。
GOT(NAME)的值定义为.L_NAME(r14),这里面r14的值为表的最高地址,也就是.LCTOC1的值(参见下面关于
GET_GOT的说明)。这样GOT(NAME) = .L_NAME + r14 = .L_NAME + .LCTOC1 =
NAME_OFFSET - 0x8000 + TABLE_START + 0x8000 = NAME_OFFSET +
TABLE_START,也就是NAME所在表项的地址。这样,通过查表,就可以找到当初存储在表中的名字为NAME的offset值。
GET_GOT用于初始化GOT表。首先程序跳转到标号为“1”的地址处(bl
1f),然后将lr的值赋值给r14(此时lr的值为“1”的地址值)。然后另r0 = 0b -
1b(r14),0b为“0”处的地址值,1b为“1”处的地址值。这样r0就等于“0”处的值,也就是.LCTOC1-1f。最后r14 = r0 +
r14 = .LCTOC1 - 1f + 1f = .LCTOC1,也就是等于GOT表的最高地址。
PPC 的GOT解释
2010-12-09 14:01:16| 分类:
默认分类
|字号 订阅
./cpu/mpc85xx/start.S中设置GOT(Global Offset Table)的代码如下:
/*
* Set up GOT: Global Offset Table
*
* Use r14 to access the GOT
*/
START_GOT
GOT_ENTRY(_GOT2_TABLE_)
GOT_ENTRY(_FIXUP_TABLE_)
GOT_ENTRY(_start)
GOT_ENTRY(_start_of_vectors)
GOT_ENTRY(_end_of_vectors)
GOT_ENTRY(transfer_to_handler)
GOT_ENTRY(__init_end)
GOT_ENTRY(_end)
GOT_ENTRY(__bss_start)
#if defined(CONFIG_FADS)
GOT_ENTRY(environment)
#endif
END_GOT
相关的宏在./include/ppc_asm.tmpl中定义:
#define START_GOT \
.section ".got2","aw"; \
.LCTOC1 = .+32768
#define END_GOT \
.text
#define GET_GOT \
bl 1f ; \
.text 2 ; \
0: .long .LCTOC1-1f ; \
.text ; \
1: mflr r14 ; \
lwz r0,0b-1b(r14) ; \
add r14,r0,r14 ;
#define GOT_ENTRY(NAME) .L_ ## NAME = . - .LCTOC1 ; .long NAME
#define GOT(NAME) .L_ ## NAME (r14)
总体来说,START_GOT用于定义表的开始,END_GOT用于定义表的结束,GOT_ENTRY用于将offset写入表中,GOT用于从表中读出 offset,GET_GOT用于将表进行初始化。
下面详细解释之:
START_GOT
定义了段“got2”,属性为“allocatable and
writable”,并定义了变量.LCTOC1,.LCTOC1的值是表的最高地址。如果设表的起始地址为TABLE_START,则.LCTOC1的
值为TABLE_START+0x8000。
END_GOT定义为子段text 0的开始。
GOT_ENTRY定义了变量.L_NAME,其值为当前表项的地址(.)-.LCTOC1。如果设NAME的表项偏移地址为NAME_OFFSET,那么.L_NAME = . - .LCTOC1 = TABLE_START + NAME_OFFSET - ( TABLE_START + 0x8000 ) = NAME_OFFSET - 0x8000。之后将名字为NAME的offset值写入当前表项,这些offset值是在编译的时候确定的。
GOT(NAME)
的值定义为.L_NAME(r14),这里面r14的值为表的最高地址,也就是.LCTOC1的值(参见下面关于GET_GOT的说明)。这样
GOT(NAME) = .L_NAME + r14 = .L_NAME + .LCTOC1 = NAME_OFFSET - 0x8000 +
TABLE_START + 0x8000 = NAME_OFFSET +
TABLE_START,也就是NAME所在表项的地址。这样,通过查表,就可以找到当初存储在表中的名字为NAME的offset值。
GET_GOT
用于初始化GOT表。首先程序跳转到标号为“1”的地址处(bl 1f),然后将lr的值赋值给r14(此时lr的值为“1”的地址值)。然后另r0 =
0b - 1b(r14),0b为“0”处的地址值,1b为“1”处的地址值。这样r0就等于“0”处的值,也就是.LCTOC1-1f。最后r14 =
r0 + r14 = .LCTOC1 - 1f + 1f = .LCTOC1,也就是等于GOT表的最高地址。
about GOT:
[url]http://code.google.com/p/cellos/wiki/UnderstandingPICGOT[/url]
UnderstandingPICGOT
This article describers the PIC (Position Independant Code) and GOT (Global Offset Table) used in CellOS.
Introduction
Sometimes,
we have to write PIC (Position Independant Code). In the PIC code, we
sometimes need to refer to some absolute symbols. But PIC code can not
itself contain absolute virtual addresses. So GOT is used to solve this
issue.
CellOS uses the same mechanism to solve the PIC code refering
to absolute addresses issues. The following article is to undertand the
GOT details with a "reverse engineering" way.
Details
The theory for PIC and GOT
The following theory section is copied from the <System V Application Binary Interface PowerPC Processor Supplement>:
When
the system creates a process image, the executable file portion of the
process has fixed addresses and the system chooses shared object library
virtual addresses to avoid conflicts with other segments in the
process. To maximize text sharing, shared objects conventionally use
position-independent code, in which instructions contain no absolute
addresses. Shared object text segments can be loaded at various virtual
addresses without having to change the segment images. Thus multiple
processes can share a single shared object text segment, even if the
segment reside at a different virtual address in each process.
Position-independent code relies on two techniques:
* Control transfer instructions hold addresses relative to the
Effective Address (EA) or use registers that hold the transfer address.
An EA-relative branch computes its destination address in terms of the
current EA, not relative to any absolute address.
* When the
program requires an absolute address, it computes the desired value.
Instead of embedding absolute addresses in instructions (in the text
segment), the compiler generates code to calculate an absolute address
(in a register or in the stack or data segment) during execution.
Because
the PowerPC Architecture provides EA-relative branch instructions and
also branch instructions using registers that hold the transfer address,
compilers can satisfy the first condition easily.
A "Global Offset
Table," or GOT, provides information for address calculation. Position
independent object files (executable and shared object files) have a
table in their data segment that holds addresses. When the system
creates the memory image for an object file, the table entries are
relocated to reflect the absolute virtual address as assigned for an
individual process. Because data segments are private for each process,
the table entries can change—unlike text segments, which multiple
processes share.
Position-independent code cannot, in general,
contain absolute virtual addresses. Global offset tables hold absolute
addresses in private data, thus making the addresses available without
compromising the position-independence and sharability of a program’s
text. A program references its global offset table using
position-independent addressing and extracts absolute values, thus
redirecting position-independent references to absolute locations.
When
the dynamic linker creates memory segments for a loadable object file,
it processes the relocation entries, some of which will be of type
R_PPC_GLOB_DAT, referring to the global offset table. The dynamic linker
determines the associated symbol values, calculates their absolute
addresses, and sets the global offset table entries to the proper
values. Although the absolute addresses are unknown when the link editor
builds an object file, the dynamic linker knows the addresses of all
memory segments and can thus calculate the absolute addresses of the
symbols contained therein.
A global offset table entry provides
direct access to the absolute address of a symbol without compromising
position-independence and sharability. Because the executable file and
shared objects have separate global offset tables, a symbol may appear
in several tables. The dynamic linker processes all the global offset
table relocations before giving control to any code in the process
image, thus ensuring the absolute addresses are available during
execution.
The dynamic linker may choose different memory segment
addresses for the same shared object in different programs; it may even
choose different library addresses for different executions of the same
program. Nonetheless, memory segments do not change addresses once the
process image is established. As long as a process exists, its memory
segments reside at fixed virtual addresses.
A global offset table’s
format and interpretation are processor specific. For PowerPC, the
symbol GLOBAL_OFFSET_TABLE may be used to access the table. The symbol
may reside in the middle of the .got section, allowing both positive and
negative "subscripts" into the array of addresses. Four words in the
global offset table are reserved:
* The word at GLOBAL_OFFSET_TABLE-1 shall contain a blrl instruction (see the
text relating to Figure 3-33, "Prologue and Epilogue Sample Code").
* The word at GLOBAL_OFFSET_TABLE0 is set by the link editor to hold the address of
the dynamic structure, referenced with the symbol DYNAMIC.
This
allows a program, such as the dynamic linker, to find its own dynamic
structure without having yet processed its relocation entries. This is
especially important for the dynamic linker, because it must initialize
itself without relying on other programs to relocate its memory image.
* The word at GLOBAL_OFFSET_TABLE1 is reserved for future use.
* The word at GLOBAL_OFFSET_TABLE2 is reserved for future use.
The global offset table resides in the ELF .got section.
The implementation for GOT in CellOS
1. The various macros used to define GOT
The following code defines the GOT entries;
/***************************************************************************
*
* These definitions simplify the ugly declarations necessary for GOT
* definitions.
*
* Stolen from prepboot/bootldr.h, (C) 1998 Gabriel Paubert, [email]paubert@iram.es[/email]
*
* Uses r14 to access the GOT
*/
#define START_GOT \
.section ".got2","aw"; \
.LCTOC1 = .+32768
#define END_GOT \
.text
#define GET_GOT \
bl 1f ; \
.text 2 ; \
0: .long .LCTOC1-1f ; \#offset from center of GOT to CIA#
.text ; \
1: mflr r14 ; \#Get CIA#
lwz r0,0b-1b(r14) ; \#Get the offset#
add r14,r0,r14 ;
#define GOT_ENTRY(NAME) .L_ ## NAME = . - .LCTOC1 ; .long NAME
#define GOT(NAME) .L_ ## NAME (r14)
/*
* Set up GOT: Global Offset Table
*
* Use r14 to access the GOT
*/
START_GOT
GOT_ENTRY(_GOT2_TABLE_)
GOT_ENTRY(_FIXUP_TABLE_)
GOT_ENTRY(_start)
GOT_ENTRY(_start_of_vectors)
GOT_ENTRY(_end_of_vectors)
GOT_ENTRY(transfer_to_handler)
GOT_ENTRY(__init_end)
GOT_ENTRY(_end)
GOT_ENTRY(__bss_start)
END_GOT
2. How GOT is actually setup in runtime code?
In the cellEntry code, before it transfers to the normal C code, it calls GET_GOT.
GET_GOT /* initialize GOT access */
/* NEVER RETURNS! */
bl cellMain
The disasm of the above code section is :
0x000021b8 <cellosEntry+184>: bl 0x21bc <cellosEntry+188> #bl 1f
####Here is a compiler trick, it is described below!!###
0x000021bc <cellosEntry+188>: mflr r14 #Get CIA,r14=0x000021bc
0x000021c0 <cellosEntry+192>: lwz r0,936(r14) #load the offset into r0
0x000021c4 <cellosEntry+196>: add r14,r0,r14 #add r0=0xf468
0x000021c8 <cellosEntry+200>: bl 0x4cb0 <cellMain>
Even it is not strictly related, we describe the compiler trick that is shown above:
We have seen that the GET_GOT is written like this:
#define GET_GOT \
bl 1f ; \
.text 2 ; \
0: .long .LCTOC1-1f ; \#offset from center of GOT to CIA#
.text ; \
1: mflr r14 ; \#Get CIA#
lwz r0,0b-1b(r14) ; \#Get the offset#
add r14,r0,r14
If
there is no ".text 2" in the above macro, then we can imagine there is
an immediate value defined just between the "bl 1f" and "1: mflr r14"
instrcution; However, the ".text 2" seems to trigger the compiler to
move the immediate value to somewhere else, not between the two
instructions. Should there is no ".text 2", the immediate value were to
be put in between the two intructions, thus the (0f - 1f) were to be -4;
However,
with the ".text 2", the (0f - 1f) becomes 936 (in this compilation, and
can vary with different compilations if you have other code added or
changed), because the immediate value has been "moved" to a position
higer than the CIA.
The (0f - 1f), in this case, the value 936, is an
intermediate offset, at that offset (to the CIA), stores the real
offset value from the centor of the GOT to CIA; In our case, the CIA for
the "1: mflr r14" is 0x000021bc; 0x000021bc + 936 = 0x2564; check the
disasm and debug the code, you will find the following:
(gdb) x/xw 0x2564 #to display the memory at 0x2564
0x2564 <in32+8>: 0x0000f468
(gdb)
So,
the "lwz r0,936(r14)" say "lwz r0,0b-1b(r14)" is to load the r0 with a
value 0x0000f468; This 0x0000f468 is the real offset from CIA to the
centor of GOT; Thus the "add r14,r0,r14" is actually to add the CIA with
the offset to the centor of GOT, which in effect is to set r14 to "sit"
in the centor of GOT;
Note that I said several times of "centor of
GOT", becasue I interpret that the r14 can be added with a 16bit
"signed" offset to access the contents around r14; I call r14 the GOT
anchor. This is a little bit like the SDA (Small Data Area) concept.
So,
now the r14 is set to sit in the centor of a GOT area, where it can
easily used to locate the "values" around it (+/- 32KB); r14 is not
changed across the execution from now on. Let's now remember the value
of r14, which is 0x0000f468 + 0x000021bc = 0x11624; We need to use this
0x11624 to calculate the address of the entries in the GOT.
3. How GOT is used to access the absolute symbols ?
Let's find a case where the GOT is used in CellOS.
When the system is running and an interrupt happens, interrupt handling is run, where GOT is used:
/*
* Exception vectors.
*
* The data words for `hdlr' and `int_return' are initialized with
* OFFSET values only; they must be relocated first before they can
* be used!
*/
#define STD_EXCEPTION(n, label, hdlr) \
. = n; \
label: \
EXCEPTION_PROLOG(SRR0, SRR1); \
lwz r3,GOT(transfer_to_handler); \#GOT#
mtlr r3; \
addi r3,r1,STACK_FRAME_OVERHEAD; \
li r20,MSR_KERNEL; \
rlwimi r20,r23,0,25,25; \
blrl; \
.L_ ## label : \
.long hdlr - _start + _START_OFFSET; \
.long int_return - _start + _START_OFFSET
The disasm code for the a bove code section looks like this:
00003568 <PIT>:
3568: 7e 90 43 a6 mtsprg 0,r20
356c: 7e b1 43 a6 mtsprg 1,r21
3570: 7e 80 00 26 mfcr r20
3574: 3a a1 ff 00 addi r21,r1,-256
3578: 92 95 00 a8 stw r20,168(r21)
357c: 92 d5 00 68 stw r22,104(r21)
3580: 92 f5 00 6c stw r23,108(r21)
3584: 7e 90 42 a6 mfsprg r20,0
3588: 92 95 00 60 stw r20,96(r21)
358c: 7e d1 42 a6 mfsprg r22,1
3590: 92 d5 00 64 stw r22,100(r21)
3594: 7e 88 02 a6 mflr r20
3598: 92 95 00 a0 stw r20,160(r21)
359c: 7e c9 02 a6 mfctr r22
35a0: 92 d5 00 9c stw r22,156(r21)
35a4: 7e 81 02 a6 mfxer r20
35a8: 92 95 00 a4 stw r20,164(r21)
35ac: 7e 95 f2 a6 mfdear r20
35b0: 92 95 00 b4 stw r20,180(r21)
35b4: 7e da 02 a6 mfsrr0 r22
35b8: 7e fb 02 a6 mfsrr1 r23
35bc: 90 15 00 10 stw r0,16(r21)
35c0: 90 35 00 14 stw r1,20(r21)
35c4: 90 55 00 18 stw r2,24(r21)
35c8: 90 35 00 00 stw r1,0(r21)
35cc: 7e a1 ab 78 mr r1,r21
35d0: 90 75 00 1c stw r3,28(r21)
35d4: 90 95 00 20 stw r4,32(r21)
35d8: 90 b5 00 24 stw r5,36(r21)
35dc: 90 d5 00 28 stw r6,40(r21)
35e0: 80 6e 80 14 lwz r3,-32748(r14) #GOT, r14 = 0x11624
35e4: 7c 68 03 a6 mtlr r3
35e8: 38 61 00 10 addi r3,r1,16
35ec: 3a 80 10 00 li r20,4096
35f0: 52 f4 06 72 rlwimi r20,r23,0,25,25
35f4: 4e 80 00 21 blrl
35f8: 00 00 5a cc .long 0x5acc
35fc: 00 00 23 38 .long 0x2338
So,
we saw there is a simple usage of r14, that is "lwz r3,-32748(r14)", or
"lwz r3,GOT(transfer_to_handler)". Remember that r14 = 0x11624, this
instrcution is to load from EA = 0x11624 - 32748 = 0x9638.
So what is the value in 0x9638?
(gdb) x/xw 0x9638
0x9638 <_GOT2_TABLE_+56>: 0x0000229c
(gdb)
OK,
we see something called "GOT2_TABLE", seems familar? Yes, GOT! Looking
back in this article, there is (yes, I copied it twice, becasue the
space is for free :-))
/*
* Set up GOT: Global Offset Table
*
* Use r14 to access the GOT
*/
START_GOT
GOT_ENTRY(_GOT2_TABLE_)
GOT_ENTRY(_FIXUP_TABLE_)
GOT_ENTRY(_start)
GOT_ENTRY(_start_of_vectors)
GOT_ENTRY(_end_of_vectors)
GOT_ENTRY(transfer_to_handler)
GOT_ENTRY(__init_end)
GOT_ENTRY(_end)
GOT_ENTRY(__bss_start)
END_GOT
The GOT_ENTRY is to define an entry "around" the centor of GOT (the GOT anchor).
#define GOT_ENTRY(NAME) .L_ ## NAME = . - .LCTOC1 ; .long NAME
The actual GOT table looks like this:
00009600 <_GOT2_TABLE_>:
9600: 00 00 96 00 .long 0x9600
9604: 00 00 96 48 .long 0x9648
9608: 00 00 46 68 .long 0x4668
960c: 00 00 26 68 .long 0x2668
9610: 00 00 46 00 .long 0x4600
9614: 00 00 48 04 .long 0x4804
9618: 00 00 9c 00 .long 0x9c00
961c: 00 60 d0 00 .long 0x60d000
9620: 00 00 9c 00 .long 0x9c00
9624: 00 00 96 00 .long 0x9600
9628: 00 00 96 48 .long 0x9648
962c: 00 00 21 00 .long 0x2100
9630: 00 00 01 00 .long 0x100
9634: 00 00 20 98 .long 0x2098
9638: 00 00 22 9c .long 0x229c
963c: 00 00 9c 00 .long 0x9c00
9640: 00 60 d0 00 .long 0x60d000
9644: 00 00 9c 00 .long 0x9c00
Disassembly of section .data:
So, at "9638: 00 00 22 9c .long 0x229c", there stores a value 0x229c;
(gdb) x/xw 0x229c
0x229c <transfer_to_handler>: 0x92d50090
(gdb)
Right, that is what we want! The address of the function transfer_to_handler is stored in the GOT entry.
So
now we have a very clear understanding to the GOT. It is actually a
"jumping table", through which an indirect addressing to the absolute
addresses are performed. This technic is is mostly used for operating
systems to load exe images using shared objects. For CellOS, derived
from u-boot, it simply severs as a way to locate some absolute symbols.
4. Note for the ones curious with the question : where GOT2_TABLE comes from?
In the linker script,there is a section:
.reloc :
{
*(.got)
_GOT2_TABLE_ = .;
*(.got2)
_FIXUP_TABLE_ = .;
*(.fixup)
}
__got2_entries = (_FIXUP_TABLE_ - _GOT2_TABLE_) >>2;
__fixup_entries = (. - _FIXUP_TABLE_)>>2;
You should be noted that in the first entry of "GOT2_TABLE", the entry specify the address of the GOT2_TABLE itself (0x9600).
00009600 <_GOT2_TABLE_>:
9600: 00 00 96 00 .long 0x9600