Worknotes for the The android ports project

1/17/2012 - the dynamic linker on ice cream sandwich

Ice cream sandwich has changed their environment, and using the linker compiled against the GB environment in ICS causes a segmentation fault:

#0  0xb00099ac in __set_errno (n=2) at bionic/libc/bionic/__set_errno.c:34
#1  0xb00099c4 in __set_syscall_errno (n=)
    at bionic/libc/bionic/__set_errno.c:51
#2  0xb0002110 in _open_lib (name=0xbed74934 "/vendor/lib/libc.so")
    at linker.c:609
#3  0xb0002248 in open_library (name=0x84b5 "libc.so") at linker.c:638
The linker is statically linked (for obvious reasons) and __set_errno gets copied into the linker from libc.a at build time. Updating the NDK to r7 (from r5b) fixes this problem.

7/24/2011 - the dynamic linker, /system/bin/linker

Not much has been written about bionic's dynamic linker (that I can find). It supports the environment variable LD_LIBRARY_PATH, but does not support a ld.so.conf file or an rpath built into the binary. This makes it hard to add shared libraries, as the /system/lib directory is mounted read-only on most (all?) phones. So, changing the dynamic linker is needed in order to change the library search path (using the environment variable LD_LIBRARY_PATH is possible, but does not seem very reliable).

The change to the library path is very simple. I've added /data/local/lib to the library search path in my copy of the dynamic linker. Telling gcc to use the new path to the dynamic linker is the flag "-Wl,-dynamic-linker,/data/local/bin/linker" I've verified that the bind binaries work with their shared libraries in /data/local/lib, and it has reduced the install size of the bind-utils package from 17.3mb to 5.5mb

7/21/2011 - Resolver library

Changing the dns server at runtime for a single process (through the _res global object) is highly discouraged on bionic, to the point where there is no global object named _res. I took the resolver code from dietlibc and gave it non-conflicting function names in order to get the nslookup applet in busybox working.

7/7/2011 - Busybox

I want to port busybox to Android's libc because the busybox ports I've seen are statically compiled against glibc or uclibc. This means: no user/group info and no dns configuration. No DNS is a problem for tools like wget, ping, and traceroute.

After filling in the missing functions and other porting work, I was having trouble with output/input not showing up or otherwise acting strange. After doing some investigation on my busybox binary, I found that inline functions (from stdio.h) were failing but regular functions (in libc.so) worked. I also found that the busybox binary had the symbol "__sF" and so did libc.so. Normally, the binary would have an undefined reference that would be satisfied at runtime linking.

The "__sF" symbol is an array of files for the stdin, stdout, stderr macros. Inline functions would get messed up by the binary's symbol, and libc functions would . That brings up the next question: how did that symbol get in the binary? I went through all the .a and .o files and only found undefined references (as expected). I eventually found that it depended on which busybox applets I built into the binary.

Since it was a problem triggered by one (or more) applets, I used a binary search to find one of the applets that was causing the problem. Turns out there was only one applet causing the problem, "diff". So next up was a binary search of the functions inside of the diff applet. And it turns out, it was one line of code: "FILE *fp[2] = { stdout, stdin };". Changing the code to runtime instead of initalized by the dynamic linker works around the issue: "FILE *fp[2]; fp[0] = stdout; fp[1] = stdin;"

A comparison of objdump between the broken code and the working code:

$ aobjdump -R puts | grep __sF
00009538 R_ARM_GLOB_DAT    __sF
0000953c R_ARM_COPY        __sF
$ aobjdump -R puts-ok | grep __sF
00009534 R_ARM_GLOB_DAT    __sF
$ aobjdump -T puts | grep __sF
0000953c g    DO .bss	000000fc __sF
$ aobjdump -T puts-ok | grep __sF
00000000      DO *UND*	00000000 __sF
Also, comparing the intermediate forms (agcc -c X.c):
$ aobjdump -r puts.o

puts.o:     file format elf32-littlearm

RELOCATION RECORDS FOR [.text]:
OFFSET   TYPE              VALUE 
00000090 R_ARM_PLT32       __swbuf
000000f0 R_ARM_GOTPC       _GLOBAL_OFFSET_TABLE_
000000f4 R_ARM_REL32       .data.rel.ro
000000f8 R_ARM_GOT32       __sF


RELOCATION RECORDS FOR [.data.rel.ro]:
OFFSET   TYPE              VALUE 
00000000 R_ARM_ABS32       __sF
00000004 R_ARM_ABS32       __sF

$ aobjdump -r puts-ok.o

puts-ok.o:     file format elf32-littlearm

RELOCATION RECORDS FOR [.text]:
OFFSET   TYPE              VALUE 
00000090 R_ARM_PLT32       __swbuf
000000f8 R_ARM_GOTPC       _GLOBAL_OFFSET_TABLE_
000000fc R_ARM_GOT32       __sF

Some further comparison between data relocation vs run-time reference:

# gdb ./print
...
(gdb) info shared
From        To          Syms Read   Shared Object Library
0xb0001000  0xb00090e8  Yes (*)     /system/bin/linker
0xafd0a720  0xafd38908  Yes (*)     /system/lib/libc.so
(*): Shared library is missing debugging information.
(gdb) cont
Continuing.
9500 =? 9500
Program exited with code 015.
(gdb) quit
# gdb ./print-ok
...
(gdb) info shared
From        To          Syms Read   Shared Object Library
0xb0001000  0xb00090e8  Yes (*)     /system/bin/linker
0xafd0a720  0xafd38908  Yes (*)     /system/lib/libc.so
(*): Shared library is missing debugging information.
(gdb) cont
Continuing.
afd424f8 =? afd424f8

From this, you can see that the stdout macro (and __sF symbol) are referring to the wrong location. With print.c, it points to 0x9500, which is the bss segment of the executable. With print-ok.c, it points into the libc.so memory location, where it belongs.