Worknotes for the The android ports project

1/5/2014 - math symbols in libgcc.a sneaking into dynamic libraries

When linking for Android, special attention has to be made for the symbols in libgcc.a, especially when using libgnustl. When source is compiled without hardware math support, it'll generate various __aeabi_ calls. This can lead to a problem if you link against a dynamic library that has these symbols (copied from libgcc.a) and try to run it on a library without the copies. You'll get an error message as follows:

CANNOT LINK EXECUTABLE: cannot locate symbol "__aeabi_d2uiz" referenced by "iperf"...
Adding -Wl,-y__aeabi_d2uiz (trace the symbol __aeabi_d2uiz) to the LDFLAGS to find out where this is coming from, I get the output:
/home/admin/droid/lib/libstdc++/libs/armeabi//libgnustl_shared.so: definition of __aeabi_d2uiz
/home/admin/droid/android-ndk-r9c/toolchains/arm-linux-androideabi-4.8/prebuilt/linux-x86_64/bin/../lib/gcc/arm-linux-androideabi/4.8/libgcc.a(_fixunsdfdi.o): reference to __aeabi_d2uiz
You can see here that the symbol is assumed to come from libgnustl_shared.so. But if you have the armeabi-v7a version of libgnustl_shared.so, it does not have a copy of __aeabi_d2uiz. Looking at the symbols in the libraries:
# nm -D ~/droid/android-ndk/sources/cxx-stl/gnu-libstdc++/4.6/libs/armeabi-v7a/libgnustl_shared.so | grep __aeabi_ | wc -l
23
# nm -D ~/droid/android-ndk/sources/cxx-stl/gnu-libstdc++/4.6/libs/armeabi/libgnustl_shared.so | grep __aeabi_ | wc -l
52
This is because the armv7a has hardware math support, and does not need the routines from libgcc.a
Summary: make sure your running environment matches your cross compile environment

1/17/2012 - the dynamic linker on ice cream sandwich

Ice cream sandwich has changed their environment, and using the linker compiled against the GB environment in ICS causes a segmentation fault:

#0  0xb00099ac in __set_errno (n=2) at bionic/libc/bionic/__set_errno.c:34
#1  0xb00099c4 in __set_syscall_errno (n=)
    at bionic/libc/bionic/__set_errno.c:51
#2  0xb0002110 in _open_lib (name=0xbed74934 "/vendor/lib/libc.so")
    at linker.c:609
#3  0xb0002248 in open_library (name=0x84b5 "libc.so") at linker.c:638
The linker is statically linked (for obvious reasons) and __set_errno gets copied into the linker from libc.a at build time. Updating the NDK to r7 (from r5b) fixes this problem.

7/24/2011 - the dynamic linker, /system/bin/linker

Not much has been written about bionic's dynamic linker (that I can find). It supports the environment variable LD_LIBRARY_PATH, but does not support a ld.so.conf file or an rpath built into the binary. This makes it hard to add shared libraries, as the /system/lib directory is mounted read-only on most (all?) phones. So, changing the dynamic linker is needed in order to change the library search path (using the environment variable LD_LIBRARY_PATH is possible, but does not seem very reliable).

The change to the library path is very simple. I've added /data/local/lib to the library search path in my copy of the dynamic linker. Telling gcc to use the new path to the dynamic linker is the flag "-Wl,-dynamic-linker,/data/local/bin/linker" I've verified that the bind binaries work with their shared libraries in /data/local/lib, and it has reduced the install size of the bind-utils package from 17.3mb to 5.5mb

7/21/2011 - Resolver library

Changing the dns server at runtime for a single process (through the _res global object) is highly discouraged on bionic, to the point where there is no global object named _res. I took the resolver code from dietlibc and gave it non-conflicting function names in order to get the nslookup applet in busybox working.

7/7/2011 - Busybox

I want to port busybox to Android's libc because the busybox ports I've seen are statically compiled against glibc or uclibc. This means: no user/group info and no dns configuration. No DNS is a problem for tools like wget, ping, and traceroute.

After filling in the missing functions and other porting work, I was having trouble with output/input not showing up or otherwise acting strange. After doing some investigation on my busybox binary, I found that inline functions (from stdio.h) were failing but regular functions (in libc.so) worked. I also found that the busybox binary had the symbol "__sF" and so did libc.so. Normally, the binary would have an undefined reference that would be satisfied at runtime linking.

The "__sF" symbol is an array of files for the stdin, stdout, stderr macros. Inline functions would get messed up by the binary's symbol, and libc functions would . That brings up the next question: how did that symbol get in the binary? I went through all the .a and .o files and only found undefined references (as expected). I eventually found that it depended on which busybox applets I built into the binary.

Since it was a problem triggered by one (or more) applets, I used a binary search to find one of the applets that was causing the problem. Turns out there was only one applet causing the problem, "diff". So next up was a binary search of the functions inside of the diff applet. And it turns out, it was one line of code: "FILE *fp[2] = { stdout, stdin };". Changing the code to runtime instead of initalized by the dynamic linker works around the issue: "FILE *fp[2]; fp[0] = stdout; fp[1] = stdin;"

A comparison of objdump between the broken code and the working code:

$ aobjdump -R puts | grep __sF
00009538 R_ARM_GLOB_DAT    __sF
0000953c R_ARM_COPY        __sF
$ aobjdump -R puts-ok | grep __sF
00009534 R_ARM_GLOB_DAT    __sF
$ aobjdump -T puts | grep __sF
0000953c g    DO .bss	000000fc __sF
$ aobjdump -T puts-ok | grep __sF
00000000      DO *UND*	00000000 __sF
Also, comparing the intermediate forms (agcc -c X.c):
$ aobjdump -r puts.o

puts.o:     file format elf32-littlearm

RELOCATION RECORDS FOR [.text]:
OFFSET   TYPE              VALUE 
00000090 R_ARM_PLT32       __swbuf
000000f0 R_ARM_GOTPC       _GLOBAL_OFFSET_TABLE_
000000f4 R_ARM_REL32       .data.rel.ro
000000f8 R_ARM_GOT32       __sF


RELOCATION RECORDS FOR [.data.rel.ro]:
OFFSET   TYPE              VALUE 
00000000 R_ARM_ABS32       __sF
00000004 R_ARM_ABS32       __sF

$ aobjdump -r puts-ok.o

puts-ok.o:     file format elf32-littlearm

RELOCATION RECORDS FOR [.text]:
OFFSET   TYPE              VALUE 
00000090 R_ARM_PLT32       __swbuf
000000f8 R_ARM_GOTPC       _GLOBAL_OFFSET_TABLE_
000000fc R_ARM_GOT32       __sF

Some further comparison between data relocation vs run-time reference:

# gdb ./print
...
(gdb) info shared
From        To          Syms Read   Shared Object Library
0xb0001000  0xb00090e8  Yes (*)     /system/bin/linker
0xafd0a720  0xafd38908  Yes (*)     /system/lib/libc.so
(*): Shared library is missing debugging information.
(gdb) cont
Continuing.
9500 =? 9500
Program exited with code 015.
(gdb) quit
# gdb ./print-ok
...
(gdb) info shared
From        To          Syms Read   Shared Object Library
0xb0001000  0xb00090e8  Yes (*)     /system/bin/linker
0xafd0a720  0xafd38908  Yes (*)     /system/lib/libc.so
(*): Shared library is missing debugging information.
(gdb) cont
Continuing.
afd424f8 =? afd424f8

From this, you can see that the stdout macro (and __sF symbol) are referring to the wrong location. With print.c, it points to 0x9500, which is the bss segment of the executable. With print-ok.c, it points into the libc.so memory location, where it belongs.