NXFLATLast Updated: October 1, 2017 |
Table of Contents |
1.0 Overview |
1.1 Functionality
NXFLAT is a customized and simplified version of binary format implemented a few years ago called XFLAT With the NXFLAT binary format you will be able to do the following:
- Place separately linked programs in a file system, and
- Execute those programs by dynamically linking them to the base NuttX code.
This allows you to extend the NuttX base code after it has been written into FLASH. One motivation for implementing NXFLAT is support clean CGI under an HTTPD server.
This feature is especially attractive when combined with the NuttX ROMFS support: ROMFS allows you to execute programs in place (XIP) in flash without copying anything other than the .data section to RAM. In fact, the initial NXFLAT release only worked on ROMFS. Later extensions also support execution NXFLAT binaries from an SRAM copy as well.
This NuttX feature includes:
- A dynamic loader that is built into the NuttX core (See GIT).
- Minor changes to RTOS to support position independent code, and
- A linker to bind ELF binaries to produce the NXFLAT binary format (See GIT).
1.2 Background
NXFLAT is derived from XFLAT. XFLAT is a toolchain add that provides full shared library and XIP executable support for processors that have no Memory Management Unit (MMU1). NXFLAT is greatly simplified for the deeply embedded environment targeted by Nuttx:
- NXFLAT does not support shared libraries, because
- NXFLAT does not support exportation of symbol values from a module
Rather, the NXFLAT module only imports symbol values. In the NXFLAT model, the (PIC2) NXFLAT module resides in a FLASH file system and when it is loaded at run time, it is dynamically linked only to the (non-PIC) base NuttX code: The base NuttX exports a symbol table; the NXFLAT module imports those symbol value to dynamically bind the module to the base code.
1MMU: "Memory Management Unit"
2PIC: "Position Independent Code"
1.3 Limitations
- ROMFS (or RAM mapping) Only
The current NXFLAT release will work only with either (1) NXFLAT executable modules residing on a ROMFS file system, or (2) executables residing on other file systems provided that
CONFIG_FS_RAMMAP
is defined. This limitation is because the loader depends on the capability tommap()
the code segment. See the NuttX User Guide for further information.NUTTX does not provide any general kind of file mapping capability. In fact, true file mapping is only possible with MCUs that provide an MMU1. Without an MMU, file system may support eXecution In Place (XIP) to mimic file mapping. Only the ROMFS file system supports that kind of XIP execution need by NXFLAT.
It is also possible to simulate file mapping by allocating memory, copying the NXFLAT binary file into memory, and executing from the copy of the executable file in RAM. That capability can be enabled with the
CONFIG_FS_RAMMAP
configuration option. With that option enabled, NXFLAT will work that kind of file system but will require copying of all NXFLAT executables to RAM. - GCC/ARM/Cortex-M3/4 Only
At present, the NXFLAT toolchain is only available for ARM and Cortex-M3/4 (thumb2) targets.
- Read-Only Data in RAM
With older GCC compilers (at least up to 4.3.3), read-only data must reside in RAM. In code generated by GCC, all data references are indexed by the PIC2 base register (that is usually R10 or sl for the ARM processors). The includes read-only data (
.rodata
). Embedded firmware developers normally like to keep.rodata
in FLASH with the code sections. But because all data is referenced with the PIC base register, all of that data must lie in RAM. A NXFLAT change to work around this is under investigation3.Newer GCC compilers (at least from 4.6.3), read-only data is no long GOT-relative, but is now accessed PC-relative. With PC relative addressing, read-only data must reside in the I-Space.
- Globally Scoped Function Function Pointers
If a function pointer is taken to a statically defined function, then (at least for ARM) GCC will generate a relocation that NXFLAT cannot handle. The workaround is make all such functions global in scope. A fix would involve a change to the GCC compiler as described in Appendix B.
- Special Handling of Callbacks
Callbacks through function pointers must be avoided or, when then cannot be avoided, handled very specially. The reason for this is that the PIC module requires setting of a special value in a PIC register. If the callback does not set the PIC register, then the called back function will fail because it will be unable to correctly access data memory. Special logic is in place to handle some NuttX callbacks: Signal callbacks and watchdog timer callbacks. But other callbacks (like those used with
qsort()
must be avoided in an NXFLAT module.
1MMU: "Memory Management Unit"
2PIC: "Position Independent Code"
3A work around is under consideration: At run time, the .rodata
offsets will be indexed by a RAM address. If the dynamic loader were to offset those .rodata
offsets properly, it still might be possible to reference .rodata
in ROM. That work around is still a topic of investigation at this time.
1.4 Supported Processors
As mentioned above, the NXFLAT toolchain is only available for ARM and Cortex-M3 (thumb2) targets. Furthermore, NXFLAT has only been tested on the Eagle-100 LMS6918 Cortex-M3 board.
1.5 Development Status
The initial release of NXFLAT was made in NuttX version 0.4.9. Testing is limited to the tests found under apps/examples/nxflat
in the source tree. Some known problems exist (see the TODO list). As such, NXFLAT is currently in an early alpha phase.
2.0 NXFLAT Toolchain |
1.2 Building the NXFLAT Toolchain
In order to use NXFLAT, you must use special NXFLAT tools to create the binary module in FLASH. To do this, you will need to download the buildroot package and build it on your Linux or Cygwin machine. The buildroot can be downloaded from Bitbucket.org. You will need version 0.1.7 or later.
Here are some general build instructions:
- You must have already configured Nuttx in
<some-dir>/nuttx
- Download the buildroot package
buildroot-0.x.y
into<some-dir>
- Unpack
- Move this into position:
mv <some-dir>/buildroot-0.x.y
<some-dir>/buildroot -
cd
<some-dir>/buildroot - Copy a configuration file into the top buildroot directory:
cp boards/abc-defconfig-x.y.z .config
. - Enable building of the NXFLAT tools by
make menuconfig
. Select to build the NXFLAT toolchain with GCC (you can also select omit building GCC with and only build the NXFLAT toolchain for use with your own GCC toolchain). - Make the toolchain:
make
. When the make completes, the tool binaries will be available under<some-dir>/buildroot/build_abc/staging_dir/bin
<some-dir>/buildroot-0.x.y.tar.gz
using a command like tar zxf buildroot-0.x.y
. This will result in a new directory like <some-dir>/buildroot-0.x.y
1.2 mknxflat
mknxflat
is used to build a thunk file. See below for usage.
Usage: mknxflat [options] <bfd-filename> Where options are one or more of the following. Note that a space is always required between the option and any following arguments. -d Use dynamic symbol table. [symtab] -f <cmd-filename> Take next commands from <cmd-filename> [cmd-line] -o <out-filename> Output to[stdout] -v Verbose output [no output] -w Import weakly declared functions, i.e., weakly declared functions are expected to be provided at load-time [not imported]
1.3 ldnxflat
ldnxflat
is use to link your object files along with the thunk file generated by mknxflat
to produce the NXFLAT binary module. See below for usage.
Usage: ldnxflat [options] <bfd-filename> Where options are one or more of the following. Note that a space is always required between the option and any following arguments. -d Use dynamic symbol table [Default: symtab] -e <entry-point> Entry point to module [Default: _start] -o <out-filename> Output to <out-filename> [Default: <bfd-filename>.nxf] -s <stack-size> Set stack size to <stack-size> [Default: 4096] -v Verbose output. If -v is applied twice, additional debug output is enabled [Default: no verbose output].
1.4 mksymtab
There is a small helper program available in nuttx/tools
call mksymtab
. mksymtab
can be sued to generate symbol tables for the NuttX base code that would be usable by the typical NXFLAT application. mksymtab
builds symbol tables from common-separated value (CSV) files. In particular, the CSV files:
-
nuttx/syscall/syscall.csv
that describes the NuttX RTOS interface, and -
nuttx/libc/libc.csv
that describes the NuttX C library interface. -
nuttx/libc/math.cvs
that descirbes any math library.
USAGE: ./mksymtab <cvs-file> <symtab-file> Where: <cvs-file> : The path to the input CSV file <symtab-file>: The path to the output symbol table file -d : Enable debug output
For example,
cd nuttx/tools cat ../syscall/syscall.csv ../libc/libc.csv | sort >tmp.csv ./mksymtab.exe tmp.csv tmp.c
1.5 Making an NXFLAT module
Below is a snippet from an NXFLAT make file (simplified from NuttX Hello, World! example.
-
binfmt/libnxflat/gnu-nxflat-gotoff.ld
. Older versions of GCC (at least up to GCC 4.3.3), use GOT-relative addressing to access RO data. In that case, read-only data (.rodata) must reside in D-Space and this linker script should be used. -
binfmt/libnxflat/gnu-nxflat-pcrel.ld
. Newer versions of GCC (at least as of GCC 4.6.3), use PC-relative addressing to access RO data. In that case, read-only data (.rodata) must reside in I-Space and this linker script should be used. - The interface to the binary loader is described in the header file
include/nuttx/binfmt/binfmt.h
. A brief summary of the APIs prototyped in that header file are listed below. - NXFLAT APIs needed to register NXFLAT as a binary loader appear in the header file
include/nuttx/binfmt/nxflat.h
. - The format of an NXFLAT object itself is described in the header file:
include/nuttx/binfmt/nxflat.h
.
Target 1 | hello.r1: | hello.o |
---|---|---|
abc-nuttx-elf-ld -r -d -warn-common -o $@ $^ | ||
Target 2 | hello-thunk.S: | hello.r1 |
mknxflat -o $@ $^ | ||
Target 3 | hello.r2: | hello-thunk.S |
abc-nuttx-elf-ld -r -d -warn-common -T binfmt/libnxflat/gnu-nxflat-gotoff.ld -no-check-sections -o $@ hello.o hello-thunk.o | ||
Target 4 | hello: | hello.r2 |
ldnxflat -e main -s 2048 -o $@ $^ | ||
Target 1. This target links all of the module's object files together into one relocatable object. Two relocatable objects will be generated; this is the first one (hence, the suffic .r1
). In this "Hello, World!" case, there is only a single object file, hello.o
, that is linked to produce the hello.r1
object.
When the module's object files are compiled, some special compiler CFLAGS must be provided. First, the option -fpic
is required to tell the compiler to generate position independent code (other GCC options, like -fno-jump-tables
might also be desirable). For ARM compilers, two additional compilation options are required: -msingle-pic-base
and -mpic-register=r10
.
Target 2. Given the hello.r1
relocatable object, this target will invoke mknxflat
to make the thunk file, hello-thunk.S
. This thunk file contains all of the information needed to create the imported function list.
Target 3 This target is similar to Target 1. In this case, it will link together the module's object files (only hello.o
here) along with the assembled thunk file, hello-thunk.o
to create the second relocatable object, hello.r2
. The linker script, gnu-nxflat-gotoff.ld
is required at this point to correctly position the sections. This linker script produces two segments: An I-Space (Instruction Space) segment containing mostly .text
and a D-Space (Data Space) segment containing .got
, .data
, and .bss
sections. The I-Space section must be origined at address 0 (so that the segment's addresses are really offsets into the I-Space segment) and the D-Space section must also be origined at address 0 (so that segment's addresses are really offsets into the I-Space segment). The option -no-check-sections
is required to prevent the linker from failing because these segments overlap.
NOTE: There are two linker scripts located at binfmt/libnxflat/
.
Target 4. Finally, this target will use the hello.r2
relocatable object to create the final, NXFLAT module hello
by executing ldnxflat
.
3.0 Binary Loader APIs |
Relevant Header Files:
binfmt Registration These first interfaces are used only by a binary loader module, such as NXFLAT itself. NXFLAT (or any future binary loader) calls register_binfmt()
to incorporate itself into the system. In this way, the binary loader logic is dynamically extensible to support any kind of loader. Normal application code should not be concerned with these interfaces.
int register_binfmt(FAR struct binfmt_s *binfmt)
Description: Register a loader for a binary format
Returned Value: This is a NuttX internal function so it follows the convention that 0 (OK
) is returned on success and a negated errno is returned on failure.
int unregister_binfmt(FAR struct binfmt_s *binfmt)
Description: Register a loader for a binary format
Returned Value: This is a NuttX internal function so it follows the convention that 0 (OK
) is returned on success and a negated errno is returned on failure.
Binary Loader Interfaces. The remaining APIs are called by user applications to maintain modules in the file system.
int load_module(FAR struct binary_s *bin)
Description: Load a module into memory, bind it to an exported symbol take, and prep the module for execution.
Returned Value: This is a NuttX internal function so it follows the convention that 0 (OK
) is returned on success and a negated errno
is returned on failure.
int unload_module(FAR struct binary_s *bin)
Description: Unload a (non-executing) module from memory. If the module has been started (via exec_module()
), calling this will be fatal.
Returned Value: This is a NuttX internal function so it follows the convention that 0 (OK
) is returned on success and a negated errno
is returned on failure.
int exec_module(FAR const struct binary_s *bin)
Description: Execute a module that has been loaded into memory by load_module().
Returned Value: This is a NuttX internal function so it follows the convention that 0 (OK
) is returned on success and a negated errno
is returned on failure.
Appendix A. No GOT Operation |
When GCC generate position independent code, new code sections will appear in your programs. One of these is the GOT (Global Offset Table) and, in ELF environments, another is the PLT (Procedure Lookup Table. For example, if your C code generated (ARM) assembly language like this without PIC:
ldr r1, .L0 <-- fetch the offset to 'x' ldr r0, [r10, r1] <-- load value of with pic offset` ... .l0: .word x < pre>-- fetch the offset to 'x' ldr r0, [r10, r1] <-- load value of with pic offset` ... .l0: .word x < pre>
Then when PIC is enabled (say with the -fpic compiler option), it will generate code like this:
ldr r1, .L0 <-- fetch the offset to got entry ldr r1, [r10,r1] <-- (relocated) address of 'x' from r0, [r1, #0] value ... .l1 .word x(got) in < pre>-- fetch the offset to got entry ldr r1, [r10,r1] <-- (relocated) address of 'x' from r0, [r1, #0] value ... .l1 .word x(got) in < pre>
See reference
Notice that the generates an extra level of indirection through the GOT. This indirection is not needed by NXFLAT and only adds more RAM usage and execution time.
NXFLAT (like XFLAT) can work even better without the GOT. Patches against older version of GCC exist to eliminate the GOT indirections. Several are available here if you are inspired to port them to a new GCC version.
Appendix B. PIC Text Workaround |
There is a problem with the memory model in GCC that prevents it from being used as you need to use it in the NXFLAT context. The problem is that GCC PIC model assumes that the executable lies in a flat, contiguous (virtual) address space like:
Virtual |
---|
.text |
.got |
.data |
.bss |
It assumes that the PIC base register (usually r10 for ARM) points to the base of .text
so that any address in .text
, .got
, .data
, .bss
can be found with an offset from the same base address. But that is not the memory arrangement that we need in the XIP embedded environment. We need two memory regions, one in FLASH containing shared code and on per task in RAM containing task-specific data:
Flash | RAM |
---|---|
.text | .got |
.data | |
.bss |
The PIC base register needs to point to the base of the .got
and only addresses in the .got
, .data
, and .bss
sections can be accessed as an offset from the PIC base register. See also this XFLAT discussion.
Patches against older version of GCC exist to correct this GCC behavior. Several are available here if you are inspired to port them to a new GCC version.