Eventually, this chapter will describe the ARC kernel in detail. However, at this time it is still under construction, and is sketchy in many places. Over time it should improve.
The ARC kernel is the part of the ARC development system which runs on the embedded processor. The kernel boots the system and provides a user-interaction console, a remote downloading and debugging interface, a multitasker, and libraries which can be shared by user programs. The full system currently runs on Motorola 68332 processors. A partial port (not including a multitasker or boot code) also runs on Intel 386, 486, and Pentium processors.
In creating the ARC kernel, I sought to provide powerful debugging support and multitasking control with a minimum of system overhead in order to facilitate real time appliations. I also sought to allow multiple processes to use shared resources, like the serial port, in a clear and efficient manner.
The kernel performs system services and facilitates the loading and running of programs, which are key features of any operating system. However, unlike most other operating system it does not put up any firewalls that isolate user programs from access to the hardware or the kernel internals. When a user program is compiled, it is linked to an image of the kernel. The user program therefore has direct access to the kernel globals and functions.
This system has the advantages that it is more efficient at runtime and allows more complete code sharing and lower memory usage than the standard way operating systems are implemented. However, it also has some disadvantages. It is potentially easier for a user program to mess up the operating system's state by allowing it complete access to the operating system, and it makes the problems of version skew worse.
The standard way for an operating system to provide services to client programs is for there to be some library code linked into the user program which sets up and performs a trap (a type of interrupt). The trap allows the operating system to take control of the processor in a well defined manner and provides a way for the user code to interact with the operating system without having access to its internals.
However, the overhead for performing a trap is much greater than the overhead for a normal function call. I believe that the added security of using traps is not worth the extra time expense when the goal is to make good real-time systems, as was my goal in designing the ARC kernel.
Also, because the ARC kernel is multitasking and the user interaction and debuggging processes are independent of any user processes, even if user processes crash, the supervisory processes are still there to help you debug your crashed process. Without having hardware memory protection, you could not really do better than that (unless you store the interrupt table in ROM) even if the only interaction between the user processes and the operating system were through traps. Without hardware memory protection, there is no way to stop a user from overwriting the interrupt vector table, which could bring the operating system in either method to its knees.
On reset, kernel sets up necessary system resources. It initializes the chip selects, globals, and serial port; checks the checksums on the memory map and persistant table; sets up multitasking, the TPU, the memory manager, the console process, and the gdb process; then it checks the memory map for programs which wish to be run on reset. After this, the console, gdb, and user programs are free to be used.
Because ARC is intended to primarily run on embedded controllers, "memory management" refers only to the way the kernel manages the contents of the RAM. Blocks cannot be swapped out to disk because there is no disk. Because of this, the memory management facilities are fine-grained. They are also geared towards allowing programs to persist across reset and power cycling so long as the RAM is not unduly corrupted.
The memory management consists of two modules -- the memory map, which is relatively coarse-grained persists across resets, and malloc, which is fine-grained and does not persist across resets.
The memory map keeps track of the name and type of data contained in the memory. It can manage multiple devices, and keep track of whether each device is RAM or ROM. It is not currently possible for users to add their own devices to the memory map, but I'm working on it. However, the current kernel does automatically detect the size of the RAM and ROM in the system and bounds the memory map appropriately.
When the board is reset, the kernel checks whether whether the memory map is still valid. It determines this by comparing the stored checksum against a freshly computed one. If the stored memory map is valid, it uses that. If it is not valid, the kernel creates a new memory map with default sizing for the system tables and heap. It also searches through ROM and RAM to find programs which are still there and valid, and adds those to the map.
Because the kernel and the user programs will be potentially
sharing a small amount of memory, blocks may be specified to start at
any address and may be of any size. Use the memmap
console
command to view the memory map.
A typical memory map looks like this:
Start End Size Type Name Device 000000 012393 012394 Code vestaboot.919 ROM 012394 0125c3 000230 Data Kernel inits ROM 100000 1003ff 000400 Vector Table Vector Table RAM 100400 100dff 000a00 Memory Map Memory Map RAM 100e00 102603 001804 Data Persistents RAM 102f00 105a9f 002ba0 Data Kernel data RAM 105aa0 105e9f 000400 Data GDB Block RAM 106000 1062ef 0002f0 User Code mobot-vision RAM 1310a8 13f7ff 00e758 Heap Heap RAM 13f800 13ffff 000800 Stacks Root stack RAM
Blocks which are not being used for programs, data, or system tables are
used by malloc. Therefore, if you want to use a block of data, you
should first reserve it by using the memblock
console command
(see section Viewing memory usage).
The malloc module manages blocks from the memory map, and partitions
them up when malloc(), realloc()
, or free()
are called.
The "Heap" block in the memory map is always reserved for use by the malloc module and you are prevented from downloading a program in the block reserved for it. However, to achieve maximum usage of a potentially small amount of memory, the malloc module incorporates unused blocks in the memory map for temporary use. Temporary use in this case refers to memory which does not persist across downloads.
The output of the `malloc' console command (done during the same session as the memory map in the previous section) shows the temporary heaps:
Heap 0x1062f0, size 175544 (0x2adb8) 0x1062f8: Free real size=175536 Total free = 175536, Total used = 0 Heap 0x102604, size 2300 (0x8fc) 0x10260c: Free real size= 2292 Total free = 2292, Total used = 0 Heap 0x1310a8, size 59224 (0xe758) 0x1310b0: Alloced real size= 38 PID=0 0x1310d6: Alloced real size= 270 PID=0 0x1311e4: Alloced real size= 38 PID=0 0x13120a: Alloced real size= 270 PID=0 0x131318: Alloced real size= 42 PID=0 0x131342: Alloced real size= 28 PID=0 0x13135e: Alloced real size= 2062 PID=0 0x131b6c: Alloced real size= 38 PID=0 0x131b92: Alloced real size= 270 PID=0 0x131ca0: Alloced real size= 42 PID=0 0x131cca: Alloced real size= 30 PID=0 0x131ce8: Alloced real size= 1038 PID=0 0x1320f6: Alloced real size= 38 PID=33 0x13211c: Alloced real size= 54 PID=33 0x132152: Alloced real size= 98 PID=33 0x1321b4: Free real size= 98 0x132216: Free real size=54762 Total free = 54860, Total used = 4356 ----------------------------------------------------- Total free = 232688, Total used = 4356
The first two heaps are temporary and were allocated in holes in the memory map. The last heap, and the only one which has been used in this example, is the "Heap" block. In this example, the board has a total of 256K of memory, of which less than 30K is being used by system overhead and a small user program. Malloc is managing 230K of free memory, most of which has been recovered from blanks in the memory map.
These optimizations have the following repercussions:
durable_malloc
function was created (see section Malloc functions). The pointers it
returns are in the main Heap and are not automatically freed (except on
reset). Therefore, you should be careful when using durable_malloc to
avoid creating memory leaks.
memblock
console command
(see section Viewing memory usage).
Responding to asynchronous events is one of the most basic and necessary functions of a computer system. There are two fundamental approaches to dealing with asynchronous events: polling and interrupts.
In a polling scheme, you typically have a flag which is set by an asynchronous event and polled by the processor. The processor periodically checks the status of the flag. When the processor sees that the flag is set, it takes the appropriate actions (reads the data from a latch, writes data to a latch, turns on a motor, etc.) and clears the status flag. Polling works, but can take a lot of processor time if you want to respond to an event quickly.
Interrupts take advantage of special hardware in a processor which allows normal processing to be preempted by asynchronous events. Using interrupts to process asynchronous events has the advantage that the processor only spends time servicing the event when it actually happens. Therefore, for the same desired response time, the CPU overhead for interrupt servicing can be much lower.
The disadvantages of using interrupts are that there is more setup involved and it is typically harder to debug interrupt routines. Therefore, it is often a good idea to start out with a polling scheme and, when you feel confident that the servicing routines work, install them on interrupts.
Typically there will be a range of possible interrupt sources, each of which is assigned a number. An interrupt vector table, located in memory, stores the address of a handler for each possible interrupt. When an interrupt occurs, the processor saves some of the current processing state (which typically includes the status flags, program counter, and interrupt source, and may include other registers or data), looks up the address of the correct interrupt handler, and resumes execution in the handler.
The handler must save any registers it will use before it starts. The handler may also need to perform some action to clear the interrupt request (this is usually the case with hardware interrupts). When the handler is done, it restores the registers to their original condition and performs a return-from-interrupt instruction to return to the processing that was preempted.
The console is the default interaction mode of the root process. The model for using the console is that of a unix shell. At the console you can view and modify the contents of memory, run programs, view and kill processes, view and modify the value of persistents, view the memory map, list and unload programs, change the baud and clock rates, and much more. Commands are typed at the console prompt and those commands are executed by the root process. If any console command hangs, pressing the abort button (an active low button attached to the IRQ7 pin) will return you to the interaction prompt.
If not otherwise specified, assume that all numerical arguments are in hexidecimal.