Linux Modules Character drivers IO & Memory Linux Kernel Process Management Process Address space Linux Scheduler Memory Management Interrupts Signals System Calls Kernel Synchronization Linux Inter Process Communications Serial Ports Parallel Ports Introduction to Hardware Linux Timers DMA in Linux Linux Threads Linux Thread Synchronization Linux Multi Threading Debugging in Linux GDB GNU Debugger KDB Kernel Debugger KGDB Kernel GNU Debugger Example Ethernet Driver |
Process Address Space Memory areas can contain: ü A memory map of the executable file’s code (text section) ü A memory map of the executable file’s initialized global var (data) ü A memory map of the zero page containing global vars (bss-section) ü A memory map of zero page used for process’s user space stack ü An additional text, data and bss-section for each shared library ü Any memory mapped files ü Any shared memory segments ü Any anonymous memory mappings such as those associated withmalloc() User context, Kernel Context, Stack, Heap, Uninitialized data, Initialized Read-write data, Initialized Read-only data, Text, Kernel Data, Virtual address space Memory Descriptor The kernel represents a process’s address space with a data structure called the memory descriptor. The structure contains all the information related to the process address space. The memory descriptor is represented by struct mm_struct and defined in <linux/sched.h> struct mm_struct{ structvm_area_struct
*mmap; /* list of memory areas*/ structrb_rootmm_rb; /* red-black tree of VMAs */ pgd_t*pgd; /*
page global directory*/ atomic_tmm_users; /* address space users*/ atomic_tmm_count; /* primary usage counter*/ intmap_count; /*
number of memory areas*/ unsigned
long start_code; unsigned
long end_code; unsigned
long start_data; unsigned
long end_data; }; Allocating a Memory Descriptor The memory descriptor associated with a given task is stored in the mm field of the task’s process descriptor. Thus current->mm is the current process’s memory descriptor. The mm_struct structure is allocated from the mm_cachep slab cache via the allocate_mm() macro in kernel/fork.c Each process receives a unique mm_struct and thus a unique process address space. Destroying a Memory Descriptor When the process associated with a specific address space exits, The exit_mm() function is invoked. It the calls mm_put(), which decrements the memory descriptor’s mm_userscounter. If the user count reaches zero, mmdrop() is called to decrement the mm_count usage. If that counter is zero, then the free_mm() macro is invoked to return mm_structto slab cache. The mm_struct and Kernel Threads Kernel threads do not have a process address space and do not have associated memory descriptor. Thus the mm field of a kernel thread’s process descriptor is NULL. Whenever kernel thread begins running, kernel threads use the memory descriptor of whatever task ran previously. Memory Areas Memory areas are represented by a memory area object, which is stored in the vm_area_struct structure and defined in <linux/mm.h>. Memory areas are called virtual memory area or VMA in the kernel. The vm_area_struct structure describes a single memory area over a contiguous interval in a given address space. struct vm_area_struct{ struct
mm_struct *vm_mm; /* associated mm_struct
*/ unsigned
long vm_start; /* VMA start, inclusive */ unsigned
long vm_end; /* VMA end, exclusive */ structvm_area_struct
*vm_next; /* list of VMA’s */ pgprot_tvm_page_prot; /*
access permission */ unsigned
long vm_flags; /* flags */ structvm_operations_struct
*vm_ops; /* associated ops */ unsigned
long vm_pgoff; /* offset with in a file
*/ structfile
*vm_file; /* mapped file, if any */ }; VMA flags The vm_flags field contains bit flags, defined in <linux/mm.h> that specify the behavior of and provide information about the pages contained in the memory area. VM_READ Page can be read from VM_WRITE Page can be written to VM_EXEC Page can be executed VM_SHARED Pages are shared VM_SHM The area is used for shared memory VM_IO The area maps a device’s IO space VMA operations The vm_ops field in the vm_area_struct structure points to the table of operations associated with a given memory area, which the kernel can invoke to manipulate the VMA. The operations table is represented by struct vm_operations_struct and is defined in <linux/mm.h>. struct vm_operations_struct{ void
(*open) (structvm_area_struct*); void
(*close) (structvm_area_struct*); structpage
* (*nopage) (structvm_area_struct*, unsigned long int); int(*populate)
(structvm_area_struct*, unsigned long, unsigned
long, pgporot_t, unsigned long int); }; VMA operations… open – is invoked when the given memory area is added to an address space. close – is invoked when the given memory area is removed from an address space. nopage – is invoked by the page fault handler when a page that is not present in physical memory is accessed. Populate– is invoked by the remap_pages() system call to prefault a new mapping. Lists and Trees of Memory Area Memory areas are accessed via both mmapand the mm_rbfields of the memory descriptor. These two data structures independently point to all the memory area objects associated with memory descriptor. The mmap links together all the memory area objects in a singly linked list. mm_rb links together all the memory area objects in a red-black tree. A red-black tree is a type of balanced binary tree. Each element in a red-black tree is called a node. The linked list is used when every node needs to be traversed. The red-black is used when locating a specific memory area in the address space. Memory Areas in Real Life Let’s look at a particular process’s address space and the memory areas inside. We can use /proc filesystem and the pmap(1) utility. Example: intmain(intargc,
char *argv[]) { return 0; } The output from /proc/<pid>/maps lists the memory area in the process’s address space: #cat /proc/1426/maps start-end permission offset major:minor inode file 00e80000-00faf000 r-xp 00000000 03:012 08530 /lib/libc-2.3.2.so 00fb2000-00fb4000 rw-p 00000000 00:000 Memory Areas in Real Life… The pmaputility formats the information in a more readable manner: #pmap1426 00e80000
(1212KB) r-xp (03:01 208530) /lib/libc-2.3.2.so 00fb2000 (8
KB) rw-p (00:00 0) Bfffe000 (8KB) rwxp (00:00 0) [ stack ] mapped: 1340KB writable/private: 40KB shared: 0 KB Manipulating Memory Areas The kernel often has to find whether any memory area in a process address space match a given criteria, such as whether a given address exists in memory area. These functions are all declared in <linux/mm.h> find_vma() find_vma_prev() find_vma_intersection() mmap() and do_mmap(): Creating an Address Interval The do_mmap() function is used by the kernel to create a new linear address interval. The do_mmap() function is declared <linux/mm.h> unsigned long do_mmap (structfile
*, unsigned long addr, unsigned long len, unsigned long prot, unsigned long flag, unsigned long offset) If file parameter is zero and offset is 0, the mapping will not be backed by a file. The prot parameter specifies the access permission for pages: PROT_READ corresponds to VM_READ PROT_WRITE corresponds to VM_WRITE PROT_EXEC corresponds to VM_EXEC PROT_NONE corresponds to VM_NONE mmap() and do_mmap(): Creating an Address Interval… The flags parameter specifies that correspond to the remaining VMA flags. MAP_SHARED the mapping can be shared PROT_WRITE the mapping can not be shared PROT_FIXED the new interval must start at the given address addr The mmap() system call The mmap() system call is defined as void *mmap2 (void *start,
size_tLength, intprot, intflags, intfd, off_tpgoff); The offset is in pages, the old mmap() took an offset in bytes. This enables larger files with larger offsets to be mapped. munmap() and do_munmap(): Removing an Address Interval The do_munmap() function removes an address interval from a specified address space. The function is defined in <linux/mm.h> int munmap (struct mm_struct
*mm, unsigned long start, size_t len); On success 0 is returned, otherwise a negative error code is returned. The munmap() system call The munmap() system call is exported to user space as a means to allow processes to remove address intervals from their address space. int munmap (void *start,
size_t Length); The system call is defined in mm/mmap.c and acts as a very simple wrapper to do_munmap() Page Tables Applications operate on virtual memory that is mapped to physical addresses, processors operate directly on those physical addresses. When an application accesses a virtual memory address it must be first converted to a physical address before the processor can resolve the request. Performing this lookup is done via page tables. Page tables work by splitting the virtual address into chunks. Each chunk is used as an index into a table. The table points to either another table or the associated physical page. In Linux the page tables consist of three levels. The multiple levels allow a sparsely populated address space. Page tables data structures are architecture dependent and are defined in <asm/page.h> Page Tables Page Cache The Linux kernel implements a memory disk cache called the page cache. The goal of this cache is to minimize disk I/O by storing in physical memory. The page cache consists of physical pages in RAM. Each page in the cache corresponds to multiple blocks on the disk. Whenever the kernel begins a page I/O operation, it first cheeks whether the requisite data is in the page cache. If it is, the kernel does not access the disk and gets the data from page cache. |