Archive for the ‘University Project’ Category

TFC Presentation

Wednesday, November 23rd, 2005

This is the presentation for the TizOs kernel I did last week, here you can find both the PDF and the sources for it, made with latex-beamer.

PDF TFC

Or you can download the full latex sources

University Project Finished

Friday, September 30th, 2005

Yesterday I handled the university project document (TFC), I havn’t been posting updates on it because of the lack of time… but you can finally download the whole project. Comment on it will be really appreciated.

PDF TFC

Or you can download the full latex sources

From Grub to protected mode (2)

Sunday, April 24th, 2005

In this second post I’ll try to explain, how to produce the binary image to use with grub.

As we saw in the last post we have choose an ELF Multiboot binary. The layout for the binary will be:

  • The Multiboot header at the beginning of the binary.
  • Our whole binary will be loaded at 1Mb physical memory.
  • With the ELF headers (using the linker) we will specify that our entry point is the start function.
  • The first stage of our binary, the one that executes before paging is enabled, will be linked starting at physical address 1Mb, a second part of our binary will be linked using the logical address 0xC0000000, this logical address will correspond, after enabling paging, to the physical address of 1 Mb.

The Multiboot header in our case is contained in the loader.asm file, and only consists on the MAGIC label, the FLAGS and the CHECKSUM. What follows is the declaration for the Multiboot header in nasm format.

ALIGN_VAR  equ   0x01 ; align loaded modules on page boundaries
MEMINFO equ  0×2 ; provide memory map
FLAGS equ  ALIGN_VAR | MEMINFO  ; this is the Multiboot ‘flag’ field
MAGIC equ    0×1BADB002       ; ‘magic number’ lets bootloader find the header
CHECKSUM equ -(MAGIC + FLAGS) ; checksum required

;; Align to 32 bit, as Multiboot header needs it,http://www.nilo.org/multiboot.html
align 4

;;Multiboot header
dd MAGIC
dd FLAGS
dd CHECKSUM

So after this our kernel needs to setup it own descriptor table, as the Multiboot specification says that the descriptors left by a Multiboot compliant bootloader do not need to be valid after boot, so the first thing our bootloader has to do is setup a 3 segment descriptors, a null segment descriptor, a code and a data ones. Then load their base to the GDTR and load again CR3, to make the change persistent.

Now we have ours segment selector registers loaded with our segment descriptors. We can now initialize the paging structure. Once the paging structure has been created we can safely activate paging and jump to the main function linked at 0xC0000000.

From Grub to protected mode (1)

Saturday, April 23rd, 2005

This is part of the project but not in its final presentation, those are just some notes on how to progress from Grub to protected mode.

First of all we need to have a bootable floppy image with Grub installed, then have our kernel copied inside. Our kernel must be Multiboot conformant. I’ve chosen to use an ELF with the Multiboot info inside, we will be seeing how to produce such a binary. First of all the Multiboot header must be contained in the executable, the structure of the Multiboot header is the following:

Offset Type Field Name Note
0 u32 magic required
4 u32 flags required
8 u32 checksum required
12 u32 header_addr if flags[16] is set
16 u32 load_addr if flags[16] is set
20 u32 load_end_addr if flags[16] is set
24 u32 bss_end_addr if flags[16] is set
28 u32 entry_addr if flags[16] is set
32 u32 mode_type if flags[2] is set
36 u32 width if flags[2] is set
40 u32 height if flags[2] is set
44 u32 depth if flags[2] is set

As we will be producing an ELF image the address information does not need to be contained in the Multiboot header, thus bit 16 won’t be set. In this first example we will not need to provide the video information, as the text mode is OK for now, so the bit 2 will be set to 0.

The idea to have an OS image loaded with Grub is the following: when the computer boots Grub is the default bootloader and then the user is prompted to choose which boot image wants to load. Our ELF will be selected, from this moment our binary OS image is running.

The state of the machine after Grub has just jumped to the entry point of our image is the following ( the info was extracted using Bochs):

eax:0x2badb002
ebx:0×2ca20
ecx:0×1f600
edx:0×0
ebp:0×67ee4
esi:0×2cb3f
edi:0×2cb40
esp:0×67ed4
eflags:0×46
eip:0×110098
cs:s=0×8, dl=0xffff, dh=0xcf9a00, valid=1
ss:s=0×10, dl=0xffff, dh=0xcf9300, valid=7
ds:s=0×10, dl=0xffff, dh=0xcf9300, valid=7
es:s=0×10, dl=0xffff, dh=0xcf9300, valid=1
fs:s=0×10, dl=0xffff, dh=0xcf9300, valid=1
gs:s=0×10, dl=0xffff, dh=0xcf9300, valid=1
ldtr:s=0×0, dl=0×0, dh=0×0, valid=0
tr:s=0×0, dl=0×0, dh=0×0, valid=0
gdtr:base=0×8f5c, limit=0×27
idtr:base=0×0, limit=0×3ff
dr0:0×0
dr1:0×0
dr2:0×0
dr3:0×0
dr6:0xffff0ff0
dr7:0×400
tr3:0×0
tr4:0×0
tr5:0×0
tr6:0×0
tr7:0×0
cr0:0×60000011
cr1:0×0
cr2:0×0
cr3:0×0
cr4:0×0
inhibit_mask:0
done

If we analyze a little bit this information we can get the following conclusions:

  • cr0:0×60000011:
    • bit 0 == 1: PE Protection Enables. We are in protected mode.
    • bit 31 ==0: PG Paging. Paging is disabled.
  • GDTR: We will have a look at the descriptor table.
    Null Segment Descriptor (SR=0):
    0×00008f5c : 0×00000000
    0×00008f60 : 0×00000000

    Code Segment Descriptor (SR=0×08)
    0×00008f64 : 0×0000ffff
    0×00008f68 : 0×00cf9a00

    The rest of the segments (SR = 0×10):
    0×00008f6c : 0×0000ffff
    0×00008f70 : 0×00cf9300

From the previous information we can extract that after the beginning of the execution of our code the processor has the A20 gate enabled, the processor is running in protected mode without paging enabled and that all the segment are 4Gb of size, we can access in a lineal way to all our memory.

In the following post I’ll explain how to progress from here to a paginated memory model.

Protected Mode

Monday, April 18th, 2005

Protected Mode

This mode is provided on the Intel X86 family since the 80286, it is the main operating mode and most features are only accessible from this mode.
This mode provides us with two different memory models to choose from, segmentation and pagination. Segmentation is used to isolate and protect different regions
of memory ( ie. code, data, stack \ldots ) from different task. Paging can be used either as a protection facility or/and a translation to map physical memory into
a virtual space. Segmentation is always turned on but we can choose whether or not to use paging. We will see later a trick to turn off segmentation.

Segmentation Theory

Segmentation provides us with a method for separating different ranges of physical address into groups, we can have a segment for the data, one for the code and
another for the stack, and the processor provides us with some protection, this way data will not be written into code stack. If we have a multitask system,
where multiple process are running we can give each process a set of segments, thus the processor will give us protection against one process writing to
an address space that does not belong to his segment.

Segmentation

As can be seen on figure \ref{fig:Segmentation}, the logical address consists of a segment selector and an offset. The segment selector is a unique identifier for
each segment, the segment selector points to a global directory table entry, called segment descriptor, there is a segment descriptor for every segment we have defined.
Each segment descriptor, specifies the size of the segment, the access rights and privilege level for the segment, the segment type, and the location of the first byte
of the segment in the linear address space (called the base address of the segment). To form the physical address, the base address of the segment is summed
to the offset provided in the logical address.

Paging Theory

Normally operating systems need a way of “virtualizing“ the memory, for example to give different address space to different process, paging allows us to
accomplish so. Paging lets us map different logical address regions to physical memory. A page is defined as the minimum amount of contiguous physical memory
that we can allocate. Each virtual address is mapped to a physical one, at a page range, so each page is contiguous in virtual and physical memory.

When the we try to access to a page that is not on memory the processor rises an exception which the operating system can handle
and get the page from the disk and put it back on memory, this is called page on demand, and allows to have in memory process that would not have fit
otherwise.

Paging

On figure \ref{fig:paging} we can see a simple paging scheme, the logical address consists of a directory entry pointer, a table offset and a byte offset.
The directory entry pointer points to a page directory entry (PDE), which contains a set of attributes and a base address to the page table.
The table base address is summed to the table offset ( from the logical address ) to obtain a page table entry, which also contains a set of
attributes and a physical base address, which is the base of the page. The byte offset is summed to the base of the page to obtain the physical address.
This same model can be extended with more tables, but the idea is the same.

Memory Model under Protected Mode

The memory model that we will be using for the operating system under protected mode is pagination without segmentation, as it has been said before it is not
possible to turn off segmentation. The trick to turn segmentation off is th use of a basic flat-model segmentation scheme. The basic flat-model consists of the
whole address space available in contiguous space, this is accomplished by assigning the data, code and stack segments the whole 4Gb of address space. This way
the segmentation unit will not generate exceptions under any circumstance.

On top of this segmentation scheme we will be adding paging. In practice it will be as if we only has pagination.

In protected mode the provided logical address space is of $2^{32}$ bytes ( 4Gb ), this is the flat segment of memory we will be logically addressing from
our processor. This logical space will be mapped into different physical space regions that do not have to be contiguous. If a bigger address space is needed
since the Intel Pentium III there is the possibility to use a 64 Gb address space ( $2^{36}$ bytes ), this extension is invoked by toggling the PAE bit in EFLAGS,
and using the 36 bit page extension. We will not be using this extension tough.

When accessing to a physical address the system always uses a two stages to get there. Even using the aforementioned flat-model, a logical address consists of
a 16 bit segment selector and a 32 bit offset, the segment selector identifies the segment in which the data is referred. The first stage consists on
translating the logical address into linear address, to accomplish this it uses the segment selector to get segment base, this base is added to the offset, the
second stage transforms this linear address into physical address, if paging is not enabled, the linear address is directly the physical address,
otherwise the paging tables are used to get the physical address.

Segment Selectors

Segment selectors are 16 bit identifiers that point to the segment descriptors which are held in the global or local descriptor tables (GDT or LDT).
The segment selectors are used to tell the processor which segment to use, but instead of directly pointing the segment, as in real-address mode, a
pointer is used.

Each segment selector contains:

  • Index: (Bits 3 through 15) Selects one of the 8192 descriptors in the GDT or LDT. The processor multiplies the index value by 8
    (the number of bytes in a segment descriptor) and adds the result to the base address of the GDT or LDT (from the GDTR or LDTR register, respectively).
  • TI: (Bit 2) Specifies which table to use, GDT or LDT.
  • RPL: (Bits 0,1) Specifies the privilege level of the selector. The privilege level can range from 0 to 3, with 0 being the most privileged level.

Segment selectors are loaded into the segment registers, each register hold a segment selector for a specified kind of data. The list of segment register is
as follows:

  • CS: Code segment, instructions are fetched from this segment.
  • DS: Data segment, data instructions get their content from this segment.
  • SS: Stack segment, the stack is hold in this segment.
  • ES, FS, GS: The processor also provides this three additional data-segment registers, which can be used to make additional data
    segments available to the currently executing program (or task).

Only 6 segments can be loaded at anytime, however we can have up to 8192 segments defined, which have to be loaded into a segment register before being
able to use them. When the segment tables are modified is our job to reload the segment registers, as the processor caches part of the information
on the segment table every time a segment register is loaded.

Segment Descriptor

The segment descriptor, shown on figure \ref{fig:SegDesc} is the structure held in the segment table (LDT or GDT) which has information about the segment.
Our segment descriptors will be really simple and will allow us to have the notion of not having segments at all.

Each segment descriptor is 4 bytes and has the following information:

Segment Descriptor

  • Base Address: A 32 value specifying the base address of the segment.
  • Limit: A 20 bit value which specifies the size of the segment, this value can be interpreted in two ways depending on bit G (granularity), this
    bit is also contained in the segment descriptor. If G is clear the segment can be from 1 byte to 1Mb ($2^{20}$). If G is set, the increments are in multiples of 4Kbytes
    ($2^{12}$), thus the available range is from 4Kb to 4Gb ( $2^{32}$ ).
  • Type: Indicates the kind of gate and specifies the access rights.
  • S: Flag to specify if is a System or Code segment.
  • DPL: Privilege level from 0 to 3.
  • P: Segment present flag. If the flag is not set the processor generates an exception when there is an access to this segment.
  • D/B: Its function depends on the kind of segment, we will see its values later in each case.
  • G: As we have seen it determines the granularity of the segment.
  • Reserved Bits: A reserved bit that should always be 0 for the processor and another bit that can be used for our own purpouses.

Segment Descriptor Tables

A segment descriptor table is an array of segment descriptors, starting at some base address. We can have two kind of tables, the global descriptor table(GDT)
and the local descriptor table (LDT).

The GDT is per system, and there is only one that must be defined. The GDT is used for all programs, and we can optionally have more than one LDT, which
can be used by different tasks. The GDT resides in the linear address space, the base of the GDT and its size is loaded into the GDTR register, the base
of the GDT must be 8-byte aligned.

The first segment of the GDT is used as the null segment and is not used as a regular segment, this segment does not generate an exception when loaded into
one of DS, ES,FS, or GS segment registers.

The LDT is contained in a system (S=1) segment of the LDT type, LDT are stored in the LDTR to be used. We will not make any further explanation of LDT’s as we will
not be using them for the operating system.

Paging

Now that we have seen all segmentation options, let’s have a look at paging configurations. Paging is activated by turning on the PG flag in the EFLAGS register,
together with this flag there are two more, PSE which enables 2Mb or 4Mb pages, otherwise if PSE is not enabled the most common size of the page is 4Kb. PAE
extens the physical address to 36 bits, this extension is to be used together with paging only.

The structures used to translate linear to physical address when paging is on are the following (when PAE!=1 and PSE!=1):

  • Page Directory: An array of 32-bit page-directory entries (PDEs) contained in a 4-KByte page. Up to 1024 page-directory entries can be held in a page directory.
  • Page table: An array of 32-bit page-table entries (PTEs) contained in a 4-KByte page. Up to 1024 page-table entries can be held in a page table.
  • The Page: A 4Kbytes of flat contiguous physical memory.

Linear Address Translation (4Kbytes Page)
4K Intel

Register CR3(PDBR) points to the base of the Page Directory. To get the physical translation of a linear address, first a page directory entry is obtained by
adding to CR3(PDBR) the directory offset (bits 31 through 22 of the linear address). The directory entry provides us with a base for the page table, this base
is added to the Table offset (bits 21 through 12 of the linear address) to obtain the page table entry, this table provides us with a 20 bit physical address
which is the base of the desired page. Once the physical base of the page has been obtained, the offset (bits 11 through 0 of the linear address) is added
to obtain the final physical byte.

As we have seen CR3(PDBR) contains the base address for the page directory, this has to be loaded with a valid value before enabling paging, and must
remain in memory while the task that needs it is active.

Page Table Entrie

Page directory and page table entries have a similar format, each of them must be page aligned and is always the size of a page.
The fields of figure \ref{fig:ptdentry} have the following meaning:

  • Base Address: (bit 31 through 12) Specifies a 4Kb page that contains the page table or the physical page, depending on the page directory kind.
  • Present Flag: It tells whether or not the physical page is on memory or not.
  • R/W Flag: Privileges for a page or a group of pages.
  • U/S Flag: User or supervisor memory page or group of pages.
  • PWT Flag: Page Level Write-through, controls the caching policy for a page or a set of pages.
  • PCD Flag: Page cache disable, controls whether a page should be cached or not.
  • A Flag: Accessed, indicates that the page has been written or read. The software must clear this flag.
  • D Flag: Dirty, indicates whether the page has been written to or not. The software must clear this flag.
  • PS Flag: Page Size, indicates the size of the page. For example 4Kb or 4Mb.
  • PAT Flag: Selects the PAT (Page Attribute Table) for the page.
  • G Flag: Global, when a page is marked as global and CR4 also has the G bit set the page it’s TLB entry is not invalidated after CR3 has changed. This is used for
    pages that do not change on context changes.