Wednesday, December 3, 2014

Interrupt handling inside Windows CE BSP [Windows]

  Here in this article, I will explain about how to trace your code in InterruptEnable, InterruptHandler, InterruptDone and InterruptDisable.

Lets start from code in xxx_Init, when we generate System IRQ using IRQ (KernelIoControl) and save this information in it's irq-sys irq table. 

Now we create a thread (CreateThread).

After creating event, we bind System IRQ with an event(InterruptInitialize). This is the point when OEMInterruptEnable (to initialize the interrupt) is getting called. Now ABCinterruptEnable(implementer dependent name) get called. As InterruptInitialize called with Syetsm IRQ so OEMInterruptEnable  convert System IRQ to IRQ and with this parameter it call ABCInterruptEnable. So depending on this IRQ it enable the interrupt control registers. 

InterruptDisable and  InterruptDone call OEMInterruptDisable and OEMInterruptDone respectively.  

Now I will  talk about ISR part. As soon as, Interrupt occurs in device, the vector table in vector.inc(private code) called and jump to exception code related to the IRQHandler in armtrap.s. Now before going to perform any action it save register context and change mode to IRQ handling.

  After saving and does nessesacity code execution, it calls OALInterruptHandler in ISR. (I guess we say vector table because it has magnitude like the address of the IRQHandler and it has Direction which say where to go.) In ISR, in OALInterruptHandler,  It will check the interrupt register which will give information about the interrupting device to decide the interrupt device and the do the appropriate task. like masking the lower priority interrupt as well as it self and then clearing the pending register.

After returning the SysIntr, kernel will check for the with which event, this sysIntr is mapped and then signal the event and which will make sure that WaitForSingleObject API or any other API which are waiting for the event to get signalled will be called

WinCE 5 vs WinCE 6 [Windows]







Windows CE 5
Windows Embedded CE 6.0
Boot Sequence
- kernel was hard-coded to launch filesys.exe as the second process.
- starting order of system processes and applications was controlled by registry settings underHKEY_LOCAL_MACHINE\Init.
- most of the system processes have been converted to DLLs that load inside the kernel process, the system uses a mechanism that supports both DLLs and executables.
- You cannot specify command-line parameters for executables.

This is unchanged from previous versions of Windows Embedded CE.
- Application-loading sequences are specified under the HKEY_LOCAL_MACHINE\Init registry key.
- You can specify the load order of system DLLs, executables, and applications under theHKEY_LOCAL_MACHINE\Init registry key. - The OS launches executables listed under this key and loads DLLs under this key into the kernel.
You can specify which DLL entry point to call when the application starts.
- The startup sequence is similar to the service startup sequence in Services.exe and the registry enumerator.
Virtual Memory Layout
1. there was a limit of 32 processes, and a 32 MB limit on virtual memory (VM) for each process.
2. all of the processes shared the same 4 GB address space.
3. the current application executed in slot zero
1. the kernel process resides in the upper 2 GB of the 4-GB (32-bit) virtual memory space, and the bottom 2 GB is unique for each process.
2. limit of about 32,000 processes, due to the number of handles that can be created. The practical limit on the number of processes is bounded by the amount of physical memory.
3. slot zero, and all other process slots, are effectively 2 GB each.
4. Because virtual memory access is translated into hardware access through the memory management unit (MMU), virtual memory code is CPU-dependent. ARM and x86 CPUs use hardware page tables, so the content of virtual memory is accessed directly by the hardware.
Handle Tables
1. a handle to an event, a thread, a process, a mutex, and so on, could be passed to a different process and be usable by that process.
-  If two handles were opened to the same object, the handle value, which is the 32-bit identifier that an application holds onto, was the same for both handles.
In Windows CE 5.0, generated handle values could elevate user privileges.
- There was one global list of handles that was shared by all processes. Handle objects are locked before they are used, and unlocked after they are used. The locking mechanism is a reference counter, which is incremented before the handle is used and decremented after the handle is used.
-  Handle objects are destroyed only when their reference counter equals zero, so handle objects in use are never destroyed.
- a handle object could be destroyed while it was still in use, for example, in the middle of a call to an API.

8.the reuse count was only 3 bits, so a handle could easily be reused. Therefore, stale handles could be a problem.

-returned the same handle value for duplicated handles, or a named handle of the same name.
1. each process has its own individual handle table.
- If two handles to the same object are opened, they have different values, even within the same process.
- A handle value in one process cannot be used to access that object in another process. This includes the kernel.
- The kernel process, including the kernel-mode servers running under it, has its own handle table, and cannot use a handle that was opened by a different process.
-  If you try to pass a handle from one process to another, that handle value may already be used by the other process to refer to a different object. If that other process tried to use the handle value, it would access a different object. This improves security because handle guessing is eliminated.
- Handles are looked up in the handle table, which the user cannot access. This prevents users from generating handle values.
- Similar to Windows desktop-based operating systems, every newly created handle, including newly duplicated handles, has a unique value.
8. Handles cannot easily be reused because there is a 16-bit reuse count. Therefore, about 64,000 operations to create and close handles would be required to exhaust the reuse count.

Tradeoffs of the Handle Tables
Handles are only valid within the process that created them. Handles cannot be passed between processes. Existing code that passes handles from process to process must be changed to useDuplicateHandle to create a new handle that the other process can use. If necessary, use OpenProcessto access the other process.
Mapping Pointers and Sharing Memory between Processes
- When a function gets called, the kernel performed a 1-byte access check on pointer parameters and marshaled the pointer by mapping it.
- The OS address space was arranged so that the 32-MB virtual-memory address space for every process was accessible at all times.
-  Therefore, mapping required only modifying the pointer to point to the right memory space of the target process.
- The server process was responsible for calling a function such as MapCallerPtr to verify that the caller process had sufficient privileges to access the entire buffer, and the server process was responsible for access-checking and marshaling any embedded pointers.

- both the access check and marshaling were covered by calling MapCallerPtr.
- The server was also responsible for performing a secure copy, if necessary.
- when an application calls a function, the kernel performs the full access check on buffer parameters, taking responsibility for that protection away from the server processes.
- This is made possible by giving the kernel more information about parameter sizes. The kernel still marshals pointer parameters, and servers must still marshal embedded pointers. Servers must still perform a secure copy, if necessary. The following list shows what has changed:
- The kernel has taken over full access checking of pointer parameters.

- The function servers must call to marshal parameters.

- Options available for servers to perform a secure copy.

- marshaling must be handled if the server needs to access a buffer asynchronously.
Loading DLLs in Kernel Mode or User Mode
- One DLL cannot simultaneously load into both kernel mode and user mode.
- When you run an executable file or a DLL, the code must be able to run at its specified address. For example, jumps within the code or references to global variables must be modified to refer to the hard-coded addresses. Because kernel DLLs load above 0x80000000 and user DLLs load below 0x80000000, it is impossible for a single copy of the DLL to be able to run in both locations at once.
- In Windows Embedded CE 6.0, the OS bypasses this limitation by having two versions of some DLLs; one version to load in kernel mode and one version to load in user mode.

- The kernel has a new naming standard for DLLs that load in both kernel mode and user mode. The kernel-mode DLL has a k. at the beginning of the name. user-mode version of coredll is coredll.dll whereas the kernel-mode version is k.coredll.dll.

- The kernel automatically translates kernel-mode accesses to user-mode DLLs into the proper kernel-mode version for this version. For example, if a DLL is linked to the user-mode coredll.dll, as almost all DLLs in the build system are, when that DLL is loaded into the kernel process, the DLL is importing from the kernel-mode k.coredll.dll, instead. There is no load error in this case because the imports are seamlessly redirected to the kernel DLL. Similarly, if code in the kernel process calls LoadLibrary on coredll.dll, it actually loads a reference to k.coredll.dll. Therefore, if it calls GetProcAddress and calls a function, it calls the proper kernel-mode function.

- if you implement DLLs, you do not need to change all coredll.dll references to k.coredll.dll references.
- In fact, that would destroy portability of your code to user mode. In the future, you may want to run your code in user mode, instead of inside the kernel.
OEMs may need to specify where their DLLs load with .bib file flags. The Z flag specifies Windows CE 5.0 and prior versions. Z cannot be combined with the compression flag C, whereas there are no such
Process Switching

File system.exe, GWES.exe, and Device.exe are process hence process switch required.
File system.exe, GWES.exe, and Device.exe have become libraries which are loaded into the kernel. Now that they are libraries, there is no process switch between them.
- Services.exe is still a user-mode process.
Paging Pool
- the paging pool was a fixed-size portion of memory reserved for executable code and read-only, memory-mapped files.
- If the amount of code loaded into memory at any time was smaller than the pool, the pool did not shrink to match.
- if the amount of coded loaded at any time was larger than the pool, only the most recently used pages were held in memory, up to the size of the pool. - The pool did not grow to hold more code, so executable code discarded.
- OEMs controlled the size of the paging pool with an OEM adaptation layer (OAL) variable, cbNKPagingPoolSize.
- the page pool maintained a physically contiguous range of pages and kept an array of information to track each page in the pool.
- Pages were kept in a global first-in, first-out (FIFO) list, and in a list per module for quick discarding of a module on unload.
- The pool can temporarily exceed its reserved size.
- Rather than setting a single size for the pool, an OEM can now choose a target size for the pool, plus a maximum size.
- There is always enough memory reserved for the pool to grow to reach its target.
- Above that, if memory is available the pool continues to grow, up to the maximum limit.
- The pool never exceeds the maximum.
- As soon as the pool exceeds the target, the kernel wakes up a trimming thread that walks through modules and discards pages.
- The trimming thread runs at low priority normally, so that paging operations do not disturb other system activity.
- If the pool approaches its maximum limit, the trimming thread runs at higher priority temporarily to ensure that pool memory is released.

- There are now two pools.
- One pool, the loader pool, holds executable code and read-only, memory-mapped files as did the Windows CE 5.0 paging pool.
- A second pool, the file pool, holds data pages from memory-mapped files and from the new CE 6.0 file system cache manager.
- The file pool is different from the loader pool in that its pages may also be read/write, so they may need to be written back to disk before being discarded.
- Therefore, the file pool can be configured separately from the loader pool to maximize system performance.
- Because file writes are relatively slow operations, the file pool trimming thread runs at lower priority than the loader pool trimming thread.
- the OAL variable cbNKPagingPoolSize is deprecated.
- Instead, to configure paging pool parameters, OEMs can implement a new IOCTL in their OAL: IOCTL_HAL_GET_POOL_PARAMETERS.
- If the OAL does not implement this IOCTL, the kernel chooses suitable default settings.
- The kernel also exposes information about the current state of the paging pool to applications through the new kernel IOCTL IOCTL_KLIB_GET_POOL_STATE.
- Applications can call this IOCTL to query current memory usage and other state information about the paging pools.
Kernel Mode

1. SetKMode and ALLKMODE exist.
1. modules (DLLs) are loaded in the kernel process, a user process, or both.
2.  any thread owned by a module loaded in the kernel process is said to run in kernel mode.
3. any thread owned by a module loaded in a user process is said to run in user mode.
4. For threads to migrate to different processes, but in general, this happens only for a short duration when the threads are in process server library (PSL) servers.
Kernel Servers
- a process server library (PSL) is a process that implements a set of APIs for applications to call.
- a kernel-mode server is a DLL that loads into the kernel process and implements a set of APIs for applications to call. Kernel.dll, filesys.dll, device.dll, gwes.dll, and most of device drivers are kernel-mode servers.
- The kernel-mode servers are supported by a kernel-only version of coredll named kcoredll.dll. Any code that loads into the kernel, but was linked to coredll.dll, is automatically redirected to use k.coredll.dll, instead.
Simplified and improved security for API calls.
- a user-mode server is a process that registers an API set.
- Services.exe is a user-mode server that loads some drivers that in previous version of Windows Embedded CE were loaded by filesys, device, and gwes.
- Improved performance for most API calls.


Marshalling in WinCE [Windows]

Marshaling Mechanics
- two possible ways that a buffer can be marshaled so that the server can use it:
-- Duplication - Allocate a new buffer for the server to use, and copy data into and out of the buffer as necessary.
- The server owns the buffer for the duration of the call.
- gives the server access to the buffer,
- prevents the caller from asynchronously modifying the buffer after it has been validated by the server.


-- Aliasing:
- Map part of the memory of the caller process into the memory space of the server process.
- Because every process has a separate address space, it is not possible to access another process's memory by using a pointer.
-  Instead, the kernel aliases part of the memory of the caller into the address space of the server, using the VirtualCopyEx function.
- The same piece of physical memory is temporarily shared between the two processes by mapping it to two different virtual address ranges inside the two processes.
- This form of mapping involves allocating a new virtual address range, instead of just re-basing a pointer


Access checking: Verifying that the caller process has sufficient privileges to access a buffer.


Marshaling: Preparing a pointer that a server can use to access a caller’s buffer.


Secure-copy: Making a copy of a buffer to prevent asynchronous modification by the caller.


Pointer parameter: Pointer that is passed as a parameter to an API.


Embedded pointer: Pointer that is passed to an API by storing it inside a buffer or inside of another embedded pointer.

Virtual memory layout - WinCE6 [Windows]

Benefits of the Virtual Memory Layout
  1. Switching processes on ARM and x86 is faster because it is simpler to make use of the hardware page tables.
  2. The time to handle TLB misses on ARM and x86 remains about the same.

Tradeoffs of the Virtual Memory Layout


  1. The virtual memory for every process is no longer accessible at all times
  2. -,the virtual memory for the kernel process and the current process are accessible at all times.
  3. - Therefore, accessing the memory of another process, particularly buffer parameters that are passed to a server, is no longer as simple as mapping a pointer.
  4. More complicated reference counting.
  5. More complicated interprocess communication (IPC) and buffer passing.