Tuesday, August 15, 2023

Windows

There's a ton of stuff one could say about Windows (and operating systems in general). I'm not sure all if it could all be succinctly described in a single human lifetime. This is just a scratch note about Windows system programming. Windows is, in some sense, quite a lot like Linux. But in some ways it's different. Maybe more complex at times.

In the Linux world, things exist in a monolithic kernel. Every device is a file. Everything is a file. Everything is kernel space. Whereas in the world of Windows (and in hybrid kernels in general) there's a separation between user space and kernel space.

Windows gives each user-mode application a block of virtual addresses. This is known as the user space of that application. The other large block of addresses, known as system space or kernel space, cannot be directly accessed by the application.

Windows also has varying functionality for various devices, etc. Simply put, Windows implements a lot of different systems within itself. We have user space and kernel space, like so. And in the world of Windows, more or less everything in the user space talks to NTDLL.DLL to make appropriate calls to hand off work to the Windows kernel, effectively context-switching. While some other software calls are diverted to libraries such as MSVCRT.DLL, MSVCP*.DLL, and CRTDLL.DLL.

But nonetheless, the kernel mode is quite monolithic in some sense. This is both good and bad, depending on how you look at it. A lot of the Windows kernel is written in C. Although there are some recent efforts to update it with memory safe languages like Rust, to maybe reduce some of the more negative tradeoffs.

All code that runs in kernel mode shares a single virtual address space. Therefore, a kernel-mode driver isn't isolated from other drivers and the operating system itself. If a kernel-mode driver accidentally writes to the wrong virtual address, data that belongs to the operating system or another driver could be compromised. If a kernel-mode driver crashes, the entire operating system crashes.

Windows Architecture Overview


User Space:
  • System Processes
    • Session manager
    • LSASS
    • Winlogon
    • Session Manager
  • Services
    • Service control manager
    • SvcHost.exe
    • WinMgt.exe
    • SpoolSv.exe
    • Services exe
  • Applications
    • Task Manager
    • Explorer
    • User apps
    • Subsystem DLLs
  • Environment Subsystems
    • Win32
    • POSIX
    • OS/2
Kernel Space:
  • Kernel Mode
    • Kernel mode drivers
    • Hardware Abstraction Layer (HAL)
  • System Threads
    • System Service Dispatcher
    • Virtual Memory
    • Processes and Threads
  • Security
    • Security Reference Monitor
  • Device & File Systems
    • Device & File System cache
    • Kernel Drivers
    • I/O manager
    • Plug and play manager
    • Local procedure call
    • Graphics drivers
  • Hardware Abstraction Layer (HAL)
    • Hardware interfaces

Windows Ecosystem

OK, so of course the real question is, how do we interact with Windows ecosystem to actually do things? Like other software ecosystems, we have some set of libraries which we can use to implement functions which return values. Consider the CreateFileA API. Per Microsoft's documentation, here is the prototype for this interface:

HANDLE CreateFileA(
  [in]           LPCSTR                lpFileName,
  [in]           DWORD                 dwDesiredAccess,
  [in]           DWORD                 dwShareMode,
  [in, optional] LPSECURITY_ATTRIBUTES lpSecurityAttributes,
  [in]           DWORD                 dwCreationDisposition,
  [in]           DWORD                 dwFlagsAndAttributes,
  [in, optional] HANDLE                hTemplateFile
);

A file name, access, share mode, security attributes (optional), a disposition, flags, and a template (optional). We'll also use printf and scanf to read some inputs. First we'll get a file path, and then a name for our new file. We'll concatenate the two into a full path, and call it with hFile on the CreateFileA API. And we'll define a constant to point to the content we wish to write to our text file.

We'll use FormatMessageA, as listed in Microsoft's documentation, to obtain possible error messages in case of failure. And check for errors against the WriteFile API with our if(!WriteFile statement - that is, if our write fails, let us know that it failed, close our handle, and return a fail status. Else, if our file has been created, close our handle and let us know by printing a message and the conjoined fullPath of our file, then exit cleanly with 0:

#include <stdio.h>
#include <windows.h>

int main() {
    char path[MAX_PATH];
    char filename[MAX_PATH];
    HANDLE hFile;
    DWORD bytesWritten;

    // Get user input for path and filename
    printf("Enter the path: ");
    scanf("%s", path);

    printf("Enter the filename: ");
    scanf("%s", filename);

    char fullPath[MAX_PATH];
    snprintf(fullPath, sizeof(fullPath), "%s\\%s", path, filename);

    hFile = CreateFileA(fullPath, GENERIC_WRITE, 0, NULL, CREATE_NEW, FILE_ATTRIBUTE_NORMAL, NULL);

    if (hFile == INVALID_HANDLE_VALUE) {
        DWORD error = GetLastError();
        LPVOID errorMsg;
        FormatMessageA(
            FORMAT_MESSAGE_ALLOCATE_BUFFER | FORMAT_MESSAGE_FROM_SYSTEM,
            NULL,
            error,
            0, // Default language
            (LPSTR)&errorMsg,
            0,
            NULL
        );
        printf("Failed to create the file: %s\n", (char*)errorMsg);
        LocalFree(errorMsg);
        return 1;
    }

    const char* content = "Noted";
    if (!WriteFile(hFile, content, strlen(content), &bytesWritten, NULL)) 
{
        printf("Failed to write to the file.\n");
        CloseHandle(hFile);
        return 1;
    }

    CloseHandle(hFile);

    printf("File created successfully: %s\n", fullPath);

    return 0;
}

Just a Prologue

After compiling, we can test and observe this to make some observations about Windows system behavior. The prologue, stack unwinding, and it's use of undocumented calls, which happen abstracted and hidden away from the user.

C:\Users\User\Downloads>.\createfile.exe
Enter the path: C:\Users\User
Enter the filename: ts
File created successfully: C:\Users\User\text.txt

For example, when we first run our program, we immediately observe calls to NTDLL, which negotiates a thread and begins the work of running and executing our file. We can see this here:

0, ntdll.dll!RtlUserThreadStart

After hitting the first return, we can pull a stack trace to see our thread has now unwound a bit, and we've initiated contact with the kernel at KERNEL32.DLL, which is home to x64 function calls.

0, ntdll.dll!NtWaitForWorkViaWorkerFactory+0x14
1, ntdll.dll!RtlClearThreadWorkOnBehalfTicket+0x35e
2, kernel32.dll!BaseThreadInitThunk+0x1d
3, ntdll.dll!RtlUserThreadStart+0x28

During this time, we see multiple calls to LdrpInitializeProcess which initialize the structures in our process. Then we see our BaseThreadInitThunk, a similar kernel mode callback like LdrInitializeThunk, and a call to RtlNtImageHeader to get the image headers for our process.

Skipping forward a bit, later, when we enter our path and filename, those values are moved into the registers, like so. And following this, many cmp comparisons are made, checking the path to see that it is ok:

mov rbx,qword ptr ss:[rsp+70] | __pioinfo
mov rsi,qword ptr ss:[rsp+78] | Users\\User\n\n 

After a very long dance handling the file path, we finally see assembly calls involving our filename emerge. The filename is effectively loaded into a register like so:

push rbx                        | rbx:&"ts\n\nsers\\User\n\n"
sub rsp,20                      |
mov rbx,rcx                     | rbx:&"ts\n\nsers\\User\n\n"
lea rcx,qword ptr ds:[<_iob>]   | 
cmp rbx,rcx                     | 
jb msvcrt.7FFF040306F5          |
lea rax,qword ptr ds:[7FFF04088 |
cmp rbx,rax                     | rbx:&"ts\n\nsers\\User\n\n"
ja msvcrt.7FFF040306F5          |

Much later on when our file is created, we see that this file creation likely could have been logged by Event Tracing For Windows.

call createfile.7FF60B1C6D00    |
jmp createfile.7FF60B1C860C     |
sub r10d,2                      |
mov rcx,qword ptr ds:[r13]      | rcx:"ts", [r13]:"ts"
lea rbx,qword ptr ds:[r13+8]    | [r13+8]:EtwEventWriteTransfer+260

And after many assembly instructions later, we finally see our text get the lea, load effective address, containing our message for the text file we're writing. "Noted":

call rax                        |
mov eax,1                       |
jmp createfile.7FF60B1C1743     |
lea rax,qword ptr ds:[7FF60B1D1 | 00007FF60B1D104F:"Noted"
mov qword ptr ss:[rbp+300],rax  |
mov rax,qword ptr ss:[rbp+300]  |

And a syscall for NtWriteFile:

mov r10,rcx                     | NtWriteFile
mov eax,8                       |
test byte ptr ds:[7FFE0308],1   |
jne ntdll.7FFF055AEE55          |
syscall                         |
ret                             |

And lastly, our call to closeHandle:

mov rax,qword ptr ds:[<CloseHandle>] | rax:CloseHandle
call rax                             | rax:CloseHandle

Though, much more happens - this is the gist of it.

Most of the stuff in the Microsoft API is well documented. Some of the code is even partially compatible with Unix systems. But other things in the Microsoft ecosystem however, are not officially documented. Microsoft gives us some public APIs. Some of which are just wrappers that call undocumented features under the hood. In a future post, we'll use an undocumented API to talk to the Windows kernel.

No comments:

Post a Comment