A thread is an instance of execution that operates within the boundaries of a process. A process is not scheduled to execute, the threads within a process are. There may be many threads executing in the context of one process. Although a thread may have "thread specific storage", generally all memory and resources created in the context of the process can be used by any executing thread. Global and Local ResourcesNot to be confusing here, but there are exceptions. There are resources that are created globally rather than locally. That means these resources may be used outside the context of the process in which they were created. One such example is a window handle. These resources have their own boundaries outside of a process. Some resources may be system wide, others desktop or session wide. There are also "shared" resources where processes can negotiate sharing of a resource through other means and mechanisms. What is Virtual Memory?In general, "Virtual Memory" is generally thought of as fooling the system into thinking there‘s more physical memory than there really is. This is true and false at the same time. It depends on who "the system" is and there‘s really more to it than that. The system is not being fooled into thinking there‘s more memory than there really is. The hardware already knows there‘s less memory and is actually the one who implements the necessary mechanisms to support "Virtual Memory". The Operating System is the one which utilizes these capabilities to perform "Virtual Memory", so it is also not being fooled. So, who is being fooled? If anyone is being fooled, it‘s the processes running on the system. I don‘t believe that to be the case either. The application programmer generally knows the system he is programming for already. That means, he knows the Operating System uses "Virtual Memory" or not such as DOS and he programs for that platform. In general, it doesn‘t mean anything. A simple application really doesn‘t care as long as it gets to execute. The only time you really run into trouble would be a "cooperative multitasking" system verses a "preemptive multitasking" system. But, then again, the programmer knows his target platform and programs appropriately. The differences with those two types of Operating Systems is beyond the scope of this article and does not apply. So, back to answering this question. The first thing that "Virtual Memory" does is that it abstracts the physical address space of the machine. This means the application programs do not see or know about the physical address. They know a "Virtual" address. The CPU is then capable of converting a "Virtual" address to a "Physical" address based on certain characteristics setup by the Operating System. The details of that mechanism is beyond the scope of this document. Just understand that the application receives a "Virtual" address and the processor maps it to a physical address. The next part of the equation is that the "Virtual" address does not need to point to a "Physical" address. The Operating System can use a swap file to keep memory on disk, that way, the entire program does not have to be in physical memory at the same time. This allows many programs to be in memory and execute. If a program attempts to access a memory location that is not in physical memory, the CPU knows this. The CPU will page fault and know that the memory that is being accessed is out on disk. The Operating System will then get told and will pull that memory from disk to Physical memory. Once this is complete, the program is given back execution and will continue where it left off. There are many algorithms to decide how to pull memory in from disk. Unless you plan to grow the footprint of a process in Physical memory, you usually swap a page out to swap one in. There are many algorithms that the OS can use to grow the process physical foot print and swap pages in and out. A simple one is basically, the least frequently used page in memory. You generally want to avoid writing programs that keep crossing page boundaries frequently, this will eliminate "thrashing" which is swapping in and out pages from memory to disk often. These topics are outside the scope of this tutorial. The next advantage of "Virtual Memory" is protection. A process cannot directly access another process‘s memory. That means that at anyone time, the CPU has only the Virtual address mappings for that process. That means, it can‘t resolve a virtual address in another process. This makes sense because since they are separate mappings, the processes could and will have the same memory address pointed to different locations! That doesn‘t mean it‘s impossible to read another process‘ memory. If the Operating System has built-in support, such as Windows does, you can access another process‘ memory. You could also do this if you could gain access to memory locations, and manipulate the CPU registers as they relate to Virtual Memory mapping. Luckily, you can‘t, as the CPU can check your privilege level before you attempt to execute sensitive assembly instructions, and "Virtual Memory" will keep you away from being a usermode process and manipulating page or descriptor tables (Although, there is a method in Windows 9x to get the LDT in usermode). What is the stack?Now that I‘ve described the basics of the system, I can get back to "What is a stack?". In general, a stack is a general-purpose data structure that allows items to be pushed onto it and popped off. Think of it as a stack of plates. You can put items on the top and you can only take items off the top (without cheating). If you followed that strict rule, you have a stack. A stack is generally referred to as "LIFO" or "Last In First Off". Programs generally use the stack as a means of temporary storage. This is generally unknown to the non-assembly programmer as the language hides these details. However, the generated code produced by your program will use a stack and the CPU has built-in stack support! On Intel, the assembly instructions to put something on the stack and take something off are Getting back on track, every "thread" executing in a process has its own stack. This is because we can‘t have multiple threads attempting to use the same temporary storage location as we will see in a moment. How is a function call made?The function call depends on the "calling convention". The "calling convention" is a basic method that the caller (the function making the call) and callee (the function being called) have agreed on in order to pass parameters to the function and clean up the parameters afterwards. In Windows, we generally support three different calling conventions. These are "this call", "standard call" and "CDECL or C Calling convention". This CallThis is a C++ calling convention. If you‘re familiar with C++ internals, member functions of an object require the Standard Call"Standard Call" is when the parameters are pushed backwards on to the stack and the callee cleans up the stack. CDECL or C Calling Convention"C Calling" convention basically means that the parameters are pushed backwards onto the stack and the caller cleans up the stack. Pascal Calling ConventionIf you‘ve seen old programs, you will see "PASCAL" as their calling convention. In WIN32, you actually are not allowed to use Cleans up the stack?The difference in who cleans up the stack is a big deal. The first
is saving bytes. If the callee cleans up the stack, that means there
doesn‘t have to be extra instructions generated at every function call
to clean up the stack. The disadvantage to this is that you cannot use
variable arguments. Variable arguments are used by functions like Also, although it is possible to then clean up the stack, it‘s not entirely feasible. Since the function does not know at compile time how many parameters are sent to it, it means it has to manipulate the stack and move around the return value in order to clean up. It‘s easier to just let the caller clean up the stack in this case. Intel supports an instruction to clean up the stack by the callee. It‘s So, what is the stack?A stack is a location for temporary storage. Parameters are pushed
onto the stack, then the return address is pushed onto the stack. The
flow of execution must know where to return to. The CPU is stupid, it
just executes one instruction after the other. You have to tell it
where to go. In order to tell it how to get back, we need to save the
return address, the location after the function call. There is an
assembly instruction that does this for us: The layout of the stack would be the following: [Parameter n ]
Before returning, the stack is cleaned up to the return address and then a "return" is issued. If the stack is not kept in proper order, we may get out of sync and return to the wrong address! This can cause a trap, obviously! What is the "base pointer"?The "base pointer" in Intel is generally Generally, it‘s Putting it all togetherSo, you can now see the reason each thread has its own stack. If they shared the same stack, they would overwrite each other‘s return values and data! Or, could eventually, if they ran out of stack space. That‘s the next problem we will discuss. Stack OverflowA Stack Overflow is when you reached the end of your stack. Windows generally gives the program a fixed amount of user mode stack space. The kernel has its own stack. It generally occurs when you run out of stack space! Recursion is a good way to run out of stack space. If you keep recursively calling a function you may eventually run out of stack and trap. Windows generally does not allocate all of the stack at once, but instead grows the stack as you need it. This is an optimization obviously. We can write a small program to perform a stack overflow and then find out how much stack Windows gave us. ![]() 0:000>; g
To do this, I simply used !teb, which displays all elements of the TEB or "Thread Environment Block" (found at FS:0 as mentioned in a previous tutorial). If you subtract the base of the stack from the stack limit, you get the size. 1,044,480 bytes is how big of a stack Windows gave us. Stack UnderflowIn general, a Stack Underflow is the opposite of an overflow. You‘ve somehow thought you put more on the stack than you really have and you‘ve popped off too much. You‘ve reached the beginning of the stack and it‘s empty, but you thought there was more data and kept attempting to pop data off. Overflows and UnderflowsOverflows and Underflows can also be said to occur when your program gets out of sync and crashes thinking the stack is in a different position. The stack could underflow if you clean up too much in a function and then attempt to return. Your stack is out of sync and you return to the wrong address. The reason your stack is out of sync is you thought you had more data on it than you did. You could consider that an underflow. The opposite can also occur. You‘ve cleaned up too little because you didn‘t think you had that much data on the stack and you return. You trap when you return because you went to the wrong address. You "could" consider this an overflow as you are out of sync, thinking you have less data on the stack than you really do. How does the debugger get a stack trace?This brings me to my next topic, how does a debugger get a stack trace? The first answer is simply by using "Symbols". The symbols can tell the debugger how many parameters are on the stack, how many local variables, etc., so the debugger can then use the symbols to determine how to walk the stack and display the information. If there are no symbols, it uses the base pointer. Each base pointer
points to the previous base pointer. The Base Pointer + 4 also points
to the return address. This is how it then walks the stack. If everyone
uses Here is a simple table of some function calls. I am going to use the stack trace from the first tutorial. 0:000> kb
Since Our current ![]() [Stack Address | Value | Description]
So, 0012fefc 77c5aca0 MSVCRT!_iob+0x20
So, we can assemble the first function: MSVCRT!_output+0x18(77c5aca0, 00000000, 0012ff44);
The second function is This is the calling function: 0012fef8 77c3e68d MSVCRT!printf+0x35
It then goes to the previous 0012ff40 00000000
This is the calling function with its parameters. MSVCRT!printf+0x35(00000000, 77f944a8, 00000007);
As you can see, if anything is off, this information is wrong. That is why you must use your judgment when interpreting these values. The next 0012ff3c 00401044 temp!main+0x44
The previous parameters were at 0012ffc0 + 8. Remember, this also assumes that These are the parameters: 0012ffc8 77f944a8 ntdll!RtlpAllocateFromHeapLookaside+0x42
Our next 0012ffc4 77e814c7 kernel32!BaseProcessStart+0x23
So, 0:000> dds 0012fff0
This should be good enough since our previous return value is MSVCRT!_output+0x18 (77c5aca0, 00000000, 0012ff44);
This was our stack trace from the debugger: ChildEBP RetAddr Args to Child
What‘s different and why? Well, we followed a simple rule to walk the stack. 0:000> kb
The same as ours! So, the debugger used symbolic information to walk the stack and display a more accurate picture. However, without symbolic information, there‘s function calls missing. That means, we cannot always trust the stack trace if symbols are wrong, missing or not complete. If we do not have symbol information for all modules, then we have a problem! If I continue with these tutorials, one of the next ones will attempt to explain symbols and validating them. However, I will attempt to show you one trick to validating function calls in this tutorial. As we can see, we notice we are missing a function call. How do you validate function calls? By verifying they were made. Verifying Function CallsI ran the program again and got a new stack trace. 0:000> kb
Some of the values on the stack are different, but that‘s what happens when you run programs again. You‘re not guaranteed the same run every time! This is your first return value: 77c3e68d If you un-assemble it, you will get this: 0:000>; u 77c3e68d
The listing reads like this: <;address> <opcode> <assembly instruction in english or mnemonic>
This is the return value. What is a return value? It‘s the next instruction after a call is made. Thus, if we keep subtracting from this value, we will eventually un-assemble the call instruction. The trick is to un-assemble enough to make out the call function. Be warned though, Intel opcodes are variable. That means that they are not a fixed size and un-assembling in the middle of an instruction can generate a completely different instruction and even different instruction list! So, we have to guess. Usually if we go back enough, the instructions eventually get back on track and are unassembled correctly. 0:000>; u 77c3e68d - 20
As you can see, the return address is 77c3e68d. So, 77c3e688 is the function call. Thus, we are calling The next return address listed in the stack trace is 00401044. Let‘s try the same: 0:000>; u 00401044 - 20
Unfortunately yes, this is assembly. This is basically a function pointer. It means call the function at address 00402010. Use "DD" to get the value at the address. 0:000> dd 00402010
We will now un-assemble this address since we know it‘s a function call. 0:000>; u 77c3e658
So, yes, we‘re calling The next return value is 77e814c7. Let‘s see if we‘re calling 0:000>; u 77e814c7 - 20
It‘s calling the first parameter. This is the first parameter: 0012fff0 00000000 00401064 00000000 78746341 kernel32!BaseProcessStart+0x23
So, it is calling something inside of 0040103e ff1510204000 call dword ptr [temp+0x2010 (00402010)]
That means that we have to un-assemble this function from its start all the way to 0401064. Another way to do it would be to use DDS on the stack and find out if there are any other symbols on the stack and verify them. If we do DDS on ![]() 0:000> dds ebp
There are a lot of unknown This is where guess work comes in. If you think that function does not jump backwards, you could attempt to only look at values that are > than this one. You could attempt to un-assemble every single reference, but you have to start somewhere. I would say, look at the symbols closest to this one first. Here is one: 0012ff50 00401147 temp+0x1147
We found this function call and it looks to be a valid address. The
way to distinguish an invalid return value on the stack is, the
previous instruction is not a Let‘s un-assemble this one: ![]() 0:000>; u 00401000
This looks like a valid function call and looks like it calls ![]() 0:000>; u 0401064
If you know assembly, you can simply read through the logic and bet
that it could have made it here or if it could not make it here. If it
could, then provided two functions do not share the same assembled code
base, there are two function calls missing. Now, if you just find their
return values on the stack, you can find their parameter list. What can
we assume from this? Some of these functions do not use 00401147 is the missing return value. If we find it on the stack, we can update the correct parameters: 00000000
So, here‘s the one generated from KB: 0:000> kb
Here‘s our modified one: ChildEBP RetAddr Args to Child
We know that the
0:000> dd 322470
Dumping the array, which only contains 1 string as per Multiple Return Addresses On The Stack?Why are there multiple return addresses on the stack? The stack may generally be initialized to zero, but as it‘s being used, it becomes dirty. You know that local variables aren‘t always initialized, so if you make a function call, those values aren‘t reset to zero when the stack moves up. If you pop a value off the stack, the stack may decrement, but the values stay unless they are physically cleaned up. Sometimes, the stack optimizes things and doesn‘t clean up variables as well. So, seeing "ghost" values on the stack is very common. This is not always desirable to leave values on the stack. For
example, if your function puts the password on the stack and traps
sometime later. A stack dump may still show the password on the stack!
So, sometimes when you have sensitive information, you may want to
clean up the values on the stack before you return. One way to do this
is with the Buffer OverflowsBuffer overflows are a common occurrence on the stack. The stack grows down in memory, but arrays grow up in memory. This is because you usually "increment" a pointer or array when using it to get to the next index, rather than decrementing it. Thus let‘s say this was your C function: { That would evaluate to a stack like this: 424 [Return Address ]
As you can see, if you index your array to Windows 2003Windows 2003 has a new method to attempt to prevent buffer overflows. This can be compiled in VS.NET using the GS flags. A random value is generated as a cookie on startup of the application. The cookie is then XOR‘d with the return address of the function and placed on the stack after the base pointer. This is a simple example: [Return Address ]
Upon return, the cookie is checked against the return value. If they‘re unchanged, then the return occurs, if not, then we have a problem. The reason for this security is not to prevent code from trapping without proper handling, but rather to protect code from executing injected code. A security risk is when someone finds out how to overflow a buffer with actual code and an address to that code. This will cause the program to return to and execute that code. This URL provides the full details of this: ConclusionI have confused beginners and probably bored advanced programmers, however, it‘s hard to portray advanced concepts in a simple manner. I am trying my best though. If you like or dislike these tutorials, leave me a comment. If you want these to end, let me know too! I‘ve probably started off too simple then got too advanced too fast. I can‘t help it though, programmers should study this information and supplement it with other sources to gain full knowledge on the subject. Do not take what you read or posted on message boards as concrete fact. Everyone is human, everyone errors and not one person knows everything. These sites let just about anyone post information, so always be skeptical. Let me know if you found an error. Thanks. |
|