Welcome Guest ( Log In | Register )

Outline · [ Standard ] · Linear+

 How to disassemble an executable?

views
     
TSTullamarine
post Apr 10 2020, 03:04 PM, updated 4y ago

Getting Started
**
Validating
163 posts

Joined: Apr 2020
Recently, I am motivated to revive my own x86 compiler project.

To get me started, I build this console app (Release, Any CPU) on my Windows 10 using .NET Framework 4.7.1.

CODE
using System;

namespace HelloWorld
{
   class Program
   {
       static void Main(string[] args)
       {
           Console.WriteLine("Hello World");
       }
   }
}


After that, I run a CLI tool to dump the PE of HelloWorld.exe (4608 bytes):

CODE
[DOS Header]
NT Headers RVA         : 0x00000080

[File Header]
Machine                : 0x014C
                       IMAGE_FILE_MACHINE_I386 0x0000014C
Number Of Section      : 3
Linked At              : 0xB2A5CFAC (2064-12-23 06:37:32)
Optional Header Size   : 224
Characteristics        : 0x00000022
                       IMAGE_FILE_EXECUTABLE_IMAGE     0x00000002
                       IMAGE_FILE_LARGE_ADDRESS_AWARE  0x00000020

[Optional Header]
Magic                  : 0x10B
                       IMAGE_NT_OPTIONAL_HDR32_MAGIC   0x0000010B
ImageBase              : 0x00400000
Base Of Code           : 0x00002000
Base Of Data           : 0x00004000
OEP                    : 0x00002702
Section Alignment      : 0x2000
File Alignment         : 0x200
Linker Version         : 48.0
OS Version             : 4.0
Image Version          : 0.0
Subsystem Version      : 6.0
Image Size             : 0x8000
Code Size              : 0x800
Headers Size           : 0x200
CheckSum               : 0x00000000
Subsystem              : 3
                       IMAGE_SUBSYSTEM_WINDOWS_CUI     0x00000003
Dll Characteristics    : 0x00008560
Data Directory Size    : 16
Loader Flags           : 0x00000000

[Data Directory]
Name                            RVA             Size
Export Table                    (N/A)           (N/A)
Import Table                    0x000026AD      0x0000004F
Resource Table                  0x00004000      0x000005BC
Exception Table                 (N/A)           (N/A)
Security Table                  (N/A)           (N/A)
Base relocation Table           0x00006000      0x0000000C
Debug                           0x00002634      0x00000038
Copyright                       (N/A)           (N/A)
Global Pointer                  (N/A)           (N/A)
Thread local storage            (N/A)           (N/A)
Load Configuration              (N/A)           (N/A)
Bound Import Table              (N/A)           (N/A)
Import Address Table            0x00002000      0x00000008
Delay Import Descriptor         (N/A)           (N/A)
COM Header                      0x00002008      0x00000048
Reserved                        (N/A)           (N/A)

[Sections]
Name            RVA             V.Size          RAW             R.Size          Characteristics
.text           0x00002000      0x00000708      0x00000200      0x00000800      0x60000020
.rsrc           0x00004000      0x000005BC      0x00000A00      0x00000600      0x40000040
.reloc          0x00006000      0x0000000C      0x00001000      0x00000200      0x42000040

[Import Table]
mscoree.dll
0       _CorExeMain


If I understand correctly, the code section starts at 0x00402000 given the below:

ImageBase : 0x00400000
Base Of Code : 0x00002000

I am not sure if I am using the proper disassembler, because I just choose one randomly: https://onlinedisassembler.com/

I cannot seem to find the x86 instruction that make CALL DWORD PTR [XXXXXXXX] from 0x00402000 to 0x00402708. There must be a call made to the Win32 API, or mscoree.dll specifically?

While I have some good progress since my last attempt to understand Windows PE (especially "PE Dump" comes handy), I still have a long way to go to construct my own Windows PE in the simplest form.

My initial plan is to support compilation of following commands:
(1) Console.WriteLine
(2) Console.Write
(3) Console.ReadLine

Any help is greatly appreciated.


Reference: https://docs.microsoft.com/en-us/archive/ms...ormat-in-detail

* Mussel and Tullamarine are the same middle-aged guy
hmmhmm
post May 17 2020, 11:53 PM

New Member
*
Junior Member
30 posts

Joined: Feb 2020
At this juncture, suddenly understanding wimminz seem easier.
TruboXL
post May 18 2020, 05:12 PM

Over 900 useless posts
*****
Junior Member
984 posts

Joined: Jan 2016


visual studio has disassembler
steady bro
post May 18 2020, 11:13 PM

Getting Started
**
Junior Member
75 posts

Joined: Apr 2020
use disassembler
TSTullamarine
post May 20 2020, 11:46 PM

Getting Started
**
Validating
163 posts

Joined: Apr 2020
QUOTE(TruboXL @ May 18 2020, 05:12 PM)
visual studio has disassembler
*
I cannot seem to find the disassembler, but I know there is binary editor.


QUOTE(steady bro @ May 18 2020, 11:13 PM)
use disassembler
*
Yes.
TSTullamarine
post May 21 2020, 12:00 AM

Getting Started
**
Validating
163 posts

Joined: Apr 2020
Tonight I test with different executable (not .NET ), the result is more obvious:

CODE

push STD_OUTPUT_HANDLE
call [GetStdHandle]        ;STD_OUTPUT_HANDLE (DWORD)-11

push 0           ;LPVOID  lpReserved
push dummy  ;LPDWORD lpNumberOfCharsWritten
push len         ;DWORD   nNumberOfCharsToWrite
push msg       ;VOID    *lpBuffer;
push eax        ;HANDLE  hConsoleOutput
call [WriteConsole]

push 0
call [ExitProcess]


The Assembly code above is more readable when we compare it with disassembled code below:

user posted image

Online Disassembler will automatically format the disassembly output if I upload a PE (Windows ) / ELF (Linux) executable file, but it is rather intriguing because I cannot see the .code section (it is somehow hidden), only .data and .idata section.

So I have to dump the executable file, and paste the hexadecimal values onto Online Disassembler. I still do not know how to check the beginning of code section in an executable file, because the "ImageBase" and "Base Of Code" are addresses in memory (not on disk ) when the executable is running. What I did is look for CALL DWORD PTR [......] instruction, and it is somewhat located at the bottom of the executable file.

.NET executable is so complicated, I think I will use EXE compiled by FASM for future study.

TSTullamarine
post May 21 2020, 05:13 AM

Getting Started
**
Validating
163 posts

Joined: Apr 2020
Does this make sense to anyone?

user posted image


kok132
post May 21 2020, 05:27 AM

New Member
*
Newbie
1 posts

Joined: Sep 2004


.net is compiled to bytecode instead of machine code. to disassemble .net binary into il language you can use ilasm which come which with .net sdk.

or to make it more readable you can easily decompile using jetbrain decompiler.
TSTullamarine
post May 21 2020, 09:50 PM

Getting Started
**
Validating
163 posts

Joined: Apr 2020
QUOTE(kok132 @ May 21 2020, 05:27 AM)
.net is compiled to bytecode instead of machine code. to disassemble .net binary into il language you can use ilasm which come which with .net sdk.

or to make it more readable you can easily decompile using jetbrain decompiler.
*
Learned something new today. biggrin.gif
TSTullamarine
post Jun 10 2020, 04:49 PM

Getting Started
**
Validating
163 posts

Joined: Apr 2020
I just want to add notes that I have succeeded in looking for file offset of data and code section in an Windows executable compiled with flat assembler.

By referring to the PE dump output, the RVA is the memory offset once the executable is loaded during runtime. What we need to look for is the RAW (and R.Size is the size of each section on disk).

For example, this "Hello Lowyat.Net" console app has code section starting at 0x400 off the beginning of executable, and data section starting at 0x200 file offset. Each section has 0x200 (512 bytes) as required by FileAlignment (must be multiple of or at least 512) setting in the PE header.
And the import table is starting at 0x600 file offset. Each section is padded with null bytes if they are shorter than R.Size (e.g. 512 bytes in this case).

CODE
[Sections]
Name            RVA             V.Size          RAW             R.Size          Characteristics
.data           0x00001000      0x00000018      0x00000200      0x00000200      0xC0000040
.code           0x00002000      0x00000025      0x00000400      0x00000200      0xE0000000
.idata          0x00003000      0x00000086      0x00000600      0x00000200      0xC0000040

[Import Table]
KERNEL32.DLL
0       ExitProcess
0       GetStdHandle
0       WriteConsoleA


And further more, we can make use of DOS stub in Windows executable to run old x86 (with 16-bit registers) instruction, but at the same time use i386 (with 32-bit registers) instruction in code section of the said executable, making it compatible under DOS mode and modern Windows command prompt.

For example, I can issue DOS interrupt call in DOS stub to display "Hello World" text string, and at the same time make call to GetStdHandle and WriteConsoleA to display the same "Hello World" text string targeting 32-bit Windows and above. If we run this EXE, we will see the same console output (i.e. "Hello World") regardless it is DOS (emulator) or modern Windows. nod.gif
cikelempadey
post Jun 13 2020, 05:47 PM

Getting Started
**
Junior Member
86 posts

Joined: Apr 2011
From: Your Nen Nen


for native Windows PE program, the OEP in the optional header is the starting point rva of execution of ur program.

.NET program is similar to Java program which has its own instruction sets. But instead of using another file format (e.g class file format for Java), .NET program use the same PE format, with the COM Header entry in the Data Directory now points to the structure of CIL header for. NET program. The OEP in this case just point to piece of code that transfer execution to the CLR (mscoree.dll) that will load the execution environment for the program. The real entrypoint rva/token of. net program can be found inside the CIL header.

QUOTE
And further more, we can make use of DOS stub in Windows executable to run old x86 (with 16-bit registers) instruction, but at the same time use i386 (with 32-bit registers) instruction in code section of the said executable, making it compatible under DOS mode and modern Windows command prompt.
if u notice, FASM also made like this so u can run it both in Windows as in DOS environment.
brogan7
post Jul 20 2020, 04:21 PM

New Member
*
Newbie
2 posts

Joined: Apr 2005


sure...
kahbeng
post Jul 20 2020, 04:27 PM

New Member
*
Newbie
4 posts

Joined: Feb 2005
yes
Trung
post Jul 20 2020, 04:31 PM

New Member
*
Newbie
1 posts

Joined: Oct 2005
no
FlierMate
post Jun 22 2021, 10:12 PM

On my way
****
Validating
543 posts

Joined: Nov 2020
As time goes by, I have had better understanding on how to disassemble an binary executable in Windows.

Here goes:
» Click to show Spoiler - click again to hide... «

Originally posted by FlierMate on April 12, 2021 on Kaki.gg coding forum.

This post has been edited by FlierMate: Jun 22 2021, 10:18 PM
junyian
post Jul 7 2021, 02:29 PM

Casual
***
Junior Member
401 posts

Joined: Jan 2003


Stumbled into this post after many years of not logging on to lowyat forum (and I was crazy bored from work). Just a couple of comments based on what I remember (and forgive me if you already know this). I used to do this as a hobby. But these days job and family have taken priority.

1. Nice to know ODA exists. In the past we usually used IDA (from Hex Rays). It might be an overkill for what you're doing, but since they have a free version now, it could be something you'd like to explore as part of your RE tools. It have very neat features especially the FLIRT libraries for compiler identification.
2. If you're reversing an EXE compiled from HLL, compilers usually add a stub. So the entry point you see from the PE header is not the one that points to your actual main(), or WinMain() for Windows apps. But if you're reversing an EXE built in assembly then of course the compiler stub doesn't exist.
3. PUSH adds the value to stack, POP removes it from stack to the selected register. PUSHing it's a typical Microsoft way of passing function arguments. And within the function, the arguments are accessed using reference to ESP. I think Borland (if it still exists) used a different method, but I don't remember the details now.

FlierMate
post Jul 7 2021, 04:38 PM

On my way
****
Validating
543 posts

Joined: Nov 2020
QUOTE(junyian @ Jul 7 2021, 02:29 PM)
Stumbled into this post after many years of not logging on to lowyat forum (and I was crazy bored from work). Just a couple of comments based on what I remember (and forgive me if you already know this). I used to do this as a hobby. But these days job and family have taken priority.

1. Nice to know ODA exists. In the past we usually used IDA (from Hex Rays). It might be an overkill for what you're doing, but since they have a free version now, it could be something you'd like to explore as part of your RE tools. It have very neat features especially the FLIRT libraries for compiler identification.
2. If you're reversing an EXE compiled from HLL, compilers usually add a stub. So the entry point you see from the PE header is not the one that points to your actual main(), or WinMain() for Windows apps. But if you're reversing an EXE built in assembly then of course the compiler stub doesn't exist.
3. PUSH adds the value to stack, POP removes it from stack to the selected register. PUSHing it's a typical Microsoft way of passing function arguments. And within the function, the arguments are accessed using reference to ESP. I think Borland (if it still exists) used a different method, but I don't remember the details now.
*
Your point 2 is what I need to know. Good to know there is more people like you who like reversing/ decompiling.....

Maybe you can also look up my Sambal Compiler for Win32 (based on Asm model)? Heh. (It's on second page on this codemasters forum)

I did know tools like IDA exist, but I am not yet up to that level. I try to learn basics step-by-step.
junyian
post Jul 7 2021, 05:53 PM

Casual
***
Junior Member
401 posts

Joined: Jan 2003


QUOTE(FlierMate @ Jul 7 2021, 04:38 PM)
Your point 2 is what I need to know.  Good to know there is more people like you who like reversing/ decompiling.....

Maybe you can also look up my Sambal Compiler for Win32 (based on Asm model)?  Heh. (It's on second page on this codemasters forum)

I did know tools like IDA exist, but I am not yet up to that level. I try to learn basics step-by-step.
*
I saw the Sambal compiler thread. Very nice effort! PE stuff takes a bit of time to learn. I'm super rusty with that already. The way I learned PE was to reverse engineer how exe packers/cryptors worked though. Not sure if such things exist anymore. I believe those tools have a tendency to trigger AVs too nowadays.

IDA is very easy to use. I highly recommend to try it and compile some simple exe. smile.gif
FlierMate
post Jul 8 2021, 11:58 AM

On my way
****
Validating
543 posts

Joined: Nov 2020
QUOTE(junyian @ Jul 7 2021, 05:53 PM)
I saw the Sambal compiler thread. Very nice effort! PE stuff takes a bit of time to learn. I'm super rusty with that already. The way I learned PE was to reverse engineer how exe packers/cryptors worked though. Not sure if such things exist anymore. I believe those tools have a tendency to trigger AVs too nowadays.

IDA is very easy to use. I highly recommend to try it and compile some simple exe. smile.gif
*
Thanks for looking it up. Yes, your RE is still relevant today.

Maybe you can try their Flare-On RE challenge this year.

For last year, this is the problem set and solutions. (#2 challenge is exactly about finding out flag from corrupted PE with EXE packers)

Problems Set: http://flare-on.com/files/Flare-On7_Challenges.zip

Solutions: https://www.fireeye.com/blog/threat-researc...-solutions.html

PE format has not changed (or changed much) for the past 30 years....!

This post has been edited by FlierMate: Jul 8 2021, 11:59 AM
junyian
post Jul 8 2021, 12:15 PM

Casual
***
Junior Member
401 posts

Joined: Jan 2003


QUOTE(FlierMate @ Jul 8 2021, 11:58 AM)
Thanks for looking it up.  Yes, your RE is still relevant today.

Maybe you can try their Flare-On RE challenge this year.

For last year, this is the problem set and solutions. (#2 challenge is exactly about finding out flag from corrupted PE with EXE packers)

Problems Set:  http://flare-on.com/files/Flare-On7_Challenges.zip

Solutions:  https://www.fireeye.com/blog/threat-researc...-solutions.html

PE format has not changed (or changed much) for the past 30 years....!
*
Interesting! I barely have time to do challenges, but I don't mind trying it offline if time permits. Do you happen to have the binaries for the 2020 challenge? The link is dead.

2 Pages  1 2 >Top
 

Change to:
| Lo-Fi Version
0.0175sec    0.25    5 queries    GZIP Disabled
Time is now: 28th March 2024 - 06:43 PM