Despite the criticism, e.g.
QUOTE
Indeed, without analyzing the code flow, a disassembler, even one that decodes every byte sequence correctly, will produce a result that leaves something to be desired.
...I still plan to go ahead with a x86 disassembler.
While I am still studying the decoding of CPU opcodes, I have finished the program to read EXE file and dump it code section.
In part 1 (I hope there will part 2 in coming weeks or months), I will share how to find the code section in EXE file.
In the mean time, you can find my code repo "exedump" on GitHub:
You may also refer to PE Format
1. Open file, read the beginning of file
2. If 'MZ', go to next step, if not quit
3. Set file position to 0x3C, read the DWORD offset value (start of PE)
4. Set file position to value read in previous step
5. If 'PE,0,0", go to next step, if not quit
6. Read adjacent WORD for machine type (optional)
7. Read next adjacent WORD for number of section
8. Set file position to 0x18 relative to 'start of PE'
9. Read magic number WORD value (for 32-bit or 64-bit PE)
10. Set file position to 0x2C relative to 'start of PE'
11. Read BaseOfCode DWORD value
12. If magic number (read in step 9) is 0x10B,
set file position to 0xF8 relative to 'start of PE', or else
set file position to 0x108 relative to 'start of PE'
13. Read section table (each table is 40 bytes long)
Its VirtualAddress, SizeOfRawData, PointerToRawData DWORD values
14. If VirtualAddress is equal to BaseOfCode (read in step 11) then go to print hexdump
If not match, then loop until 'number of section' (read in step 7)
You may ask why don't just use AddressOfEntryPoint? Well, because it is relative virtual address, not file offset on disk.
To set file position on disk, have to read section table (or section headers), I mean "code section", for its PointerToRawData.
But how we do know it is code section? I use a trick here, by comparing VirtualAddress found in each section table with BaseOfCode found in header.
For example: (Virtual address)
CODE
'.data' section: 0x1000
'.text' section: 0x2000
'.idata' section : 0x3000
The virtual address is unique for each section, so if my BaseOfCode is 0x2000, then I know the '.text' section is code section.
Until part 2!
This post has been edited by FlierMate4: Mar 14 2023, 12:20 PM
Mar 13 2023, 02:31 PM, updated 3y ago
Quote




0.0385sec
0.36
6 queries
GZIP Disabled