Reversing: DSYM
Introduction
DSYM is a medium difficulty reversing challenge on Hack The Box. Using Ghidra the challenge is quite easily solved, however actually obtaining the flag required guess work and I am not a fan of those types of challenges.
Although this challenge can be fully solved with static analysis it is worth nothing that there are 2 files in this challenge and the 2nd file actually has debug symbols for the 1st file. In the Dynamic Analysis the use of this 2nd file will be explained.
Static Analysis
The first step is to figure out what kind of challenge this is. Using file
the type of files, there are 2 in this challenge, can be inspected.
$file *
dunnoWhatIAm: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter *empty*, for GNU/Linux 3.2.0, BuildID[sha1]=7e677f09cf2db7922096c6da0ad55a1e9a6895b8, with debug_info, not stripped
getme: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=7e677f09cf2db7922096c6da0ad55a1e9a6895b8, stripped
Both are ELF
binaries and they share a BuildID
, meaning they were built in the same session probably. One has debug info, the other does not. Note that dunnoWhatIAm
does not have an interpreter set, so it will not be able to be executed.
To dig deeper, objdump -d
does not show any disassembly for dunnoWhatIAm
, this is in no means a normal executable.
For now, the next step in the process is to get some textual information using strings
. On getme
this yields some strings that look like a win situation. The other binary only lists symbol names, which points to it being a debug symbol file.
$strings -n 15 getme
/lib64/ld-linux-x86-64.so.2
__libc_start_main
_ITM_deregisterTMCloneTable
_ITM_registerTMCloneTable
You almost got me :D
Here is small price for you:
GCC: (Debian 8.2.0-7) 8.2.0
.note.gnu.build-id
For now the getme
binary seems to be the one that should be examined first. Load up the binary into Ghidra. The entry
function (found in Symbol Tree > Exports) has a standard libc
start function, the first argument should be the actual entry point.
__libc_start_main(FUN_00101258,in_stack_00000000,&stack0x00000008,&LAB_00101270,&DAT_001012e0, param_3,auStack8);
However, the first argument FUN_00101258
actually points to a function that only returns.
undefined8 FUN_00101258(void) {
return 0;
}
Instead use Windows > Defined Strings to list the strings in the binary and search for price
. Double clicking it will bring the listing display to the point where the string is defined. On the right hand side there are a XREF to one function.
s_You_almost_got_me_:D_Here_is_s XREF[1]: FUN_00101145
00102008 59 6f ds "You almost got me :D\nHere is s
75 20
61 6c
Double click the function to navigate to it. This seems to be the function that will solve the challenge.
void FUN_00101145(void)
{
At the bottom there is the string that hints that it will print our price. It consists of a printf
followed by a loop of 0x16
(decimal 22) positions in which it will xor
each entry using 0x29a
. The price will be printed in hex (%x
).
printf("You almost got me :D\nHere is small price for you: ");
local_c = 0;
while (local_c < 0x16) {
auStack200[local_c] = local_68[local_c] ^ 0x29a;
printf("%x",(ulong)auStack200[local_c]);
local_c = local_c + 1;
}
The variable local_68
is used to loop over. Just before the function the array is assigned values, however for local_68
only 4 positions are assigned. This is a common problem with Ghidra in that it is unable to correctly identify the array length.
local_68[0] = 0x2cf;
local_68[1] = 0x2dd;
local_68[2] = 0x2d5;
local_68[3] = 0x2e1;
local_58 = 0x2f6;
local_54 = 0x2aa;
At the top of the function, where variables are defined, local_68
should hold an array of 22 positions, however at the top of the function it only has 4. Press CTRL+L
and retype the variable to hold 22 positions.
uint local_68 [4];
undefined4 local_58;
undefined4 local_54;
The resulting list of array assignments can now easily be copied to a text editor so that it can be included in a python
script. All that needs to be done is loop over the array and perform the xor
operation.
a = [0x2cf, 0x2dd, 0x2d5, 0x2e1, 0x2f6, 0x2aa, 0x2f2, 0x2c5,
0x2ff, 0x2a9, 0x2ae, 0x2e3, 0x2e3, 0x2f6, 0x2c5, 0x2ee,
0x2aa, 0x2fd, 0x2c5, 0x2e0, 0x2a9, 0x2e7]
for i in a:
print(chr(i ^ 0x29a), end='' )
print()
The result should be the flag, sort of…
UGO{l0h_e34yyl_t0g_z3}
Flag
The flag is obfuscated. This is the part where the guessing comes into play. Although the structure of the flag is correct, it does not start with HTB. It is nearly there, but not quite.
Using CyberChef I played around with various encoding ciphers. Seeing that the digits and brackets seem to be in the right spot the most logical cipher to use would be ROT13. Indeed, it produces the actual flag.
HTB{y0u_r34lly_g0t_m3}
Solved.
Dynamic Analysis
Although the challenge is solved the question still remains; “what is dunnoWhatIAm for?”. The assumption is that the file contains debug symbols for the getme
binary, but how are they useful?
The GDB docs, reproduced online, about Separate debug files explains the use of separated debug files and the connection to BuildID
.
GDB supports two ways of specifying the separate debug info file:
- The executable contains a debug link that specifies the name of the separate debug info file. The separate debug file’s name is usually executable.debug, where executable is the name of the corresponding executable file without leading directories (e.g., ls.debug for /usr/bin/ls). In addition, the debug link specifies a 32-bit Cyclic Redundancy Check (CRC) checksum for the debug file, which GDB uses to validate that the executable and the debug file came from the same build.
- The executable contains a build ID, a unique bit string that is also present in the corresponding debug info file. (This is supported only on some operating systems, when using the ELF or PE file formats for binary files and the GNU Binutils.) For more details about this feature, see the description of the –build-id command-line option in Command Line Options in The GNU Linker. The debug info file’s name is not specified explicitly by the build ID, but can be computed from the build ID, see below.
Depending on the way the debug info file is specified, GDB uses two different methods of looking for the debug file:
- For the “debug link” method, GDB looks up the named file in the directory of the executable file, then in a subdirectory of that directory named .debug, and finally under each one of the global debug directories, in a subdirectory whose name is identical to the leading directories of the executable’s absolute file name. (On MS-Windows/MS-DOS, the drive letter of the executable’s leading directories is converted to a one-letter subdirectory, i.e. d:/usr/bin/ is converted to d/usr/bin, because Windows filesystems disallow colons in file names.)
- For the “build ID” method, GDB looks in the .build-id subdirectory of each one of the global debug directories for a file named nn/nnnnnnnn.debug, where nn are the first 2 hex characters of the build ID bit string, and nnnnnnnn are the rest of the bit string. (Real build ID strings are 32 or more hex characters, not 10.)
To know what the name is that the debug link will search for the readelf
utility can be used.
$readelf -x.gnu_debuglink getme
Hex dump of section '.gnu_debuglink':
0x00000000 64756e6e 6f576861 7449416d 00000000 dunnoWhatIAm....
0x00000010 f73fe528 .?.(
This means that when you have a file called dunnoWhatIAm
in the debug search path when gdb
is started, it will find the debug symbols for getme
.
Looking at the functions in the binary it is now clear that there are 2 actual functions in getme.c
:
File getme.c:
25: int main();
8: int notme();
Knowing the function name enables the quick disassembly of the function.
gef➤ disas notme
Dump of assembler code for function notme:
0x0000555555555145 <+0>: push rbp
0x0000555555555146 <+1>: mov rbp,rsp
0x0000555555555149 <+4>: sub rsp,0xc0
0x0000555555555150 <+11>: mov DWORD PTR [rbp-0x60],0x2cf
0x0000555555555157 <+18>: mov DWORD PTR [rbp-0x5c],0x2dd
0x000055555555515e <+25>: mov DWORD PTR [rbp-0x58],0x2d5
0x0000555555555165 <+32>: mov DWORD PTR [rbp-0x54],0x2e1
0x000055555555516c <+39>: mov DWORD PTR [rbp-0x50],0x2f6
...
Resulting in the discovery of the flag.