I'm actually from a Super Monkey Ball 2 hacking community which, among other things, has been working on a Ghidra decompilation for a bit over a year now. We collaborate on a shared Ghidra server and have tons of stuff labelled and annotated at this point!
We also have the ability to inject C++ directly into the game for modding. I'm also the author of ApeSphere, a practice mod for SMB2 with features like savestates: https://github.com/complexplane/apesphere . Take a look at rel/savestate.cpp for a taste of what we can do.
do you have any tips for setting up a shared Ghidra server? the user management especially seemed scary, and has kept me from setting one up in the past.
How is it that the decompiled camera code has interesting variable names and even comments? I was expecting it to look more like the decompiled code from this article... With made-up and uninformative variable names.
You mentioned collaborating on annotating the decompiled code. Is the linked code something created manually based on the "raw" results of the decompilation? Or does the toolset allow you to annotate memory addresses etc in such a way that when you invoke the decompilation it uses the nice names automatically?
Can anybody recommend some good resources and/or books (or some kind of roadmap) to get into practical reverse engineering "starting from zero" (as in: a person with general higher level programming experience)?
For learning programming I always recommend that you have a specific project in mind, so instead of following a book, you follow your project's needs and use books as a reference.
I learned (intel x86)assembly from joining a cracking group when I was adolescent, it was my first programming language. Nothing beats being taught by masters of the craft and competing and contributing as a team.
Today I teach kids ARM assembly using microcontrollers like arduino unos or nanos and controlling a motor or servo, then I add more and more complexity like explaining klipper architecture, so they learn c and python.
With kids, the social part is very important, and making real things that move and react on the real world too. I suspect is of great help for adults too.
It is probably too dull to learn assembler today, on your own (alone) on a powerful computer. With a microcontroller you have a machine that is so constrained in resources that you really need to use c or assembly.
HN has very good resources about assembly, you can search them on google "hacker news assembly" or "hacker news reverse engineer", use zotero to aggregate the links.
Crackmes are how I got into it. crackmes.one has a bunch of good ones. I whipped up a quick and easy one here for you you can run it if you want but it doesn't need to be run (generally don't run random executables you download off the internet) just load it in ghidra/radare/<decompiler of choice> and figure it out. it's just about finding the main entry point of the program and following the flow. https://filebin.net/h4kp7s0z04kqa5om
Hey, article author here. I'm not very good at this stuff, it's just applying knowledge I've picked up over my career working in systems-level programming and various little game hacking projects.
The one thing I'd say is a real pre-requisite is to learn C. And I mean really learn C: pointers, including arithmetic and aliasing; memory allocation; stack vs heap; statics and globals. C is the lingua franca of this environment, and learning C will give you a good window into what the compiler is doing to turn your code into what's actually running on the CPU.
It's also good to understand how computers actually work. I strongly recommend the book "Code" by Charles Petzold (Microsoft Press, be wary of counterfeits and don't buy from Amazon). It starts literally from scratch. Like, light bulbs with electricity and a switch. Then it builds on that to create a fundamental CPU, and then shows how to go from there into the real CPUs used today, including an introduction to assembly language and machine code. It's a fantastic book if you ever want to do anything low-level on a computer.
Once you've got C and a basic level of understanding of assembly/machine code, I think you're ready to do something real. Start simple. The NES runs on a 6502, which is 8-bit, has three registers and only a handful of opcodes, and is single-threaded. In the past I've used FCEUX-DSP's debugger features[1], but I think these days most people use Mesen. You can use the same basic techniques I did in these articles: scan memory for the values you want (the player's score or life count; the player character's velocity), then set watch points to find the code that modifies them, then go understand what that code is doing and change it to do what you want. Give yourself infinite lives or a crazy high jump in Super Mario Bros or something.
From there you can move on to more modern CPUs and such, but the complexity goes _way_ up once you move on from environments where most stuff was actually written in ASM or very simple C without fancy compilers. I was very relieved to see how readable Super Monkey Ball's disassembly actually was.
[1] My first real game RE project was dumping PNG pictures of the levels in M.C. Kids for NES, for comparison on TCRF:
Thanks for the suggestion, I have done that. If you know of other communities that might enjoy the article, please do feel free to pass it along anywhere you like.
Haha this is very true. This also seems very true for Algebra and Calculus. I never really remembered/internalized all of the rules of Algebra (moving around log, e, ln, a^x, etc) until doing derivatives/integrals by hand
My 2 cents is do some CTF questions. PicoCTF's are good for beginners, they have a good progression in difficulty.
And once your more or less proficient then just crack some software. Find something old, like protected by purchasing key or whatever, and on the simpler side, don't go straight for photoshop or whatever. Its gonna take a long time but even if you don't succeed you'll get a much better feel for it in the process. Best way to learn is by doing, thats my experience anyways.
HackTheBox (hackthebox.eu) has some great reverse engineering challenges that you can learn a lot from, and have a clear difficulty progression. The non-retired challenges are all free with a basic account, but if you're starting from scratch, it might be beneficial to try out a premium account to access retired challenges (and their associated write-ups).
As someone else said. Practical Malware Analysis is great. The ghidra book by Eagle is also decent as well.
The best thing for RE learning though is to use Visual Studio to write a few programs. From the debugger view you can use the assembly, and see what your program ends up in.
You also want to think about it as the programmer. You have a MASSIVE program. You want to see where it creates files. You know that on windows, it HAS to go through CreateFile, or at least NTcreatefile function or system call. So you can watch for these, or look where they are called in Ghidra. Now you can mark all the functions in the chain using xrefs (What references this) and then get all the functions that use createfile out of the way!
And lastly, as a programmer, you know the apis. So think of what cases someone would use printf for example. There's not many. You know by the use of printf, there's some sort of logging at that location. If they use openfile. You have a good idea that all the code surrounding that call is going to be about the file being opened.
TL;DR. Start from api calls, and work backwards. Use your knowledge as a dev for what these api calls are used for. And walk that call chain and mark. Eventually every function is mapped.
pretty easy and fun, there is a difficulty wall after a few stages though. but once you get there, you might be able to get through it with some general deduction skills.
I think the author made these available for free in the past but now they are paywalled. (Nothing wrong with that at all, just thought it would help others trying to find the download on the site.)
There was MTP Target, a clone with penguin which had a pretty active community circa 2008[1][2]
I loved this game. It had an IRC server included so you could chat while in game, a forumboard, and a pretty diverse community. The gameplay was simple to learn but difficult to master, with a lot of fun for everyones.
Sadly, it was open-source but not so much. The sole developer never opened up the source of the server, and even the client's were not keep up to date, so it was impossible to play elsewhere than the official server.
The sole developer had a little revenue stream by selling premium account (which enabled the possibility to use custom skins and access to a private server) and never accepted to let the community fix bugs or host other servers. There was quiet a bit of friction and drama between him and some members of the community, and in the end he prefered letting the game die than giving it to the community.
An indie developer announced today that their Super Monkey Ball-inspired game will be going into early access on Steam later this month: https://www.youtube.com/watch?v=YN_XnekG6Ac
It looks to be incredibly close to those original games.
Is the bomb lab the same one as in CS61A? I did that one and it was extremely interesting. Sadly I'm not really sure how to go from there to start a reverse engineering life but at least I'm taking some univ courses.
I've seen bomblab, etc, at a couple of universities. Does anyone know the history of these exercises? I saw them at CMU as well. Really cool that these are used across several great programs.
I believe they are a part of Computer Systems: A Programmer's Perspective by Bryant and O'Hallaron. You can see a list of the labs at http://csapp.cs.cmu.edu/3e/labs.html.
My grad school had the same auto grading and a scoreboard where you could see anonymously each others attempts and explosions. We did it as part of our Systems I class which was essentially an introduction to C and Assembly. I felt the point loss from blowing up was poorly thought out for at least an into type class. It makes it so that if you didn't do it perfectly the first few attempts there's actually a point where you can come out ahead by not finishing the assignment which just really bothered me as that seems exactly against any intended learning outcome from the lab.
I wonder if anyone has any reading resources for reversing old DOS programs written in fortran - namely SCORE, the music notation program, still unparalleled in productivity, beauty, and preciseness, but with a dead author and no source code.
Interesting... (I sometimes do reverse engineering for fun and I am always looking for new challenges). I see there is a DOS version and a Windows version. I suppose you talk about the DOS version? (SCOR4)
Yes, I’m talking about the dos version, SCOR4.exe, as the windows version is a buggy beta release where the author died before being able to get it stable.
I expect you'll have a tougher time there because NES games were written directly in assembler, so there won't be the characteristic patterns that a decompiler can use to recognize particular control flows— I believe C wasn't really standard until the N64 era, which was part of what made high-level emulation (see: UltraHLE) possible.
On the other hand, hand-authored assembly is easier to read directly, because, of course, it was written and read in that form originally anyway!
Ghidra didn't do a great job of 6502 when I tried using it a couple of years ago, due to poor support for a couple of the 6502's addressing modes, but maybe it's improved?
The version 9.2 release notes mention improvements to the 6502 processor specification:
> Many improvements and bug fixes have been made to existing processor specifications: ARM, AARCH64, AVR8, CRC16C, PIC24/30, SH2, SH4, TriCore, X86, XGATE, 6502, 68K, 6805, M6809, 8051, and others.
The Displaced Gamers' "Behind the Code" series does stuff like this for NES games. I don't know they exact tech they use. I think it's basically emulators with debugging features.
I wonder if Dolphin lets you load a "cheatsheet" or something where you can tell it to modify certain values in memory after loading from the ISO - so you don't have to modify the original source ISO at all.
I find it funny that their goal is to make the game technically harder to make it a better game in general - by reducing the points you get - incentivizing you to explore the other levels mores. Instead of your typical game hacks that make it easier or add cool things. Even though I get this is just a tutorial.
Nice reversing article for GameCube, but I think original scoring is much better.
It is preventing you to play same easy levels over and over again, and as fast as possible bringing you to your skill level. Very good adaptive difficulty, it is actually punishing players for choosing to play easy levels behind theirs skill level, keeping game constant challenge.
PS: haven’t played the game, assuming later levels are more difficult
As an example, here's a full decompilation of the camera code which runs after the ball passes through the goal: https://cdn.discordapp.com/attachments/463221047471374337/79...
We also have the ability to inject C++ directly into the game for modding. I'm also the author of ApeSphere, a practice mod for SMB2 with features like savestates: https://github.com/complexplane/apesphere . Take a look at rel/savestate.cpp for a taste of what we can do.