I’m not personally a programmer, so that’s half the reason I look into stuff like this, but I keep looking into alternatives to traditional ROM hacking for translating games. The following links are part of my inspiration for what I’m considering:
Right. What if, rather than trying to hack into the game, you make a wrapper that reads what is already being output in the RAM and overlays English over the text? Now, this would probably taking some programming wizardry (especially when it comes to getting text boxes to appear where they should, figuring out solutions for cutscenes and voiced dialogue, and the insane amount of work it would take to create a program that allows users to do all of this), but I can’t help but keep interest in the idea. With enough programming knowledge, I bet you could even make this work on real hardware in the case of newer consoles. However, one kicker I don’t know enough about is this- could you make something generalized enough to cover a variety of games, or would this need to be done per individual game, defeating the purpose?
Another option is OCR solutions, which Ztranslate is one I’ve used. It works decently, though the true potential comes from “packages”. Packages are hand edited versions of the translations wrapped into a file that allow most of the type of stuff I’m talking about, even cutscene dialogue. However, the developer hasn’t made much progress in the last two years and I’ve seen little interest from outside programmers. This would not be a real hardware solution at all.
Just a bit of rambling. This is an idea I’ve become a bit obsessed with, as there’s more translators out there than hackers, and there’s already a large number of games that have script translations somewhere but no way to apply them in-game.
So obviously if you want it to work on a repro cart you need to do it the old fashioned way of hacking. But let's say you're working on the "translation" emulator, where you're building in this sort of intercept capability. There already is some precedent for interception; that's how the Game Genie works. It sits between the game and the console and you set it up with one or more alternate instructions that "overwrite" what is actually in the game, whether it be RAM or ROM. So presumably you could do something similar where when the game wants to read a particular address for the text to output you can supply your own text.
The thing is, there is a lot more to it than that for any old game. First, you need to figure out where the text read occurs. If you're lucky there are a couple of levels of indirection so you only need to end up modifying one spot, rather than needing to have N different handlers for the different instances of text. Then you might also need to hack in the glyphs needed to render your translated text; it's not a sure thing that a game in a different language has your character set as part of the game (or if it does, it's on a different bank so you still need to either trigger a bank swap or update the bank). Beyond that is how well or poorly the game handles the fact that your text strings are almost invariably going to be of different length from the game's native strings. If you're interested in all the technical challenges of game translation I recommend checking out the Mother 3 translation blog; it was written by the translator for that project, so he gives a much more laymen's views into some of the specific technical challenges. On example is if the text box supports three lines but your translated text requires four. Was the game originally programmed assuming there was no need to scroll? Does it handle line breaks automatically or do those need to be formatted into the string? You end up needing to build up a lot of knowledge into how the game was programmed, and it's going to be different from game to game. The biggest benefit to this sort of overlay system would be not having to worry about the ROM size and getting everything into the right banks, so you don't have to interface with how the game handles bank swapping. But if you're already doing a fair amount of hacking that isn't really any different.
Now, games that are recent enough are potentially making use of modern localization and string handling technology, where the strings are externalized and the system handles strings dynamically with the UI elements without assuming anything about them (like how a Japanese game can assume fixed width characters and that the lengths of character names are always going to be N or less). But even that isn't a sure thing.
Blizzard Entertainment Software Developer - All comments and views are my own and not representative of the company.
Essentially what I feared, with the amount of effort, you might as well just make a traditional hack. Disappointing but not unexpected.
@Maru-
Yeah, and that’s what I’m looking at. There’s a translation program called “Sugoi” that is pretty accurate for Japanese that can be combined with an OCR (Visual Novel OCR is recommended, made by the same person). Sugoi also doesn’t require internet connectivity, which is great. Big issue though is that it uses a ton of RAM and you essentially have to have 3 programs open at once. I’m in the Discord for Ztranslate, I guess I could check in with that dev and see how simplifying his package idea to be useable to non-programmers has gone.
I highly recommend this book if you want to see the pitfalls of machine translation and the commentary of a professional translator.
I agree with you about the pitfalls of machine translation, but OCR does not necessarily have to be paired with machine translation, though I don't think true OCR would even really be necessary for this application, if you allowed for some game-specific knowledge.
An older game is going to have a knowable tileset that could be extracted without a lot of low-level knowledge. Some subset of these tiles will be characters.
If you have a tileset of characters you don't have to do true OCR to detect the presence of characters on the screen because there won't typically be things like odd rotations or handwriting variance, etc. It is a much more constrained problem.
You could then detect strings of tiles that you know correspond to words.
If the game has a translation dictionary that includes full phrases or passages in the native character set and a corresponding translations in the target language, it becomes a lot easier to translate on the fly.
Attempting to superimpose this text over the on-screen text is its own problem, but if the player was willing to see text displayed in a corresponding location in a second window, that greatly simplifies the problem.
This obviously would only work in an emulation setting, but could allow for fan game translation without requiring low-level programming knowledge.
It was a horrible experience, but Ross Scott played through an entire game using screen capture "real time" translation. It was taxing and draining to watch, but if you didn't understand his pain you might, through the editing, think it was actually acceptable and playable. I was brought to the verge of tears when things went wrong before the apex. The software he used got better, and for most games there are other alternatives.
I bought Tunic on launch day. I have been following this game for years. It's been tough playing it though with the 55-hour work week I've been doing. Still, I am having a blast playing it. I just made it to the swamp. Game reminds me of Dark Souls fused with Link's Awakening. It's on Xbox (Gamepass as well), PC stores (Steam, GOG). Worth a buy in my opinion, plus there's no way I'm getting everything in one playthru.
"Challenging my unit was both foolish and reckless! You are nothing more than my prey... one that is soon to be retired!"