Thursday, August 30, 2018

What About Localizing My Game?

This started as a Reddit post in reply to a question about how to handle localization, but it got quite long so I figured I'd post it here.  Hopefully this is useful to someone.  For the record, what we're talking about here is making your game available in multiple languages.

There's not a ton of work required, on a technical level, to build yourself a functioning localization solution.  Someone experienced can probably build what's necessary in an afternoon, and none of the work is exotic.  Unity doesn't have one in place by default however, and so I think a lot of folks who are just learning tend to gloss over the infrastructure or not realize how much difficulty they could have down the line trying to retrofit support.  Luckily it's pretty easy and I encourage preparing any serious project for it from the beginning.  Here's roughly what we use.

Firstly, we have a TextManager that is a mostly-static singleton class.  This gets initialized on first use if needed, but also manually initialized in the game's startup scene, and that mostly consists of setting the desired language, then reading in the actual strings and sticking them in a dictionary.  We only read in a single language at a time, and the languages exist in a set of parallel files that use the Android XML format.  If you set TextManager to german, it loads the german.xml file, if you set it to english it loads the english.xml file, etc.  You can then use a static method like TextManager.GetString("level_readout") to retrieve an entry.  It searches for the "level_readout" key and returns the actual text for your loaded language.

We've got a lengthy history of Android dev, which is really the only reason we use that file format.  A CSV or set of CSVs is probably the best in my opinion, since when you get to the point of actual localization you'll very commonly end up sending spreadsheets to people.  A format that makes that simple is useful.  Alternatively iOS or Android style files are useful since a lot of services can work with them directly.  Remember you will very likely have to merge things eventually, so whatever you do don't use a binary file for this.  UTF-8 text is ideal.

CSV file with key in first column, languages in following columns

Requirements for the entries are quite simple, whatever your format.  You have a dictionary-style key, which is what you'll use to refer to the strings in code, and then the actual text entries which are what will ultimately end up on-screen.  No matter how good the idea sounds at the time, do not use your english string as the key.  Strings change, your first draft is often temp, and you don't want to have to edit your code in order to fix grammar or spelling or whatever.  In fact, I encourage forcing your keys to lowercase, and disallowing things like spaces as well.  Treat your string keys like variable names.

Now here's the important part: make sure you use it.  Whenever you put text on the screen, call TextManager.GetString() and use the result.  That real-text string is what you put on screen.  Directly putting user-visible strings in your code is now verboten, and you should feel dirty for doing it.  This means that in addition to lengthy strings like dialogue or item descriptions, you'll end up with short strings that are simply labels, or that have replacement fields in them like "Level: {0} of {1}".  Replacements like that are better than assembling a few tiny strings because some languages may prefer a formatting with a different order.

This sounds like a hassle, but it quickly becomes not a big deal -- you're probably only making a couple entries at a time, and you'll mostly only end up dealing with english (or whatever your native language is) during development.  I assure you, doing it as you go is far, far less painful than tracking down hundreds of strings scattered across your codebase later down the road.

You'll probably end up with some convenience functions for common types of usage.  If you're using Unity, a component that automatically fills in a text component on the same GameObject is an obvious addition, as well.  Whether you want to try to deal with switching language on the fly is something to consider, and that's getting more complicated since it means menus and stuff will need to be notified and update all their text fields.

If you really want to help your translators, add a section to your CSV (or whatever) that lets you fill in context for the string.  You don't need to read this in yourself, but remember your translator will often only have the list of strings to refer to and won't have the time to find that text in the actual game.  A note that says "Refers to the player's current experience level" can be hugely helpful, since what if that string is talking about unlockable areas or something?  Things like that often come back as questions if the context isn't obvious.

If a key is not found upon attempting to retrieve a string, make it very obvious.  We literally return something like "MISSING: level_readout" so it screams incorrect to the viewer.  During development I could see having your non-native languages fall back to the native string that (almost certainly) exists, but once you get into a QA finishing-things stage you should remove that so nothing gets missed.

There's a whole pile of features you could add to a system like this, but for my money they pretty quickly start seeming not worth the trouble.  We have some python scripts to convert our XML files into a couple of other formats like CSV, but have never had much call to do something like upload them dynamically to Google Drive or whatever.  You could invent your own super-easy-to-parse format if you wanted to, too.  So long as what you have is consistent and you use it, that's the big thing.

No comments:

Post a Comment