« BackBrood War Korean Translationsblog.sourcedive.netSubmitted by todsacerdoti 6 hours ago
  • dfan 6 minutes ago

    Even when Google Translate got pretty good I was not really able to effectively translate Chinese or Japanese text about Go (the game). I had similar issues to the ones mentioned in this post. Many Chinese and Japanese words (e.g., "ko") have a very specific meaning in the context of Go, but they also have regular meanings (e.g., "robbery") in more normal contexts, so Google Translate would translate text in a generic way, which made everything unintelligible. With modern LLMs, I can now preface my translation requests with instructions such as "I am going to ask you to translate some Chinese text accompanying weiqi diagrams. Your translations should be idiomatic and not shy away from Go jargon. For example, 拆 = extension, 夹 = pincer, 刺 and 觑 = peep.", and it does a fantastic job, enough for me to basically read anything I want. It was lucky for me that evidently enough Go material already existed in the training set that I didn't have to do anything more special.

    (Some chess corrections, in case the author is reading: the moves at the start of chess games are called openings in English, not openers; there are not distinct white-piece and black-piece openings, although of course an individual player will probably study a given opening from the point of view of one side or the other; their study is considered fundamental all the way up to the highest level, in fact more so as you increase in skill; and the Sicilian variation in question is the Najdorf, not Najdork.)

    • jaeyounkg 5 hours ago

      This was an fun read, as someone who's both a Korean BW player and a speech recognition researcher.

      It's interesting to note that the original Korean transcription already has many errors, seemingly (and impressively) corrected by LLMs later on. For example, 12 안마당 빌드 (12 courtyard build) is actually 12 앞마당 빌드 (12 frontyard build), which might have been more understandable to BW players. Similarly 투에처리 빌드 (processing-at-two build? makes no sense lol) should have been transcribed 투해처리 빌드 (two-Hatchery build).

      Therefore it may also be helpful to directly feed the slang dictionary into Whisper's inference process using contextual biasing. There are lots of ways to do this, but the simplest would be to increase the probability of slang words in the dictionary in the final prediction layer of Whisper by a constant factor. This is fairly easy to implement, for example by using HuggingFace's library: https://huggingface.co/docs/transformers/en/internal/generat...

      • chongli 2 hours ago

        I am a StarCraft fan and I have no idea what a courtyard or a frontyard is supposed to be! However I do know that the names of buildings, units, technologies, and strategies are usually heavily abbreviated in English. Perhaps the same is true in Korean? A 12 barracks build would usually just be called "12 rax", a two hatchery mutalisk build would be called "2 hatch muta", and a three hatchery hydralisk timing attack / all-in would be called "3 hatch hydra bust".

        • rcthompson 2 hours ago

          I believe the equivalent term used in English (exhibited in the new translation) is "natural", short for "natural expansion", which refers to the obvious location where the player should build their first expansion. It sounds like the term used in Korean for this concept literally means "front yard" rather than matching the English term.

          • Reason077 an hour ago

            Makes sense. And presumably the 12 means that you expand to your natural ("courtyard") with your 12th worker unit (probe, in the case of protoss).

          • starcraftgamer 2 hours ago

            A lot of Korean slang is a little different. Source: not Korean but have been in the English community a long time and picked some stuff up.

            "1rax double" is equivalent to "1rax expand" or "1rax CC". They use multi or double to mean expand in the early game. Instead of "cheese" or "all-in" they use "pil-sal-gi" which means ace/joker card or "han-bang" which means an army or attack on few resources.

            I am not sure what short-hand they use for barracks, gateway, etc.

            • chongli an hour ago

              Instead of "cheese" or "all-in" they use "pil-sal-gi" which means ace/joker card

              That’s a really interesting one to me! One thing I’ve noticed is that Koreans do not seem to have the same hangups / negative attitude towards cheese strategies as westerners do!

          • bee_rider 5 hours ago

            Do they actually use the Korean word for, like, tossing something to refer to the Protoss? That’s a pretty funny cross-language pun if so.

            • asdasdsddd 4 hours ago

              Half of the words in the Korean blurb are just romanizations. Even build is just bil-deu

              • jaeyounkg 5 hours ago

                Haha, no I acutually never associated this with the English word toss lol.

            • jaimebuelta 2 hours ago

              LOL, as a non-native English speaker, reading this reminds me of EXACTLY the same problem of translating many things, but more precisely, computer articles and software development.

              There’s a huge amount of terms that are difficult to translate (sharding? Hash?). The only real solution is to adopt them to your language, more or less adapted, which is what happens over time. But it requires a community that, to some degree, is able to cross the gap between the languages. In this case, learning English.

              Talking about software development in Spanish (my native language) is a succession of imported terms from English.

              I don’t think there’s a good way of doing that, and I’m interested to see how automatic translations deal with it, because the only way this can work is with a process of mixing both language in a social way and see what terms evolve from that process.

              And you need, in the terms the post describes, people that know Korean at least in a non-fluent way. And the game itself, of course.

              • leshokunin 5 hours ago

                Don’t let the title fool you: this is anextremely thorough and creative take on translating and making more approachable the commentary of StarCraft.

                As the author rightly points out, in its 27 years of existence, commentary around the game has become a domain specific language. Not just Korean or English.

                This approach of automated scripting and using AI to understand roughly what was said and then make it coherent is really cool.

                • navane 12 minutes ago

                  How far off are we from local immidiate voice translation? Something on my computer that translates all spoken words by my computer, keeping tone and intonation?

                  • _dark_matter_ 4 hours ago

                    Kinda funny that in an article about translations, the author gets signal-to-noise completely backwards. A high signal to noise (over 9000) is very good. It means you are getting a lot of signal with very little noise. Decreasing signal to noise means getting more noise.

                    • jfim 2 hours ago

                      The author also used domain specific language when they meant jargon.

                      • flerchin 4 hours ago

                        Yes I had to read that a few times to understand it. Much like the google translations.

                      • jboggan 21 minutes ago

                        This is amazing. I'm getting chills!

                        As a side note, I have gotten into watching a "foreigner" BW channel everyday called ArtosisCasts. The videos are very strategic and high level commentaries on games as they are watched for the first time, with some after-match highlights for especially interesting maneuvers. It has really made me appreciate the depth of the game, as well as explain how I was so bad at it in high school. It has actually made me think a lot about startups, economic optimization, and how you approach the "meta" of any activity you're undertaking.

                        • bee_rider 5 hours ago

                          Dumb question from someone who only played money-maps as a kid:

                          What do the numbers in front of the building mean? 12 Hatcheries seems like… well, 12 seems like a possible but implausible number of hatcheries to build (hypothetically it is possible of course). And 12 spawning pools is obviously not useful. So that makes me think it is the position in the build order list. But, they list other builds, like:

                          > The second is the 12 Hatch, 12 Pool, 12 Gas

                          Which doesn’t make a ton of sense in with that parsing. I mean it must not be a straight list. Maybe it is a tree, and 12 is the depth for this building? But that seems late, I can’t think of 11 buildings to build before gas. Maybe they include units too? Or maybe just drones/overlords?

                          • stackghost 4 hours ago

                            IIRC it started with "4-pooling" which is when, as Zerg, you build a spawning pool while only having 4 workers (it's been years, forget what they're called), rebuild your 4th worker and then start making zerglings to achieve a super-early attack (a "rush").

                            Then your opponent calls you all sorts of vile names and questions your sexuality, etc.

                            • TeMPOraL 2 hours ago

                              That's only if you manage to get the first two zerglings out faster than it takes for opponent's SCVs to arrive at your base and kill your drones (that's the name of Zerg workers) :).

                            • SynasterBeiter 5 hours ago

                              It denotes how much supply you should have when you start the building. All of your supply at this stage comes from workers, so it's also an indication how many workers you should train.

                              • TulliusCicero 2 hours ago

                                People already explained that's it's how much supply you have.

                                In practice this is easier for people to use than actual clock timings, because it's more robust to delays or interference. If you remember "third rax at 30 supply" then even if you're playing a little slow, you will still know roughly when to build that. But if you memorized exact clock timings and now you might be 20+ seconds behind, it's hard to know when you should fit in the new building.

                                It's not perfect of course, and if you get cheesed and the game goes weird then you'll have to start improvising rather than relying on just supply timings, a lot of times after a cheese where neither side definitely wins, the balance between tech and economy is now very non-standard and you can't rely on conventional rules of thumb anymore.

                                • gs17 3 hours ago

                                  > And 12 spawning pools is obviously not useful.

                                  I vaguely remember a Husky video where he actually did a "9 pool" with building 9 spawning pools.

                                  • moefh 18 minutes ago

                                    There might be a video where this happens, but I think it's more likely that you're misremembering it; there was a somewhat famous game cast by Husky where Cella[1] (a professional player) was joking around on the Internet playing 2v2 (2 teams of 2 players each).

                                    He asked his partner what strategy he should use, the person responded with "13 gate" (meaning: keep building probes until you have 13, then build your first warpgate). Cella pretended to misunderstand and instead built 13 warpgates, which is a horrible strategy, but they still won the game. They only won because his partner could barely defend him in early game while he was building the warpgates. After surviving early game, it wasn't a fair fight even with a horrible strategy, because it's a professional against "normal" people on the Internet.

                                    I don't think the video exists anymore, Husky famously removed his whole channel with a lot of StarCraft 2 early history, but I found this Reddit thread[2] talking about the game (WeRRa was Cella's team at the time, that's why they call him CellaWeRRa).

                                    [1] https://liquipedia.net/starcraft2/Cella

                                    [2] https://www.reddit.com/r/starcraft/comments/dyjk9/cellawerra...

                                    • bee_rider 2 hours ago

                                      That would be just a flex or a joke or something, right?

                                  • narcindin 5 hours ago

                                    Is it how many workers you have when you build that building

                                    • Reason077 an hour ago

                                      Doesn't it mean you build your expansion once you have 12 worker units?

                                      • zzlk 5 hours ago

                                        It's the supply of when you should build it. In the early game it's essentially how many workers you have.

                                        • LegitShady 4 hours ago

                                          In the game you build buildings and units. The units take up "supply" which there is a limit on. At the beginning of the game you mostly just building workers (unless you scout your opponent is going for an extremely early attack), who mine resources and construct buildings.

                                          The numbers indicate the supply you should be at when you build the structure.

                                          so a 12 hatch 12 pool 12 gas means you get to 12 workers and then build those 3 buildings in that order as soon as you have the resources for those.

                                          For zerg the workers actually become the building, so I assume you hit 12, build the hatchery, build another worker, build the spawning pool, build another worker, and then build your gas refinery.

                                          • starcraftgamer an hour ago

                                            Yes as zerg the lost supply is counted, so you can either go 12 hatch 11 pool 10 gas or 12-12-12 if you want to be a little bit more economically greedy at the expense of making it much harder to hold 8rax in ZvT as an example.

                                            As you get later into the game people who play more seriously also use the in-game clock, or timing a building placement relative to how complete a different building is to determine building timing. This helps with subtleties like whether you lost your scouting worker or not (-1 supply), if the early game got really weird because you had to build more units to hold some aggression, etc.

                                        • amatecha 5 hours ago

                                          If the author sees this: with yt-dlp you can download lower quality versions of videos to save bandwidth, like so:

                                            yt-dlp -f "bv[height<=720]" <url>
                                          
                                          (where <url> is your URL or video ID)

                                          That will download up to 720p quality.

                                          • doctor_phil 4 hours ago

                                            The author mentions just downloading the audio track. That's a lot less data than downloading any video at all. ;)

                                          • allcentury 4 hours ago

                                            I loved this article, thanks for writing it.

                                            I attempted playing a few world cyber game US regional matches and I was always amazed how much faster everyone else was. Then I remember when they live streamed it from Korea and I saw how fast they played and I was blown away. From a strategy point of view, something so basic about the game that I missed was when a blog introduced me to some math for a protoss zealot power up that defeated a zergling in 2 hits rather than 3. That's when I realized this is a chess game and I got hooked.

                                            • sharkjacobs 5 hours ago

                                              I get that it's "wrong" but I really like the translation of "natural expansion" to "courtyard"

                                              • TheAceOfHearts 4 hours ago

                                                I really wish someone with the resources and connections could get in touch with South Korean broadcasters in order to get access to their archives so that more historical games could get uploaded and re-commentated for a western audience.

                                                My favorite Brood War slang term is Ee Han Timing [0]: basically when you take a risky build that has to do damage in a small timing window. A ton of exciting Brood War moments come from exploiting tiny timing windows.

                                                [0] https://liquipedia.net/starcraft/Ee_Han_Timing

                                                • karmakaze 3 hours ago

                                                  That's much of the game in a nutshell, in varying degrees. If one player makes many units and the other is instead collecting resources, the first player has to do damage to equalize or else the other player got away with it and the first is way behind. At the top levels, smaller actions like making one or two early attack units applies the same to a smaller degree, but early game differences compound thus matter more than the same later on.

                                                  • debo_ 4 hours ago

                                                    Artosis was doing this with historical games and then he started getting DMCAed from the copyright holders of the original broadcasts.

                                                    • TheAceOfHearts 3 hours ago

                                                      Yeah, I heard about that. Someone with enough resources and connections could probably license the rights, but I doubt it would ever be profitable. Still, I really hope that this key part of gaming history doesn't end up lost.

                                                  • Baeocystin an hour ago

                                                    I was a reasonably competitive BW player until the Korean teams arrived on the scene. I'll always appreciate how they elevated the level of gameplay. Really nice guys, too, I learned a lot from playing with them, and it was fun talking strategy via chat to the best of our mutual linguistic abilities. Good memories. I would have absolutely loved something like this project back in the day.

                                                    • starcraftgamer 2 hours ago

                                                      I've been watching BW out of Korea since 2007. Previously also played but it's been many years. This is really cool, thanks for sharing!

                                                      There are two YouTube channels I wanted to take the opportunity to shout out, the first one does English translations of Korean BW content, and the second one provides commentary on recent tournaments like the ASL with a little bit more depth then Tasteless and Artosis (no hate but to me their commentary is too off topic and they miss basic things about build orders).

                                                      https://www.youtube.com/@jinjinBW

                                                      https://m.youtube.com/@StarCastTVENG

                                                      • xedrac 39 minutes ago

                                                        > Najdork variation

                                                        I think OP doesn't like the Najdorf Sicilian... or is this some meme opening I don't know about?

                                                        • Unearned5161 3 hours ago

                                                          for any of the lucky 10000, like me, who were left wanting to see what this game looked like:

                                                          https://youtu.be/Nm-PXmOELAw?si=Z-RXbdqNzkSF3cqx

                                                          my brief search didn't show me any more obscure Korean only strategy videos, so maybe this one is just for the lowly foreigners :(

                                                          • nicois 2 hours ago

                                                            Nit-picking, but a high signal to noise ratio is desirable, indicating low levels of noise compared to signal, not the reverse.

                                                            • spongebobism an hour ago

                                                              Impressive project, and I always love reading about the communities that form around competitive games.

                                                              It feels kind of sad to admit as a chess player, but "Najdork variation" is one of the funniest typos I've seen.

                                                              • maeil 4 hours ago

                                                                Reminds me, many years ago someone paid me to translate a Korean wiki article about some League of Legends pro player to English. No idea why, most of it was random trivia, it didn't contain any notable insights. But it was decent money as a side job so I didn't bother asking. Possibly similar motives to this article?

                                                                > Very few of members of the foreigner community are fluent in Korean. Foreigner access to Korean BW discourse is a contradicting concept: if you speak Korean fluently, you have no reason to be in the foreigner community, as it only has access to material that is strictly inferior and more limited. For this reason, Korean-speaking members in the foreigner community are exceedingly rare.

                                                                I can vouch for this in general - after becoming fluent I've stopped looking up anything related to Korea in English because the quality of information is much worse. I'm sure the same holds for other languages and places.

                                                                • egurns 2 hours ago

                                                                  Warms my heart to see effort put into a beloved game. Just this week I watched a YT video of a sc custom game where the players were discussing whether its worth the effort to translate in-game korean-language content. Its an old game that is played by a niche community in north america. The majority of custom games are created in korean and never get to translated for the small number of north american players that would be interested.

                                                                  • ZeWaka 5 hours ago

                                                                    I wonder how well AI audio generation would work here, to produce a voiceover video like the original input.

                                                                    • nfRfqX5n 5 hours ago

                                                                      Pretty cool and shows a clear issue I’ve seen across any LLM. the language and grammar is so formal/robotic

                                                                      • jayd16 5 hours ago

                                                                        Neat. I wonder what Google Translate uses these days and if its gotten or will receive an update to a new LLM.

                                                                        • mock-possum 5 hours ago

                                                                          As the author points out, this does seem like exactly the kind of language problem that LLM‘s ought to excel at, and I love that moment of discovery when the testers were so busy discussing the content that they forgot to focus on the accuracy of the translation!

                                                                          • vippy 4 hours ago

                                                                            ok but where are the vods?