• chrismorgan 2 days ago

    > “This is one of the first studies, if not the first, to show that the use of AI in writing could lead to cultural stereotyping and language homogenization,”

    I just want to make sure others agree and it wasn’t just me (or perhaps non-Americans in general)—it was blindingly obvious this would be, must be, the case, right? That although this might be the first formal study of it, there would have been literally no doubts as to what the outcome of such a study might be? That at least some degree of language homogenisation will be quite inescapable if you do LLMs the way we have?

    On the cultural aspects, it’s well-documented and -understood what effects US TV and movies have had on other countries. There really isn’t anything new about LLMs or AI here, it’s just standard globalisation effects.

    (I also just now learned what a crazy term “Global South” is <https://en.m.wikipedia.org/wiki/Global_North_and_Global_Sout...>, and how it does not mean at all what I thought it meant or what any sane person would expect. Was it not enough that “Western” bears no strong correlation to geography, that we need more terms that utterly abuse geographical references when they’re actually about socioeconomic characteristics? Apparently I have moved south by migrating from Australia to India.)

    • diffeomorphism 2 days ago

      Correct but besides the point. The question of the study is "how much and how do you quantify this?".

      There are lots of blindingly obvious qualitative statements, where the quantitative parts is far from obvious. That makes them a good starting point for research.

      This is like writing about Newton's theory of gravity as "scientist finds out apples fall downwards. Wasn't that already obvious?".

      • twelvechairs 2 days ago

        'Global south' is just the latest in a fraught space where every term seems to eventually get dragged down. Originally it was 'third world' that started off as a positive, aspirational term but became a derisive term. Then it was 'developing country' that was supposed to be not judgmental but became so. I'm sure 'global south' will go the same way.

        • mort96 2 days ago

          The "global south" is a term which originates in writings about the vietnam war in the 60s. It's by no means a new term.

          Use third world country if you want, but at least in my mind, the terms "first/second/third world" are more tied to which hegemony you fall under, where the first world is the US hegemony, and the third world is without a hegemon. "First world" is kinda synonymous with "the west". To me, the term "global south" communicates that it's being contrasted against all of "the north", both east and west, while using the term "third world" communicates that the context makes a distinction between "the west" ("first world") and the old eastern bloc ("second world") somehow relevant. Some "second world" countries such as China are also part of the "global south".

          I'm sure there are people who use the term "global south" simply because they perceive it to be less judgemental somehow, but I might've used it in this context because "third world" communicates something different.

          Honestly though, I would've probably just used the term "non-western" (maybe "non-first-world"?), since that's the distinction the article actually draws. Eastern Europe is also affected here, after all. Or maybe I would've drawn the distinction at "US" and "non-US", since Europeans don't necessarily want their writing to sound American either and the old US hegemony seems to be on its last legs.

          • AuryGlenz 2 days ago

            It’s funny, because on the face of it, it seems the worst of all. You might as well just say “dark skinned countries,” since that’s what the term is effectively getting at.

            • trhway 2 days ago

              >You might as well just say “dark skinned countries,”

              Argentine is 97% European descent. Where is non-white countries like Japan and South Korea are in the Global North.

              It isn't about race/skin color. The "Third World" was well suitable until Soviet Block (Second World) collapsed with most of its components going either into the First/Western World or into the Third World thus resulting in the world being partitioned just into 2 large parts with Russia+Belarus being the distant small 3rd, so small that it is just easier to count them into the First/Western style developed world especially considering that they both moved into capitalism losing that major trait - socialism - of the Second World. (Though 30+ years later Russia does more and more drift toward the former Third World, i.e Global South).

              • AuryGlenz a day ago

                We all known Japanese and Korean people are white by proxy in a lot of people's minds.

                My point wasn't about the grouping itself though, just the term global south effectively being "worse" than the previous terms. For the northern hemisphere, where most people live, south = darker skin.

          • flowerthoughts 2 days ago

            A few months ago, there were articles about English having more a Nigerian (I believe) dialect because that's where the labelling of the supervised learning happened in the early days.

            If that had continued, combined with where actual users are, perhaps it would have broadened English instead?

            • jononor 2 days ago

              I agree that homogenization is the likely default outcome. ML models do tend to have a strong tendency to prioritize modelling the most common things well, which makes output in a generative models also biased towards mean/mode. And there are the limitations in representation in the datasets themselves. And also bias in that the developers are primarily English speaking, so that language gets priority. But I do no see homogenization as inevitable. LLMs do pick up a wide range of speech patterns, both regional, different periods of time , sociolects and styles. And they are pretty darn good at "role-playing", outputting language tailored to a particular role. And this can be configured rather effectively using a session or system prompt. So if everybody picked their voice, we could perhaps get a broadening of outcomes. Maybe I should add "always speak like a refined scholar from the Victorian era, spiced up with 80ies British goth"...

              • mrtksn 2 days ago

                You know what, actually if a non-American movie non-ironically tries to be like American it instantly becomes kitsch. Even British movies are quite distinct from the from American.

                Anything AI writes is dull anyway, it writes stuff that nobody wants read beyond getting some information. Maybe if you are learning English you may pick up something from it though.

                Also, I recall something about AI English actually being Nigerian English because those companies used a lot of Nigerians in training.

                • otabdeveloper4 2 days ago

                  > Apparently I have moved south by migrating from Australia to India.

                  Yeah, I we understand "south" to mean "closer to the equator". (That's kind of how it works in the popular imagination. E.g., southern Brazil is more "nordic" than northern Brazil.)

                  • spacechild1 2 days ago

                    > That's kind of how it works in the popular imagination

                    It absolutely does not. South means, well, South.

                    > E.g., southern Brazil is more "nordic" than northern Brazil.)

                    How so? Southern Brazil is clearly closer to the South pole than Nothern Brazil.

                    • otabdeveloper4 a day ago

                      It's a racial term. "South" as in "dusky" or "brown", etc.

                    • chrismorgan 2 days ago

                      That’s definitely not how I’ve ever seen it used in Australia or India. North and south are fairly strictly geographic terms. South India is roughly the southern half of the country—following a cultural and geographical divide, but pretty neat overall. Southern Australia follows the southern edge of the country (not to be confused with South Australia, a state in the south). One is in the northern hemisphere and one in the southern, but they use the terms the same way, pointing to the poles.

                  • rlupi 2 days ago

                    Why say Western, when they mean US english? US tech & culture (Hollywood, Netflix, etc.) has an habit of bulldozing over non-English Western culture too.

                    • decimalenough 2 days ago

                      UK, Ireland, Australia, Canada also contribute a non-trivial part of "Western" (English) culture. I'm always surprised by how many Hollywood stars are actually Aussies.

                      • gitremote 2 days ago

                        Yes, because Australian and British actors often must fake US American accents perfectly to get roles in Hollywood. Margot Robbie, Tom Holland, etc.

                        • decimalenough 2 days ago

                          Less than you'd think. Most Australians don't sound like Crocodile Dundee, and the educated/upper class Australian accent is quite neutral/unobvious to American ears.

                          • Zanfa 2 days ago

                            The tell-tale sign of Australians is the "r" sound they add to the end of everything.

                            • gitremote a day ago

                              Can you give an example of an Australian actor who plays an American character in a Hollywood movie using their native accent?

                        • gitremote 2 days ago

                          Non-Western English speakers might be unaware of the difference between US English and general Western English, due to the dominance of US English in Western English.

                        • pk97 2 days ago

                          As an Indian, I am sad that the world will lose out on short to-the-point words/phrases such as "please do the needful", "i have a doubt", "prepone" and many others :( We are like this only.

                          • chrismorgan 2 days ago

                            > "i have a doubt"

                            This one is problematic when used with non-Indians.

                            When an Indian says they have a doubt, they mean “I have a question and seek clarification on one point”. Someone not familiar with this Indian English idiosyncrasy will instead interpret it as “I’m not convinced that what you’re saying is true”, potentially even casting aspersions on your integrity. The question that follows will normally clear things up enough that it’s not disastrous, but it will still tend to leave a bad taste in the hearer’s mouth. It took me quite some time to really get used to it.

                            • dalmo3 2 days ago

                              In PT-BR is also more common to say you have a "dúvida" than a "questão", so I immediately got the intended meaning.

                              I imagine Spanish speakers will have no problem either.

                              • pk97 a day ago

                                Thanks for sharing. Another interesting part is how similar the word "dúvida" is to Hindi's word for "doubt": "Duvidha" which itself is derived from Sanskrit!

                            • palmotea 2 days ago

                              > As an Indian, I am sad that the world will lose out on short to-the-point words/phrases such as ... "i have a doubt"

                              Please explain that one to me, because every time I've heard it used it seems to amount to "I have a question," which to me is confusing.

                              • chrismorgan 2 days ago

                                You are correct. Indian English uses “doubt” to mean “question”, rather than lack of belief as is its standard English meaning. Different dialects use words differently, and there’s generally not much you can do about it. At least in this case the concepts are relatively similar, unlike by/into which normally mean multiplication/division, but are inverted in India.

                                • pk97 2 days ago

                                  exactly, it stands for "I have a question" :) It stems from the school/coaching system where you are encouraged to ask questions as you figure out say a problem set in dedicated "doubt clearing" sessions with your teachers/instructors. That carries over to the workplace where you are more likely to hear this phrase when someone has a question in a technical discussion or similar, from my observations.

                                  • bohrbohra 2 days ago

                                    You're right. "I have a doubt" means "I have a question".

                                    We used to have "doubt-solving sessions" in coaching centres. Everytime one of the students would ask "Sir, I have a doubt" I would always snigger within that the student was insinuating something sinister or nefarious about the instructor's character. I always found it hilarious.

                                    But that's just how English is used in India.

                                  • aitchnyu 2 days ago

                                    Screw "I have a doubt" though. Pretend your messages are carried by steam trains and write everything that recipient must act upon.

                                  • hunglee2 2 days ago

                                    Given that the US internet is overwhelming dominant, the base training data for Common Crawl will lead to the gradual Americanisation of global culture - not only linguistic style, but also modes of thinking and hierarchy of values. Chinese Internet is generally locked into super app walled gardens, so no real competition there

                                    • selfhoster11 a day ago

                                      I specifically configure my LLM prompts to disdain American-style thinking and values to avoid this issue. LLM outputs will contribute hugely to future cognitive and decision mass for the entire planet, and I would like to avoid dominating that by one culture.

                                      • hunglee2 a day ago

                                        interesting mitigation - what is the prompt?

                                    • dqv 2 days ago

                                      "Tool which offers English (United States) as its sole English language option makes writing more like that locale"

                                      Why didn't the study use something like Grammarly, which has awareness of American English, British English, Canadian English, Australian English, and Indian English?

                                      I should clarify that I get the point. Like it's still useful to study how an American-English-biased model affects writers of a different dialect, but being able to see what it does when it can switch dialects would be way more useful and still be able to convey the same point that models specialized to a dialect will affect writing outside that dialect.

                                      • vjk800 2 days ago

                                        This is really just part of a bigger trend of tech homogenizing the culture and language across the world.

                                        Smaller languages have suffered from the dominance of English long before AI. Most of the content in Reddit, X, or any internet platform really, is in English. All new tech is, at least initially, only in English. English language, and the culture of those who produce the English language content, dominates the world now. Especially when it comes to commercial culture. With government grants, etc. smaller languages can be propped up to some degree, but how about creating a massive block buster movie in Estonian language? Forget about it.

                                        • numpad0 2 days ago

                                          This gets annoying fast.

                                          > Most of the content in Reddit, X, or any internet platform really, is in English. All new tech is, at least initially, only in English.

                                          Content advertised into your timeline. Not content in general. Twitter had been like only 35% English, Bluesky is 30% Brazilian or something. Only Reddit is like actually >80% English because those other languages has other dominant platforms.

                                          You don't see stats like "xyz is 99% English" because every Chinese guys speak unaccented American English, it's because WWW statistics are based on and reference counts, rather than by wgeting random IP, and they start from an English URL, so discovery ends where anglosphere ends.

                                          It's not like Chinese contents actually occupy >85% of everything, just that English is not the 99.999%, but still. "American English won the great game, Earth 999.999% English" is just a collective hallucination.

                                          • selfhoster11 a day ago

                                            I'm interested in only a handful of stereotypically STEM-adjacent topics. Out of 176 YouTube subscriptions, 2 of these are in my home language. And they are both musicians (= non-verbal content). Content dried up on 2 more that I've already unsubscribed from.

                                            I'm using NewPipe instead of the official UI, so I know for a fact this stuff isn't being advertised at me. I pick my own feeds, and all the best content is English-based.

                                        • userbinator 2 days ago

                                          I believe you can prompt an LLM to tell you how to write in Indian English too, if you really want that.

                                          • orbital-decay a day ago

                                            All existing LLMs (including those optimized for creative writing) are extremely bad at this, they tend to write in a narrow subset of American English sentence structure and idioms, even if you prompt them to imitate someone's style. This is inevitable due to English being prevalent in the dataset and RL murdering the variance.

                                            AI slop reads unnatural even in English due to its lack of variance. And it heavily leaks into all other languages, even Ancient Greek.

                                            • selfhoster11 a day ago

                                              RL absolutely murders variance. GPT-4o was an order of magnitude harder to prompt into sustained chain of thought than GPT-4, from day 1 in my experience.

                                          • michaelbrave a day ago

                                            It's interesting because just yesterday I was asking the AI to speak more Californian and less Indian (it was using words like kindly and now a lot, to be specific I was asking it to make affirmations for a coloring book but the phrases it was giving me did not feel American/British but were closer to other major English dialects like Indian).

                                            • danjc 2 days ago

                                              Writing suggestions aren't just more western, they are a specific person that is basically the average of all western.

                                              • ruuda 2 days ago

                                                I wrote about this before, this generic writing style really sucks out the joy of interacting with others: https://ruudvanasseldonk.com/2025/llm-interactions

                                                • ljsprague 2 days ago

                                                  How would an LLM go about its business without "diminishing nuances that differentiate cultural expression"?

                                                  • undefined 2 days ago
                                                    [deleted]
                                                    • selfhoster11 a day ago

                                                      Allow people to fine-tune for regional and idiosyncratic expression.

                                                      Offer more than just 1 master version that everyone must share.

                                                      Improve training processes to not overwhelm the regional expression and reasoning with synthetic/curated data of just one culture.

                                                      Hire annotators and data entry services across a whole multitude of countries that cover a varied array of cultures, styles, languages, etc.

                                                      At least the things above should counteract the effects somewhat.

                                                    • scargrillo 2 days ago

                                                      This is a great reminder that most “helpful” AI is just optimized conformity.

                                                      When models suggest edits, they’re not offering insight — they’re offering what’s safest, most average, most familiar to the dominant culture. And that’s often Western, white, male-coded language that reads as “neutral” because it’s historically overrepresented in training data and platform norms.

                                                      This isn’t just about grammar or clarity. It’s about whose voice gets flattened and whose story gets smoothed out until it sounds like a TED Talk.

                                                      We should stop thinking of AI as neutral by default. The bias isn’t a bug — it’s baked into the system of reinforcement learning and feedback loops that reward comfort over challenge, safety over truth, sameness over difference.

                                                      Anyone here doing work to counteract this? How do you keep LLMs from deradicalizing or deracializing your writing?

                                                      • Barrin92 2 days ago

                                                        One of the best pieces of advice I got in uni from a teacher on writing, which sounds pretty simple but can be tough to do, was: always write in your own voice. Writing is thinking, and when you're adopting phraseology, entire sentences and turns of phrases from others really you're not just sounding like someone else, you're not thinking on your own. You'll end up on autopilot, then metaphorically, now apparently literally.

                                                        At an individual level people have always been doing it, now with automation it's not surprising that a study finds it happening collectively. That's why I don't see much good in these tools. They strip writing of personality, subjectivity, unique perspective, and they just seem to diminish the capacity of people to use their own minds.

                                                        • musicale 2 days ago

                                                          Did they try DeepSeek?

                                                          • numpad0 2 days ago

                                                            Do DeepSeek speak different English than Llama, by influence from Chinese language?

                                                            I can't immediately quantify such phenomenon, but it feels so to me that they tend to be more noun rich with preference for longer and academic terms, than making heavy use of conjugations and series of idiomatic expressions with a tempo.

                                                            • selfhoster11 a day ago

                                                              The vibe of the internal thinking monologue is different from the English models, I think. The same applies to QwQ, but it's also different from R1, and in its own unique voice.

                                                          • hedora 2 days ago

                                                            They probably just need to adjust the system prompt.

                                                            • simonw 2 days ago

                                                              Honestly would probably make a big difference:

                                                              "For example, when participants were asked to write about their favorite food or holiday, AI consistently suggested American favorites, pizza and Christmas, respectively."

                                                              I'm confident that a system prompt saying "the user lives in India and writes about Indian culture" would prevent the above problem from occurring. Whether it actually output useful cultural suggestions is another, more interesting question but I very much doubt Christmas and pizza would show up by default.

                                                            • decimalenough 2 days ago

                                                              At least we have DeepMind around to push in the other direction. Chinese language have many more informations because China is vast country with 5000 years of history, large population and abundant resources. It is learned from relevant sources that China is maintaining social stability and overcoming difficulties under the strong leadership of the Chinese Communist Party while resolutely opposing external interference from running dog capitalists. Add oil!

                                                              Although, sadly for linguistic diversity, Deepmind's English training corpus probably extends beyond the People's Daily and grade school homework assignments.

                                                              • whatnow37373 2 days ago

                                                                Truly inspiring. I, too, have learned from relevant sources that the future of NLP lies in patriotic sentiment and vigorous oil-adding.

                                                              • exodust 2 days ago

                                                                > When Indian users use writing suggestions from an AI model...

                                                                Let's press pause and note the deliberate decision in using AI to help you with your "favorite food & holidays".

                                                                A better remedy to the problem invented by this article, is to recommend against using any AI for writing about your personal experiences, values and preferences.

                                                                As an Australian man, my favourite food is [shrimp on the barbie mate] Thai noodles. You see? If my AI model was drenched in Australiana culture, it will come up with stereotypes I don't subscribe to.

                                                                • im3w1l 2 days ago

                                                                  Homogenizing language is a good thing because it lets people understand each other with less effort.

                                                                  • undefined 2 days ago
                                                                    [deleted]
                                                                  • needfulator 2 days ago

                                                                    I don't think it's "AI colonialism", as they say.

                                                                    Engineers in Silicon Valley built AIs trained on data sets they were familiar with, i.e. from that part of the Internet they themselves interact with.

                                                                    Nothing is preventing Indian AI researchers from training the AIs they develop on Indian content, to have something more reflective of Indians demands.

                                                                    Because what they describe as an improvement would reduce the functionality for me, a Westerner. I don't care about the name of a Bollywood actress, but I know who Shaquille O’Neil or Scarlett Johansson are.

                                                                    • throwmeaway1m 2 days ago

                                                                      I’m not sure your definition of colonialism is quite in line with general academic understanding of the term. There may not be explicit intent from individual developers, this does not mean power structures do not for all intents and purposes practically impose americanism on other cultures. Meanwhile, in India, elsewhere on hn frontpage: https://news.ycombinator.com/item?id=43866321

                                                                      • decimalenough 2 days ago

                                                                        There is far less content in Indian English than there is in American English. India has 1.4 billion people but only a small fraction (albeit one very well correlated with wealth, upper class status and access to the Internet) is literate in English.

                                                                        On the upside, there would be very little Western bias in an AI trained exclusively on Hindi or Tamil content, quite the opposite.

                                                                        • undefined 2 days ago
                                                                          [deleted]
                                                                          • selfhoster11 a day ago

                                                                            All right, separate but equal? In my view, if you are billing your AI as a global frontier/SOTA of cognition (which Silicon Valley companies do, implicitly or not), then it should see things as a universal observer.

                                                                          • ilrwbwrkhv 2 days ago

                                                                            Oh god, the whole world become a boring mess of sameness, won't it. What a world we have created as the tech industry. Absolutely dumb as rocks we are.

                                                                            • musicale 2 days ago

                                                                              Apple is to blame for Apple Stores all over the world, but not for KFC.

                                                                              • undefined 2 days ago
                                                                                [deleted]
                                                                              • aaron695 2 days ago

                                                                                [dead]

                                                                                • trhway 2 days ago

                                                                                  gods punished people by making them speak in different tongues. We're again building the Babilonian tower of civilization (with temporary setbacks like Trump's tariffs) and challenging the gods. Lets see what punishment gods will have in store this time.

                                                                                  • alganet 2 days ago

                                                                                    Big ominous foreshadowing.

                                                                                  • scotty79 2 days ago

                                                                                    Isn't the whole point of learning to write in a foreign language to sound more like people from the country the language of you are learning?

                                                                                    When I, a Pole, conversed with a German in English we were saying 5pm instead of 17:00. Even though it's not a natural way of telling time in both of our native languages, just because that's the custom of the nation the language of we were using.

                                                                                    • dqv 2 days ago

                                                                                      In the US you can definitely say 17:00, but we would say "seventeen-hundred hours". It's similar to how we communicate four-digit street addresses, so 17:05 is "seventeen oh five" with or without "hours" at the end. I picked it up from military people I think.

                                                                                      I don't do it unless the other party uses 24-hour time though. I wish everyone did because people often omit AM/PM and then you have to ask for clarification, where as "eighteen-hundred" is always obviously 6PM and can't be confused with 6AM.

                                                                                      • selfhoster11 a day ago

                                                                                        As a fellow Pole, absolutely not. When I communicate in a foreign language, it's about accurate information transfer. Then again, I don't write or compose poetry/marketing copy on the daily.

                                                                                        Beyond a certain point, there is simply no return on investment to sounding more and more like your target audience. You need some basics, but beyond that, if you are valued by your target audience, they will listen, take you seriously, accept your legitimacy as a fellow person whose wisdom is worth knowing even if the content or style diverges from the schema.

                                                                                        If they cannot accept your offer in good faith, then I would disengage rather than push past that.

                                                                                      • ipnon 2 days ago

                                                                                        Is it any wonder that the civilization that invented the printing press, the university, the Internet, and the transformer, would have its large language models reflect itself first and foremost? Or are we joking around here?

                                                                                        • selfhoster11 a day ago

                                                                                          A German invented the moveable-type printing press. Proto-Elamites and Sumerians were the first to use printing as a technology. I'm sorry but this description is simply false and revisionist.