• mise_en_place 2 hours ago

    It's closer to Dutch or German. I had no issue understanding transit signage in Amsterdam. The difficult words are calques and idioms that don't directly translate, i.e. opponthoud. Exit is uitgang, literally an "outgoing". Entrance is ingang, ingoing.

    • geocar an hour ago

      Once upon a time I was walking with a friend around Amsterdam and a man walked up to me and asked me where something was (a nearby restaurant), I answered in English with directions and he thanked me, then a moment later this exchange happened:

      - Wait! You can speak Netherlands?

      - No. Sorry.

      - But you can! I am speaking it!

      - No you're not! You're speaking English!

      He just blinked and walked off, probably assuming I was playing a trick, but my friend asked what that was all about, and had to convince me that she didn't understand anything he was saying, and wouldn't have said anything to me at all about it except that she heard me insist he was speaking English!

      She asked me to try and remember what exactly I was hearing, and I realised I couldn't! I could only remember something like "do you know where [placename that sounded familiar] is?" and maybe that there was a heavy accent, but not exactly how it sounded.

      Every now and then I think of this and I wonder if I actually understand English at all, or if I'm just so used to saying that I understand it (in response to something that sounds something like a question) that I started to believe it.

      • inanutshellus an hour ago

        Very much reminds me of a Kids in the Hall skit from ages ago. Two guys are at a party, one says "hello", the other says -- in perfect English -- "Sorry I don't understand you, I don't speak English. please don't beat me up." The first guy assumes it's a joke and plays along until he gets annoyed at the perfect accent and responses ("no really, I don't understand what you're saying, nor what I'm saying, I've just practiced this conversation a lot. please don't beat me up.")

        Eventually the first guy tires of being mocked by this guy claiming not to speak English and beats the guy up.

        Totally hilarious example of what you experienced taken to an extreme.

        • larkost 16 minutes ago

          When I was a kid my boychoir hosted a choir from Estonia (this was before the fall of the Iron Curtain, so this was a big deal), and I remember when I was trying to help out back-stage at their concert and one of the Estonian boys needed help, but the only thing he could say was "I don't speak English", in a perfect midwestern accent (the Estonian choir director had drilled his entire choir on that one phrase). The cognitive dissonance was hard to deal with.

      • blueyes an hour ago

        And it's closer to Frisian than Dutch. If the Battle of Hastings hadn't happened, these languages would be easily and mutually comprehensible. But that just shows that English descends from a branch of West Germanic. It changed gradually over time, kept a core Germanic vocabulary for its basic terms, and borrowed a lot from other languages as it needed more complex concepts, moreso than Modern High German which allows for easy word invention through combination.

        Creoles are defined as languages emerging from pidgins, which themselves are often simplified forms of speech that allow linguistically separate groups to communicate. English is not that. It has an unbroken line of native speakers whose usage gradually evolved, while maintaining a system of complex syntactic rules as well as a literature.

        • indigo945 an hour ago

          This is very much debatable. Only around one in four English words is Germanic in origin, whereas half of English words are Latinic or Romanic. [1]

          Some properties of English's grammar are also more similar to Romance languages than Germanic ones. For example, whereas Germanic languages only mark tense on verbs, English verbs (like Spanish verbs) are marked with aspect in addition to tense: there is a distinction between "I played tennis"/"I have played tennis"/"I was playing tennis".

          I'm not saying English is a Romance language. But it's a much more interesting problem to figure out why it's not considered a Romance language than you imply.

          [1]: https://en.wikipedia.org/wiki/Latin_influence_in_English

          • enragedcacti 3 minutes ago

            > This is very much debatable. Only around one in four English words is Germanic in origin, whereas half of English words are Latinic or Romanic. [1]

            I think that's complicated by the fact that many english words of latin origin were inherited from a germanic source and thus from an intelligibility standpoint the overlap between english and dutch or german is partially additive with latin. e.g. [1]

            Unfortunately the source is dead, but at one point a study ranked english/german as 60% lexical similarity compared to english/french at 27% [2]. Anecdotally, I think dutch and german are easier for an english speaker because of comparatively more overlap in simple words. Heavy use of compound words in dutch/german also gives an english listener a better chance of recognizing part of the word and inferring the meaning from context.

            [1] https://en.wikipedia.org/wiki/List_of_English_Latinates_of_G...

            [2] https://en.wikipedia.org/wiki/Lexical_similarity

            • geocar 17 minutes ago

              > Germanic languages only mark tense on verbs, English verbs (like Spanish verbs) are marked with aspect in addition to tense:

              What exactly do you mean by "aspect"?

              > there is a distinction between "I played tennis"/"I have played tennis"/"I was playing tennis".

              Sure:

              -- (I) played (Tennis) ~~ jugué

              -- (When I) played (Tennis) ~~ jugaba

              -- (When I've) played (Tennis) ~~ jugado

              English (like German) might say they need auxiliary words to explain what's going on, and Spanish people might say they're just conjugating the verb differently, but I think this is just because things are written down;

              Because, if I say any of these sentences in Spanish or English rapidly enough, someone who does not know the language will not know where each word begins and ends, and it may just sound like English "conjugates" at the beginning of words and Spanish "conjugates" at the end. What's the real difference here?

              > But it's a much more interesting problem to figure out why [English is] not considered a Romance language than you imply.

              I think most of the time people talk about the "rules" of a language, we're not really talking about something that will help acquire the language (or be better understood): we're really just yapping about geography and time.

              So what do you think would be different if English was more widely considered a Romance language?

              • munificent 36 minutes ago

                > Only around one in four English words is Germanic in origin, whereas half of English words are Latinic or Romanic.

                Are you measuring by fraction of words in a word list, or fraction of words in a typical utterance? Word usage is highly non-uniform, so you'll get very different ratios between the former and latter.

                • patall 32 minutes ago

                  Which 'aspect' is supposed to be missing in Germanic language verbs? Because I can say those three example forms (aspect tenses?) in both Swedish and German without any limitation.

                  • secretmark 35 minutes ago

                    The most frequently used words are almost all Germanic. There are a lot of technical words with Latin or Greek etymologies that are hardly used in quotidian conversation.

                  • cml123 2 hours ago

                    incidentally, outgang is technically a modern English word, but I've never heard it spoken in real life. You can see literary uses of its precessor "utgang" in Old English here [0]

                    [0]https://bosworthtoller.com/search?q=utgang

                    • kragen 35 minutes ago

                      the question is not really whether english is germanic (it's clearly germanic! nobody disagrees!) but whether it's a creole

                      • BtM909 2 hours ago

                        opponthoud -> oponthoud

                        Although, because I'm looking at it, I get semantic satiation.

                      • wrp 8 hours ago

                        You can argue either way, because "creole" to too vaguely defined. The term was invented to cover a set of languages that had interesting historical and structural similarities, but it turns out to be hard to establish the boundaries of the set. I think it's unfair to the researchers to politicize it as some "post-colonial" writers have been doing.

                        • teleforce 8 hours ago

                          Is English just badly pronounced French [video]:

                          https://news.ycombinator.com/item?id=40495393

                          • cryptonector an hour ago

                            French is just badly pronounced Latin/Italian.

                            Seriously, French is just a very crassified Romance language. As a near-native French speaker I was shocked one time in Paris at a restaurant where our server was from Italy and pronounce every letter in every French word and I still understood him.

                            Dropping 's's (and leaving behind a circumflex to remind one of the dropped 's')? That's a pretty crass evolution (but see the note at the bottom). Dropping trailing letters in words? Same thing. I understand that "oc" is really the Latin "hoc", meaning "this", and that "oui" is just an evolution -a shortening- of "hoc hic" ("this that"). "Oi" (pronounced "wah" in English) is just a vowel shift.

                            French is often treated as a high-class language, so I say 'crass' mainly to remind people that the evolution of French was really the result of every day people not treating it like a dead language to be preserved.

                            • LeonB 7 hours ago

                              Here is a different video by the same person (the `robWords` channel)

                              “How to translate French words WITHOUT KNOWING FRENCH (3 clever tricks)”

                              https://m.youtube.com/watch?v=3BGaA3PC9tQ

                              I learned a lot from this and continue to find words that these lessons apply to.

                              • bryanrasmussen 8 hours ago
                                • xanderlewis 7 hours ago

                                  *spelt.

                                  • ndileas 4 minutes ago

                                    personally I prefer wheat.

                                    • philshem 2 hours ago
                                      • whatshisface an hour ago

                                        Long after the sword was smelt

                                        In foreign lands did wielder welt.

                                        The script upon the scabbard belt

                                        Thereupon became misspelt.

                                        • 73kl4453dz 39 minutes ago

                                          A missing n?

                                  • kreyenborgi 6 hours ago

                                    "that strategy of borrowing a word but giving it a slightly narrower meaning" – I think this happens with most borrowing, into any language. E.g. "opsjon" in Norwegian is from "option" but only means stock/legal options, never used in the common meaning of "choices". (And the video's own example with "les peoples".)

                                    But my favorite example is when going into English from Norse, "fjord" has a quite narrow meaning in English, but must have been much broader in Norse when you look at Limfjorden in flat Denmark (more of a sound than a fjord) and Tunhovdfjorden (even Norwegians would think of this as a lake these days)

                                    • agumonkey 6 hours ago

                                      this kind of crippled borrowing seems universal though, you only complement your own language where you have a hole

                                    • agumonkey 6 hours ago

                                      The bit about non rhyming poetry forms makes me wonder how brit-pop would have sounded without those "french" structures.

                                      • mauvehaus 37 minutes ago

                                        Old English poetry tends to have commonality at the beginnings of words (e.g. alliteration) rather than rhyming at the ends of words. Or so I remember from The History of English Podcast.

                                        It's now 180 episodes in and talking about Shakespeare. Old English was some episodes back, as you might imagine.

                                        As with many others posting here IANALinguist.

                                      • theodric 8 hours ago

                                        With 40%+ Germanic vocabulary (per Braudel), is French rather a Germanic creole?

                                        • bonzini 8 hours ago

                                          Is that the vocabulary of Germanic origin and imported from Germanic, or is it just the part of PIE that has not diverged between Latin and Germanic languages?

                                          • biorach 7 hours ago

                                            Mostly the former. Much of what is now France was dominated by speakers of Germanic languages after the fall of the Roman empire. They were eventually assimilated but left a linguistic legacy.

                                      • 39896880 a day ago

                                        Creole is both a sociohistoric and linguistic term. The fact that the linguistic attributes are difficult to put boundaries on is extremely common for linguists: we won’t even claim to tell you what the definition of “word” is!

                                        As for whether English is a Creole, it’s important to understand the motivation for Creolistics at its origin. It started, and continues, as an effort to legitimize a set of languages that before were considered illegitimate, un-interesting and utterly lacking on features worth studying. Part of that reputation is inextricably bound up in colonialism and racism.

                                        Could you squint and call English a Creole? Sure. But you’d be doing the same disservice to “Creole” as you would by saying “technically we’re all [in the US] African American because all humans originated in Africa.” It’s a disingenuous point, and one that could easily be mistaken for trying to reverse the legitimization efforts that brought the term into existence to begin with.

                                        • BurningFrog 2 hours ago

                                          > It started, and continues, as an effort to legitimize a set of languages

                                          OK, but this is not a scientific definition.

                                          Nothing wrong with such words, but trying to logically reason about them is not going to be fruitful.

                                          • 39896880 an hour ago

                                            The subject of the quote is Creolistics, not the definition of Creole. If you are looking for scientific definitions of linguistic categories you are going to be very disappointed. Language involves humans. Language is messy.

                                            As well, any question that involves “meaning” must itself answer the question “to whom?” I answered from the perspective of (some) linguists, but as another user pointed out, non-linguists might very well and with no ill intention consider it an appropriate term.

                                          • nkrisc 7 hours ago

                                            My personal definition of “word” is “a unit of semantic meaning”. Don’t ask me to define that.

                                            Are numbers words? Are abbreviations words? Can a words contain other words? Can the same sentence have a different number of words when spoken versus written? Yes to all, and more.

                                            • throwup238 9 hours ago

                                              > we won’t even claim to tell you what the definition of “word” is!

                                              What's your best attempt? :-)

                                              • Xophmeister 9 hours ago

                                                It's a bit like the old saying, "All models are wrong, some models are useful." The concept of a "word" is useful in everyday language, particularly in English -- it's host language, unsurprisingly enough -- and (probably) other Indo-European languages. However, in a more precise context, it breaks down because of edge cases.

                                                For example, "words" in agglutinative languages[1] (e.g., Turkish) act very differently from "words" in English. It's hard (impossible?) to capture all that variety in a pithy way. "A string of morphemes" might work, but that's hardly a satisfactory definition!

                                                Maybe a good analogy for the HN crowd would be like asking, "How many characters are in a string?"

                                                [1]: https://en.wikipedia.org/wiki/Agglutinative_language

                                                • 3np 2 hours ago

                                                  (IANALinguist).

                                                  I think it's obvious that there might not be one unifying definition that spans all human languages. "Word" is an English word refering to a specific of the English language. A class in JS is not the same as class in C++, big deal? ;)

                                                  As for English, I'm happy defining it through written language and spacing. "Can't", "unmarried", and "walkie-talkie" all one word each.

                                                  We might as well think of "word" in foreign enough languages as separate concepts. Doesn't seem meaningful to try to fit fundamentally different structures into the same conceptual molds. Which I guess ties back to your original point regarding if it's a useful excercise taxonimizing English as creole.

                                                  • DiogenesKynikos 8 hours ago

                                                    In Chinese, it's very difficult to define word boundaries.

                                                    Each character is a syllable, with a particular pronunciation and a constellation of meanings, usually closely related to one another. There are very common combinations of characters that appear together, which one could define as words. However, often, you could just as easily view the individual characters as words, or the combination as a word.

                                                    In some cases, the combination of two characters means something totally different from what the characters alone would mean (e.g., 东西, where the characters literally mean "East-West," but the combination means "thing"), so the combination is clearly a word. But sometimes, the meaning of the combination is basically a combination of the words' meanings (e.g., 吃饭, where the characters literally mean "eat-food," and the combination means "to eat, have a meal").

                                                    Because written Chinese doesn't use spaces, I guess it doesn't really matter what one defines as a word. The issue just doesn't come up, practically speaking.

                                                    • tsimionescu 8 hours ago

                                                      I should note that typically discussions of language are more fruitful around spoken (or signed) languages rather than writing. Writing is an artificial formal system, and as such sometimes has aspects which are much more socilogiclaly determined than spoken language, which tends to evolve more freely.

                                                      Also, the problems you raise here are mostly just as applicable to English, though perhaps for somewhat fewer words. Is a "walkie-talkie" one word, or two? How about "unmarried"? The un- prefix has a distinct meaning on its own, even if it never appears alone, after all. Or how about "can't"? Technically it's a contraction of "can not", and those words do sometimes appear separately as well, even in this same meaning.

                                                      • DiogenesKynikos 3 hours ago

                                                        The issues I raise are somewhat applicable to English, but to a far lesser extent.

                                                        In Chinese, just about every syllable has its own set of meanings. In English, there are compound words, and some words have prefixes or suffixes, but you can't just arbitrarily break a word into its syllables and assign a meaning to every syllable. Imagine if the word "syllable" could be broken down as syl-la-ble, and every person who spoke English could tell you what "syl," "la" and "ble" individually meant. That's the situation in Chinese, for almost every polysyllabic word you can utter. They can almost all be decomposed into syllables that have individual meanings. It's a very different paradigm from English.

                                                        • tsimionescu 2 hours ago

                                                          Yes, I understand there is a huge difference of the degree to which this applies to the language. I was just pointing out that none of these things should be alien to an English speaker, and even a linguist who only knew English would have had similar struggles to define "word" because of these problems (though of course they could have decided to file them under "exceptions", which doesn't work for Chinese).

                                                    • mr_toad 7 hours ago

                                                      Some times the boundaries between words in English are clear, and sometimes they are a bit blurry.

                                                    • nivertech 7 hours ago

                                                      There is no generally accepted definition in linguistics, but AI researchers have come to the consensus that a word is one or more LLM tokens ;)

                                                      I’m not a linguist, but I would define a word as a part of a sentence composed out of one or more syllables, with word boundaries either implicitly or explicitly specified by different methods in different languages, e.g. by using pauses, longer or shorter phonemes, by using accents, rhythm, or intonation, or simply by remembering words as part of learning a lexical vocabulary.

                                                      A word is something that can be categorized as to which part of speech it belongs (noun, verb, adjective, adverb, etc.)

                                                      Depending on the languages it’s not always clear whether prefixes and/or suffixes are part of the word a separate words.

                                                      Similarly with compound words - do they count as a single or multiple words?

                                                      A short sentence in one language may enter another language as a single opaque word.

                                                      • throw__away7391 7 hours ago

                                                        It depends on the architecture of your CPU.

                                                        • flir 9 hours ago

                                                          \s\w+\s

                                                          ;)

                                                          • koito17 8 hours ago

                                                              cljs.user> (re-seq #"\s\w+\s" "やり直して")
                                                              nil
                                                            
                                                            Joking aside, is it common to use regular expressions? Seems like the method only works for languages with spaces. I think a more sophisticated lexer may be necessary, but are there are non-regex, "fast approximations" that work across most languages? This is a problem that I have not tried solving before.
                                                            • nine_k 7 hours ago

                                                              すみません! It's because you have no space around. A more correct regexp would be \b\w+\b, with zero-width "word boundary" psttetbs instead of spaces.

                                                              • Liquid_Fire 2 hours ago

                                                                That's just passing the problem onto how you define \b. Since Japanese uses no spaces, it would match entire phrases or sentences as "words", treating only punctuation as word boundaries.

                                                            • eesmith 9 hours ago

                                                              "Ain't ain't a word 'cause it don't match the regex."

                                                              BTW, should be "\b\w+\b". The \b is a zero-width match for the start or end of a word. Your pattern requires a space before and after:

                                                                >>> import re
                                                                >>> re.compile(r"\b\w+\b").findall("What's the problem?")
                                                                ['What', 's', 'the', 'problem']
                                                                >>> re.compile(r"\s\w+\s").findall("What's the problem?")
                                                                [' the ']
                                                              • flir 9 hours ago

                                                                That makes sense. I was trying to translate "A word is a group of letters with whitespace on either side" into regex.

                                                                I was being facetious, obviously, but you've already identified a serious problem with that definition (is "what's" two words or one?)

                                                          • ycombinete 9 hours ago

                                                            It's more about how people define the term. I don't think I've seen the position made disingenuously.

                                                            Most people I've spoken to think of creole as a mixed language that becomes it's own language. To them that describes English and how it came to be.

                                                            Even if you require colonisation to be part of it the position can stand. The Anglo-Saxon's colonisation by the Romans and the Normans are a big part of how the English mixture was formed.

                                                            If your definition requires an indigenous, non-European, language being modified by contact with a European coloniser's language. Then sure, English isn't a creole language. But I don't think that's how most people use the term creole colloquially.

                                                            • defrost 8 hours ago

                                                              You're going to have to explain how fifth century Germanic settlers in Britain were colonised by the Romans who arrived in Britain four centuries earlier and were largely a spent force by the time the Germans rolled in ...

                                                              • ycombinete 3 hours ago

                                                                Thanks, I'd have a hard time doing that. I got the history reversed.

                                                                Dang, and it's too late for me to add an edit now as well.

                                                                I think the broad point still stands, in spite of that error.

                                                              • thaumasiotes 9 hours ago

                                                                > The Anglo-Saxon's colonisation by the Romans

                                                                That never happened; it was the other way around.

                                                                • ithkuil 8 hours ago

                                                                  Germanic conquerors may be first partially reverse-colonized by the culture they conquered (which heavily romanized with Celtic substrate) and then later further colonized by norman conquerors who were themselves carriers of the remnants of the Roman cultural heritage.

                                                                  Germanic people (franks) conquered Gaul and you wouldn't call modern french a Germanic language.

                                                                  Linguistic dynamics are utterly fascinating and complex

                                                                  • ycombinete 3 hours ago

                                                                    I've been listening to The Rest is History podcast lately, and a lot of this happened in Islam via converts.

                                                                    Tom Holland was saying that many now fundamental Islamic practices were imported into the faith via converts. For example such as praying 5 times a day was apparently a Zoroastrian practice.

                                                              • k__ 9 hours ago

                                                                Why don't just define a new word?

                                                                I mean you're linguists! :D

                                                                Pseudo-Creole, a language that's technically a Creole, but doesn't fit the sociohistoic context.

                                                                I think, it would be funny if a word for giving specific languages legitimacy is used to define a language that had more legitimacy to begin with.

                                                                • ithkuil 8 hours ago

                                                                  Post-creole

                                                                  I expect that a creole language will evolve over centuries into something that is less and less considered as a creole when compared with more recent creole languages.

                                                                  I think it's more useful to consider "creolization events" in the history of a language rather than a blanket "creole/non-creole" attribute

                                                                  • 082349872349872 7 hours ago

                                                                    Do pseudocreoles include concreoles, eg https://en.wikipedia.org/wiki/Belter_Creole ?

                                                                    Oye Hakalowda! Kowmang showxa lang belta hiya ke?

                                                                  • pessimizer 2 hours ago

                                                                    I don't understand this comment. It makes absolutely no claims about the definition of a creole, and it makes no claims about how English doesn't conform to that definition. It just talks about "squinting" at languages, mentions black Americans for no particular reason, and accuses anyone who would argue that English is a creole of being "disingenuous" for trying to reverse the "legitimazation."

                                                                    It's like the perfect troll comment. Makes no argument, implies anyone who would disagree is probably racist, and uses black Americans as a comparison for no particular reason.*

                                                                    Here's my opinion: English is the result of a simplification of grammar caused by Old Norse and Old English crashing into each other with the same words but different grammar; then the French forced French usage in commerce, so most people started using a ton of French words with this English grammar. That last part sounds exactly like a creole.

                                                                    Is your argument that considering English a creole lets English people off the hook for something? Are things that happen to non-white people somehow delegitimized when they are also noted in white people? Are white people so special and unique that everything else has to be defined by it's non-whiteness? If I'm missing your point, what was it?

                                                                    [*] As a black American, I'm starting to recognize this as a long-standing feature of western rhetoric. A lot of modern argument seems to boil down to which side is more like black Americans. Humorously, actual black Americans are never considered to be like metaphorical black Americans; we're actually spoiled whiners with a sense of entitlement.

                                                                    • imbnwa 2 hours ago

                                                                      >Here's my opinion: English is the result of a simplification of grammar caused by Old Norse and Old English crashing into each other with the same words but different grammar; then the French forced French usage in commerce, so most people started using a ton of French words with this English grammar. That last part sounds exactly like a creole.

                                                                      Grandparent's point is that you're stretching the meaning of the term to a level of generality that applies to everything, therefore meaning nothing in particular.

                                                                      Your point applies to French just as much as English: Gaulic Celtic speakers poorly adopt Latin after getting bull rushed by Romans, then mix in Frankish terms when the Romans' former allies take over; or how about Sicilian: a mash-up of Latin, Greek, Germanic, Arabic conquerors.

                                                                      You have to go pretty far to find a language which isn't a fusion of other languages that results from different peoples entering into extended contact.

                                                                      'Creole' is a word that only appears with its current meaning and history in the last 100 or so years, grandparent is simply preserving that particularity. Nothing more.

                                                                      No one in 1200s Europe thought English or Sicilian were languages not worth engaging with as actual languages owing to modern racial connotations.

                                                                    • thaumasiotes 9 hours ago

                                                                      > The fact that the linguistic attributes are difficult to put boundaries on is extremely common for linguists: we won’t even claim to tell you what the definition of “word” is!

                                                                      The definition of a "word" is always straightforward: a word is an atomic unit of language.

                                                                      However, which units are or aren't atomic varies according to what it is you're measuring.

                                                                      Lexically, "catch fire" is an atomic entity, which cannot be understood as the sum of its parts. It's just one part, and it needs its own dictionary entry, separate from "catch" and from "fire".

                                                                      Syntactically, "catch fire" is definitely not atomic, because the past tense is "caught fire". From this perspective, it's enough to know "catch" and "fire".

                                                                      Syntactically again, we can see that "an elephant" is in variation with "two elephants" / "my elephant" / "every elephant" / etc., and it's clear that "an elephant" is not atomic, but is understood as the composition of "a(n)" with "elephant".

                                                                      Phonologically, as the citation-form spelling above hinted, "an elephant" is atomic; the article cannot exist independently and must attach to another word. Without knowing what that word is, you won't know how the article is pronounced.

                                                                      Specialized terms for both of these types of phenomena exist - lexical words that are too large to be syntactic words are called idioms; syntactic words that are too small to be phonological words are called clitics. But the general lesson is that, despite the definition of "word" being clear, membership in the category varies according to what aspect of the language you're looking at.

                                                                      • tsimionescu 7 hours ago

                                                                        This doesn't even begin to cover things, even for English. First of all, "catch fire" is at least partly understandable from its constituent parts - "catch" has a great variety of related meanings, and they all have to do with something taking hold of something else; I'm sure any English speaker who has encountered both words would intuit the meaning of "catch fire" without any problem, especially if they also encountered "to catch a cold". Of course, the meaning is slightly different, and it is quite invariant.

                                                                        Your analysis of the phonetic atomicity is also unsatisfactory. First, the article can very well be pronounced independently - I can say "English has a single indefinite article, with two forms: 'a' or 'an'". Secondly, the 'a' form can be pronounced in two different ways, depending on how you want to highlight it within a sentence: "he ate a piece" could use the schwa, or the "long a" if you want to highlight the article itself "he ate [ay] piece, not your piece". So the article's pronunciation can change independently of the word it is applying to. Finally, in at least some English accents, many words can be pronounced differently in certain sequences than others - for example, in modern Southern English, an "r" sound is introduced almost always in speech when a word that ends in a vowel sound is followed by another word that starts with a vowel sound, e.g. "I saw-R-it". By your description, neither "saw" nor "it" are individual word phonologically, since you don't know how they will be pronounced unless we know the following word.

                                                                        Overall, the atomicity of a linguistic construct is highly debatable, even in a particular context.

                                                                        • thaumasiotes 7 hours ago

                                                                          > First of all, "catch fire" is at least partly understandable from its constituent parts - "catch" has a great variety of related meanings, and they all have to do with something taking hold of something else

                                                                          If you want to analyze it that way, you'll find that the semantics are the reverse of what you predict: when you catch fire, it's the fire that takes hold of you.

                                                                          > First, the article can very well be pronounced independently - I can say "English has a single indefinite article, with two forms: 'a' or 'an'".

                                                                          This argument is predicated on forgetting the difference between use and mention. What part of speech would you say an is in that sentence? Is it an article?

                                                                          > Secondly, the 'a' form can be pronounced in two different ways, depending on how you want to highlight it within a sentence

                                                                          Yes, problems arise when you need to place sentence-level stress on a feature that is too weak to bear stress. The same problem occurs for any English clitic, including 's, which in the general case doesn't even include a vowel. Most notably here, there's nothing about this specific to a before consonants; the rules for placing stress don't know what word you're following a with. If you need to stress an, the usual choice is /æ/. But also notably, when native speakers do this, they recognize it as a problem - it's just one they may not be able to work around.

                                                                          > Finally, in at least some English accents, many words can be pronounced differently in certain sequences than others - for example, in modern Southern English, an "r" sound is introduced almost always in speech when a word that ends in a vowel sound is followed by another word that starts with a vowel sound, e.g. "I saw-R-it". By your description, neither "saw" nor "it" are individual word phonologically

                                                                          This is not a word-level phenomenon in any way; intrusive R also occurs between syllables of a single word, as long as there's an appropriate vowel-vowel sequence. Placing one between saw and it would not normally be viewed as altering the pronunciation of either word (Which one do you think is altered? I guess by nonrhotic standards it would have to be it), but as the application of a general rule.

                                                                          Placing /n/ between a and elephant is not the application of a general rule, it's the application of a rule specific to a.

                                                                          > Overall, the atomicity of a linguistic construct is highly debatable, even in a particular context.

                                                                          You're saying that people argue over which items count as words, not that they argue over what it means to count as a word.

                                                                          • tsimionescu 5 hours ago

                                                                            > If you want to analyze it that way, you'll find that the semantics are the reverse of what you predict: when you catch fire, it's the fire that takes hold of you.

                                                                            That is one of the meanings of catch, just like when you catch a cold, the cold takes hold of you, or when you catch your foot on something, that thing took hold of your foot.

                                                                            > This argument is predicated on forgetting the difference between use and mention. What part of speech would you say an is in that sentence? Is it an article?

                                                                            Fair enough, though I would still argue that being able to make a noun out of the article in this way relies on them having a stable, recognizable, individual pronunciation.

                                                                            > This is not a word-level phenomenon in any way; intrusive R also occurs between syllables of a single word, as long as there's an appropriate vowel-vowel sequence.

                                                                            Well, we are trying to define what a "word" even is, so you can't bring this distinction in. A priori, "saw it" could be a word, just as my whole comment could be a single word. We are trying to come up with a formal definition of what it means to be a word; if we want "an elephant" to be a single word and "saw-r-it" to be two words, we need to come up with a distinction between these that doesn't presuppose that "saw" and "it" are separate words.

                                                                            > Placing one between saw and it would not normally be viewed as altering the pronunciation of either word (Which one do you think is altered? I guess by nonrhotic standards it would have to be it), but as the application of a general rule.

                                                                            Depending on the exact accent, not all words follow this rule. In certain accents, at least, it is quite specific to words that have an 'r' in their spelling (well, to words that historically had an r sound that was lost, and is usually preserved in the spelling), so "four o'clock" would get a linking R, but "saw it" would not. So at least in these cases, by your definition, we'd have to say that "four" is not an individual word, phonetically speaking. Also note that linking/intrusive R doesn't appear inside morphemes, normally, only between morphemes, or between morphemes and suffixes. So you get [Kafka-r-esque] in certain accents, but never inside, say, "dais".

                                                                            > Placing /n/ between a and elephant is not the application of a general rule, it's the application of a rule specific to a.

                                                                            This could also have been a general rule, that happens to apply to a single word in modern English. Regardless, as I have mentioned before, if you want to define "phonological word" as a unit whose exact pronunciation is only knowable when you have all parts present, then lots of phrases are phonological words in English, unless you add a lot of exceptions to your definition.

                                                                            > You're saying that people argue over which items count as words, not that they argue over what it means to count as a word.

                                                                            These are not that different in practice. If you have a good formal definition, then you should be able to say for any object whether it is a word, a part of a word, or a sequence of words. If you can't do that, you don't really have a definition. The definition you gave basically equates word with atomic, but then if "atomic" is not well defined, or even definable, then we're back to square one of not knowing what a word actually means.

                                                                        • barrucadu 8 hours ago

                                                                          By god you've done it, you've solved linguistics!

                                                                          • gvx 8 hours ago

                                                                            Relevant xkcd: https://xkcd.com/793/

                                                                        • sparsely 7 hours ago

                                                                          There's a good comment on the blog:

                                                                          "This seems to me a rather feckless argument over how people want to define a term. Indisputably, modern English results from a blend of French and old English, in the process of which what we today call English absorbed a lot of French vocabulary, lost most of its inflection, and its verb conjugation was greatly simplified. Whether you want to call it a creole or not is pointless, what I just stated is still true."

                                                                          • ChrisMarshallNY 8 hours ago

                                                                            The SMBC from a couple of days ago, sort of spoke to that: https://www.smbc-comics.com/comic/arthur

                                                                            • talideon 8 hours ago

                                                                              That's the cartoon the Language Log post is commenting on. It's even included at the top of the entry.

                                                                              • ChrisMarshallNY 7 hours ago

                                                                                Yeah, I figured that (after the fact). I figured that it wouldn't be harmful to post it. I am deeply apologetic, if it caused trauma. That was not the intent.

                                                                                I read the headline on my iPad, just after I woke up. Just got back from my run, and I'll read the article after my shower.

                                                                                I figured some helpful soul would do an RTFA, if that was the case, and was not disappointed.