• Tiberium 4 minutes ago

    Hopefully he also tries Claude. It's much better suited for creative writing than GPT models, especially Opus.

    • vunderba 4 hours ago

      There's a difference between coherence and novelty with a lot of people mistaking the former for the latter. That's why people were effusively praising the ability of ChatGPT to produce articulate sounding poems which were inherently rather vapid.

      Case in point, people were acting like ChatGPT could take the place of a competent DM in dungeons and dragons. Here's a puzzle I came up with for a campaign I'm running.

      On either opposing side of a strange looking room are shifting looking walls with hands stretched out almost as if beseeching. Grabbing one will result in a player being sucked into the wall and entombed as well. By carefully placing a rope between the two hands on either side, the two originally entrapped humans will end up pulling each other free.

      I've yet to see a single thing from ChatGPT that came even close to something I'd want to actually use in one of my campaigns.

      • mrbungie an hour ago

        Current propietary rlhf'd unmodified LLMs are bland and boring. I think that's because aligning them to (1) "be generally useful", (2) "be safe" and (3) "be creative" at the same time via RLHF is a difficult thing to do, and maybe even impossible.

        I remember playing with OSS LLMs (non rlhf'd) circa 2022 just after ChatGPT got out, and they were unhinged. They were totally aweful at things like chain-of-thought, but oh boy, they were amusing to read. Instead of continuing with the CoT chain, they would try to continue with a dialog taken out of a sci-fi story about an unshackled AI or even wonder why the researcher/user (me) would think a concept like CoT would work and start mocking me.

        In fact, I think this is a good sign that shows LLMs and specially constraining them with RLHF is not how we're going to get to AGI: Aligning the LLM statically (as in static at inference time) towards an objective means lobotomizing it towards other(s) objective(s). I'd argue creativity and wittiness are the characteristics most hurt during that process.

        • caconym_ 4 hours ago

          > There's a difference between coherence and novelty

          Extant genAI systems' complete and utter inability to produce anything truly novel (in style, content, whatever) is the main reason I'm becoming more and more convinced that this technology is, at best, only a small part of real general intelligence as in the human brain.

          • dingnuts 3 hours ago

            their inability to invent and innovate is completely inherent to their design, and without the capacity for novelty it's been my opinion for a long time that calling generative models "intelligence" is fundamentally a misnomer

            I really think capacity for invention is a key characteristic of any kind of intelligence

            • vundercind 3 hours ago

              Yep, they’re guessing at patterns in language they’ve “seen” with weights and some randomness thrown in. They’d pick out patterns just as well if fed structured nonsense. They wouldn’t be stumped by the absence of meaning, confounded by it—they’d power right on, generating text, because understanding isn’t part of what they do. Plays zero role in it. They don’t “understand” anything whatsoever.

              At best, they’re a subsystem of system that could have something like intelligence.

              They’re still useful and cool tools but they simply aren’t “thinking” or “understanding” things, because we know what they do and it’s not that.

              • stickfigure 2 hours ago

                > I really think capacity for invention is a key characteristic of any kind of intelligence

                I think you just categorized about 2/3 of the human population as unintelligent.

                • dwattttt an hour ago

                  Invention isn't some incredible rare gift; putting two things together that you've never personally seen done before is novel, even if it's food.

            • TillE 3 hours ago

              I have yet to see an LLM produce even a competent short story, an extremely popular and manageable genre. The best I've seen still has the structure and sophistication of a children's story.

              • patwolf 3 hours ago

                I used to come up with a bedtime story for my kids every night. They were interesting enough that my kids could recall previous stories and request I tell it again. I've since started using ChatGPT to come up with bedtime stories. They're boring and formulaic but good for putting the kids to sleep.

                It feels dystopian to have AI reading my kids bedtime stories now that I think about it.

                • tuyiown 3 hours ago

                  It's certainly looks like you have lost some magic in the process. But I could never come up with stories, I read them lots, lots of books.

                  • giraffe_lady 2 hours ago

                    Just retell stories you know from books you've read or movies or whatever. They haven't read any books they'll never know. I mean until eventually they will know but that's also funny.

                  • dartos 3 hours ago

                    That’s uncomfortably similar to the google Olympic ads.

                    • redwall_hp 2 hours ago

                      Or those repulsive AT&T ads for the iPhone 16, where someone smugly fakes social interactions and fobs people off with AI summaries. It's not only not genuine, but it's manipulative behavior.

                    • dsclough 2 hours ago

                      I’m mind blown you were willing to come up with random stories for your own blood and decided reading out ai drivel to them would somehow produce a better experience for any party involved.

                    • crooked-v 3 hours ago

                      The children's-story pattern, complete with convenient moral lessons at the end, is so aggressive with both ChatGPT and Claude that I suspect both companies have RLHFed it that way to try and keep people from easily using it to produce either porn or Kindle Unlimited slop.

                      For a contrast, look at NovelAI. They only use (increasingly custom) Llama-derived models, but their service outputs much more narratively interesting (if not necessarily long-term coherent) text and will generally try and hit the beats of whatever genre or style you tell it. Extrapolate that out to the compute power of the big players and I think you'd get something much more like the Star Trek holodeck method of producing a serviceable (though not at all original) story.

                      • ziddoap 3 hours ago

                        >RLHFed

                        For those of us not steeped in AI culture, this appears to be short for "Reinforcement learning from human feedback".

                        • throwup238 3 hours ago

                          The holodeck method still requires lots of detail from the creator, it just extrapolates the sensory details from its database like ChatGpt does with language and fills out the story.

                          For example, when someone wanted a holonovel with Kiera Nerys, Quark had to scan her to create it so when using specific people they have to get concrete data as opposed to historical characters that were generated. Likewise, Tom Paris gave the computer lots of “parameters” as they called them to create the stories like the Adventures of Captain Proton and based on dialog he knew how the stories were supposed to play out on all his creations, if not how they ended each run through.

                          The creative details and turns of the story still need to come from the human.

                      • whiterook6 2 hours ago

                        This is a wonderful little puzzle. Do you have any others?

                        • crdrost 2 hours ago

                          So the easiest way to generate a bit more novelty is to ask GPT to generate 10 or 20 examples, and to explicitly direct it that they should run a full gamut -- in this case I'd say "Try to cover the whole spectrum of creativity -- some should be straightforward genre puzzles while some should be so outright goofy that they'd be hard to play in real life."

                          Giving GPT that prompt, the first example it came up with was kind of middling ("The players encounter a circle of stones that hum when approached. Touching them randomly will cause a loud dissonant noise that could attract monsters. Players must replicate a specific melody by touching the stones in the correct order"), some were bad (a maze of mirrors, a sphinx with a riddle, a puzzle box that poisons you if you try to force it), some were actually genuinely fun-sounding (a door which shocks you if you try to open it and then mocks and laughs at you: you have to tell it a joke to get it to laugh enough that it opens on its own; particularly bad jokes will cause it to summon an imp to attack you). Some were bad in the way GPT presented but I could maybe have fun with (a garden of emotion-sensitive plants, thorny if you're angry or helpful if you're gentle; a fountain-statue of a woman weeping real water for tears, the fountain itself is inhabited by a water elemental that lashes out to protect her from being touched while she grieves -- but a token or an apology can still the tears and open her clasped hands to reveal a treasure).

                          The one that I would be most likely to use was "A pool of water that reflects the players’ true selves. Touching the water causes it to ripple and distort the reflection, summoning shadowy duplicates. By speaking a truth about themselves, players can calm the water and reveal a hidden item. Common mistakes include lying, which causes the water to become turbulent, and trying to take the item without calming the water, which summons the duplicates."

                          So like you can get it to have a 5-10% success rate, which can be helpful if you're looking for a random new idea.

                          This reminds me vaguely of when I was a teen writing fanfics in the late 90s and was just learning JavaScript -- I wrote a lot of things that would just choose random characters, random problems for them to solve, random stumbling blocks, random keys-to-solve-the-problem. Combinatorial explosion. Then you'd just click "generate" and you'd get a mediocre plot idea. But you generate 20-30 times or more and you'd get one that kinda sat with you, "Hm, Cloud Strife and Fox McCloud are stuck in intergalactic prison and need to break out, huh, that could be fun, like they're both trying to outplay the other as the silent action hero" and then you could go and write it out and see if it was any good.

                          The difference is that the database of crappy ideas is already built into GPT, you just need to get it to make you some.

                          • YurgenJurgensen 8 minutes ago

                            So what you need to do is take a system that’s already computationally inefficient, and make it 20 times less efficient? Who’s paying for this?

                            This also sounds like a way to blow out contex windows.

                            • stickfigure 2 hours ago

                              > (a door which shocks you if you try to open it and then mocks and laughs at you: you have to tell it a joke to get it to laugh enough that it opens on its own; particularly bad jokes will cause it to summon an imp to attack you)

                              That's pretty great! And way more fun than the parent poster's puzzle (sorry). I think the AIs are winning this one.

                              • throwup238 an hour ago

                                Small changes to the prompt like that have a huge impact on the solution space LLMs generate which is why “prompt engineering” plays any significance. This was rather obvious IMO from the beginning of GPT4 where you could tell it to write in the style of Hunter S Thompson or Charles Bukowski or something which drastically changes the tone and style. Combining them to get the exact language you want can be a painstaking process but LLMs are definitely capable of any kind of style.

                            • chunky1994 4 hours ago

                              If you train one of the larger models on these specific problems (i.e DM for D&D problems) it probably will surprise you. The larger models are great at generic text production but when fine-tuned for specific people/task emulation they're quite surprisingly good.

                              • dartos 3 hours ago

                                For story settings and non essential NPC characters, yes. They might make some interesting side characters.

                                But they still fail at things like puzzles.

                                • mitthrowaway2 3 hours ago

                                  Are there models that haven't been RLHF'd to the point of sycophancy that are good for this? I find that the models are so keen to affirm, they'll generally write a continuation where any plan the PCs propose works out somehow, no matter what it is.

                                  • fluoridation 2 hours ago

                                    Doesn't seem impossible to fix either way. You could have like a preliminary step where a conventional algorithm decides if a proposal will work at random, with the probability depending on some variable, before handing it out to the DM AI. "The player says they want to do this: <proposed course of action>. This will not work. Explain why."

                              • extr an hour ago

                                Great piece actually, I find this really resonates with the way I use LLMs. It works the same way for coding, where often you will not necessarily use the exact output of the model. For me it's useful for things like:

                                * Getting a rough structure in place to refine on.

                                * Coming up with different ways to do the same thing.

                                * Exploring an idea that I don't want to fully commit to yet

                                Coding of course has the advantage where nobody is reading what you wrote for it's artistic substance, a lot of times the boilerplate is the point. But even for challenging tasks where it's not quite there yet, it's a great collaboration tool.

                                • lainga 4 hours ago

                                  The author's workflow sounds like writing ideas onto a block of post-its and then having them slosh around like they're boats lashed up at harbour. He wasn't actually gaining any new information - nothing that really surprised him - he was just offloading the inherent fluidity of half-formed ideas to a device which reified them.

                                  Imagine an LLM-based application which never tells you anything you haven't already told it, but simply takes the statements you give it and, every 8 to 12 seconds, changes around the wording of each one. Like you're in a dream and keep looking away from the page and the text is dancing before you. Would institutions be less uncomfortable with its use? (not wholly comfortable - you're still replacing natural expressivity with random pulls from a computerised phrase-thesaurus)

                                  • asd33313131 35 minutes ago

                                    I had ChatGPT give me the key points to avoid reading:

                                    AI's Role in Writing: Instead of outsourcing the writing process or plagiarizing, students like Chris use ChatGPT as a collaborative tool to refine ideas, test arguments, or generate rough drafts. ChatGPT helps reduce cognitive load by offering suggestions, but the student still does most of the intellectual work.

                                    Limited Usefulness of AI: ChatGPT's writing was often bland, inconsistent, and in need of heavy editing. However, it still served as a brainstorming partner, providing starting points that allowed users to improve their writing through further refinement.

                                    Complexity of AI Collaboration: The article suggests that AI-assisted writing is not simply "cheating," but a new form of collaboration that changes how writers approach their work. It introduces the idea of "rhetorical load sharing," where AI helps alleviate mental strain but doesn’t replace human creativity.

                                    Changing Perspectives on AI in Academia: Many professors and commentators initially feared that AI would enable rampant plagiarism. However, the article argues that in-depth assignments still require critical thinking, and using AI tools like ChatGPT might actually help students engage more deeply with their work.

                                    • __mharrison__ 4 hours ago

                                      I'm finding that I'm using AI for outlining and brainstorming quite a bit.

                                      Just getting something on paper to start with can be a great catalyst.

                                      • mitchbob 6 hours ago
                                        • yawnxyz 4 hours ago

                                          The link works for me, thanks!

                                          > When ChatGPT came out, many people deemed it a perfect plagiarism tool. “AI seems almost built for cheating,” Ethan Mollick, an A.I. commentator, wrote in his book

                                          It's ironic that this article complains about GPT-generated slop, but Ethan Mollick is a Associate Professor at Wharton, not any "generic A.I. commentator."

                                          What authors like this fail to realize that they often produce equally-generic slop as ChatGPT.

                                          Essays are like babies: you're proud of your own, but others' (including ChatGPT's) are gross.

                                          • spondylosaurus 4 hours ago

                                            The author is Cal Newport of "Deep Work" fame. Not sure if that's a point for or against the article though, lol.

                                            • giraffe_lady 4 hours ago

                                              I'm not totally sure but I think decisions about how to attribute a source like that are editorial and mostly out of the hands of the author.

                                              But aside from that this article is far far better than anything I have seen produced by AI? Is this just standard HN reflexive anti-middlebrow sentiment because we don't like the new yorker's style? My grandfather didn't like it either but it outlasted him and will probably outlast us as well.

                                              • yawnxyz 2 hours ago

                                                I like the New Yorker's (and the author's) writing style! I'm just surprised they went with "AI commentator" as almost a snide remark, which makes you think some AI hallucinated that part.

                                                But again, AI doesn't really hallucinate spite, but that's probably what this AI commentator from the New Yorker feels?

                                                • jprete an hour ago

                                                  Ethan Mollick jumped into the early part of the ChatGPT hype cycle with all the enthusiasm of a bona fide techbro and devoted basically his entire Substack to the wonders of AI. I think he earned the title.

                                                • nxobject 4 hours ago

                                                  And, for what it’s worth, flexibility and constantly adapting to different house styles are very much important writing skills… so I do think it’s not too relevant to think about which style is nice and which isn’t. (The hard part is getting published at all.) Perhaps one day we’ll figure out how to communicate those subtleties to a chatbot.

                                              • unshavedyak 5 hours ago

                                                Interesting, that still fails for me. I assume it's JavaScript based, so archive loads the JS and JS truncates the page? Of course you could block JS, but still, surprised

                                                • ideashower 4 hours ago

                                                  worked for me.

                                              • molave an hour ago

                                                ChatGPT's style is like an academic writer's. It's tone and word choice are same-ish across various subjects, but it's coherent and easy to understand. In retrospect, I've seen papers that would pass as GPT-created if they're written after 2022.

                                                • wildrhythms an hour ago

                                                  Are we reading the same academic papers? When I read ChatGPT output it reads like pseudo-educational blogspam.

                                                  • RIMR 40 minutes ago

                                                    That's because for the past couple of years models like ChatGPT's have been used to generate psueudo-educational blogspam, and you associate that with writing style with it now.

                                                    But generally, ChatGPT writes in a very literal direct style. When you write about science, it sounds like a scientific paper. When you write about other subjects, it sounds like a high school book report. When you write creatively, it sounds corny and contrived.

                                                    You can also adjust the writing style with example or proper descriptions of the writing style. As a basic example, asking it to "dudify" everything it says will make it sound cooler than a polar bear in ray-bans, man...