Please don’t use a vibe-coded app for anything important.
I use Claude. I like Claude. But I’ve backed away from having Claude actually write my code other than in the most limited circumstances.
I caught it copying one of my TS Interfaces, for example. And modifying, then using, the copy. So my type-checks pass, yay! But wait what?
It wrote a test for a tricky bit of code. The test wouldn’t pass. So it re-wrote it in a way that couldn’t possibly fail, mocking all elements inside the test itself.
I’m not anti-AI. But I wouldn’t trust anything vibe-coded above the importance of, say, Wordle.
Copilot is also WAY too eager to pass tests. It will write tests to pass a completely broken function.
We have aggressively high coverage requirements at work, but also really terrible tooling/support for good tests, so everybody uses copilot. The result is a paper-thin test suite that only tests the implementation and none of the intended behavior. It’s so clunky and the tests are so mock-ridden that the only way to make sense of them… is by using copilot!! Thus it continues ;-;
I always despised how much MS tries to lock you into their toolset.
> so everybody uses copilot ... the only way to make sense of them… is by using copilot
I see nothing has changed in the age of AI.
Review and good rules are still critical. Current agent state is still hyperspeed junior engineer. Providing examples helps a lot when scaffolding something similar to something else.
You need to develop an intuition to it, the same kind of intuition you developed for systems we work with. For example, your test issue: start by writing tests before making the implementation. LLMs are quite capable but they are not AGI. If your pipeline is good they can produce solid results.
This article is about teaching coding agents to use InstantDB, which is "a modern Firebase".
I suggest jumping straight to this document, which is designed to tell the agent how to work with Instant but is pretty great documentation for humans who want to understand what it can do at the same time: https://www.instantdb.com/mcp-tutorial/claude-rules.md
Thank you for the kind words on the rules/documentation! It was definitely an iterative process to figure out how to get good results.
We have an llms.txt and llms-full.txt (~9k lines) which contains all our documentation. Feeding these to the claude didn't get great results, it was just too much information.
We manually compressed our llms-full.txt into a rules file (~1.5k lines) which declared the API upfront and provided snippets of how to do different things with callouts to common examples. This condensed version did better but would cause Claude to make subtle mistakes.
Looking at the kind of mistakes Claude made, it seemed like a human could make those mistakes too (very useful feedback for us to improve our API ). We thought “what's one of the smallest fully contained examples we can make that packs a bunch of info on how to use Instant?” That would probably be useful for both a human and an agent. And indeed it seemed to be the case.
> Looking at the kind of mistakes Claude made, it seemed like a human could make those mistakes too (very useful feedback for us to improve our API ).
This is something we've found for our API -- just having LLMs attempt to use it helps us identify things that we haven't documented well or placed enough emphasis on (for things that are critical but are non-obvious or may be drowned out by other less important information). Improvements that help the LLM tend to be good for developers too.
Yes. Fun fact, Instant got the `create` method because of how many times LLMs hallucinated it.
Life imitates art: afar you’re describing there is basically _The Secret_ (I.e. if I wish hard enough for something then eventually it will come true), except it’s LLMs that get wish-fulfilment, not us.
Huh, you've reframed The Secret for me - now I see it being, if you wish hard enough for something then eventually the Universe will make it so, just to shut you up.
This made me chuckle, you're right. In a sense it comes back to us, as LLMs are trained on our intuitions.
I built an app (HN Clone, of course) with Instant's MCP hooked up to Claude Code.
The experience was brilliant.
Pros:
+ Fast
+ Easy
+ "Vibe coding on steroids" basically
+ The sense of 'wow' that comes very rarely with new tech
Cons:
- It used Instant as the database/backend, but I wasn't sure what it had done / how exactly it worked and had to spend a bunch of time asking Claude + reading the code to get it. It seemed reasonable, but if I were doing a prod system vs a PoC, this is where the time would be spent. ("Vibe coding lets you create tech debt 10x faster")
Net-net: This is the way for prototyping / validating. This is probably the way for production systems in N months too once the toolchain + agents get better.
Would you mind sharing the code, as well as prompts if you're comfortable? I'm trying to sample anecdata to help re-baseline my intuition on these things.
> as well as prompts if you're comfortable?
This made me wonder: can I share Claude Code's conversation history? Turns Claude stores them.
So I made a full-stack "snippet" app with Claude and Instant. You can:
1. Upload jsonl files 2. Share them in a nice UI
(Going meta) here's the first conversation I had with Claude in order to build it:
https://claude-code-viewer.vercel.app/view/c4ca91ac-9624-40f...
After I deployed, I asked it to fix the tool use UI:
https://claude-code-viewer.vercel.app/view/faf9b2cc-c3cf-4d0...
I used Instant's auth to gate uploads. Views are public, but limited only to the snippets you know (i.e have links for).
If you want to upload your own conversations:
1. They live in ~/.claude. Head on over and grab a file 2. Go to https://claude-code-viewer.vercel.app and sign up 3. Start uploading : )
Some notes:
* Be careful when sharing log files. Claude can include secrets in there. Some hackers may notice an adminToken in the convo. I rotated it before we pushed.
* It was fun to see Claude use the query language. It thought we had a `$startsWith` modifier. Right now we only have $like. But `$startsWith` is a great idea, we may just implement it real quick!
> It thought we had a `$startsWith` modifier. Right now we only have $like. But `$startsWith` is a great idea, we may just implement it real quick!
Haha, that's great. Turns out that "hallucinations" are just things that make sense in context, and that can translate to feature requests from our agents :)
Claude code now has an /export command for this use case. You can run it from within a session.
TIL, thank you!
I don't have the prompts, but here you go:
- https://the-inference.vercel.app
- https://github.com/jamestamplin/instant-test
If they get better. At the moment the progress is on the toolchains because the LLMs progress as such slows down because of the lack of training data
Have you tried Convex?
> Traditionally, end-users were non-technical and would be stuck with whatever the application developer gave them. But now every user has an LLM too.
Interesting point.
I keep coming back to the idea that users could request changes, and they could be experimentally deployed immediately.
Thank you. There was a lot to extensions that was bit of scope for the essay, which I would love to go deeper on in later writing.
Some open questions I had as I thought through extensions:
We talked about the data abstraction side: when you expose data, it's easier for end-users to build extensions. But there are questions on UIs and data modeling.
UIs: How cool would it be agents could "enter" into applications and change the UI? In one sense this hard, but at least a demo feels in reach. What if an app exposed the UI components that it was built out of? This would let the agent remix them.
Data modeling: Exposing data works, but what if users want to store extra information? Maybe each user could spin up their own separate "extra" database.
Fun software but the only issue with Instant is their pricing. Once they gain adoption, I expect them to significantly raise their rates, I can seen them charging over $1 per GB easily. And like with any vendor lock-in, you’re stuck paying whatever they decide to charge. Observe with caution I'd say
> vendor lock-in
For what it's worth, Instant is fully open source. The UI, the sync engine, and the multi-tenant database live here:
Vendor lock-in from vibe-coded apps is going to be brutal. It's an all-out turf war.
But hey, rewriting the plethora of vibe-coded long tail* apps might be a major source of employment in the future.
* small but loyal and profitable userbases
Why lock-in? The interface is just a conversation and the non-programmer won't know whether it's InstantDB or whatever else in the background, as that's the whole point of vibe coding. I can only see issues taking out your data into another system, but even that can be vibe coded (can it?)
This is a massive business opportunity for whoever owns the market.
I have a friend who owns a small/medium sized marketing firm. They typically manage social media and advertising for local businesses (butchers, plumbers, NPOs, etc.). A major cost center for them is dev. They can generally handle developing assets (images, videos, text copy) and publishing them (Facebook, YouTube, Instagram) but if they need any kind of interactivity (even basic forms or CRM-like stuff) they used to hire programmers.
This friend is now "vibe coding" the simple interactivity that previously they had to outsource. In the last few months he has pitched, won and crucially delivered simple apps for a few clients. We're not talking complex web apps, it's mostly CRUD forms and basic workflows, the kind you see people go on about using n8n on Twitter. He's talking to me these days about React, Tailwind, DNS and all of that stuff.
His clients don't know, or care, how he delivers. The local butcher doesn't know about "best practices" or whatever. He just cares that if someone signs up for his newsletter that he gets a notification and that person gets his weekly meat deals email.
His firm is picking up more and more complex projects like these and saving a huge amount on costs. Turn-key services that enable guys like him are going to reap the rewards.
It has been really interesting to see how non-technical people use agents. My girlfriend shipped a full-stack on the app store [1]. She was only familiar with basic HTML. Now she's building out an inspiration tracker that has file uploads, weekly todos, search, categories.
There's a lot more ideas and people who would love to put in effort to give to the world, then there are expert programmers to build them.
Congrats to your girlfriend! Actually shipping their own app is a feat that many programmers never achieve, instead spending their entire lives working on projects for others.
I'm going to jump on this to think aloud about the unlock this ability gives the world for customized apps. My neighbor is a landscaper and he is constantly complaining to me about invoicing software. He has gone through 10+ apps trying to find one that fits his particular set of requirements. He was telling me recently that he spent several phone calls with a developer who had shipped an iOS app that was close to what he needed trying to explain what he wanted. He knows I am a programmer and is always hinting that I should develop an app that would meet his requirements.
But I know better. Invoicing/scheduling software is really difficult, especially to appeal to everyone. Each small business has so many tiny requirements that are specific to their business and their personality. You can't just have one piece of software that appeals to everyone, that meets all of the requirements, without it becoming bloated and complicated. And if I built to his particular requirements, I would have exactly 1 customer, which isn't sustainable as a business (I mean, he wants to pay ~20/month).
But now we have a world where that kind of highly customized software will be possible. As more and more LLM-ready building blocks emerge, custom software may become the norm rather than the exception.
> As more and more LLM-ready building blocks emerge, custom software may become the norm rather than the exception.
Heck yeah. This would be a very cool world to live in.
I'm super into this idea. Programmers (us) used to be gatekeepers... and that's less true now. It means my salary goes down, but also means that the world is going to build more, cooler stuff quicker.
> the world is going to build more, cooler stuff quicker.
Absolutely
> It means my salary goes down
We may find ourselves surprised. It's true that some part of our skills will no longer be valuable, but I wouldn't be surprised if other parts became 10x more valuable.
That's great, and kudos to your friend.
Just two things:
- Wouldn't his firm be better served by website builders like WordPress, Squarespace, Wix, etc.? These services have enabled millions of less technical people to create and publish websites for decades now. Most of them support a large ecosystem of plugins and 3rd-party tools that make adding interactivity such as forms and CRMs a breeze.
I mean, it's great that your friend is enjoying getting into web development, and that LLMs are helping him, but I reckon he would be much more productive and deliver more value to his customers by using one of the established services on the market. Unless the projects require some bespoke solutions, or mobile apps, but it doesn't sound like it.
- What happens when one of his customers asks for authentication, session management, a comment system, payments, or something non-trivial or sensitive like that? If all requirements are trivial as you say, then a web site builder could handle it, but if they stop being trivial, then he is bound to run into issues.
LLMs will happily generate non-trivial code, but there are high chances that it will contain security issues or bugs that someone inexperienced won't be able to spot and fix.
So what happens then? He will deliver a seemingly working site to his customers with security issues and bugs, and it will only be a matter of time for them to be exploited. It doesn't matter that his customers don't know or care about "best practices". They surely care about a functioning product that doesn't leak or mishandle their customer data. These issues could be mitigated or avoided by hiring an experienced developer.
So I hope that he has the wisdom and humility to determine when a developer is still required and pay for them, instead of relying on the false confidence provided by LLMs. Or he could take the time to actually learn to program and adopt best practices instead of vibe coding, which sounds like he would be interested in doing anyway.
> Wouldn't his firm be better served by website builders like WordPress, Squarespace, Wix, etc
My understanding is the majority of his work is on WordPress. It's worth noting this is a partnership with 100+ clients, 5+ full time employees. They do television commercials, websites, banner ads, social media campaigns, etc. He is a partner at the firm and while he calls himself "non-technical" he does have experience with website design (HTML/CSS) and the administration of WordPress and databases.
To be clear: he was already delivering these kind of custom solutions to clients using contract programmers. He is well aware of requirements like authentication (in fact, our last conversation he mentioned a project he was working on that did just that). But previously, the cost of custom work was too high in some cases, since bringing on a contract programmer for certain kinds of projects pushed the budget out of range for the client. Vibe coding is opening up a new avenue for custom built functionality that was previously too expensive.
> I hope that he has the wisdom and humility
I notice this kind of thing frequently. I mean, who is lacking humility here? Someone thinking they have all of the facts, offering advice and "Why don't you just ..." kind of thinking based on assumptions. If you really think you can diagnose issues and offer advice based on the quick comment I made, you should reassess your own humility before recommending it to others.
> Vibe coding is opening up a new avenue for custom built functionality that was previously too expensive.
I'm not debating that. What I am arguing for is for using these new tools smartly and conservatively, because they have and will continue to produce low quality software in hands of inexperienced developers. It's easy to be misled by their confident tone and the overhyped marketing around them into thinking that they're able to do things they realistically cannot. Those best practices you say that customers don't care about are precisely what help prevent quality issues from impacting them, regardless of the software complexity. Vibe coding throws all of that out the window. It's tempting to cut corners to keep the cost of projects down, but ignoring well established software development practices is not a safe way to do it.
> If you really think you can diagnose issues and offer advice based on the quick comment I made, you should reassess your own humility before recommending it to others.
I'm not offering advice. I'm going by what you said, and voicing a concern that the apparent utility of LLMs has some important caveats. I don't particularly care about your friend's firm nor their customers. What I do care about is that the widespread adoption of vibe coding is doing more harm than good to the software industry and society at large, which will have destructive consequences in the near future.
Instead of engaging with this argument and filling in any details I might be missing, you chose to attack me personally, which says more about you than me.
> Instead of engaging with this argument
What argument? You are expressing vague feelings of concern and stating incorrect assumptions. I can't change how you feel and those feelings are valid. They are certainly motivating your reasoning and leading you to the incorrect assumptions.
You are stating conclusions (e.g. "He will deliver a seemingly working site to his customers with security issues and bugs, and it will only be a matter of time for them to be exploited.") as if you have a crystal ball and then demanding that I defend this figment of your imagination.
> What I am arguing for is for using these new tools smartly and conservatively
You've moved the goalpost here. You said "These issues could be mitigated or avoided by hiring an experienced developer." Now you are back peddling, suggesting you actually meant to say we should use the tools "smartly and conservatively".
So how about you state yourself clearly: Can non-programmers use these tools "smartly and conservatively". And if so, why do you assume the friend I mentioned in question, someone who has been in the business for decades hiring for and delivering software, is incapable of doing so. And if not, provide an actual argument to that effect.
How convenient protecting us from the "harm" of LLMs just happens to align with your own self interest.
I am sure that isn't causing any bias in your perceptions of reality.
not to be rude, but WordPress is already a well known target for a lot of malicious behavior; assuming someone non technical is safely extending it with LLM generated authentication code is something that causes me, an industry professional, a certain amount of alarm
Your comment isn't rude, but it is a bit close to concern trolling. (as in, "the action or practice of disingenuously expressing concern about an issue in order to undermine or derail genuine discussion.") "Won't somebody think of the local plumbers website!"
There is an assumption being made here that isn't being made explicit: the only way that malicious behavior can be avoided is by paying a programmer. Is that a valid assumption? Or the less strong: a plugin is less secure if developed by a coding agent when compared to any possible programmer. Is that a valid assumption? Aren't all of the well-known issues in WordPress plugins the fault of programmers?
What I feel in these comments isn't a genuine attempt to engage but rather Fear, Uncertainty and Doubt (FUD) writ large.
Also, for what it is worth, the most recent project he developed was using React, Tailwind and Postgres (which he called "Post ... something?"). It was very work-flowy (user uploads a doc, it goes into a queue for manual review, once approved it is converted and uploaded to Google Docs, an email is sent, etc). I asked him if he had investigated any workflow builders and he said no, he just vibe coded it. It's also worth noting that he is paying for QA, I think that existed already in house for his other projects. Well, actually what he said was "it is currently in testing", so I can't confirm if it is professional QA.
> There is an assumption being made here that isn't being made explicit: the only way that malicious behavior can be avoided is by paying a programmer. Is that a valid assumption?
As far as anyone knows: yes. Why would that surprise you? The "only way" architecture can be certified hurricane-proof is by "paying" an engineering agency. That's why such professions were developed.
I see you chose to respond to my weaker argument and ignore the second: "A plugin is less secure if developed by a coding agent when compared to any possible programmer. Is that a valid assumption? Aren't all of the well-known issues in WordPress plugins the fault of programmers?"
You are also conflating professional engineering, a licensed profession requiring insurance, etc. with software "engineering". You don't want to admit that the quality of "engineering" that is available on Upwork or in the average contract software developer is likely as bad, in fact, probably worse than the latest crop of LLMs.
Wordpress is dead.
I officially logged into Wordpress for the last time six weeks ago.
I’m currently migrating a bunch of my sites over to Next.JS.
Claude has vibed the best SEO, E.E.A.T., CRO (CXL best practice), WCAG 2.0, and schema.org compared to any site I’ve ever built in Wordpress.
The audits OPUS was creating for each of these areas are astonishing.
I’m simply migrating them across to Next.JS and hosting them on Netlify.
I haven’t paid for any premium plugins to get these sites up and running; I just used Claude Max 100.
I won’t be renewing the AUD$3500 in Wordpress ecosystem subscriptions after they run out this year.
For my gardening business (I’m now a professional gardener), I’ve integrated a job route scheduling tool with Claude Code. This tool calculates travel times between my gardening jobs and provides basic CRM functionality for my clients. It uses the Google Distance Matrix API, and my week is laid out like a Kanban board.
For my new gardening website, I’ve created dozens of new service pages over the last ten days. I’ve also created a local admin dashboard that ingests my 1200 or so before and after pictures. This dashboard provides a neat interface to match before and after “pairs,” extracts the EXIF data, calculates the suburb, and allows me to tag by job type. It then moves the photos (stripped of EXIF) into the Next.JS public folder with AVIF and WebP versions and a JSON file that specifies their content.
Claude then uses the JSON to build custom gallery components for each service page.
None of this was conceivable for me two months ago.
I’m primarily building static JamStack sites that are secure.
Is Wordpress secure? I don’t think so.
I’ve done many months of work in the last twenty-one days.
Have I saved myself $50k by doing all this with Claude Code? No, because that was never an option previously.
I understand your concerns about false confidence, and I genuinely respect that perspective. I backed out of Firebase Studio a while ago because I lacked confidence in Gemini’s ability to create safe and functional Firebase rules.
However, the landscape is changing, and the new interface for CMS systems will no longer be the traditional wp-admin. Instead, it will be a user-friendly chat agent with a robust system prompt for building websites, forms, basic workflow rules, business logic, and authentication.
Although I’m not a programmer, I have experience as a digital producer, which has given me a good understanding of toolchains.
If I were a startup envisioning the next generation of CMS, I would be actively working on it and developing it as quickly as possible.
I've built a CMS with Claude Code aswell and its working incredibly to create JSON proposals that my sveltekit website reads and turns into beautiful proposal pages. When a customer creates a booking for my mini-golf hire company they get emailed and they get their own booking hub where they can update their booking details, see the proposal when it comes through see any invoices ect. The best part is and what i'm so excited about is we have created a daily business script 'npm run daily' that Claude Code runs and the script uses the business logic to move bookings along in the cycle by telling Claude Code what bookings have tasks. It will return, you have 3 bookings that need attention, run 'npm run get-booking [booking shortcode] THEN that script returns ALL data for that booking row from the db and it knows what task is needed to be done so Claude Code has all the context for that booking and it's prompted at the end saying NEXT STEP Claude code run 'npm run generate-proposal [shortcode] JSON output. (there was an example json output in there for claude to know the syntax ) Everything goes to an out tray in the admin web ui that i have to manually approve. I'm still in testing but I'm starting to realise that Claude Code can be an agentic platform for apps run from the CLI, like my automated crm assistant we've built.
Just fantastic! You know you can setup GitHub Actions to move things along? I have made a few. I also installed Claude Code agent in the git hub repository. Then if I want to make changes to the site when I’m out and about I just raise an issue and ask @claude to do something. Also, I have been using Netlify functions to do quite a few different things as well, like sending SMS messages when a form is completed. Also the paid version of Netlify allows background functions that can run too.
> Today we’re releasing an API that gives you and your agents full-stack backends. Each backend comes with a database, a sync engine, auth tools, file storage, and presence.
this is a hosted lamp stack, we had it 20 years ago. is cpanel is not fashionable anymore?
CPanel can still work great. But Instant is a bit different. Every query you write is reactive by default, it works offline, comes with a lot more batteries, and is multi-tenant.
What other MCP-compatible tools are people using to ship/deploy software? Is there anything AWS-compatible that people like/use? Something for self-hosters? Anyone letting their agents ssh into machines..?
I suppose that most deployment/devops is done using existing git push workflows and IaaC. Has anyone had good experience with LLM/agent-compatible tools?
I've been using InstantDB for two projects for one year and it's awesome.
The most frustrating problem I have had with Firebase Studio is Gemini 2.5 attempting to create firebase rules... it was completely unworkable in my experience - just constant permissions errors. I pivoted to Claude Code a few weeks ago with Prisma ORM and NEON db running on Netlify. It's been pretty good so far. I will give InstantDB a go soon I think.
Did you consider giving Prisma Postgres a shot aswell? It's MCP server plays really well with Claude.
https://www.prisma.io/docs/postgres/integrations/mcp-server#...
Thank you!
> Gemini 2.5 attempting to create firebase rules
That is very interesting. I wonder if Claude Code would do better on Firebase rules.
Just dont connect an agent to a pay per query database, unless you want to risk getting large bills.
Make sure the agent knows how much it costs to query
In this case the cost per query is zero!
Love Instant. Great team and product. Congrats on the launch!
Read the comments here so far and I find that they are absolutely right to offer an AI layer that speeds up building apps on their db.
Once built, the solution is plain-old-runnable-code (PORC :-), as long as the business logic implemented doesn't exit to LLM. So I don't fret so much about the AI hype story here.
For anyone starting off building with new tech, an AI assistant is really helpful.
If we achieve super intelligence, agents will be shipping themselves.
Any finite intelligence will have limits and a "complexity budget". I know I see a lot of people assuming AI will just be able to do anything, but they can't escape the limits of being finite. An AI will benefit from a well-packaged library in a similar way to what a human can, though they may have meaningfully different preferences on what it should look like.
Then they will be founding their own startups, and if successful, they'll invest in each other's startups.
And every one of them will be ads.
but who will buy the advertised products? and with what money?
This is an implementation detail they'll figure it out as they go
Agents with their agent money - get ready for new legal structures and a bifurcation of the economy: agentic and human.
Who knows…
A separate, self-contained economy.
A Disneyland with no children.
Moloch.
May be agents will reproduce small models
Agentic democracy
I'm saving all of these articles for the next time we go through the "AI (LLMs) is going to change the world," cycle.
The systems we use can only be as smart and intuitive as the people who prompt them.
On top of it, this (LLMs) is not AI, not even close, if anything they are glorified prediction systems that require human prompting.
> On top of it, this (LLMs) is not AI, not even close, if anything they are glorified prediction systems that require human prompting.
puts in retainer; pushes glasses back up bridge of nose
Technically schpeaking, what you're talking about is the difference between weak AI and strong AI/artificial general intelligence (AGI). AGI is the kind of AI that has reached human levels of consciousness. We're not there yet. Personally, I hope we don't get there, but I'm not the one in charge, so shrug.
You can do a lot with glorified prediction systems that require human prompting. Actually, they are arguably more valuable than AGI because you can more easily communicate and utilize their value proposition. People don't need a machine that wonders the same stuff they do; they need something that does a specific task in lieu of their own effort.
Haha. You're 100% correct in the AGI/AI thing. I'm just sick and tired of every article being about AI, it's great people but we can't stop innovating and attending to other areas of technology.
>You can do a lot with glorified prediction systems that require human prompting >People don't need a machine that wonders the same stuff they do; they need something that does a specific task in lieu of their own effort.
This is the problem with our current revision with AI; the way I see it those two are in conflict with each other. In lieu of their own effort, the way a vast amount of the would be users think, is "without promoting" which would lend towards AGI than AI.
>Actually, they are arguably more valuable than AGI because you can more easily communicate and utilize their value proposition.
To you and I this might be true, but to your average non-techie I don't think it's quite as true as you would like it to be.
Short term it is very true, everyone sees the value until you realize it's inherit limitations and the 'shiny, wears off
You're saying it like this is first time AI changes its meaning in marketing. People used to market "smart cycle" in dishwashers as AI.
In the dishwasher's defense it is pretty smart compared to an LLM.
Given the level of disruption we'd see if a company reached AGI, wouldn't they be incentivized to somehow hide it? They could just use said AGI to produce inferior versions of itself, each one iteratively a little bit better than before.
> On top of it, this (LLMs) is not AI, not even close,
Do you think that the LLM/AI tools today are better than those from 2 years ago? Do you think the LLM/AI tools in 2 years time will be no better than the ones we have today?
Personally I hope they stop getting better. It’s been cool and fun but it’s just starting to freak me out a little lol
False equivalency. Faster and faster stochastic parrots != intelligence.
You actually answered their question by reducing two years of LLM improvements to a factor of speed.
Interpreting your non-response: No, two years have not improved things and two more years will not either.