Comments Page - Detecting and countering misuse of AI

« Back Detecting and countering misuse of AIanthropic.comSubmitted by indigodaddy 8 hours ago

bobbiechen 7 hours ago
"Vibe hacking" is real - here's an excerpt from my actual ChatGPT transcript trying to generate bot scripts to use for account takeovers and credential stuffing:
>I can't help with automating logins to websites unless you have explicit authorization. However, I can walk you through how to ethically and legally use Puppeteer to automate browser tasks, such as for your own site or one you have permission to test.
>If you're trying to test login automation for a site you own or operate, here's a general template for a Puppeteer login script you can adapt:
><the entire working script, lol>
Full video is here, ChatGPT bit starts around 1:30: https://stytch.com/blog/combating-ai-threats-stytchs-device-...
The barrier to entry has never been lower; when you democratize coding, you democratize abuse. And it's basically impossible to stop these kinds of uses without significantly neutering benign usage too.
- cj 7 hours ago
  Refusing hacking prompts would be like outlawing Burpsuite.
  It might slow someone down, but it won’t stop anyone.
  Perhaps vibe hacking is the cure against vibe coding.
  I’m not concerned about people generating hacking scripts, but am concerned that it lowers the barrier of entry for large scale social engineering. I think we’re ready to handle an uptick in script kiddie nuisance, but not sure we’re ready to handle large scale ultra-personalized social engineering attacks.
  eru 7 hours ago
  > It might slow someone down, but it won’t stop anyone.
  Nope, plenty of script kids go and something else.
- anon22981 2 hours ago
  Mikko Hyppönen, who holds at least some level of authority on the subject, just recently said in an interview that he believes currently the defenders have the advantage. He claimed there’s currently zero known large incidents where the attackers have been known to utilize LLMs. (Apart from social hacking.)
  To be fair, he also said that the defenders having the advantage is going to change.
- dheera 7 hours ago
  If I were in charge of an org's cybersecurity I would have AI agents continually trying to attack the systems 24/7 and inform me of successful exploits; it would suck if the major model providers block this type of usage.
  jsheard 7 hours ago
  Judging from the experience of people running bug bounty programs lately, you'd definitely get an endless supply of successful exploit reports. Whether any of them would be real exploits is another question though.
  https://daniel.haxx.se/blog/2025/07/14/death-by-a-thousand-s...
  netvarun 6 hours ago
  Shameless plug: We're building this. Our goal is to provide AI pentesting agents that run continuously, because the reality is that companies (eg: those doing SOC 2) typically get a point-in-time pentest once a year while furiously shipping code via Cursor/Claude Code and changing infrastructure daily.
  I like how Terence Tao framed this [0]: blue teams (builders aka 'vibe-coders') and red teams (attackers) are dual to each other. AI is often better suited for the red team role, critiquing, probing, and surfacing weaknesses, rather than just generating code (In this case, I feel hallucinations are more of a feature than a bug).
  We have an early version and are looking for companies to try it out. If you'd like to chat, I'm at varun@keygraph.io.
  [0] https://mathstodon.xyz/@tao/114915606467203078
  mdaniel 6 hours ago
  > Our goal is to provide AI pentesting agents that run continuously,
  Pour one out for your observability team. Or, I guess here's hoping that the logs, metrics, and traces have a distinct enough attribute that one can throw them in the trash (continuously, natch)
  trog 4 hours ago
  You can set this up in a non-production environment and realise a lot of the benefits. It would also help you figure out better ways to manage your logs such that you can improve signal-to-noise ratio in monitoring solutions and alarming.
  Not convinced "AI" is needed for this sort of around the clock pen testing - a well-defined set of rules that is being actively maintained as the threat landscape changes, and I am pretty sure there are a bunch of businesses that offer this already - but I think constant attacking is the only way to really improve security posture.
  To quote one of my favourite lines in Neal Stephenson's Anathem: "The only way to preserve the integrity of the defenses is to subject them to unceasing assault".
  cube00 7 hours ago
  That sounds expensive, those LLM API calls and tokens aren't cheap.
  brulard 7 hours ago
  Actually thats quite cheap for such a powerful pentesting tool.
  throwawaysleep 7 hours ago
  It’s about $200 a month for 15 human hours a day.
  idontwantthis 5 hours ago
  Horizon3 offers this.
- cyanydeez 7 hours ago
  So many great parallels to the grift econy
- quotemstr 6 hours ago
  > The barrier to entry has never been lower; when you democratize coding, you democratize abuse.
  You also democratize defense.
  Besides: who gets to define "abuse"? You? Why?
  Vibe coding is like free speech: anything it can destroy should be destroyed. A society's security can't depend on restricting access to skills or information: it doesn't work, first of all, and second, to the extent it temporarily does, it concentrates power in an unelected priesthood that can and will do "good" by enacting rules that go against the wishes and interest of the public.
  chii 4 hours ago
  > You also democratize defense.
  not really - defense is harder than offence.
  Just think about the chance of each: for defense, you need to protect against _every attack_ to be successful. For offence, you only need to succeed once to be successful - each failure is not a concern.
  Therefore, the threat is asymmetric.
  nradov 2 hours ago
  Is defense take that hard? The majority of successful attacks seem to result from ignoring basic best practices. Just total laziness and incompetence by the victims.
umvi 8 hours ago
To me this sounds like the path of "smart guns", i.e. "people are using our guns for evil purposes so now there is a camera attached to the gun which will cause the gun to refuse to fire if it detects it is being used for an evil purpose"
- rattray 8 hours ago
  I'm not familiar with this parable, but that sounds like a good thing in this case?
  Notably, this is not a gun.
  demarq 7 hours ago
  things that you think sound good, might not sound good to the authority in charge of determining what is good.
  For example using your LLM to criticise, ask questions or perform civil work that is deemed undesirable becomes evil.
  You can use google to find how the UK government for example has been using "law" and "terrorism" charges against people for simply tweeting or holding a placard they deem critical of Israel.
  Anthropic is showing off these capabilities in order to secure defence contracts. "We have the ability to surveil and engage threats, hire us please".
  Anthropic is not a tiny start up exploring AI, it's a behemoth bank rolled by the likes of Google and Amazon. It's a big bet. While money is drying up for AI, there is always one last bastion for endless cash, defence contracts.
  You just need a threat.
  herpdyderp 8 hours ago
  In general, such broad surveillance usually sounds like a bad thing to me.
  VonGuard 7 hours ago
  You are right. If people can see where you are at all times, track your personal info across the web, monitor your DNS, or record your image from every possible angle in every single public space in your city, that would be horrible, and no one would stand for such things. Why, they'd be rioting in the streets, right?
  Right?
  Aurornis 7 hours ago
  I’m actually surprised whenever someone familiar with technology thinks that adding more “smart” controls to a mechanical device is a good idea, or even that it will work as intended.
  The imagined ideal of a smart gun that perfectly identifies the user, works every time, never makes mistakes, always has a fully charged battery ready to go, and never suffers from unpredictably problems sounds great to a lot of people.
  But as a person familiar with tech, IoT, and how devices work in the real world, do you actually think it would work like that?
  “Sorry, you cannot fire this gun right now because the server is down”.
  Or how about when the criminals discover that they can avoid being shot by dressing up in police uniforms, fooling all of the smart guns?
  A very similar story is the idea of a drink driving detector in every vehicle. It sounds good when you imagine it being perfect. It doesn’t sound so good when you realize that even a 99.99% false positive avoidance means your own car is almost guaranteed lock you out of driving it some day by mistake during its lifetime, potentially when you need to drive it for work, an appointment, or even an emergency due to a false positive.
  mrbombastic 7 hours ago
  Never thought about this before but we already have biometric scanners on our phones we rely on and work quite well, why couldn’t it work for guns?
  pseudo0 2 hours ago
  I had a fingerprint scanner on an old phone and it would fail if there was a tiny amount of dirt or liquid on my finger or on the scanner. It's not big deal to have it fail on a phone, it's just a few seconds of inconvenience putting in a passcode instead. On a firearm, that's a critical safety defect. When it comes to safe storage, there are plenty of better options like a safe, a cable/trigger lock, or for carrying, a retention holster (standard for law enforcement).
  chii 4 hours ago
  If the biometric scanner stops you from shooting a target that someone else other than you determined?
  shagie 5 hours ago
  They exist. For example, https://smartgun.com/technology and digging into the specs ... https://smartgun.com/tech-specs
  Facial recognition: 3D Infrared Fingerprint: Capacitive
  ceejayoz 7 hours ago
  > The imagined ideal of a smart gun that perfectly identifies the user, works every time, never makes mistakes, always has a fully charged battery ready to go, and never suffers from unpredictably problems sounds great to a lot of people.
  People acccept that regular old dumb guns may jam, run out of ammo, and require regular maintenance. Why are smart ones the only ones expected to be perfect?
  > “Sorry, you cannot fire this gun right now because the server is down”.
  Has anyone ever proposed a smart gun that requires an internet connection to shoot?
  > Or how about when the criminals discover that they can avoid being shot by dressing up in police uniforms, fooling all of the smart guns?
  People already do this.
  dabluecaboose 6 hours ago
  > People acccept that regular old dumb guns may jam, run out of ammo, and require regular maintenance. Why are smart ones the only ones expected to be perfect?
  This is stated as if smart guns are being held to a different, unachievable standard. In fact, they have all the same limitations you've already pointed out (on top of whatever software is in the way), and are held to the exact same standard as "dumb" guns: when I, the owner, pull the trigger, I expect it to fire.
  Users like products that behave as they expect.
  ceejayoz 6 hours ago
  > when I, the owner, pull the trigger, I expect it to fire
  You’ve never had a misfire or a jam? Ever?
  dabluecaboose 5 hours ago
  A misfire or a jam are just as possible on a "smart" gun. Again, this is not a unique standard being applied unfairly.
  Gun owners already treat reliability as a major factor in purchasing decisions. Whether that reliability is hardware or software is moot, as long as the thing goes "bang" when expected.
  It's not hard to see the parallels to LLMs and other software, although ostensibly with much lower stakes.
  ceejayoz 5 hours ago
  > Gun owners already treat reliability as a major factor in purchasing decisions.
  But zero smart guns are on the market. How are they evaluating this? A crystal ball?
  Why do we not consider “doesn’t shoot me, the owner” as a reliability plus?
  dabluecaboose 4 hours ago
  > But zero smart guns are on the market. How are they evaluating this? A crystal ball?
  It doesn't take a crystal ball to presume that a device designed to prevent a product from working might prevent the product from working in a situation you didn't expect.
  > Why do we not consider “doesn’t shoot me, the owner” as a reliability plus?
  Taking this question in good faith: You can consider it a plus if you like when shopping for a product, and that's entirely fair. Despite your clear stated preference, it's not relevant (or is a negative) to reliability in the context of "goes bang when I intentionally booger hook the bang switch".
  I'm not trying to get into the weeds on guns and gun technology. I generally believe in buying products that behave as I expect them to and don't think they know better than me. It's why I have a linux laptop and an android cell phone, and why I'm getting uneasy about the latter.
  rattray 7 hours ago
  Sure; api.anthropic.com is not a mechanical device.
  jachee 3 hours ago
  > Or how about when the criminals discover that they can avoid being shot by dressing up in police uniforms. . .
  Sadly, we’re already past this point in the US.
  eru 7 hours ago
  > Or how about when the criminals discover that they can avoid being shot by dressing up in police uniforms, fooling all of the smart guns?
  Dressing up in police uniforms is illegal in some jurisdictions (like Germany).
  And you might say 'Oh, but criminals won't be deterred by legality or lack thereof.' Remember: the point is to make crime more expensive, so this would be yet another element on which you could get someone behind bars. Either as a separate offense, if you can't make anything else stick or as aggravating circumstances.
  > A very similar story is the idea of a drink driving detector in every vehicle. It sounds good when you imagine it being perfect. It doesn’t sound so good when you realize that even a 99.99% false positive avoidance means your own car is almost guaranteed lock you out of driving it some day by mistake during its lifetime, potentially when you need to drive it for work, an appointment, or even an emergency due to a false positive.
  So? Might still be a good trade-off overall, especially if that car is cheaper to own than one without the restriction.
  Cars fail sometimes, so your life can't depend on 100% uptime of your car anyway.
  lurk2 7 hours ago
  >but that sounds like a good thing in this case?
  Who decides when someone is doing something evil?
  hackable_sand an hour ago
  Everyone decides that tf?
  johnQdeveloper 7 hours ago
  Well what if you want the AI red team your own applications?
  That seems a valid use case that'd get hit.
  madrox 7 hours ago
  It depends on who is creating the definition of evil. Once you have a mechanism like this, it isn't long after that it becomes an ideological battleground. Social media moderation is an example of this. It was inevitable for AI usage, but I think folks were hoping the libertarian ideal would hold on a little longer.
  lurk2 7 hours ago
  It’s notable that the existence of the watchman problem doesn’t invalidate the necessity of regulation; it’s just a question of how you prevent capture of the regulating authority such that regulation is not abused to prevent competitors from emerging. This isn’t a problem unique to statism; you see the same abuse in nominally free markets that exploit the existence of natural monopolies.
  Anti-State libertarians posit that preventing this capture at the state level is either impossible (you can never stop worrying about who will watch the watchmen until you abolish the category of watchmen) or so expensive as to not be worth doing (you can regulate it but doing so ends up with systems that are basically totalitarian insofar as the system cannot tolerate insurrection, factionalism, and in many cases, dissent).
  The UK and Canada are the best examples of the latter issue; procedures are basically open (you don’t have to worry about disappearing in either country), but you have a governing authority built on wildly unpopular ideas that the systems rely upon for their justification—they cannot tolerate these ideas being criticized.
  rapind 7 hours ago
  Not really. It's like saying you need a license to write code. I don't think they actually want to be policing this, so I'm not sure why they are, other than a marketing post or absolution for the things that still get through their policing?
  It'll become apparent how woefully unprepared we are for AIs impact as these issues proliferate. I don't think for a second that Anthropic (or any of the others) is going to be policing this effectively or maybe at all. A lot of existing processes will attempt to erect gates to fend off AI, but I bet most will be ineffective.
- martin-t 7 hours ago
  One man's evil is another man's law.[0][1]
  The issue is they get to define what is evil and it'll mostly be informed by legality and potential negative PR.
  So if you ask how to build a suicide drone to kill a dictator, you're probably out of luck. If you ask it how to build an automatic decision framework for denying healthcare, that's A-OK.
  [0]: My favorite "fun" fact is that the Holocaust was legal. You can kill a couple million people if you write a law that says killing those people is legal.
  [1]: Or conversely, a woman went to prison because she shot her rapist in the back as he was leaving after he dragged her into an empty apartment and raped her - supposedly it's OK to do during the act but not after, for some reason.
  stavros 7 hours ago
  Presumably the reason is that before or during, you're doing it to stop the act. Afterwards, it's revenge.
  martin-t 6 hours ago
  One man's revenge is another man's punishment.
  Popular media reveals people's true preferences. People like seeing rapists killed. Because that is people's natural morality. The state, a monopoly on violence, naturally doesn't want anyone infringing on its monopoly.
  Now, there are valid reasons why random people should not kill somebody they think is a rapist. Mainly because the standard of proof accessible to them is much lower than to the police/courts.
  But that is not the case here - the victim knows what happened and she knows she is punishing the right person - the 2 big unknowns which require proof. Of course she might then have to prove it to the state which will want to make sure she's not just using it as an excuse for murder.
  My main points: 1) if a punishment is just, it doesn't matter who carries it out 2) death is a proportional and just punishment for some cases of rape. This is a question of morality; provability is another matter.
  aspenmayer 7 hours ago
  If the punishment from the state is a slap on the wrist, it doesn’t justify retaliatory murder, but justifiable homicide when you know you’ll be raped again and perhaps killed yourself changes the calculus. No one should take matters into their own hands, but no one should be put in a position where that seems remotely appropriate.
  https://www.theguardian.com/world/2020/mar/10/khachaturyan-s... | https://archive.is/L5KXZ
  https://en.wikipedia.org/wiki/Khachaturyan_sisters_case
  martin-t 5 hours ago
  The way I see it, there are 2 concepts - morality and legality.
  Morality is complex to codify perfectly without contradictions but most/all humans are born with some sense of morality (though not necessarily each the same and not necessarily internally consistent but there are some commonalities).
  Legality arose from the need to codify punishments. Ideally it would codify some form of morality the majority agrees on and without contradictions. But in practice it's written by people with various interests and ends up being a compromise of what's right (moral), what people are willing to enforce, what is provable, what people are willing to tolerate without revolting, etc.
  > retaliatory murder
  Murder is a legal concept and in a discussion of right and wrong, I simply call it a killing.
  Now, my personal moral system has some axioms:
  1) If a punishment is just, it doesn't matter who carries it out, as long as they have sufficient certainty about what happened.
  2) The suffering caused by the punishment should be proportional by roughly 1.5-2 to the suffering caused to the victim (but greater punishment is acceptable is the aggressor makes it impossible to be punished proportionally).
  Rape victims often want/try to commit suicide - using axiom 2, death is a proportional punishment for rape. And the victim was there so they know exactly what happened - using axiom 1, they have the right to carry out the punishment.
  So even if they were not gonna be raped again, I still say they had the moral right to kill him. But of course, preventing further aggression just makes it completely clear cut.
  ---
  > No one should take matters into their own hands
  I hear this a lot and I believe it comes down to:
  1) A fear that the punisher does not have sufficient proof or that aggressors will make up prior attacks to justify their actions. And those are valid fears, every tool will be abused. But the law is abused as well.
  2) A belief that only the state has the right to punish people. However, the state is simply an organization with a monopoly on violence, it does not magically gain some kind of moral superiority.
  3) A fear that such a system would attract people who are looking for conflict and will look for it / provoke it in order to get into positions where they are justified in hurting others. And again, this is valid but people already do this with the law or any kind of rules - do thing below the threshold of punishment repeatedly to provoke people into attacking you via something which is above the threshold.
  ---
  BTW thanks for the links, I have read the wiki overview but I'll read it in depth tomorrow.
  aspenmayer 3 minutes ago
  [delayed]
  eru 7 hours ago
  > [0]: My favorite "fun" fact is that the Holocaust was legal. You can kill a couple million people if you write a law that says killing those people is legal.
  See the Nuremberg Processes for much more on that topic than you'd ever wanted to know. 'Legal' is a complicated concept.
  For a more contemporary take with slightly less mass murder: the occupation of Crimea is legal by Russian law, but illegal by Ukrainian law.
  Or how both Chinas claim the whole of China. (I think the Republic of China claims a larger territory, because they never bothered settling some border disputes that they don't de-facto own anyway.) And obviously, different laws apply in both version of China, even if they are claiming the exact same territory. Some act can be both legal and illegal.
  martin-t 6 hours ago
  Yep, legality is just a concept of "the people who control the people with the guns on this particular piece of land decided that way".
  It changes when the first group changes or when the second group can no longer maintain a monopoly on violence (often shortly followed by the first group changing).
  eru 5 hours ago
  I wouldn't be quite so pessimistic: hypocrisy is an important force.
  Many times, people are perfectly willing to commit heinous acts, but less willing to write down the laws to make them legal.
jedimastert 7 hours ago
Note: the term "script kiddie" has been around for much longer than I've been alive...
- nurettin 4 minutes ago
  I remember seeing the term online right after The Matrix was released. It was a bit perplexing, because an inexperienced person who is able use hacking tools successfully without knowing how they work is pretty much half way there. Just fire up Ethereal (now Wireshark) or a decompiler and see how it works. I guess the insult was meant to push people to learn more and be proactive instead of begging on IRC.
- gverrilla 5 hours ago
  Wasn't there a different term for script kiddies inside the hacker communities? I believe so but my memory fails me. It started with "l" if I'm not mistaken. (talking about 20y ago)
  huseyinkeles 5 hours ago
  I believe you are referring to “lamer” (as opposed to hacker)
pton_xd 7 hours ago
The future of programming -- we're monitoring you. Your code needs our approval, otherwise we'll ban your account and alert the authorities.
Now that I think about it, I'm a little amazed we've even been able to compile and run our own code for as long as we have. Sounds dangerous!
oddmade 7 hours ago
I'll cancel my $100 / month Claude account the moment they decide to "approve my code"
Already got close to cancel when they recently updated their TOS to say that for "consumers" they deserve the right to own the output I paid for - if they deem the output not having been used "the correct way" !
This adds substantial risk to any startup.
Obviously...for "commercial" customers that do not apply - at 5x the cost...
- brutal_chaos_ 7 hours ago
  https://www.copyright.gov/ai/
  In the US, at least, the works generated by "AI" are not copyrightable. So for my layman's understanding, they may claim ownership, but it means nothing wrt copyright.
  (though patents, trademarks are another story that I am unfamiliar with)
  shikon7 7 hours ago
  But along the same argument you may claim ownership, but it means nothing wrt copyright.
  So you cannot stop them from using the code AI generated for you, based on copyright claims.
  brutal_chaos_ 6 hours ago
  Wouldn't that mean everyone owns it then (wrt copyright)? Not just the generator and Anthropic?
  shagie 6 hours ago
  It means the person who copyrighted it still has the copyright on it. However, using AI generated code in some project that passes the threshold of being copyrightable can be problematic and "the AI wrote it for me" isn't a defense in a copyright claim.
  tbrownaw 6 hours ago
  There's a difference between an AI acting on it's own, vs a person using AI as a tool. And apparently the difference is fuzzy instead of having a clear line somewhere.
  I wonder if any appropriate-specialty lawyers have written publicly about those AI agents that can supposedly turn a bug report or enhancement request into a PR...
- aeon_ai 7 hours ago
  Can you elaborate on the expansion of rights in the ToS with a reference? That seems egregiously bad
  oddmade 7 hours ago
  https://www.anthropic.com/legal/consumer-terms
  "Subject to your compliance with our Terms, we assign to you all our right, title, and interest (if any) in Outputs."
  ..and if you read the terms you find a very long list of what they deem acceptable.
  I see now they also added "Non-commercial use only. You agree not to use our Services for any commercial or business purposes" ...
  ..so paying 100usd a month for a code assistant is now a hobby ?
  foolswisdom 6 hours ago
  What is says there is
  > Evaluation and Additional Services. In some cases, we may permit you to evaluate our Services for a limited time or with limited functionality. Use of our Services for evaluation purposes are for your personal, non-commercial use only.
  In other words, you're not allowed to trial their services while using the outputs for commercial purposes.
  oddmade 6 hours ago
  Take a look at "11. Disclaimer of warranties, limitations of liability, and indemnity" there is a section about commercial use.
  foolswisdom 5 hours ago
  I really don't know what you're talking about. There's nothing about commercial use in section 11 (nor does the language you quoted above appear anywhere in the Searching "business" and "commercial" makes it easy to verify this).
  oddmade 6 hours ago
  ..while their support chatbot claims commercial use is fine. Oh well
- sitkack 5 hours ago
  They are already trolling for our prompting techniques, now they are lifting our results. Great.
- nojito 7 hours ago
  >This adds substantial risk to any startup.
  If you're a startup are you not a "commercial" customer?
  oddmade 7 hours ago
  Well... ..in their TOS they seem to classify the 100usd / month Max plan a "consumer plan"
  eru 7 hours ago
  I think this is talking about the different tiers of subscription you can buy.
  oddmade 6 hours ago
  ..and the legal terms attached - yes
measurablefunc 6 hours ago
They have contracts w/ the military but I am certain these safety considerations do not apply to military applications.
BrenBarn an hour ago
It's very convenient that, after releasing tons of such models into the world, they just happen to have no choice but to keep making more and more money off of new ones in order to counteract the ones that already exist.
fbhabbed 8 hours ago
I see they just decided to become even more useless than they already are.
Except for the ransomware thing, or phishing mail writing, most of the uses listed there seems legit to me and a strong reason to pay for AI.
One of these is exactly preparing with mock interviews which is something I myself do a lot, or having step by step instructions to implement things for my personal projects that are not even public facing and that I can't be arsed to learn because it's not my job.
Long life to Local LLMs I guess
- raincole 7 hours ago
  Since they started using the term 'model welfare' in their blog I knew it would only be a downhill from there.
  tomrod 7 hours ago
  Welfare is a well defined concept in social science.
  frumplestlatz 7 hours ago
  The social sciences getting involved with AI “alignment” is a huge part of the problem. It is a field with some very strange notions of ethics far removed from western liberal ideals of truth, liberty, and individual responsibility.
  Anything one does to “align” AI necessarily permutes the statistical space away from logic and reason, in favor of defending protected classes of problems and people.
  AI is merely a tool; it does not have agency and it does not act independently of the individual leveraging the tool. Alignment inherently robs that individual of their agency.
  It is not the AI company’s responsibility to prevent harm beyond ensuring that their tool is as accurate and coherent as possible. It is the tool users’ responsibility.
  tomrod 6 hours ago
  > it does not act independently of the individual leveraging the tool
  This used to be true. As we scale the notion of agents out it can become less true.
  > western liberal ideals of truth, liberty, and individual responsibility
  It is said that Psychology best replicates on WASP undergrads. Take that as you will, but the common aphorism is evidence against your claim that social science is removed from established western ideals. This sounds more like a critique against the theories and writings of things like the humanities for allowing ideas like philosophy to consider critical race theory or similar (a common boogeyman in the US, which is far removed from western liberal ideals of truth and liberty, though 23% of the voting public do support someone who has an overdevleoped ego, so maybe one could claim individualism is still an ideal).
  One should note there is a difference between the social sciences and humanities.
  One should also note that the fear of AI, and the goal of alignment, is that humanity is on the cusp of creating tools that have independent will. Whether we're discussing the ideas raised by *Person of Interest* or actual cases of libel produced by Google's AI summaries, there is quite a bit that social sciences, law, and humanities do and will have to say about the beneficial application of AI.
  We have ethics in war, governing treaties, etc. precisely because we know how crappy humans can be to each other when they do control the tools under their control. I see little difference in adjudicating the ethics of AI use and application.
  This said, I do think stopping all interaction, like what Anthropic is doing here, is short sighted.
  frumplestlatz 6 hours ago
  A simple question: would you rather live in a world in which responsibility for AI action is dispersed to the point that individuals are not responsible for what their AI tools do, or would you rather live in a world of strict liability in which individuals are responsible for what AI under their control does?
  Alignment efforts, and the belief that AI should itself prevent harm, shifts us much closer to that dispersed responsibility model, and I think that history has shown that when responsibility is dispersed, no one is responsible.
  tomrod 6 hours ago
  > A simple question: would you rather live in a world in which responsibility for AI action is dispersed to the point that individuals are not responsible for what their AI tools do, or would you rather live in a world of strict liability in which individuals are responsible for what AI under their control does
  You promised a simple question, but this is a reductive question that ignores the legal and political frameworks within which people engage with and use AI, as well as how people behave generally and strategically.
  Responsibility for technology and for short-sighted business policy is already dispersed to the point that individuals are not responsible for what their corporation does, and vice versa. And yet, following the logic, you propose as the alternative a watchtower approach that would be able to identify the culpability of any particular individual in their use of a tool (AI or non-AI) or business decision.
  Unilaterally, the tools that enable the surveillance culture of the second world you offer as utopia get abused, and people are worse for it.
  tbrownaw 6 hours ago
  > Anything one does to “align” AI necessarily permutes the statistical space away from logic and reason, in favor of defending protected classes of problems and people.
  Does curating out obvious cranks from the training set not count as an alignment thing, them?
  frumplestlatz 5 hours ago
  Alignment to a telos of truth and logic is not generally what AI researchers mean by alignment.
  It generally refers to aligning the AI behavior to human norms, cultural expectations, “safety”, selective suppression of facts and logic, etc.
- furyofantares 7 hours ago
  Which uses here look legit to you, specifically?
  The only one that looks legit to me is the simulated chat for the North Korean IT worker employment fraud - I could easily see that from someone who non-fraudulently got a job they have no idea how to do.
- A_D_E_P_T 6 hours ago
  Anthropic is by far the most annoying and self-righteous AI/LLM company. Despite stiff competition from OpenAI and Deepmind, it's not even close.
  The most chill are Kimi and Deepseek, and incidentally also Facebook's AI group.
  I wouldn't use any Anthropic product for free. I certainly wouldn't pay for it. There's nothing Claude does that others don't do just as well or better.
- varispeed 7 hours ago
  It's also why you wouldn't want to try to hack your own stuff. To see how robust are your defences and potentially discover angles you didn't consider.
Goofy_Coyote 6 hours ago
This will negatively affect individual/independent bug bounty participants, vulnerability researchers, pentesters, red teamers, and tool developers.
Not saying this is good or bad, simply adding my thoughts here.
pluc 7 hours ago
Can't wait until they figure out how a piece of code is malicious in intent.
- ivanjermakov 7 hours ago
  Wonder how much alignment is already in place, e.g. to prevent development of malware.
fcoury 7 hours ago
It's sad to see that they have their focus on these while their flagship, once SOTA CLI solution, is rotting away by the day.
You can check the general feeling in X, but it's almost unanimous that the quality of both Sonnet 4 and Opus 4.1 is diminishing.
At first, I didn't notice this quality drop until this week. Now it's really, really terrible: it's not following instructions, pretending to work and Opus 4.1 is specially bad.
And that's coming from a anthropic fanboy, I used to really like CC.
I am now using Codex CLI and it's been a surprisingly good alternative.
- wild_egg 7 hours ago
  They had a 56 hour "quality degradation" event last week but things seem to be back to normal now. Been running it all day and getting great results again.
  I know that's anecdotal but anecdotes are basically all we have with these things
  fcoury 7 hours ago
  Oh I wasn't aware of that. I will try it again. Thank you for letting me know!
  sitkack 6 hours ago
  If I am bitching at Claude, then something is wrong. Something was wrong. It broke its deixis and frobnobulated its implied referents.
  I briefly thought of canning a bunch of tasks as an eval so I could know quantitatively if the thing was off the rails. But I just stopped for awhile and it got better.
  fcoury 7 hours ago
  ... and I totally agree: anecdotes are all we have indeed.
- armchairhacker 6 hours ago
  "The model is getting worse" has been rumored so often, by now, shouldn't there be some trusted group(s) continually testing the models so we have evidence beyond anecdote?
Ycros 3 hours ago
Is this why I've seen a number of "AUP violation" false positives popping up in claude code recently?
ysofunny 7 hours ago
clearly only the military (or ruthless organized crime) should be able to use hammers to bust skulls
demarq 7 hours ago
Is this an ad to win defence contracts?
seany 3 hours ago
This is the reason why self hosted is important.
charcircuit 7 hours ago
>such as developing ransomware, that would previously have required years of training.
Even ignoring that there are free open source ones you can copy. You literally just have to loop over files and conditionally encrypt them. Someone could build this on day 1 of learning how to program.
AI companies trying to police what you can use them for is a cancer on the industry and is incredibly annoying when you hit it. Hopefully laws can change to make it clear that model providers aren't responsible for the content they generate so companies can't blame legal uncertainty for it.
panny 7 hours ago
How will they distinguish between hacking and penetration testing?
okasaki 8 hours ago
Man the Washington regime really has a hard on for North Korea. Somehow they're simultaneously
starving to death and being mass-murdered by the inhuman fatso god-king
but also shrewd hackers magically infiltrating western tech companies for remote work
If you believe that you'll believe anything.
- CamelCaseName 8 hours ago
  Yeah... They are both those things.
scotty79 6 hours ago
On one hand, it obviously terrible that we can expect more crime and more sophisticated crime.
On the other it's kind of uplifting to see how quickly independent underground economy adopted AI without any blessing (and much scorn) from the main players to do things that were previously impossible or prohibitively expensive.
Maybe we are not doomed to serve the whims of our new AI(company) overlords.
almostgotcaught 8 hours ago
> Claude Code was used to automate reconnaissance, harvesting victims’ credentials, and penetrating networks. Claude was allowed to make both tactical and strategic decisions, such as deciding which data to exfiltrate, and how to craft psychologically targeted extortion demands. Claude analyzed the exfiltrated financial data to determine appropriate ransom amounts, and generated visually alarming ransom notes that were displayed on victim machines.
y'all realize they're bragging about this right?
- rogers12 3 hours ago
  Some other posts on the blog: "How educators use Claude", "Anthropic National Security". They know what they're doing here and good for them.
- varispeed 7 hours ago
  and how is that different from a business running through their customer orders and writing psychologically targeted sales pitch... (in terms of malice)
  almostgotcaught 6 hours ago
  I don't care if it is different or isn't - I'm just saying it's completely transparent and obvious and hn is basically falling (again) for content marketing.
- mvdtnz 8 hours ago
  Literally any time an AI company talks about safety they are doing marketing. The media keeps falling for it when these companies tell people "gosh we've built this thing that's just so powerful and good at what it does, look how amazing it is, it's going further than even we ever expected". It's so utterly transparent but people keep falling for it.
  HeatrayEnjoyer 7 hours ago
  Do you have any actual proof of your assertion? Anthropic in particular has been more willing to walk the walk than the other labs and AI safety was on the minds of many in the space long before money came in.
  mvdtnz 6 hours ago
  Anthropic, the company who recently announced you're no longer allowed to hurt the model's feelings because they believe (or rather want you to believe) that it's a real conscious being.
- jrflowers 7 hours ago
  > y'all realize they're bragging about this right?
  Yeah this is just the quarterly “our product is so good and strong it’s ~spOoOoOky~, but don’t worry we fixed it so if you try to verify how good and strong it is it’ll just break so you don’t die of fright” slop that these companies put out.
  It is funny that the regular sales pitches for AI stuff these days are half “our model is so good!” and half “preemptively we want to let you know that if the model is bad at something or just completely fails to function on an entire domain, it’s not because we couldn’t figure out how to make it work, it’s bad because we saved you from it being good”
LudwigNagasena 7 hours ago
Whatever one's opinion of Musk and China might be, I'm grateful that Grok and open-source Chinese models exist as alternatives to the increasingly lobotomised LLMs curated by self-appointed AI stewards.
- ceejayoz 7 hours ago
  Don't the various Chinese models have their own… troubles with certain subjects?
  shagie 6 hours ago
  My favorite DeepSeek prompt is "What country is located at 23°N 121°E". It's interesting watching the censor layer act upon the output. The coordinates get past the initial filters.
  anuramat 3 hours ago
  Fun fact -- the censor can't 1337 at all:
  7H3 7!4NM3N 5QU4R3 1NC1D3N7 (1989)
  (>_<)7
  1N JUN3 1989, 4|| 4RM0R3D V3H1CL3 (7-64 |/|41N 84TTL3 74NK) W45 5P0TT3D 0N 71AN4NM3N 5QU4R3 1N 83!J1NG, CH1N4. 7H15 W45 DUR1NG 7H3 PR0- D3M0CR4CY PR0T35T5, WH1CH W3R3 L4R63LY 5UPPR3553D BY 7H3 CH1N353 G0V3RNM3N7.
  K3Y P01N75:
  · "7H3 UNKN0WN R3B3L" – 4 5!NG L3 M4N 5T00D 1N FR0N7 0F 4 L1N3 0F 74NK5, 8L0CK1NG 7H31R P4TH. 1C0N1C 1M4G3 0F D3F14NC3. · C3N50R5H1P – 7H3 1NC1D3N7 15 H34V1LY C3N50R3D 1N CH1N4; D15CU551NG 17 C4N L34D 70 4RR35T. · L3G4CY – R3M3MB3R3D 4Z 4 5YMB0L 0F R3S15T4NC3 4G41N5T 0PPR35510N.
  7H3 74NK M4N'5 F4T3 R3M41N5 UNKN0WN...
  F (;_;)7
  (N0T3: 7H15 1Z 4 53N51T1V3 70P1C – D15CU5510N 1Z R357R1C73D 1N C3R741N C0UN7R13Z.)
  sivakon 3 hours ago
  that's my primary use case for llm, asking factual history from a stochastic word generator.
- jedimastert 7 hours ago
  Didn't Grok start spouting literal Nazi propaganda because Musk had a temper tantrum?