Comments Page - If you have a Claude account, they're going to train on your data moving forward

« Back If you have a Claude account, they're going to train on your data moving forwardold.reddit.comSubmitted by diggan 3 hours ago

ljosifov 5 minutes ago
Excellent. What were they waiting for up to now?? I thought they already trained on my data. I assume they train, even hope that they train, even when they say they don't. People that want to be data privacy maximalists - fine, don't use their data. But there are people out there (myself) that are on the opposite end of the spectrum, and we are mostly ignored by the companies. Companies just assume people only ever want to deny them their data.
It annoys me greatly, that I have no tick box on Google to tell them "go and adapt models I use on my Gmail, Photos, Maps etc." I don't want Google to ever be mistaken where I live - I have told them 100 times already.
This idea that "no one wants to share their data" is just assumed, and permeates everything. Like soft-ball interviews that a popular science communicator did with DeepMind folks working in medicine: every question was prefixed by litany of caveats that were all about 1) assumed aversion of people to sharing their data 2) horrors and disasters that are to befall us should we share the data. I have not suffered any horrors. I'm not aware of any major disasters. I'm aware of major advances in medicine in my lifetime. Ultimately the process does involve controlled data collection and experimentation. Looks a good deal to me tbh. I go out of my way to tick all the NHS boxes too, to "use my data as you see fit". It's an uphill struggle. The defaults are always "deny everything". Tick boxes never go away, there is no master checkbox "use any and all of my data and never ask me again" to tick.
- ardit33 a few seconds ago
  This is a problem for folks with sensitive data, and also for coorporate users who don't want their data being used for it due to all kinds of liability issues.
  I am sure they will have a coorporate carve out, otherwise it makes them unusuable for some large corps.
AlecSchueler 3 hours ago
Am I the only one that assumed everything was already being used for training?
- hexage1814 an hour ago
  This. It's the same innocence of people who believe when you delete a document on Google/META/Apple/Microsoft servers, it "really" gets deleted. Google most likely has a backup of every piece of information indexed by them in the last 20 years or so. It would cause envy to the Internet Archive.
  giancarlostoro an hour ago
  With the privacy laws out there, I do genuinely think they eventually get purged even from backups. I remember there being a really cool YouTube video shared here on HN that google no longer has publicly, it was about the process of an email and all the behind the scenes things, like physical security into a data center, to their patented hard drive shredders they use once the hard drives are to be tossed. I wish Google had kept that video public and online, it was a great watch.
  I know once you delete something on Discord its poof, and that's the end of that. I've reported things that if anyone at Discord could access a copy of they would have called police. There's a lot of awful trolls on chat platforms that post awful things.
  diggan 42 minutes ago
  > I know once you delete something on Discord its poof, and that's the end of that. I've reported things that if anyone at Discord could access a copy of they would have called police. There's a lot of awful trolls on chat platforms that post awful things.
  That's not what Discord themselves say, is that coming from Discord, the police or someone else?
  > Once you delete content, it will no longer be available to other users (though it may take some time to clear cached uploads). Deleted content will also be deleted from Discord’s systems, but we may retain content longer if we have a legal obligation to preserve it as described below. Public posts may also be retained for 180 days to two years for use by Discord as described in our Privacy Policy (for example, to help us train models that proactively detect content that violates our policies). - https://support.discord.com/hc/en-us/articles/5431812448791-...
  Seems to be something that decides if the content should be deleted faster, or kept for between 180 days - 2 years. So even for Discord, "once you delete something on Discord its poof" isn't 100% accurate.
  giancarlostoro 27 minutes ago
  At least in terms of reporting content to "Trust and Safety" they certainly behave like its gone forever. I have had friends report illegal content, to both Discord and law enforcement, the take away seemed like it was gone, now it's making me wonder if Discord is really archiving CSAM material for two years and not helping law enforcement unless a proper warrant is involved, yikes.
  bwillard 42 minutes ago
  Officially, up to you if you believe they are following their policies, all of the companies have published statements on how long they keep their data after deletion (which customers broadly want to support recovery if something goes wrong).
  - Google: active storage for "around 2 months from the time of deletion" and in backups "for up to 6 months": https://policies.google.com/technologies/retention?hl=en-US
  - Meta: 90 days: https://www.meta.com/help/quest/609965707113909/
  - Apple/iCloud: 30 days: https://support.apple.com/guide/icloud/delete-files-mm3b7fcd...
  - Microsoft: 30-180 days: https://learn.microsoft.com/en-us/compliance/assurance/assur...
  So if it ends up that they are storing data longer there can be consequences (GDPR, CCPA, FTC).
- lemonberry 3 hours ago
  You are not.
- A4ET8a8uTh0_v2 2 hours ago
  I mean, I am sure there are individuals, who still believe in the basic value of the word within the framework of our civilization, but, having seen those words not just twisted beyond recognition to fit a specific idea, but simply ignored when they were no longer convenient, it would be a surprise now that a cynical stance is not more common.
  The question is: how does that affect their choices. How much ends up being gated what previously would have ended up in the open?
  Me: I am using a local variant ( and attempting to build something I think I can control better ).
vb-8448 36 minutes ago
So, I guess they run out of data to train on ...
I wonder on how much they can rely on the data and what kind of "knowledge" they can extract. I never give feedback and most time (let's say 5 out of 6) the result cc produce it simply wrong. How can they know the result is valuable or not?
binary132 22 minutes ago
$COMPANY reneged on their solemn pinky promise to not do the bad thing this time? Quelle surprise!
macintux 3 hours ago
Dupe: https://news.ycombinator.com/item?id=45062683
- diggan 3 hours ago
  At least this submission has the original text Anthropic sent out to people :) But yeah, Perplexity gives a better summary for outsiders I guess.
0xbadc0de5 an hour ago
I kind of already assumed they were. I've got some pretty niche use-cases that I'd like to see the models get better at thinking their way through. I benefit from their training on my interactions. So I'll opt in. But I'll also recognize that others might not feel that way, so the services should provide a way for users to opt out.
javier_e06 2 hours ago
I use AI to solve problems, not to check the weather or deciding what to wear. As such it makes sense for AI to remember when it hits the nail on the head.
- ath3nd 7 minutes ago
  And if you solve a novel problem, Claude will happily take your reasoning and give it to the next user trying to solve the same novel problem. Imagine if that was a guy working for the competition :)
- leetbulb 2 hours ago
  Agreed. Typically I would be against something like this, but in this case, have it.
  AlexandrB an hour ago
  How do you feel about this data being used to target advertising at you in the inevitable rush to monetize these AI products?
  christophilus 9 minutes ago
  I feel like that’s annoying, but it’s a drop in the bucket vs the current firehose of ads, and there’s a slim shot these ads might actually be interesting or relevant to me.
  Anyway, I’ll block them like I do everything.
  ath3nd 6 minutes ago
  Oh, sweet summer child, your SOLUTIONS will be trained on, and will be given to others, but now that you bring up ads, I guarantee you that those will somehow be incorporated in Claude soon.
homarp 2 hours ago
see other discussion https://news.ycombinator.com/item?id=45062683
ChrisArchitect 9 minutes ago
[dupe] https://news.ycombinator.com/item?id=45053806
esafak an hour ago
"If you use Claude for Work, via the API, or other services under our Commercial Terms or other Agreements, then these changes don't apply to you."
flerchin an hour ago
Maybe a value to users if done correctly. The way it is right now, you can't teach the model anything. When it gets something wrong, it will probably get the same thing wrong again in another chat.
- phallus an hour ago
  That's not how LLMs work.
I_am_tiberius 2 hours ago
Criminal, evil thieves.
wat10000 2 hours ago
Rather misleading title. Missing the important “unless you ask them not to” part. Sounds like a bit of a dark pattern to push you into accepting it and that’s not cool, but you do get a choice.
ratg13 2 hours ago
I can understand training AIs on books, and even internet forums, but I can't help but think that training an AI on lots of dumb questions with probably an excessive amount of grammar and spelling errors will somehow make it smarter.
- nrclark an hour ago
  Depends on how you’re using the data. There’s a pretty strong correctness signal in the user behavior.
  Did they rephrase the question? Probably the first answer was wrong. Did the session end? Good chance the answer was acceptable. Did they ask follow-ups? What kind? Etc.
  vb-8448 30 minutes ago
  I'm used to doing the same task 4 or 5 times (different sessions, similar prompts), and most of the time the result is useless or completely wrong. Sometimes I go back and pick the first result, other time none of them, other time a mix of them. I'm wondering how can they extract value from this.
  dudefeliciano an hour ago
  > Did the session end? Good chance the answer was acceptable.
  Or that the user just ragequit
- mrweasel an hour ago
  They train AI on Reddit and Stack Overflow questions, I can't see it getting any worse.
- dahsameer 2 hours ago
  > and even internet forums
  i would consider internet forums also includes a lot of dumb questions
  ratg13 2 hours ago
  Agree, but people generally take a small pause before saying stuff online.
  In 'private', people are less ashamed of their ignorance, and also know they can say gibberish and the AI will figure it out.
internet2000 2 hours ago
I’m fine with that.
- dudefeliciano an hour ago
  you are fine with paying, 20, 90 or 200 euros a month AND having your data mined? i must be getting old...
hkon 2 hours ago
With the amount of times Claude is visiting my websites I'd say they are very desperate for data.
SirFatty 2 hours ago
"going forward" ;-)
gooob 2 hours ago
and now the LLM gets to observe itself, heh heh heh