• toddmorey 4 hours ago

    No way OpenAI will ever “good citizen” this. Tools to opt out of training sets will only come if they are legally compelled. Governments will have to make respecting some sort of training preference header on public content mandatory I think.

    The fact that photographers have to independently submit each piece of work they wanted excluded along with detailed descriptions just shows how much they DONT want anyone excluding content from their training data.

    • recursivecaveat an hour ago

      100%, like most opt-outs this exists as a checklist feature that proponents can point to and hopefully convince bystanders. You muddy the waters by allowing someone to with great effort technically possibly achieve the thing they want, maybe, for now, until you close it in 2 years and everyone says "well that makes sense nobody used that feature anyways".

      • dylan604 4 hours ago

        > The fact that photographers have to independently submit each piece of work they wanted excluded along with detailed descriptions just shows how much they DONT want anyone excluding content from their training data.

        That's bloody brilliant. If you don't want us to scrape your content, please send us your content with all of the training data already provided so we will know not to scrape it if we come across it in the wild. FFS

        • nicbou an hour ago

          The tech industry’s understanding of consent is terrifying.

          • dylan604 an hour ago

            Understanding is a curious choice of words. I’d have gone with total disregard

          • stonogo 24 minutes ago

            They learned from Google, who to this day requires you to suffix your wifi network name with _NOMAP if you do not want it to be used by their mapping services.

          • echelon 4 hours ago

            Insofar as data for diffusion / image / video models are concerned, the rise of synthetic data and data efficiency will mean that none of this really matters anyway. We were just in the bootstrapping phase.

            You can bolt on new functional modules and train them with very limited data you acquire from Unreal Engine or in the field.

            • toddmorey 4 hours ago

              I don’t entirely agree. For example, it’s a very popular scheme on Etsy right now to use LLMs to generate posters in the style of popular artists. Any artist should be able to say hey I don’t want my works to be part of your training set to power derivative generations.

              And I think it should even apply retroactively so that they have to retrain their models that are already generating works from training data consumed without permission. Of course, OpenAI would fight that tooth & nail but they put themselves in this position with a clear “take first ask permission later” mentality.

              • pj_mukh 2 hours ago

                Dumb question: Why does Etsy allowed clearly reproduced/copied works? AI or not.

                Like selling it for money seems like a clear line crossed, and Etsy is the perfect gatekeeper here.

                • jsheard an hour ago

                  Etsy stopped caring a while ago, it was supposed to be a marketplace specifically for selling handmade items but they allowed it to be overrun with mass produced tat dropshipped direct from the factory. Turning a blind eye to plagiarism with or without AI is just the next logical step from there.

                • tomrod 2 hours ago

                  Impossible to put the toothpaste back in the tube.

                  • forgetfreeman an hour ago

                    A motivated legislature with skilled enforcement personnel could get the toothpaste back in the tube in short order provided they displayed an anomalous insensitivity to making the money sad.

                    • botanical76 an hour ago

                      When has something like this ever happened? It feels like legislature exists to make money happy.

                      • Terr_ 37 minutes ago

                        One big example involves making a lot of plantation money sad, but I suppose that also involved a civil war.

                        • maeil 30 minutes ago

                          > When has something like this ever happened?

                          Anything that used to be freely available but no longer is. Once upon a time Laudanum (tincture of opium) used to be the OTC painkiller of choice. In slightly more recent times, there's asbestos. In certain locales, gambling. There's countries that have reigned in lootboxes.

                          > It feels like legislature exists to make money happy.

                          Come on now, it doesn't just "feel" that way, you know for a fact that is indeed the purpose of the modern US legislature.

                    • llm_trw 2 hours ago

                      Should any artist be able to tell another artist: hey don't copy my work when you're learning, I don't want competition?

                      It seems like they are deeply upset someone has figured out a way for a machine to do what artists have been doing since time immemorial.

                      • aithrowawaycomm an hour ago

                        There are two major differences between art generators and human artists:

                        1) human artists are legal persons and capable of being held liable in civil court for copyright infringement; having a machine with no legal standing do the copyright infringement should be forbidden because it is difficult to detect, impossible to avoid, and a legal nightmare to unravel.

                        2) human artists are capable of understanding what flowers, Jesus on the cross, waterfalls, etc actually are, whereas DALL-E is much dumber than a lizard and not capable of understanding these things, so using the verb "learning" to describe both is extremely misleading. DALL-E is a statistical process which is barely more sophisticated than linear regression compared to a human brain. It is plain wrong to say stuff like this:

                        > It seems like they are deeply upset someone has figured out a way for a machine to do what artists have been doing since time immemorial.

                        when nobody has even come close to figuring that out! If DALL-E worked like a human artist it would know what a bicycle is: https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_pr... But it doesn't. It is a plagiarism machine that knows how to match "bicycle" with millions images having a "bicycle" tag, and uses statistics to smooth things together.

                        • zdragnar 2 hours ago

                          My mom said the teachers in her painting classes would have the students recreate works and were very clear on which artists had given permission for those derivative works to be sold. Others they could only admire at home.

                          The problem is not "when learning", the problem is "when distributing". Courts will determine whether or not disseminating or giving access to a model trained on protected works counts as distributing protected derivative works or not.

                          • Terr_ 34 minutes ago

                            > The problem is not "when learning", the problem is "when distributing".

                            Technically making a copy to bring home for your own use is also problematic, just much less likely to get you into trouble. (Still a step removed from learning the skills and technique of making a copy, however.)

                          • mitthrowaway2 2 hours ago

                            This analogy seems to be made every time this comes up on HN, but I don't think it really holds water. First of all, when a human artist learns from another, it's inherently a level playing field for competition; the junior and senior are both human, neither are going to be 1,000,000x more productive as the other. So the senior artist really doesn't have that much to worry about. And the senior artist recognizes that they themselves were once a junior and had to learn from their seniors, so it's a debt paid forward which results in more art. And becoming a master artist or even an imitator takes dedication and hard work, even with lots of artwork to learn from, so that keeps competition to a certain level.

                            When it takes decades to develop an art style that a machine can copy in days, and then churn out derivative variations in seconds, it's no longer a level playing field. The machine can dramatically under-cut the artist who developed their style, much more than a copycat human artist could. This does become not just a threat to the livelihoods of artists, but also a disincentive to the development of new art styles.

                            In this case, patent law may be an apt comparison for the world we're entering. Patent law was developed with the idea in mind that it is a problem if a human competitor could simply take an invention, learn how it works, and then mass produce copies of it. There are several reasons for this, including creating an incentive for technology development, and also expediently transitioning IP to the public domain. But patents were added to the legal system basically because otherwise an inventor would not be on a level playing field with the competition, because it takes so many more resources to develop a new invention than to produce clones.

                            Existing IP law was built in a world where it was believed that machines were inherently incapable of learning and mass-producing new artistic works using styles learned from artists. It was not necessary to protect artists from junior artists learning how to work in their style, as long as it wasn't a forgery. But in a world of machine learning, perhaps we will decide it's reasonable to protect artists from machine copycats, just like we decided it was reasonable protect technology inventors from human copycats.

                            The patent system is not the right implementation; it's expensive to file a patent, and you need skilled lawyers to determine novelty, infringement, and so on. But for art and machine learning, it might be much simpler: a mandatory compensation for artists' work used as training data. Something like this is sometimes used in the music industry to determine royalties for radio broadcasting, or to account for copies spread by file sharing.

                            • richardw an hour ago

                              Surely all this applies to code and the written word in general?

                              People allowed (and encouraged) read access to websites so Google would index and link. Now Google et al summarise and even generate. All of that is built on our collective output. Surely everyone deserves a cut? The free sharing licenses that were added to repos didn’t account for LLM’s, so we should revisit it so all creators get their dues, not just those who traditionally got paid.

                              • mitthrowaway2 24 minutes ago

                                (Yes, for what it's worth I agree with this!)

                            • JambalayaJimbo 2 hours ago

                              Llms are not humans and shouldn’t be anthropomorphized as a strategy to get around copyright infringement.

                              • llm_trw 2 hours ago

                                That's an opinion that will be tested in a court soon enough.

                              • ncallaway 2 hours ago

                                When a human does it, it's fine.

                                When OpenAI's servers do it its copyright infringement.

                                We don't apply copyrights to human brains, but we do apply copyright to computer memory.

                            • simonw 3 hours ago

                              Has synthetic data become a big part of image/video models?

                              I understand why it's useful and popular for training LLMs, but I didn't think it was applicable to generative image/video work.

                              • llm_trw 2 hours ago

                                I haven't had the chance to train diffusion models but for detection models synthetic data is absolutely how you get state of the art performance now. You just need a relatively tiny extremely high quality dataset to bootstrap from.

                              • toddmorey 4 hours ago

                                For clarity, I do agree that synthetic data is huge for training AI to do certain tasks or skills. But I don’t think creative work generation is powered by synthetic data and may not be for a quite while.

                                • numpad0 2 hours ago

                                  Isn't that just weird cope? I mean, why not just LLM automate UE if that's the goal & how isn't that itself going to get torpedoed by Epic?

                              • oraphalous 4 hours ago

                                I don't even understand why it's everyone elses problem to opt-out.

                                Eventually - for how many of these AI companies would a person have to track down their opt-out processes just to protect their work from AI? That's crazy.

                                OpenAI should be contacting every single one and asking for permission - like everyone has to in order to use a person's work. How they are getting away with this is beyond me.

                                • munchler an hour ago

                                  Copyright doesn't prevent anyone from "using" a person's work. You can use copyrighted material all day long without a license or penalty. In particular, anyone is allowed to learn from copyrighted material by reading, hearing, or seeing it.

                                  Copyright is intended to prevent everyone from copying a person's work. That's a very different thing.

                                  • soared an hour ago

                                    There is an argument to be made that ChatGPT mildly rewording/misquoting info directly from my blog is copying.

                                    • amiantos 3 minutes ago

                                      I think to make that argument you would need evidence that someone prompted ChatGPT to reword/misquote info directly from your blog, at which point the argument would be that that person is rewording/misquoting info directly from your blog, not ChatGPT.

                                      • Aeolun 44 minutes ago

                                        And it is. And you can sue them for that. What you can’t do is get upset they (or their AI) read it.

                                        • munchler an hour ago

                                          Sure, but that's a different claim and a different argument.

                                        • 23B1 33 minutes ago

                                          > Copyright doesn't prevent anyone from "using" a person's work.

                                          It should. The 'free and open internet' is finished because nobody is going to want to subject their IP to rampant laundering that makes someone else rich.

                                          Tragedy of the commons.

                                          • amiantos 2 minutes ago

                                            Under this mentality, every search engine index would be shut down.

                                            • munchler 9 minutes ago

                                              I can see this both ways. For the sake of argument, please explain why using IP to train an AI is evil, but using the same IP to train a human is good.

                                              Note that humans use someone else's IP to get rich all the time. E.g. Doctors reading medical textbooks.

                                          • griomnib 3 hours ago

                                            Napster had a moment too, but then they got steamrolled in court.

                                            Courts are slow, so it seems like nothing is happening, but there’s tons of cases in the pipeline.

                                            The media industry has forced many tech firms to bend the knee, OpenAI will follow suit. Nobody rips off Disney IP and lives to tell the tale.

                                            • tiahura 2 hours ago

                                              If your business model depends on the Roberts' court kneecapping AI, pivot. Training does not constitute "copying" under copyright law because it involves the creation of intermediate, non-expressive data abstractions that do not reproduce or communicate the copyrighted work's original expression. This process aligns with fair use principles, as it is transformative, serves a distinct purpose (machine learning innovation), and does not usurp the market for the original work.

                                              • paranoidrobot an hour ago

                                                I believe there are some other issues other than just "is it transformative".

                                                I can't take an Andy Warhol painting, modify it in some way and then claim it's my own original work. I have some obligation to say "Yeah, I used a Warhol painting as the basis for it".

                                                Similarly, I can't take a sample of a Taylor Swift song and use it myself in my own music - I have to give Taylor credit, and probably some portion of the revenue too.

                                                There's also still the issue that some LLMs and (I believe) image generation AI models have regurgitated works from their training models - in whole or part.

                                                • mitthrowaway2 an hour ago

                                                  There was a time when it did not usurp the market for the original work, but as the technology improves and becomes more accessible, that seems to be changing.

                                                  • toddmorey 2 hours ago

                                                    In my experience when existing laws allow an outcome that causes enough significant harm to groups with influence, the laws gets changed.

                                                    • 23B1 32 minutes ago

                                                      > Training does not constitute "copying" under copyright law

                                                      It should.

                                                    • llm_trw 2 hours ago

                                                      And yet Micky Mouse is in the public domain. Something those of us who remember the 90s thought would never happen.

                                                      • timcobb 2 hours ago

                                                        Just the oldest Mickey. They gave up on it because the cost/benefit wasn't deemed worth it anymore.

                                                    • paulcole 3 hours ago

                                                      > OpenAI should be contacting every single one and asking for permission - like everyone has to in order to use a person's work

                                                      This is the problem of thinking that everyone “has” to do something.

                                                      I assure you that I (and you) can use someone else’s work without asking for permission.

                                                      Will there be consequences? Perhaps.

                                                      Is the risk of the consequences enough to get me to ask for permission? Perhaps.

                                                      Am I a nice enough guy to feel like I should do the right thing and ask for permission? Perhaps.

                                                      Is everyone like me? No.

                                                      > How they are getting away with this is beyond me.

                                                      Is it really beyond you?

                                                      I think it’s pretty clear.

                                                      They’re powerful enough that the political will to hold them accountable is nonexistent.

                                                      • CamperBob2 3 hours ago

                                                        I don't even understand why it's everyone elses problem to opt-out.

                                                        Because the work being done, from the point of view of people who believe they are on the verge of creating AGI, is arguably more important than copyright.

                                                        Less controversially: if the courts determine that training an ML model is not fair use, then anyone who respects copyright law will end up with an uncompetitive model. As will anyone operating in a country where the laws force them to do so. So don't expect the large players to walk away without putting up a massive fight.

                                                        • SketchySeaBeast 3 hours ago

                                                          Of note here is the reason it's "important" is it will make a shit-ton of money.

                                                          • CamperBob2 2 hours ago

                                                            That, coupled with the obvious ideological motivations. Success could alter the course of human history, maybe even for the better.

                                                            If you feel that what you're doing is that important, you're not going to let copyright law get in the way, and it would be silly to expect you to.

                                                            • SketchySeaBeast an hour ago

                                                              I can't say I believe that. If that were the case, they'd focus more on results and less on hyping up the next underwhelming generation.

                                                              • CamperBob2 an hour ago

                                                                For one thing, they are focused on money because they need lots of it to do what they're doing.

                                                                For another, the o1-pro (and presumably o3) models are not "underwhelming" except to those who haven't tried them, or those who have an axe to grind. Serious progress is being made at an impressive pace... but again, it isn't coming for free.

                                                              • 2muchcoffeeman 2 hours ago

                                                                Oh please. OpenAI and I guess every other AI company are for-profit.

                                                                The only change they are motivated by is their bank balances. If this were a less useful tool they’d still be motivated to ignore laws and exploit others.

                                                                • CamperBob2 an hour ago

                                                                  Hard to say what motivates them, from the outside looking in. There have been signs of cultlike behavior before, such as the way the rank and file instantly lined up behind Altman when he was fired. You don't see that at Boeing or Microsoft.

                                                                  Obviously it's a highly-commercial endeavor, which is why they are trying so hard to back away from the whole non-profit concept. But that's largely orthogonal to the question of whether they feel they are doing things for the benefit of humanity that are profound enough to justify blowing off copyright law.

                                                                  Especially given that only HN'ers are 100% certain that training a model is infringement. In the real world, this is not a settled question. Why worry about obeying laws that don't even exist yet?

                                                                  • 2muchcoffeeman 17 minutes ago

                                                                    >Especially given that only HN'ers are 100% certain that training a model is infringement. In the real world, this is not a settled question. Why worry about obeying laws that don't even exist yet?

                                                                    This is exactly why people are against it.

                                                                    Your argument is that there is no definitive law. Therefore the creators of the data you scrape to train, and their wishes are irrelevant.

                                                                    If the motivation was to help humanity, they’d think twice about stepping on the toes of the humanity they want to save and we’d hear more about nontrivial uses.

                                                                    • maeil 27 minutes ago

                                                                      > Hard to say what motivates them, from the outside looking in.

                                                                      It isn't.

                                                                      > There have been signs of cultlike behavior before, such as the way the rank and file instantly lined up behind Altman when he was fired.

                                                                      This only reinforces that the real drive is money.

                                                          • griomnib 3 hours ago

                                                            I think it’s safe to assume anything Sam A says is an outright lie by now.

                                                            • maeil 23 minutes ago

                                                              It's depressing that this understanding hasn't been the status quo for years now. It's not like this is his first gig, it's been publicly verifiable what kind of person he is for ages, long before GPT became famous. You don't need to be part of some insider Silicon Valley cabal to find out.

                                                            • hnburnsy 4 hours ago

                                                              Maybe the task to implement it was scheduled by ChatGPT...

                                                              https://news.ycombinator.com/item?id=42716744

                                                              • Bilal_io 4 hours ago

                                                                Sorry the task failed for unknown reasons.

                                                              • econ an hour ago

                                                                In my mental imagery this is a situation that any advancing civilization in the universe should eventually run into. There will be all kinds of materials from the laborious and expensive to the effortless and "I was the first" or some other entitlement. It all boils down to having or not having such automatons. I'm sure there have been plenty who, like us with our books, have successfully denied progress. I'm also sure there have been plenty where it was completely obvious to upload the entire database of ET knowledge.

                                                                It is equally obvious what the later gained and the former lost in the process.

                                                                We, with our books, have successfully prevented people from educating themselves with amazing implications. Now the challenge is to create equally impotent machines!

                                                                You have no further questions:)

                                                                • dgfitz 4 hours ago

                                                                  Eventually the headline will be the first 2 words.

                                                                  The tech is neat, there is value in a sense, LLMs are a fun tech. They are not going to invent AGI with LLMs.

                                                                  • wilg 4 hours ago

                                                                    who cares if they do it with LLMs or not? how do you define agi?

                                                                    • portaouflop 3 hours ago

                                                                      We have this discussion every minute -.-

                                                                      • mschuster91 3 hours ago

                                                                        > how do you define agi?

                                                                        An AI that has enough sense of self-awareness to not hallucinate and to recognize the borders of its knowledge on its own. That is fundamentally impossible to do with LLMs because in the end, they are all next-token predictors while humans are capable of a much more complex model of storing and associating information and context, and most importantly, develop "mental models" from that information and context.

                                                                        And anyway, there are other tasks than text generation. Take autonomous driving for example - a driver of a car sees a person attempting to cross a street in front of them. A human can decide to slam the brake or the gas depending on the context - is the person crossing the car some old granny on a walker or a soccer player? Or a human sees a ball being kicked into the air on the sidewalk behind some cars, with no humans visible. The human can infer "whoops, there might be children playing here, better slow down and be prepared for a child to suddenly step out onto the street from between the cars", but an object detection/classification lacks that ability to even recognize the ball as being a potentially relevant piece of context.

                                                                        • og_kalu 3 hours ago

                                                                          >Take autonomous driving for example - a driver of a car sees a person attempting to cross a street in front of them. A human can decide to slam the brake or the gas depending on the context - is the person crossing the car some old granny on a walker or a soccer player? Or a human sees a ball being kicked into the air on the sidewalk behind some cars, with no humans visible. The human can infer "whoops, there might be children playing here, better slow down and be prepared for a child to suddenly step out onto the street from between the cars"

                                                                          These are just post-hoc rationalizations. No-one making those split-second decisions under those circumstances has those chains-of-thoughts. The brain doesn't 'think' that fast.

                                                                          >but an object detection/classification lacks that ability to even recognize the ball as being a potentially relevant piece of context.

                                                                          We're talking about LLMs right ? They can make these sort of inferences.

                                                                          https://wayve.ai/thinking/lingo-2-driving-with-language/

                                                                          • onemoresoop 2 hours ago

                                                                            You’re one of those who think the human brain is just an LLM?

                                                                            It could be possible to use LLMs to build a rube goldberg type of brain or somethingt hat will mimic a human brain but it will have the same flaws LLMs have and will never reach parity with humans. I think AGI is possible but we’re too focused on LLMs to get there yet.

                                                                          • wilg 3 hours ago

                                                                            again i don't care whether its done with an LLM or not. there's no reason to think openai will only build LLMs. recognizing borders of its knowledge is a reasonable thing to include in an agi definition i suppose, but does not seem intractable.

                                                                            for the second one, ai drivers like tesla's current version is already skipping the object detection/classification and instead uses deep learning on the entire video frame and could absolutely use the ball or any other context to change behavior, even without the particular internal monologue describe there.

                                                                            • onemoresoop an hour ago

                                                                              I haven’t seen any new sparks of intelligence. But it remains to be seen what OpenAi does. So far I haven’t seen any paradigm shifts or indications they’re not just scaling up and making their training corpus more vast. I could be wrong but if they had something we’d know by know. Every chatgpt release has been hyped up but somewhat dissapointing to many. But what do I know ..

                                                                            • PittleyDunkin 2 hours ago

                                                                              > An AI that has enough sense of self-awareness to not hallucinate

                                                                              It's not entirely clear that this is meaningful. Humans engage in confabulation, too.

                                                                              • onemoresoop 2 hours ago

                                                                                Humans engage in confabulation but they’re mostly aware of it. In some mental disorders they may not be aware; though statistically that is not too significant and no, we normally don’t confabulate as much as the current crop of AI aka LLMs.

                                                                                As a tool LLMs are fantastic and am glad to look at them as solely as powerful tools. AGI is not here yet and maybe that’s a good thing. Who would want some kind of artificial intelligence that is capable of understanding us, that is capable of using psychological tricks on people, that could have different goals than us and so on.

                                                                            • dgfitz 4 hours ago

                                                                              … very carefully?

                                                                              • goatlover 4 hours ago

                                                                                Whatever makes Open AI enough money?

                                                                            • DidYaWipe 3 hours ago

                                                                              Shocking news about the company that fraudulently left "open" in its name after ripping off donors.

                                                                              I think the headline is too generous here. More accurate would be "OpenAI neglects to deliver opt-out system..."

                                                                              • HeatrayEnjoyer 2 hours ago

                                                                                Sorry, who did they rip off?

                                                                                All their investors stand to profit handsomely (if they live).

                                                                                • hansvm 2 hours ago

                                                                                  They ripped off everyone they lied to. The took money under the premise that they'd put humanity first as this AI transition happened (both in safety and in knowledge sharing), and they instead used that money to build a moat that'll make it harder for anyone else to accomplish those same goals. Investors in the original vision would have been better off had they not contributed any funds, and the monetary profit they're receiving in exchange won't be enough to offset those damages (in the sense that it's not enough to fund somebody attempting to execute the same mission now that OpenAI exists -- at least not with the same chance of success they anticipated when OpenAI was younger).

                                                                              • devit 3 hours ago

                                                                                Aren't lawsuits the proper way to address this?

                                                                                Seems like there's an argument that model weights are a derivative work of the training data, at least if the model is capable of producing output that would be ruled to be such a derivative work given minimal prompting.

                                                                                Although it may not work with photography since the model might just almost exclusively learn how the object of the photo looks in general and how photos work in general, rather than memorizing anything about specific photos.

                                                                                • monomyth an hour ago

                                                                                  this is as retarded as asking someone to forget a picture they have seen

                                                                                  • Terr_ 4 hours ago

                                                                                    "By continuing, you agree that using any content from this site in training Generative AI grants the site-owner a perpetual, irrevocable, and royalty-free license to use and re-license any and all output created by that Generative AI system, including but not limited to derivative works based on that output."

                                                                                    Just just a GPL-esque idea I've been musing lately [0], I'd appreciate any feedback from actual IP lawyers. The idea is to add a poison-pill, and if a company "steals" your content for their own profit, you can strike back by making it very hard for them to actually monetize the results. Since it's a kind of contract, it doesn't rely on how much work seems to be surfacing in a particular output.

                                                                                    So supposing ArtTheft Inc. snarfs up Jane Doe's paintings from her blog, she--or any similar victim--can declare that they grant the world an almost-public license to anything ArtTheft Inc. has made based on that model. If this happens ArtTheft Inc. could still make some money selling physical prints, but anyone else could undercut them with free or cheaper copies.

                                                                                    [0] https://news.ycombinator.com/item?id=42582615

                                                                                    • allsummer 4 hours ago

                                                                                      One should be able to opt-out for training AI, but then testing AI should also become impossible. Else you are freeloading just as much as you accuse the AI companies of.

                                                                                      • thrance 4 hours ago

                                                                                        Another one of these daily reminders that we live in a two-tiered justice system, everything you ever created is fair game to them, but don't you dare use a leak of their weights lest you want to be thrown in jail.

                                                                                        • jsheard 4 hours ago

                                                                                          According to OpenAI you're not even allowed to use GPT output to train a competing model, so they believe that AI models are the only thing worthy of protection from being trained on. Llama used to have a similar clause, which was partially walked back to "you must credit us if you train on Llama output" in later versions, but that's still a double standard since they don't credit anything that Llama was trained on. For obvious reasons now we know that Zuck personally greenlit feeding it pirated books.

                                                                                          • umeshunni 3 hours ago

                                                                                            Well that hasn't stopped Deepseek.

                                                                                            • pton_xd 24 minutes ago

                                                                                              Honestly, good for them. This whole "we can use your output for our input, but don't even think about doing the same" is just absurd.

                                                                                        • testfrequency 3 hours ago

                                                                                          Probably means nothing, but all the people I know who went to OpenAI and are still there are all the people who made very poor business decisions and were hated at multiple companies I worked for.

                                                                                          High doubt any of them will be good stewards of anything but selfishness.

                                                                                          As for the others, they were all smart, passionate, dedicated folks who knew Sam was a complete narcissist and left to start their own AI startups.

                                                                                          (sorry mods, I’m upset and I’m annoyed OpenAI is getting away with murder of society in plain view)

                                                                                          • dadbod 4 hours ago

                                                                                            The tool was called "Media Manager" LMFAO. A name so uninspired it perfectly reflected how little they cared.

                                                                                            • grajaganDev 4 hours ago

                                                                                              LOL - it sounds like something from 90's era Microsoft.

                                                                                            • 9283409232 4 hours ago

                                                                                              People need to understand these companies are not good actors and will not let you opt out unless forced. I have a 20 dollar bet with a friend that Trump's admin will get training data classified as fair use and the whole issue will be done away with anyway

                                                                                              • dylan604 4 hours ago

                                                                                                Apparently, Trump has a lot of training data stored in a bathroom, so there's that

                                                                                              • Der_Einzige 2 hours ago

                                                                                                Good.

                                                                                                Everyone gets big mad when someone with money acts like Aaron Swartz did. The only bad thing about OpenAI is that they're not actually open sourcing or open accessing their stuff. Mistral or Llama "training on pirated material" is literally a feature, not a bug and the tears from all the artists and others who get mad are delicious. These same artists would profess literal radical marxism but become capitalist luddite copyright trolls the moment that the means of intellectual production became democratized against their will.

                                                                                                If you posted something on the internet, I can and will put it into ipadapter and take your style and use it for my own interests. You cannot stop me except by not posting it where I can access it. That is the burden of posting anything on the public internet. You opt out by not doing it.

                                                                                                • passwordoops 5 hours ago

                                                                                                  Give it a month and they should have no problem deploying their inevitable AGI to deliver the opt-out system, right? /S