• ErikBjare 16 hours ago

    I wonder if copyright holders would be okay if Meta had a legitimately acquired a copy of each copyrighted work they trained on. (not saying the issue ends there)

    • catlikesshrimp a day ago

      Clickbait. Real title:

      "Meta Secretly Trained Its AI on a Notorious Piracy Database, Newly Unredacted Court Docs Reveal"

      Which is "old news"

      Disable javascript to read or: https://archive.ph/Kp29q

      • jazzyjackson a day ago

        Title of the article vs a salacious quote (newly revealed in court documents) doesn’t reach the threshold of clickbait IMO, Meta engineers really were torrenting libgen with the CEOs approval (this is the part that’s not old news)

        Previously it was just “books3 was part of the training data”, now it’s “MZ was made aware of pirated materials, gave the go ahead, and by way of torrenting Meta engineers redistributed the copyrighted materials, which is outlawed whether or not you’re training a super intelligence.

        Personally I’m not a fan of enforcing copyright law in general, but I’m especially not a fan of corporations getting to skirt laws that the little people are made to obey. If Meta wants to train on libgen, they should have partnered with internet archive and provided them better lawyers.

        • throw5959 a day ago

          You don't know whether they seeded anything. Or anything recognizably copyrightable.

          • jazzyjackson a day ago

            Fair, when someone says torrenting I assume bidirectionality, but they may have blocked outgoing packets in order to comply with some interpretation of the law.

            Given that llama was originally “leaked” via torrent, I have this assumption that meta folks are Pirates in spirit tho, and wouldn’t leech without being told explicitly, but then, being told not to upload would be legally perilous too since it would hint that they are aware of the illegality. Meta’s defense here seems to be “Officer I swear I didn’t know that wasn’t allowed”, testing the legal theory of transforming copyrighted work.

            • atkailash a day ago

              [dead]

        • ilrwbwrkhv a day ago

          I thought it is common knowledge that every single AI company trained on pirated datasets.

          • magic_smoke_ee 20 hours ago

            I suspect MAANG + OpenAI will simply pay lobbyists to make it legal. They have huge piles of money and money makes the law in America.

            • hdjjhhvvhga 13 hours ago

              Common knowledge - yes, but not proved in court.

            • metalman a day ago

              getting to watch someone steal a cake and eat it too