Aaron was the OG. If you've never dug through his blog, do yourself a favor [1]. Also make some time to watch The Internet's Own Boy doc about him [2] and look up some of his talks during the SOPA shenanigans. RIP.
Sam Altman failed upwards only because PG likes him. Aaron Swartz was actually a technical genius imo. DOJ never should of charged Swartz.
Although I agree, I think your analysis could be enhanced by asking "why" a couple of times to get to the root.
Why did PG "like" him?
Alan Kay really liked him in 2016. https://youtu.be/fKcCwa_j8e0 I think Altman is very good at cultivating people. Mind you, it seems that Swartz got around by making dazzling impressions on people, too.
What’s the best evidence for Swartz’ technical genius?
My personal favorite is that he was part of authoring the RSS spec at age 14. But you're more than welcome to Google for other pieces of evidence yourself if you're genuinely interested and not just being argumentative.
Not that I disagree that with the idea that he was brilliant but the RSS spec isn't what I would consider a complex piece of documentation. Even for a 14 year old.
Define failure.
Does Loopt count as a success? The exit was slightly more than the total investment, I guess. What about Worldcoin?
He's at least not someone I naturally associate with business success pre-OpenAI (and the jury's still out on OpenAI considering their financial situation) but I suppose depending on how you evaluate it his success rate isn't 0%.
You can say OpenAI is a "success" given their achievements in AI but those aren't Sam's work, he should mostly be credited with their business/financial performance and right now OpenAI-the-business is mostly a machine that burns electricity and operates in the red.
> should of charged
what?
Aaron Swartz was targeted by some pretty overly zealous prosecution no objection, but lets not forget that what he really did.
He put a laptop in a wiring closet that was DOSing JSTOR and kept changing IPs to avoid being blocked. The admins had to put a camera on the closet to eventually catch him.
He might have had good intentions but the way he went about getting the data was throwing soup at paintings levels of dumb activism.
For all the noise the real punishment he was facing was 6 months in low security [1]. I'm pretty sure OpenAI would have also been slapped hard for the same crime.
[1] https://en.wikipedia.org/wiki/Aaron_Swartz#Arrest_and_prosec...
Edit: added link
“charges carrying a cumulative maximum penalty of $1 million in fines plus 35 years in prison” https://en.m.wikipedia.org/wiki/United_States_v._Swartz
I didnt think people on “hacker news” would be defending what happened to Aaron Swartz.
> charges carrying a cumulative maximum penalty of $1 million in fines plus 35 years in prison
Any lawyer knows that is stupid math. The DOJ has sentencing guidelines that never add up the years in prison for charges to be served consecutively. The media likes to do that to get big numbers, but it isn’t an honest representation of the charges.
I don’t think charges against Schwartz should have been filed, but I also can’t stand bad legal math.
Sure but… he could technically get that or not? If somebody wanted to really punish him they could push it to what? 3 years? 5 years? 10 years?
Because some people really wanted to punish him.
I am just reacting to the downplaying that he would get 6 months in jail. Like he was some weak person for commiting suicide because of that.
Just for context, there is a new post about OpenAI DDoS'ing half the internet every other day on hn
If you ask OpenAI to stop, using robots.txt, they actually will.
What Aaron was trying to achieve was great, how he want about it is what ruined his life.
It is a well known fact that OpenAI stole content by scraping sites with illegally uploaded content on it.
Nobody really asked Aaron about anything they collected more evidence and wanted to put him to jail.
School should have unplugged his machine bring him for questioning and tell him not to do that.
Which individual suffered harm from Aaron's laptop in the closet?
> the way he went about getting the data was throwing soup at paintings levels of dumb activism.
Throwing soup at paintings doesn’t make the paintings available to the public.
What he did had a direct and practical effect.
> What he did had a direct and practical effect
The main impact of Aaron Swartz’s actions were that it became much more difficult to walk onto MIT’s campus and access journal articles from a laptop without being a member of the MIT community. I did this for a decade beforehand and this became much more locked down in the years after his actions due to restrictions the publishers pushed at MIT. Aaron intentionally went to the more open academic community in Cambridge (Harvard, his employer, was much more restrictive) and in the process ruined that openness for everyone.
For one Sam scraped under the veil of a corporation which helps reduce or remove personal liability.
Second, if the crime was the act of scraping then it’s directly comparable. But if the crime is publishing the data for free, that’s quite different from training AI to learn from the data while not being able to reproduce the exact content.
“Probabilistic plagiarism” is not what’s happening or even aligned with the definition of plagiarism (which matters if we’re talking about legal consequences). What’s happening is that it’s learning patterns from the content that it can apply to future tasks.
If a human reads all that content then gets asked a question about a paper, they too would imperfectly recant what they learned.
Your argument might make sense to you, but it doesn't make sense legally.
The fact is that “Probabilistic plagiarism” is a mechanical process, so as much as you might like to anthropomorphize it for the sake of your argument ('just like a human learning') it's still a mechanical reproduction of sorts which is an important point under fair use, as it the fact that it denies the original artists the fruits of their labor and is a direct substitute for their work.
These issues are the ones that will eventually sink (or not) the legality of AI training, but they are seldom addressed in these sorts of discussions.
> The fact is that “Probabilistic plagiarism” is a mechanical process, so as much as you might like to anthropomorphize it for the sake of your argument
I did not anthropomorphize anything. “Learning” is the proper term. It takes input and applies it intelligently to future tasks. Machines can learn, machine learning has been around for decades. Learning doesn’t require biology.
My statement is that it is not plagiarism in any form. There is no claim that the content was originally authored by the LLM.
An LLM can learn from a textbook and teach the content, and it will do so without plagiarism. Just as a human can learn from a textbook and teach. Making an analogy to a human doesn’t require anthropomorphism.
The law on copyright doesn't depend on the word "learning", it depends on whether it's a human doing it, or a mechanical process.
If a human reads a book and produces a different book that's sort-of-derivative but doesn't copy too many elements too directly, then that book is a new creative work and doesn't infringe on the copyright of the original author. For example, 50 Shades of Gray is somewhat derivative of Twilight (famously starting as a Twilight fan-fic) but it's legally a separate copyright.
Conversely, if you use a machine to produce the same book, taking only copyrighted text as input and an algorithm that replaces certain words and phrases and adds certain passages, then the result is a derivative work of the original and it infringes the copyright of the original author.
So again, the facts of the law are pretty simple, at the moment at least: even if a machine and a human do the exact same thing, it's still different from a legal perspective.
> I did not anthropomorphize anything.
Machines don't learn. They encode, compress and record.
Until or unless the law decides otherwise.
The 2020s ethic of "copying any work is fair game as long as you call the copying process AI" is the polar and equally absurd opposite to the 1990s ethic of "measurement and usage of any point or dimension of a work, no matter how trivial, constitutes a copyright infringement".
It's been a term of art since 1959, and the title of a research journal since 1989: https://en.wikipedia.org/wiki/Machine_Learning_(journal)
Does a research journal on aliens prove aliens exist?
As this is a semantics debate, their actual existence is irrelevant.
A research journal on extra-terrestrial aliens would prove that the word "aliens" is used to mean "extra-terrestrials" and that the word doesn't just mean "foreigners": https://www.law.cornell.edu/uscode/text/8/chapter-12/subchap...
Semantics.
Ask any LLM to recite lyrics and see that it's not so probabilistic after all, it's perfectly capable of publishing protected content, and the filter to prevent it from doing so is such a bolt-on its embarrassing.
We have to understand what plagiarism is if making claims of it. Claiming that you authored content and reciting content are different things. Reciting content isn’t plagiarism. Claiming you are the author of content that you didn’t author is plagiarism.
> it's perfectly capable of publishing protected content
At most it can produce partial excerpts.
LLMs don’t store the data that it’s trained on. That would be infeasible, the models would be too large. Instead, it stores semantic representations which often uses entirely different words and sentence structures than the source content. And of course most of the data is lost entirely during this lossy compression.
That's a little like saying downloading mp3s isn't music piracy, because it's not encoding the actual music, just some lossy compressed wavelets that sound like it.
Your username represents a thing that can happen in a human brain which reproduces the perceptual content of a song.
Are earworms copyright infringement?
If I ask you what the lyrics were, and you answer, is that infringement, or fair use?
The legal and moral aspects are a lot more complex than simply the mechanical "what it's done" or "is it like a brain".
Are earworms infringement? No, they stay inside your head, so they exist entirely outside the scope of copyright law.
If you ask me the lyrics, fair use acknowledges that there's a copyright in effect, and carves out an exemption. It's a matter-of-degree argument, is this a casual conversation or a written interview to be published in print, did you ask me to play an acoustic cover of the song and post it on YouTube?
Either way, we acknowledge that the copyright is there, but whether or not money needs to change hands in some direction or other is a function of what happens next.
No, the difference is that MP3s can almost completely recreate the original music, while LLMs can't do that with specific pieces of authored works.
You've got it completely backwards. MP3s just trick you into thinking it's the same thing. It's actually totally different if you analyze it properly, in a non-human sense.
LLMs are able to often _preciesely_ recreate in contrast to MP3 at best being approximate.
> LLMs are able to often _preciesely_ recreate
Is there actual proof of this? Especially the "often" part?
These are different problems.
The written word is an inherently symbolic representation. Recorded audio (PCM WAV or similar) is not. The format itself doesn't encode meaning about the structure.
The written word is more akin to MIDI, which can exactly represent melody, but cannot exactly represent sound.
MP3 is a weird halfway house, it's somewhat symbolic in terms of using wavelets to recreate a likeness of the original sound, but the symbols aren't derived from or directly related to the artist's original intent.
First shot on GPT 4o: https://chatgpt.com/share/6783ad9d-7c9c-8005-b2aa-660200d05e...
Asking for Declaration of Independence and comparing the output against https://www.archives.gov/founding-docs/declaration-transcrip... and exhaustive list of differences:
1. Em-dashes in ChatGPT, `--` in .gov -> this is just an ASCII limitation of the .gov transcript
2. ChatGPT capitalized Perfidy in "Cruelty & Perfidy", while .gov has "perfidy"
3. ChatGPT writes "British" while .gov says "Brittish"
These are _all_ the differences.
Talking about plagiarism here is a complete red herring.
Plagiarism is essentially a form of fraud: you are taking work that someone else did and presenting it as your own. You can plagiarize work that is in the public domain, you can even plagiarize your own work that you own the copyright to. Avoiding a charge of plagiarism is easy: just explicitly quote the work and attribute to the proper author (possibly yourself). You can copy the entirety of the works of Disney, as long as you are attributing them properly, you are not guilty of plagiarism. The Pirate Bay has never been accused of plagiarism. And plagiarism is not a problem that corporations care about, except insofar as they may pay a plagiarist more money than they deserve.'
The thing that really matters is copyright infringement. Copyright infringement doesn't care about attribution - my example above with the entire works of Disney, while not plagiarism, is very much copyright infringement, and would cost dearly. Both Aaron Swartz and The Pirate Bay have been accused and prosecuted for copyright infringement, not plagiarism.
LLMs (OpenAi models included) are happy to reproduce books word by word, page by page. Just try it out yourself. And even if some words were reproduced wrong, it still would be copyright violation.
It's not really some words, it's more like you won't be able to get more than a page out of it and even that is going to be so wrong it's basically a parody and thus allowed.
I’d love to see you try to defend this notion in court. Parody requires deliberate intent to be humorous. And courts have repeatedly held that changing the words of a copyrighted work while keeping the same general meaning can still be copyright infringement.
You could argue intent. The model has no intent to infringe. No mens rea.
AI models will get broad federal immunity is my prediction for 2025.
I'll bet DOGE coins on it.
It's not just "changing some words". The majority of words will be different, sentences will be different. The general meaning might be generally the same, but I don't think that's enough to claim copyright protection.
I tried and I could not make it work. And even if you could, that has to be the most inefficient way to pirate books on earth.
> reproduce books word by word, page by page
This statement is a figment of the commenters imagination with no basis in reality. All they would have to do is try it to realize they just spouted a lie.
At most LLMs can produce partial excerpts.
LLMs don’t store the data that it’s trained on. That would be infeasible, the models would be too large. Instead, it stores semantic representations which often uses entirely different words and sentence structures than the source content. And of course most of the data is lost entirely during this lossy compression.
The NYT has extracted long articles from ChatGPT and submitted the evidence in court.
Given that it has been submitted in court, does that mean you can say what the longest verbatim extract was?
It seems like that would be a fact that couldn't be argued with.
Crucial size difference between an article and a book.
Size difference meaning that people often share complete copies of articles to get around pay walls — including here. As I understand it, this is already copyright infringement.
I suspect that those copies are how and why it's possible in cases such as NYT.
> At most LLMs can produce partial excerpts.
Glad you agree that LLMs infringe copyrights.
>Second, if the crime was the act of scraping then it’s directly comparable. But if the crime is publishing the data for free, that’s quite different from training AI to learn from the data while not being able to reproduce the exact content.
They often do reproduce the exact context; in fact it's quite a probable output
>“Probabilistic plagiarism” is not what’s happening or even aligned with the definition of plagiarism (which matters if we’re talking about legal consequences). What’s happening is that it’s learning patterns from the content that it can apply to future tasks.
That's what I think people wish would happen. Sometime they have been shown to learn procedural knowledge from the training data but mostly it's approximate retrieval.
Do you have proof of this?
Do you really think that OpenAI has deleted the data it has scraped? Don't you think OpenAI is storing all this scraped data at this moment on some fileservers in order to re-scan this data in the future to create better models? Models which may even contain verbatim copies of that data internally but prevent access to it through self-censorship?
In any case it's a "Yes, we have all this copyrighted data and we're constantly (re)using it to produce derived works (in order to get wealthy)". How can this be legal?
If that were legal, then I should be able to copy all the books in a library and keep them on a self-hosted, private server for my or my companies use, as long as I don't quote too much of that information. But I should be able to have all that data and do close to whatever I want with it.
And if this were legal, why shouldn't it be legal to request a copy of all the data from a library and obtain access to it via a download link?
What? This makes no sense. Of course you're allowed to own copyrighted material. That's the whole point. I have bookshelves worth of copyrighted material at home at this very minute.
If you're implying that the scraping and storing of the things itself breaks copyright, then maybe, but I don't think so? If you're saying that training on copyrighted material breaks copyright, then yes, that's the whole argument.
But just having copyrighted material on a server somewhere, if obtained legally, is not by itself illegal.
“Obtained legally” implies that they have licensed the material for machine learning - which they didnt because it learns “like a human”. So i guess netflix subscription is enough. And since the machine has to learn on frames we just have to copy every frame to image and store it in our dataset but thats just a technicality. Also even if you explicitly prohibit use of your copyrighted material it doesnt matter because “its like human” it would be discrimination.
Nah this is breach of current copyright laws in many many ways. Tech sector is as usual just running away with it hoping nobody will notice untill they manage to change the laws to suit them.
Not the other poster, but chiming in here that:
> If you're implying that the scraping and storing of the things itself breaks copyright, then maybe, but I don't think so?
Suppose I "scrape and store" every book I ever borrow or temporarily-owned, using the copies to fill several shelves in my own personal library-room.
Yes, that's still copyright infringement, even if I'm the only one reading them.
> But just having copyrighted material on a server somewhere, if obtained legally, is not by itself illegal.
I see two points of confusion here:
1. The difference between having copies and making copies.
2. The difference between "infringing" versus "actually illegal."
Copyright is about the right to make copies. Simply having an unauthorized copy is no big deal, it's making unauthorized copies where you can get in trouble.
Also, it is generally held that the "copies" of bytes in the network etc. do not count, but if you start doing "Save As" on everything to create your own archives of the news site, then that's another story.
> Suppose I "scrape and store" every book I ever borrow or temporarily-owned, using the copies to fill several shelves in my own personal library-room.
Yes, but you don't know that they did that. They could've just bought legal access to copyrighted material in many cases.
E.g. if I pay for an NYT subscription that gives me the entire back catalogue of the NYT, then I'm legally allowed to own it. Whether I'm allowed to train models on it is, of course, a new and separate (and fascinating) legal question.
If I make an MP3 of a song or a JPEG of an artwork I cannot use them to reproduce the exact content, but I will still have violated the artist’s copyright.
So, let me get this straight. Scraping data and publishing it so everyone get equal and free access to information and knowledge, arguably making the world better, is a crime. But scraping it to benefit and enrich just yourself, is A-OK??
This legal system is truly fucked.
No, if you scrape and create a open-weight model, that is OK too. The difference is if you re-publish the original dataset, or just the weights trained from it.
Please stop mis-interpreting posts to match doomer narratives. It's not healthy for this forum.
It's weird that corporations get to remove liability from people acting on their behalf. The law is only that they remove liability for debts. If you act on behalf of a corporation to commit either a criminal offence or civil wrong, as far as I can tell (I'm not a lawyer) you are guilty of at least conspiracy.
And yet, it's rare for individuals to be prosecuted for such offences, even for criminal offences. We treat the liability shield as absolute. It may seem unfair to prosecute the little guy for "just following orders" but the fact that we don't do it is what allows corporations to offend with impunity.
If you join a gang you get protection from that gang. There are some gangs you can join where you get to kill people, with total impunity, if you're into that sort of thing. Corporations are just another kind of gang. Don't let the legal sugar frosting fool you.
Yes, but I don't think that's the full explanation. It seems to apply even to companies of modest size, which should not be able to intimidate.
An algorithm that predicts the best next possible outcome, with trillions of parameters can hardly be called intelligence in my book, it is an artificial I will give you that.
I am old enough to remember a bot called SeNuke which was widely used 10-15 years ago, in the so called black hat SEO community, the purpose of the bot was to feed it with 500 words article, so the words can be scrambled in a way to pass the Google algorithm for duplicated content. It was plagiarism 101, now I don't recall anyone talking about AI back then or how all the jobs of copy writers will extinct, and how we are all doomed.
What I remember is that every serious agency would not use such tool so that they can't be associated with plagiarism and duplicate content bans.
Maybe it is just me but I cannot fathom the craziness, and hype of a first person output.
What we get now with LLM models it not simply an output of link and description of lets say a question like What is an algorithm? We get an output that starts with "Let me explain" ... how is this learning and intelligence?
We are just witnessing the next dot com boom, the industry as whole haven't seen such craziness despite all the efforts in the last 25 years. So I imagine that everyone wants to ride the wave to become the next PayPal mafia, tech moguls, philanthropist, inventors, billionaires...
Chomsky summed it best.
RIP Aaron
Sam Altman wouldn't spend a second reflecting about this.
I believe so too, sociopaths are not capable of empathy
Do you have proof that he is a sociopath?
I know I support what aaronsw did and I don’t think he shouldn’t have gotten in any trouble for it, let alone to the tragic level it went to. As for sama, I’m not sure, on one hand I like the innovation and on the other hand it’s very worrying for humanity. I appreciate the post and the fond memories of Aaron but I’m not in complete agreement with the author about sama.
Idk but I find Aaron actually cool and intelligent.
This post pits the two people against each other. Am I the only one here who likes both Sam Altman and Aaron Swartz? They've both done great things to help remix culture.
Sure, you could say that the law has come down differently on the two, but there are several differences: the timing (one was decades earlier), the nature of copying (one was direct, while the other was just to train and more novel), and the nature of doing (doing it individually vs as a corporation).
But this doesn't have to reflect on them. You don't have to hate one and love the other... you can like both of them.
In the photo there are some other faces that I think I might recognise, but I'm not 100% sure. Is there a list of everyone in the picture somewhere on the internet?
Edit I think the lady on the left is Jessica Livingston and a younger PG on the right
https://i.imgur.com/e0GPhSE.jpeg
1. zak stone, memamp
2. steve huffman, reddit
3. alexis ohanian, reddit
4. emmet shear, twitch
5. ?
6. ?
7. ?
8. jesse tov, https://www.ycombinator.com/companies/simmery
9. pg
10. jessica
11. KeyserSosa, initially memamp but joined reddit not long after (I forget his real name)
12. phillip yuen, textpayme
13. ?
14. aaron swartz, infogami at the time
15. ?
16. sam altman, loopt at the time
17. justin kan, twitch
Amazing, thank you!
No, that's Jessica Livingston, the cofounder of YC.
yes, you are right, I edited my comment.
Aaron was a developer himself but Sam ... ?
Thank you, Sam Altman and everyone at OpenAI, for creating ChatGPT and unleashing the modern era of generative AI, which I use every day to speed up my job and coding at home.
Signed,
Someone who doesn't care that you're making $$$$ from it
The point is that regardless if you're negative, neutral, or positive of others using data for profit, you would hold those who use it altruistically higher.
The usual caveat applies. I'm okay they make money from it until they start using that money against the rest of us.
I'll use it to find information , semi-reliably. Hallucinations are still a huge issue. But I can't help thinking that Stackoverflow and Google have self-enshittified to a point where it makes LLMs look better relative to the pinnacle of more conventional knowledge engines than they actually are.
If you take the evolution of those platforms from saying 2005-2015, and project forward ten years, we should be in a much better place than we are. Instead they've gone backwards as a result of enshittification and toxic management.
If I’m an author and I don’t want my work included in the corpus of text used for training ChatGPT, should I have that right?
What about if I’m an artist and I don’t want my work included in the training data for an image generation model?
No, you should not have that right. Copyright allows you to sell artificial scarcity. AI does not replicate your work directly. So you can still sell your artificial scarcity even if it is trained on.
At least you're acknowledging that training rights are a proposed expansion of current IP laws!
> Copyright allows you to sell artificial scarcity.
Not always. That’s more the domain of patents, honestly.
> AI does not replicate your work directly.
This is false, at least in certain cases. And even if it were always true, it doesn’t mean it doesn’t infringe copyright.
> At least you're acknowledging that training rights are a proposed expansion of current IP laws!
Yes, they are, emphasis on the “proposed,” meaning that I believe that training AI on copyrighted material without permission can be a violation of current copyright law. I don’t actually know if that’s how it should be, honestly. But as of right now I think entities like the New York Times have a good legal case against OpenAI.
You are probably correct, legally. But given that we're talking about Aaron Swartz, who was legally in the wrong but morally (in the classic hacker's sense) in the right, I meant "copyright allows you to sell artificial scarcity" in the moral sense.
I think fundamentally we have a difference in opinion on what copyright is supposed to be about. I hold the more classic hacker ideal that things should be free for re-use by default, and copyright, if there is any, should only apply to direct or almost-direct copies up to a very limited time, such as 10 years or so. (Actually, this is already a compromise, since the true ideal is for there to be no copyright laws at all.)
You may have that right as long as you agree that others have the right to not care about your right when deciding to use "your" stuff however they want.
The two halves of your statement contradict each other. What are you trying to say?
> I don’t want my work included in the corpus of text used for training ChatGPT, should I have that right?
No
You could choose not to publish, and be read
If you are read you can be used to learn from
Generative AI is not learning.
Copyrights don’t depend on whether I choose to publish a particular work or not.
Zuckerberg personally approved using pirated content to train a model. Is that OK too?
Absolutely. Abolish all copyright. I can't believe hackers have lost their pirating roots and have become copyright zealots instead.
IP is fake.
What I mean is, all laws are fake, but while we have to follow some collective hallucination about magic words giving people authority—to keep society together—the specific delusion that we should enforce artificial scarcity by extending monopolies to patterns of information is uniquely silly.
Yes.
This post misses a lot of nuance. Aaron Swartz was an activist who did obviously illegal things and got caught and prosecuted. What OpenAI is doing is in legal gray area because it might be transformative enough to be fair use; we just don't know yet.
Simply being transformative is not sufficient for it to be fair use.
But more to the point if it's deemed illegal Altman won't suffer any personal legal consequences.
It's not about simply not doing something illegal - we all regularly commit crimes that we could be charged with if we piss off the wrong people. When a company does it, it's "disrupting" and heavily rewarded if the company has enough funding or revenue. When people like AS do it, they get destroyed. Selective enforcement in order to maintain the status quo. The last few years have clearly shown that if you are wealthy enough, the rules do not apply to you.
If OpenAI had run its company by hiding their hardware around a university campus they would have gotten in trouble too. It is not as much about the scraping as it’s the the part where MIT sees a masked person sneaking into backrooms hiding equipment that got AS in trouble. And of cause that he literally disrupted service to jstor because he did a poor job of scraping it. He could have gotten through all of this if he had appeared less like a cyber terrorist in his execution of his plan, but of cause he though he wouldn’t get caught doing it, so he never considered how it would look if he did.
By US copyright law, OpenAI is engaged in illegal conduct. The fact that they haven't been punished for it doesn't mean it's magically not illegal.
It's possible copyright law will be revised to make it unambiguously legal to do what they've done, but that's not how the law works right now.
> By US copyright law, OpenAI is engaged in illegal conduct.
What makes you so sure about this? You are not a judge, and multiple cases against OpenAI have been dismissed by judges already.
Why is Sam Altman singled out in these copyright issues? Aren't there plenty of competing models?
OpenAI is the highest profile.
because he's a manipulative POS.
I don't believe you're asking this in good faith because the answer is so obvious. But just in case, it's because OpenAI is by a ridiculously large margin the most well known player in the space and unlike the leaders of other organisations his name is known and he has a personal brand which he puts a lot of effort into promoting.
The two are seen on the picture together! So more similar they could not be. That highlights the irony and hypocrite of capitalism, of better say of human society.
It's possible that both of them are more similar in thought than anti-AI people think, which is why they hung out together.
Oh man. Heavy stuff. Our industry will be looked at as good or bad? I hope we end up doing good for the world.
Open source the models it is the only right decision.
Hard to say when there is a profit motive for all industries. Seems like every industry at the moment is not really looking for human advancement, or maybe it is looking at advancing but only if the results are expensive for end users and efficient/proprietary for the company.
Yes but the thing is our industry has almost unparalleled leverage and marginal utility cost is zero.