I think AI-"upscaled" videos are as jarring to look at as a newly bought TV before frame smoothing has been disabled. Who seriously thinks this looks better, even if the original is a slightly grainy recording from the 90's?
I was recently sent a link to this recording of a David Bowie & Nine Inch Nails concert, and I got a serious uneasy feeling as if I was on a psychedelic and couldn't quite trust my perception, especially at the 2:00 mark: https://www.youtube.com/watch?v=7Yyx31HPgfs&list=RD7Yyx31HPg...
It turned out that the video was "AI-upscaled" from an original which is really blurry and sometimes has a low frame rate. These are artistic choices, and I think the original, despite being low resolution, captures the intended atmosphere much better: https://www.youtube.com/watch?v=1X6KF1IkkIc&list=RD1X6KF1Ikk...
We have pretty good cameras and lenses now. We don't need AI to "improve" the quality.
The weird thing is that people are seemingly enjoying this.
Yesterday we went to a store to have a look at a few smartphone for my partner. She primarily wants a good camera above any other parameter. I was seeing her preferring those that were counterfeiting the reality the most: she was like, "look I can zoom and it is still sharp" while obviously there was a delay between zooming and the end result which was a reconstructed, liquid like distorded version similar to the upscaling filters people are using on 8/16bit game console emulators. I was cringing at seeing the person I love the most preferring looking at selfies of picture of us with smoothed faces and a terrible fake bokeh in the background instead of something closer to the reality.
I’m a photographer, and am on a bunch of beginner photography groups.
These groups used to be a mix of people being confused at how their camera worked and wanting help, people wanting tips on how to take better pictures, and sometimes there was requests for editing pictures on their behalf (eg “I found this old black and white faded picture of my great grandparents, can anyone help restore it?”)
These days, 99.9% of the posts are requests that involve synthesizing an entirely new picture out of one or more other pictures. Examples: “can someone bring in my grandpa from this picture into this other family picture?”. Or “I love this photo of me with my kids, but I hate how I look. Can someone take the me from this other picture and put it in there? Also please remove the cups from our hands and the trees in the background, and this is my daughter’s ex boyfriend please also remove him”.
What’s even crazier is that the replies of those threads are filled with dozens of people who evidently just copy pasted the prompt + picture into ChatGPT. The results look terrible… but the OP is always pleased as punch!
People don’t care about “reality”. Pictures have lost their status of “visual record of a past event”* and become “visual interpretation of whatever this person happens to want”.
There’s no putting back the genie in the bottle.
*: yes, you can argue they were never 100% that, but still, that’s effectively what they were.
But people have be editing photos like that before AI and even before Photoshop, I don't see the big deal. What I've seen recently is synthesizing whole new pictures with AI, by training a LoRA on their face and body and asking the AI to create themselves with a specific setting or background.
The motivation behind taking pictures has definitely changed over time. People used to keep them mainly for themselves and their close family. Then they started to share with close and not so close friends. Now they use it to boost their "personal online brand". Yes, it was possible to heavily manipulate pictures with Photoshop, or even in analog photography, but it wouldn't make any sense for most people.
People were pirating before napster, but napster made it easy, accessable, and let people do it with little to no barrier.
It's the same with this.. yes photo editing could always be done, but it's far easier now to get better results. It's accessibility changes the game
I'm specifically responding to their point about how "these days" people want different things and I'm saying that they always wanted those things, nothing new about it.
On the contrary, there is plenty new about it. People’s perception of how much you can change influences how much they ask. Seeing new possibilities gives you new ideas.
> But people have be editing photos like that before AI and even before Photoshop
Very few people who had the skill, time or money. I think we are now discovering that everybody wants to edit the photos, they just couldn't do it before in what they consider a reasonable amount of effort.
Yes, I agree, but I am specifically looking to understand the above photographer's point. They said the requests they used to get versus what they get today have changed, but I argue that that doesn't make any sense, people have always wanted to edit their photos in the "now" example even back then.
It totally makes sense that people don't request things they don't expect to be possible.
I generally love AI.
But I lament these blurred lines of reality. Is this photo real? Was this reply ChatGPT or did they actually write it?
It makes me feel uneasy.
I feel the same way. Thankfully there are still obvious signs in case of using LLMs, but it is not always so obvious. I think we may be better off assuming X is fake, and go from there. Sad but what could we do? There are websites that tell you (with a %) whether or not something has been written by an LLM. Unfortunately, however, some of my writings come out false positive. We may need to do improvements on this front, and I believe we will.
> The weird thing is that people are seemingly enjoying this.
I often think of my taste as something shaped by random and by the stuff I have previously consumed. I don't like the linked YT video, but it is probably just unfamiliarity, and I would like it if I consumed more of it.
I don't think that there is anything inherently bad about it. Especially that the things you'd prefer over it (CRT and in-air artifacts, 240p youtube compression artifacts) are also just effects, just different from this sharpening. Therefore I am not surprised that people like it.
Yes, this is the exact same reason that frame smoothing exists. When you walk into a store, all the TVs are lined up showing some random nature show or sports event, and frame smoothing will make your TV look a little more smooth than the others, even though it completely ruins the content.
It's made for making sales, not for making things actually look good.
At some point it became unacceptably rude to gatekeep, king-make, or be otherwise judgemental of taste. It was at around the same time that subcultures and counterculture melted into an homogenous mass.
I think we lost something in that. Embarrassment can be useful for moving us out of our comfort zones.
Yes, 2:07 is just ridiculous. That's more Matt Damon than David Bowie. To be fair, though, this upscale was not committed by Youtube.
> Who seriously thinks this looks better
I don’t think people notice. I don’t own a TV, but twice now I’ve been to some friend’s house and I immediately noticed it on theirs. Both times I explained the Soap Opera effect and suggested disabling the feature. They both agreed, let me do it, and haven’t turned it on again. But I also think that is a mix of trusting me and not caring, I’m not convinced they could really tell the difference.
Tip for those aiming to do the same: Search online for “<tv brand> soap opera effect” and you’re bound to find a website telling you the whereabouts of how to reach the setting. It may not be 100% correct, so be on the lookout for whatever dumb name the manufacturer gave the “feature” (usually described in the same guide you would have found online).
> I got a serious uneasy feeling as if I was on a psychedelic and couldn't quite trust my perception, especially at the 2:00 mark
You weren’t kidding. That bit at 02:06 really makes you start to blink and look closer. The face morphs entirely.
https://youtu.be/7Yyx31HPgfs&t=126s
Looking at the original, it’s obvious why: that section was really blurry. The AI version doesn’t understand camera effects.
https://youtu.be/1X6KF1IkkIc?t=126
Thank you for providing both links, it made the comparison really simple.
I remember watching an episode of one of my favorite shows on my parents’ brand new TV, and thought to myself something about this episode is off, like the production is cheap, the acting feels worse, even the dialog is bad.
Over time I noticed everything looks cheaper on their TV.
It was the auto-smoothing.
It is especially bad for animated shows that have made an explicit artistic choice to let (parts of) the animation progress at a lower frame rate. My kids watched "spider-man: across the spider-verse" at a friends place where smoothing was not turned off, and it completely ruined the artistic feel and made the movie feel like a stuttering video game.
I found those spiderverse movies really hard to watch because of the low frame rate. I don't think it was artistic, it was cheap.
> I found those spiderverse movies really hard to watch because of the low frame rate. I don't think it was artistic, it was cheap.
It was absolutely an artistic choice - Sony spent more per frame on those movies than any previous animated film & the directors knew exactly what they were doing when they chose to animate some parts on every second (or even third) frame.
This is an artistic choice with a variety of film precedents. Its not exactly the same thing, but if you watch this GDC talk about the way that Arc System Works uses 3d to simulate 2d animation, it gets some of the ideas across:
https://www.youtube.com/watch?v=yhGjCzxJV3E
Artists might want to produce a lower framerate just to make something look filmic (eg, 25 frames per second) or hand animated, but it can also be a deliberate stylistic choice for other reasons. Eg, the recentish Mad Max films used subtle undercranking to make action scenes feel more intense, and part of that effect is a more noticeable frames and I think there is a bit of that in the Spiderverse films too.
I think that's a matter of taste, but it definitely doesn't make it easier to watch them when the frame rate randomly switches between low and high all the time :).
I had the exact same experience watching Goodfellas on my parents' TV. It felt like a cheap soap opera and I was thoroughly confused about what's happening. Afterwards I did some research and learned about motion interpolation in modern TVs.
Back when there was a lot of 4x3 on TV, 20 years ago, my parents had their TV set to auto stretch. Why?
Because they felt they were being ripped off, with all that unused space. They paid for widescreen!
Didn't matter that people looked all fat in the face, or that the effect was logarithmic near the edges. A car driving by got wider as it neared the edge of screen!
Nope, only mattered it was widescreen now.
And until I mentioned it, they did not even notice.
When I thought of it, I realised this sort of matches everything. Whether food, or especially politics, nuance is entirely lost on the average person.
I feel, as a place for tech startups, we should realise this. If you plan to market to the public, just drop the nuance. You'll save, be more competitive, and win.
Thing is, to some population this is seen as better. While to me it feels as journalists camera, too real to pass as a story.
It also has to do with how basically everything is filmed for Netflix/streaming nowadays.
Not really sure I get why that would be a factor
I hope you waited until they were out of the room and turned it off in settings?
That's hilarious https://i.imgur.com/TVfncya.png
That's insane. Here's the same-ish frame from the original: https://imgur.com/a/dWS20oP
The extreme blur here was obviously a creative choice by the director/editor, the rest of the video has lower resolution but it's not nearly that bad (which is why Bowie still looks like himself in other parts of the upscaled video).
The process used to upscale the video has no subtlety, it's just "make everything look crisp, even if you have to create entirely made-up faces".
Seems like they ignored the non-square pixel aspect ratio as well for the upscaling, which may have changed face shapes as well.
Between 2:07 and 2:08, the guy on the right loses his glasses. Over the course of a couple frames, they just fade into nothingness.
This phenomenon of pushing technology that end consumers don't want, seem to be driven by a simple sequence of incentives: pressure from shareholders to maintain/increase stock price -> pressure on business to increase market share, raise prices, or at least showcase promising future tech -> pressure on PMs to build new features -> combined with developers' desire to try out new technologies -> result: AI chatbots/summaries on things we didn't ask for, touchscreens on car dashboards, AI upscaling etc.
> pushing technology that end consumers don't want
Flashback to when every TV at CES had 3D functionality. Turns out nobody really wanted what. What an immense waste of resources that was.
Failure of 3D TVs was one unprecedented glorious victory of a customer, where customers not buying it indeed led to its disappearance. Otherwise I'm furiously not buying other ridiculous stuff but my consumer decision does nothing.
When stock price has to grow 8% more than gdp this is inevitable.
> Who seriously thinks this looks better, even if the original is a slightly grainy recording from the 90's?
Whatever you had as a kid feels "natural", these things feel "natural" for new generations.
Same things for a proper file system vs "apps", a teenagers on an ipad will do things you didn't know were possible, put them on windows XP and they won't be able to create a file or a folder, they don't even know what these words mean in the context of computers.
What makes it uneasy is not only upscaling but they are generating new frames to make it 60fps. 60fps by itself feels fake (check some footage of The Hobbit that tried 48fps). It feels like video games.
It's kinda funny to aim for 60fps because modern video productions will often have 60fps footage that's too sharp and clean. So they heavily post process the videos. You add the film grain and lower the fps to 30 or even 24 (cinema) so it looks much more natural.
The question is if this is just habitual / taste thing. We most likely wouldn't prefer 24fps if the movie industry started with 50fps.
It is just habitual and I feel it's making movies look terrible, especially panning shots look like a stuttery mess that is almost unwatchable for me at 24 FPS.
The AI upscaling makes it look like NIN are playing with late-1980s era Rick Astley. Hilarious.
Rick Astley is so ubiquitous now, thanks to Memes, the AI is never gonna give him up.
That is terrible.
I see this upscaling a lot in Youtube videos about WWII that use very grainy B+W film sources (which themselves aren't using the best sources of) and it just turns the footage into some weird flat paneled cartoonish mess. It's not video anymore, it's an animated approximation.
I also think it looks like garbage, but I wonder if maybe it looks better on small mobile screens - where you can't actually see the mangled details, but can perceive that it "looks sharper"
I think its preferred to get this kind of smooth unreal effect for services like youtube, but not because it looks better; but rather because it compresses better for storage. Less fine detail overall helps video compression.
I like upscaling and frame interpolation but as always, the TV does not have the hardware to do a good job. If you use neural network models, it works and looks a lot better without looking plastic-like.
The closeups of the bass player are like 6 slowmotion frames in the original and look like an interpolated mess with unhuman body joints upscaled.
This reminds me of colorized black and white movies from the 90s although I can know imagine AI being used to do that and upscale the past creating new hyper-real versions of the past.
Holy... wtf...
At 2:04 the original deliberately has everyone on stage way out of focus, and the AI upscaler (or the person operating it) decided to just replace it with an in-focus version sporting what looks like late 90s video game characters. That is terrible.
Also, David Bowie looks like a 20-something-year old man in this shot.
Wow, that is horrible! The 2:07 mark where AI put in some generic Rick Astley-alike for Bowie, just made me feel sick
The most upvoted comment is "Thank you so much for preserving this!!"
> I got a serious uneasy feeling as if I was on a psychedelic and couldn't quite trust my perception
When I took LSD for the first time, I realised it was hitting when everything started looking like stable diffusion
The first video induced actual physical nausea.
I had to stop playback or I’m sure I would have thrown up. And I don’t suffer from motion sickness etc.
There’s definitely something “uncanny valley” about it.
Wow, you're not kidding. In some shots David Bowie barely looks like David Bowie because the algorithm's taken such liberties with the original image to try and make it look sharp.
Two root comments (so far) are focusing on YouTube, but the article claims most of the AI was done by Will’s team, using AI to convert stills to video:
> The video features real performances and real audiences, but I believe they were manipulated on two levels:
1. Will Smith’s team generated several short AI image-to-video clips from professionally-shot audience photos
2. YouTube post-processed the resulting Shorts montage, making everything look so much worse
You can see the side-by-side [1] of the YouTube post-processing, and, while definitely altering the original, isn’t what’s causing most of the really bad AI artifacts.
Most of what YouTube appears to be doing is making it less blurry, sometimes successfully, and sometimes not. And, even with that, it is only done on Shorts.
My hardware/software or my eyes are borked because I cannot tell much of a difference between YouTube vs Instagram side-by-side. Gosh. If it is not my eyes, what are the recommendations? What is the top 1 (or 5) reasons I cannot see it if it is not my eyes? Do I need to upgrade my monitor? I have a relatively recent GPU but it is not a beast and I use a HDMI -> VGA converter.
The pictures, however, look god-awful! I presume the video is filled with stuff like these.
The most incredible part about this story is that Will Smith is still a performing (and touring???) musician with any audience at all, AI or otherwise. I thought he was an actor now. Wut happened?
Whenever one gets cancelled, it opens up new opportunities for them.
Will Smith has been both a successful musician and actor since the Fresh Prince days. People do both. I don't know why this would be confusing to you.
Some PM in Youtube: “ yes let’s make it harder to tell real videos from AI to make people who don’t know better more susceptible and accepting of it”
> Some PM in Youtube: “ yes let’s make it harder to tell real videos from AI to make people who don’t know better more susceptible and accepting of it”
This can backfire, perhaps making people believe that real, important news is in reality AI-generated to brainwash them, thus making people less susceptible, and more disbelieving.
The whole point is to make people believe in nothing. When all news is fake, no news matter.
Thats the future. Kids arnt going to have the same mental history as older generayions.
I'm not sure about this specific instance, but AI generated movies will absolutely be the future, when you can create the exact shots you want with stability of the foreground, background, and characters, and edit it all together, it'll be an explosion of creativity just as with image generation currently.
To be clear, I don't think it'll be telling an AI to "create me a movie with X, Y, and Z" because AI reasoning is not there yet, but for the raw video generation, it's progressing steadily, as seen in r/aivideo.
I don't exactly disagree, but I do suggest reading "Trickster Makes This World: Mischief, Myth, and Art" by Lewis Hyde.
There is a reasonable argument to be made that a lot of art is enlivened by the cantankerous, unpredictable and unyielding nature of the media we use to create art. I don't think this is a necessary feature of art per se, but I do think limitations often help humans create good art and that eliminating them often produces things which feel tossed off, trivial, thoughtless.
I think for commercial produces creating "the exact shot you want" might be what shareholders demand of you. But many artists don't set out to create "the exact shot they want," they set out to collaborate with the world to create an impression that captures both their intent and the unpredictable substance of the situation in whatever sense that might mean.
> […] it'll be an explosion of creativity just as with image generation currently.
I'm mostly seeing people who lack the skills or means to create their own works go nuts with prompting gen-AI tools, but it rarely strikes me as creative in either the 'having the ability to create' sense — they've outsourced that — or the 'original, expressive, imaginative' sense.
They don't have the mechanical means, yes, but they decide what to create so it'd be the latter, not sure why you think it's not; the AI isn't independently coming up with ideas and generating the media. Plus with ComfyUI, I'd say there's some of the former too, similar to how music producers aren't literally playing each instrument that's simulated in their software, but they do assemble it together.
The line between movie and games will blur. Once you can do generative movies, you can do games, and vice versa, there's no obvious delineation, and the technical problem is heavily overlapping. Games just has some scoped control inputs, like this: https://demo.dynamicslab.ai/chaos
> Once you can do generative movies, you can do games
No you can't, these are completely different mediums.
There are world models such as Genie [0] which show that they can be constrained to games too.
[0] https://deepmind.google/discover/blog/genie-3-a-new-frontier...
Nope. Limitations feed creativity. When you have unlimited power/reesources, you end up with unlimited slop. One of the reasons why old movies were better on average - now we get so many average movies with no lasting effect. Another one, slightly orthogonal - a golden ring or rolex in a neatly designed photo shoot vs a middle eastern head of state's "throne room". When you have something in limited quantities, you get the best out of it - when it's unlimited you go crazy.
> it'll be an explosion of creativity just as with image generation currently.
I haven't seen anything breathtaking yet, just a tsunami of slop. Arguably we already had a video tsunami of slop, you just have log in into netflix to witness it.
For a long time I disliked the term "content" to describe photos/movies/art/&c. but now I feel it's a very appropriate term, an infinite amount of meaningless "content" to fill bottomless "containers"
On this episode of "Trying to make AI useful"...
Seriously, who's idea was this? It can't be a money saving feature; surely it costs more to upscale all these videos than to just host the HD version.
And even if you argue it can be used only on low res videos to provide a "better experience", the resulting distortion of reality should be very concerning.
If I were a marketing person I would also make genuine images look AI generated for the free publicity. Nothing gets attention like mistakes or fakes. The fact that they aren't actually fake means there is no downside for WS and team. I once spoke to a social media manager for a large brand and he said they intentionally put typos in posts on a semi regular basis and it always results in more post engagement (people correcting the typo).
It's called ragebait and it's pretty common in marketing already.
Like the new Naked Gun movie adding extra floating fingers to their poster, heh.
https://www.theverge.com/youtube/765485/is-youtubes-shorts-e...
Today on The Verge, GenAI upscaling in YT shorts. Yes, AI is here to stay, but I do hope the icky parts go away soon.
> GenAI upscaling in YT shorts
I cannot watch the linked video, but its description quotes “not generative AI”; is The Verge or someone else showing something different?
This is being unnecessarily pedantic. They're saying "yes we're doing post-processing, but that's not technically generative AI."
Personally I couldn't care less about what they call it, I care that it makes the same video look more artifical on YouTube than they look elsewhere.
> Hi! I'm a tech nerd and I try to be precise about the terminology I use
> GenAI typically refers to technologies like transformers and large language models, which are relatively new
> Upscaling typically refers to taking one resolution (like SD/480p) and making it look good at a higher resolution (like HD/1080p)
> This isn't using GenAI or doing any upscaling
> It's using the kind of machine learning you experience with computational photography on smartphones, for example, and it's not changing the resolution
> And sincerely appreciate the feedback!
There's no strict definition of these things, the quote is from YT and they're saying something like this isn't that _bad_ generative AI that people are worried about, this is just good old fashioned wholesome machine learning techniques.
I fear the icky parts might be the only ones staying.
Open an company that sells t-shirts with "AI glitched" text on it so people can make every foto of this kind illegitimate.
Oh those already exist, they just print out whatever ChatGPT gives them without double-checking.
In a couple of years, they can go into same junk drawer of sixth finger prosthetics (generative AI problem) and 5-eyes masks (face recognition problem).
I wonder if the fact that the original video was AI generated made the upscaling look worse than it would on a real video? Not that it can certainly be detected, but an actual video is likely different from an AI generated in ways that it seems like could lead astray their "computational photography" processing.
Everthing is a copy of a copy of a copy
From a PR perspective, I wonder why YouTube is at the same time forcing unwanted AI features down people's throats[1], a move that many companies now do to drum up their perceived AI competence, but THEN at the same time, when asked, also downplaying this use of AI by splitting words.
The combination of the two confuses me. If this was about shareholders, they'd hype up the use of AI, not downplay it. And if this was about users, they'd simply disable this shit.
[1] I mean, they're sacrificing Google Search of all things to push their AI crap. Also, as a bilingual YouTube user, AI-translated titles and descriptions make the site almost unusable now. In addition to some moronic PM forcing this feature onto users, they somehow also seem to have implemented the worst translation model in the industry. The results are often utterly incomprehensible, and there's no way to turn this off.
So... the videos showing the difference between the AI-tainted youtube version and the supposedly untainted instagram version are hosted on... youtube?
apparently the sharpening algorithm is only applied on youtube shorts, not on regular youtube videos.
the experimental post processing was only applied to shorts, according to the post.
Theres not a single person i know in my life who will want this as a consumer. WHY does the world keep doing things that are so complicated and unnecessary.
Because necessary things are equally complicated but less profitable.
I find it hilarious that the Youtube spokespersons go out of their way to clarify that this is "not the bad GenAI shit that we know everyone hates but the good kind, you know, machine learning and stuff, you know, trust us"
What's the point of using videos like this if it's a risk to reputation just to use them?
The people with whom this is a reputational risk were not going to buy Will Smith concert tickets anyway.
The same reason you don't feed prime vegetables and fruits to pigs.
There's no risk to reputation. You get a massive rage boost then reveal that "a social media contractor used authentic crowd photographs in an unauthorized manner and is no longer employed by the company". You reveal the photos, everyone either celebrates the contractor getting canned or that this wasn't AI and you get huge lift.
People are suckers. You can tell them you are going to do this, do it, and they'll still fall for it. Don't tell them and they'll think better of themselves and of you for obeying them (cf fictional firing) and you're done!
Saving money of course! That’s the sad truth
What's the point of saving money if it's a risk to reputation?
Will smith punched a dude on stage, a comedian. I think you are putting a lot on a cage concept with a scatter plot of outliers.
You actually need a reputation of merit for there to be risk. Hes a rapper, not a saint or Ethicist.
“AI” editing is one thing
The upscaling seems to be Google doing it without permission of the original uploader. Google however are unaccountable, you can’t complain otherwise you’ll be exiled
Vote with wallet etc /s
there are no risk of reputation until you use it. further more, even within creative professions, using Gen AI is already acceptable to some degree.
The entire point of this article is that the "make it look worse" filter is being applied by YouTube automatically, whether creators want it or not.
I’m wondering at what point the minority are going to finally accept ai is here to stay.
What happens when AI gets trained on AI slop?
If there's code to stop AI from being trained on AI, I would like to have it from stopping me from seeing it.
Google have the untainted video, so it's a competitive advantage
Better question is what happens when a generation knows nothing but slop
From now on, I am going to bring an unreadable AI looking sign to every concert I go.
I hate how everyone thinks we have to use AI now. I wish this trend would end already.
Who is this "everyone" who thinks we "have to" use AI? From my experience it's very split with people on both sides.
>Who is this "everyone" who thinks we "have to" use AI?
Upper management. Seriously. The push to use AI in everything is very real.
TLDR: The video got VAEed, and we are the discriminators being fooled.
doesn't seem very blurry - don't generate video of a crowd? seems like an easy rule?
Who the hell are going to Will Smith concerts in Europe? He’s been dead to me since the slap, I cannot imagine going to a concert of his with a clean conscience.
I thought Willy had disappeared into the shadows after his wife embarrassed him to the point he prob won’t step foot in public again…