The most fun thing about this demo (and the demo is 100% worth trying out, you'll need to use Chrome for it) is that it shows how there's no persistent memory at all. If you get bored of the area you are in, look straight up at the sky and look down again. If you see something interesting - like a fence post - keep that in your field of vision and you should start to see more similarly interesting things.
This gesture would often work for me in lucid dreams-- look away to refresh content. And when I played video games more regularly, my dreams would take on a procedurally generated quality that always struck me as more than the typical abstract dream imagery. It was as if my mind had learned a basic game engine with mechanics, physics and various rendering techniques.
This also works in Safari on my iPhone.
Also because of the limitations of this model to achieve something in AI Minecraft is completely different than the standard game, for example to go to the Nether is easier to find the red mushroom texture and use it to confuse the AI into thinking it's netherrack, same think for the end by using sand blocks or similar color blocks.
See previous discussion: https://news.ycombinator.com/item?id=42014650
81 comments, 241 points.
This is such a cool experiment, and those comments basically boil down to "Lol, it's just minecraft"
Welcome to hacker news. Home of the shallow dismissal.
I could knock up clone of these shallow dismissals in a weekend.
A while back "deformable terrain" and walls you could destroy and such was a big buzzword in games. AFAICT, though, it's rarely been used in truly open-ended ways, vs specific "there are things behind some of these walls, or buried under this mound" type of stuff. Generally there are still certain types of environment objects that let you do certain things, and many more that let you do nothing.
Generative AI could be an interesting approach to the issue of solving the "what happens if you destroy [any particular element]" aspect.
For a lot of games you'd probably still want to have specific destinations set in the map; maybe now it's just much more open-ended as far as how you get there (like some of the ascend-through-matter stuff in Tears of the Kingdom, but more open-ended in a "just start trying to dig anywhere" way and you use gen AI to figure out exactly how much dirt/other material will get piled up for digging in a specific place?).
Or for games with more of an emphasis on random drops, or random maps, you could leverage some of the randomness more directly. Could be really cool for a roguelike.
You can still scale up the Power Game/Noita mindset to larger games, with pretty good results. Teardown is a very fun current-gen title that combines physics and voxels to create a sort of fully physical world to interact with: https://youtu.be/SDfWDB1hGWc
There's still a ways to go, but I don't really think AI will be required if game engine pipelines extend PBR to destruction physics. The biggest bottleneck is frankly the performance hit it would entail, and the dynamic lighting that is inherently required.
This looks like a very similar project to "Diffusion Models Are Real-Time Game Engines"[1] that circulated on HN a few months ago [2], which was playing DOOM. There's some pretty interesting commentary on that post that might also apply to this.
I'd like to do a deeper dive into the two approaches, but on a surface level one interesting note is Oasis specifically mentions using a use-specific ASIC (presumably for inference?):
> When Etched's transformer ASIC, Sohu, is released, we can run models like Oasis in 4K.
Playable demo: https://oasis.decart.ai/starting-point
It looks like noise with hardly any permanence. Seems like a electricity -heavy way of solving "gameplay", if you can call it that.
This is a very impressive demo, this seems very early in a what might in hindsight be an “obvious” direction to take transformers towards
How are state changes consistently tracked?
The context window seems to be about one second long at ca 20 FPS, so you get enough state change to do realistic things like accelerated falling.
What you see is what you get. Build a house, but turn away from it, and it may disappear. Everything seems to be tracked from a few previous frames.
They are not, and probably never will be. Looking up at the sky regenerates your surroundings. Inventory is less tricky, you just feed it back as input, but it is so easy to lose the landscape.
Everyone will shit on this while the authors will get a 20m valuation and walk away with their lives set.
Most people are praising it.