Hey HN, this project started out when I was exploring rendering radiance fields in real-time for standalone VR and AR headsets. I was frustrated by the lack of performant, yet non-CUDA, implementations. Also, this would be a good excuse to learn about compute pipelines in Vulkan.
Right now, the renderer runs on Windows, Linux, macOS, iOS, and visionOS (as an iPad app). OpenXR support and an immersive visionOS app are coming soon. Training is also WIP. As we're seeing the industry adopt research in Gaussian Splatting at a fast pace, I hope this makes it easier for folks implementing Gaussian Splatting or variants in their products.
Would love to hear your feedback!
For more context, see previous HN discussions on Gaussian Splatting:
Thank you so much for open sourcing this!
I've started a project to initialize each pixel in an image into a mesh of GS. I used a depth map to unproject them into space.
I've been very curious what a few training iterations would do to optimize my scenes. The original 3DGS implementation is not accessible on my hardware at the moment! I really look forward to your training implementation!
That's super interesting. Are you trying to do single-view image to 3D?
Kinda! My ultimate goal is to be able to join multiple 3D reconstructions from a set of iPhones all in different positions! Like hemisphere. The single-view image to 3D is pretty accurate but across lots of iPhones.
Does colmap fail on your dataset?
I honestly haven't tried! I'm trying to do this per frame of a live video. I had not considered COLMAP for performance reasons.
For training you can also try https://github.com/pierotofy/opensplat
Great project, it's nice to see so many people are building stuff with Gaussian Splatting.
Just the other day I went through the whole list of known viewers on MrNeRF's awesome 3DGS resources, to find one that runs on a MacBook. I'm working on compressing 3D scenes by sorting Gaussians into 2D grids [0], and I wanted a native viewer that I could for experiments on the go.. perhaps as an alternative backend to the CUDA one in my colleague's exploratory Python viewer [1].
VulkanSplatting was the only one I could get to compile and run on my Intel MacBook. Unfortunately the feeble Intel GPU isn't able to display even the Lego scene at an interactive framerate. Do you think there's performance headroom, and that it will become possible in the future, or should I give up trying to run this on an Intel MBP?
[0]: https://fraunhoferhhi.github.io/Self-Organizing-Gaussians/
Thanks for trying it out! I haven't had the opportunity to benchmark this on an Intel Macbook. Were you able to see which kernel takes the most time? There should be a performance graph if you have the GUI enabled.
For my Apple Silicon benchmarks, the main bottleneck is the parallel radix sort that sorts the Gaussians by tile and depth. I used a some shaders from a sorting library, but it has some performance gaps with SOTA parallel sort algorithms. I think fixing this would give a 1.5x overall performance boost and maybe 3x on Macbooks. Also the wave size isn't tuned for different GPUs.
Another area of improvement is better management of the shared memory. Right now, we just let the driver manage it as the L1 cache. However, we could manage it manually and group Gaussian retrievals together for the same tile. This is what the official implementation does.
Although 3DGS is the first radiance field with SOTA quality that runs in real-time, I think it's still quite heavy. Due to the explicit representation of the scene, a lot of operations are memory bound. If you can't get an interactive frame rate right now, it's unlikely the improvements will make a material difference.
Hopefully that's where your work on compression comes in and solves the problem :)
For some reason the GUI is not showing up for me in 3DGS.cpp. (I checked out the repo, made a build folder and built with Ninja, then launched ./apps/viewer/vulkan_splatting_viewer).
I still have a VulkanSplatting build from this Wednesday though. In VulkanSplatting, when looking at the Lego scene from above with the default window size, I'm getting just below 1 ms for the sorting kernel and just above 1 ms for "render", everything else is too small in the graph to register. But it only displays at a handful of fps, so it seems quite some time goes unaccounted for.
Maybe spinning up Instruments could give some more insights into what's happening? I tried `cmake -G Xcode` to have that setup easily, but the Xcode CMake generation fails with
CMake Error in src/shaders/CMakeLists.txt:
The custom command generating
3DGS.cpp/build-xcode/shaders/shaders.h
is attached to multiple targets:
shaders
xcode_shaders
but none of these is a common dependency of the other(s). This is not
allowed by the Xcode "new build system".
Yeah, I gave it a try and timings seems to be very wrong. I'll fix that soon.
I haven't tried benchmarking SPIR-V shaders on macOS. Since they're translated into Metal shaders anyways, it should be possible theoretically.
Also, for the command line viewer in the new version, I've only tested make or ninja. I'll take a look at xcode when I get a chance.
Update: I just gave Instruments a try and it seems like the Metal compiler grouped all of the compute and copy operations together and just left the timestamp operations to run back to back. Since MoltenVK isn't a conformant implementation, I'm guessing the synchronization dependencies weren't respected.
However, I'm still getting 200ms frame times on the Garden scene at 4K with M1 Pro. The lego scene shouldn't be too bad even on an Intel Mackbook.
This looks interesting. Little sad that even though it supports iOS, the license is LGPL 2.1 - seems like a dealbreaker for anyone who would like to release any project based on it to App Store. Unless EU will use bigger stick on Apple and we finally get some alternative to App Store, there is not easy way for user to swap dynamically linked libraries.
? There are already multiple software using the (L)GPL on the AppStore. https://en.m.wikipedia.org/wiki/List_of_free_and_open-source...
Hell, webkit which is the base of all browsing on iOS is LGPL.
I wonder if this technique could be used to bring back PS1 esque pre-rendered background style games but with actual depth
The original paper doesn't work well with few-shot learning. I'm assuming that there is only one camera angle for each pre-rendered background. For single image to 3D, check out DreamGaussian. [1]
Pre-rendered in this context means you still have access to the full 3d scene so you can generate the full gaussian splatting model from that. The benefit here would be to lower the cost of rendering that complicated scene. There is a game called Fantasian (by the former FF dev) that uses real-life dioramas as backgrounds, I bet this tech would've been perfect fit for that too.
Oh I see. That's a pretty cool use case. Not sure how GS would perform on non-photorealistic scenarios, but certainly worth a try.
Probably, I saw one where they took scenes from movies and recreated 3D versions of them.
I'd love to try it out, but where can I get the scene files?
The original 3DGS paper offers a download of a few scenes (14 GB): https://repo-sam.inria.fr/fungraph/3d-gaussian-splatting/dat...
The link is taken from the projects' GitHub page: https://github.com/graphdeco-inria/gaussian-splatting