• Abishek_Muthian a day ago

    I'm looking at my Jetson Nano in the corner which is fulfilling its post-retirement role as a paper weight because Nvidia abandoned it in 4 years.

    Nvidia Jetson Nano, A SBC for "AI" debuted with already aging custom Ubuntu 18.04 and when 18.04 went EOL, Nvidia abandoned it completely without any further updates to its proprietary jet-pack or drivers and without them all of Machine Learning stack like CUDA, Pytorch etc. became useless.

    I'll never buy a SBC from Nvidia unless all the SW support is up-streamed to Linux kernel.

    • lolinder a day ago

      This is a very important point.

      In general, Nvidia's relationship with Linux has been... complicated. On the one hand, at least they offer drivers for it. On the other, I have found few more reliable ways to irreparably break a Linux installation than trying to install or upgrade those drivers. They don't seem to prioritize it as a first class citizen, more just tolerate it the bare minimum required to claim it works.

      • dotancohen 21 hours ago

          > Nvidia's relationship with Linux has been... complicated.
        
        For those unfamiliar with Linus Torvalds' two-word opinion of Nvidia:

        https://youtube.com/watch?v=OF_5EKNX0Eg

        • stabbles a day ago

          Now that the majority of their revenue is from data centers instead of Windows gaming PCs, you'd think their relationship with Linux should improve or already has.

          • sangnoir 21 hours ago

            Nvidia segments its big iron AI hardware from the consumer/prosumer segment. They do this by forbidding the use of GeForce drivers in datacenters[1]. All that to say, it is possible for the H100 to to have excellent Linux support, while support for the 4090 is awful.

            1. https://www.datacenterdynamics.com/en/news/nvidia-updates-ge...

            • robhlt 20 hours ago

              They have been making real improvements the last few years. Most of their proprietary driver code is in firmware now, and the kernel driver is open-source[1] (the userland-side is still closed though).

              They've also significantly improved support for wayland and stopped trying to force eglstreams on the community. Wayland+nvidia works quite well now, especially after they added explicit sync support.

              1. https://github.com/NVIDIA/open-gpu-kernel-modules/

            • lolinder a day ago

              It's possible. I haven't had a system completely destroyed by Nvidia in the last few years, but I've been assuming that's because I've gotten in the habit of just not touching it once I get it working...

              • godelski 21 hours ago

                I update drivers regularly. I've only had one display failure and was solved by a simple rollback. To be a bit fair (:/) it was specifically a combination of new beta driver and a newer kernel. It's definitely improved a ton since 10 years ago I just would not update them except very carefully.

                • lolinder 18 hours ago

                  I've bricked multiple systems just running apt install on the Nvidia drivers. I have no idea how, but I run the installation, everything works fine, and then when I reboot I can't even boot.

                  That was years ago, but it happened multiple times and I've been very cautious ever since.

                • pplonski86 21 hours ago

                  I got similar experience. I really prefer switch CUDA version with whole PC machine. What is more, the speed and memory of hardware improves quickly in time as well.

                  • KerrAvon 20 hours ago

                    I have been having a fine time with a 3080 on recent Arch, FWIW.

                    HDR support is still painful, but that seems to be a Linux problem, not specific to Nvidia.

                • FuriouslyAdrift 20 hours ago

                  The Digits device runs the same nVidia DGX OS (nVidia custom Ubuntu distro) that they run on their cloud infra.

                • vladslav a day ago

                  I've had a similar experience, my Xavier NX stopped working after the last update and now it's just collecting dust. To be honest, I've found the Nvidia SBC to be more of a hassle than it's worth.

                  • busterarm a day ago

                    Xavier AGX owner here to report the same.

                    • justin66 a day ago

                      My Jetson TX2 developer kit didn't stop working, but it's on a very out of date Linux distribution.

                      Maybe if Nvidia makes it to four trillion in market cap they'll have enough spare change to keep these older boards properly supported, or at least upstream all the needed support.

                      • lexszero_ 13 hours ago

                        Back in 2018 I've been involved in a product development based on TX2. I had to untangle the entire nasty mess of Bash and Python spaghetti that is JetPack SDK to get everything sensibly integrated into our custom firmware build system and workflow (no, copying your application files over prebaked rootfs on a running board is absolutely NOT how it's normally done). You basically need a few deb packages with nvidia libs for your userspace, and swipe a few binaries from Jetpack that have to be run with like 20 undocumented arguments in right order to do the rest (image assembly, flashing, signing, secure boot stuff, etc), the rest of the system could be anything. Right when I was finished, a 3rd party Yocto layer implementing essentially the same stuff that I came up with, and the world could finally forget about horrors of JetPack for good. I also heard that it has somewhat improved later on, but I have not touch any NVidia SoCs since (due to both trauma and moving to a different field).

                        • aleden 21 hours ago

                          Are you aware that mainline linux runs on these Jetson devices? It's a bit of annoying work, but you can be running ArchLinuxARM.

                          https://github.com/archlinuxarm/PKGBUILDs/pull/1580

                          Edit: It's been a while since I did this, but I had to manually build the kernel, overwrite a dtb file maybe (and Linux_for_Tegra/bootloader/l4t_initrd.img) and run something like this (for xavier)

                            sudo ./flash.sh -N 128.30.84.100:/srv/arch -K /home/aeden/out/Image -d /home/aeden/out/tegra194-p2972-0000.dtb jetson-xavier eth0
                          • justin66 20 hours ago

                            How close does any of that get a person to having Ubuntu 24.04 running on their board?

                            (I guess we can put aside the issue of Nvidia's closed source graphics drivers for the moment)

                            • aleden 18 hours ago

                              You could install Ubuntu 24.04 using debootstrap. That would just get you the user space, though, you'd still have to build your own kernel image.

                              • nightski 19 hours ago

                                Isn't the Jetson line more of an embedded line and not a end-user desktop? Why would you run Ubuntu?

                                • justin66 18 hours ago

                                  The Jetson TX2 developer kit makes a very nice developer machine - an ARM64 machine with good graphics acceleration, CUDA, etc.

                                  In any case, Ubuntu is what it comes with.

                                  • aleden 12 hours ago

                                    If you spent enough time and energy on it.. I'm fairly confident you could get the newest Ubuntu running. You'd have to build your own kernel, manually generate the initramfs, figure out how to and then flash it. You'd probably run into stupid little problems like the partition table the flash script makes doesn't allocate enough space for the kernel you've built.. I'm sure there would be hiccups, at the very least, but everything's out there to do it.

                                  • verall 18 hours ago

                                    Jetson are embedded devices that run ubuntu. Ubuntu is the OS it ships with.

                            • smallmancontrov a day ago

                              Wait, my AGX is still working, but I have kept it offline and away from updates. Do the updates kill it? Or is it a case of not supporting newer pytorch or something else you need?

                              • moondev 14 hours ago

                                Xavier AGX is awesome for running ESXi aarch64 edition, including aarch64 Windows vms

                            • aseipp a day ago

                              The Orin series and later use UEFI and you can apparently run upstream, non-GPU enabled kernels on them. There's a user guide page documenting it. So I think it's gotten a lot better, but it's sort of moot because the non-GPU thing is because the JetPack Linux fork has a specific 'nvgpu' driver used for Tegra devices that hasn't been unforked from that tree. So, you can buy better alternatives unless you're explicitly doing the robotics+AI inference edge stuff.

                              But the impression I get from this device is that it's closer in spirit to the Grace Hopper/datacenter designs than it is the Tegra designs, due to both the naming, design (DGX style) and the software (DGX OS?) which goes on their workstation/server designs. They are also UEFI, and in those scenarios, you can (I believe?) use the upstream Linux kernel with the open source nvidia driver using whatever distro you like. In that case, this would be a much more "familiar" machine with a much more ordinary Linux experience. But who knows. Maybe GH200/GB200 need custom patches, too.

                              Time will tell, but if this is a good GPU paired with a good ARM Cortex design, and it works more like a traditional Linux box than the Jeton series, it may be a great local AI inference machine.

                              • moondev 14 hours ago

                                AGX also has UEFI firmware which allows you to install ESXi. Then you can install any generic EFI arm64 iso in a VM with no problems, including windows.

                              • halJordan a day ago

                                It runs their dgx os and Jensen specifically said it would be a full part if their hw stack

                                • startupsfail a day ago

                                  If this is DGX OS, then yes, this is what you’ll find installed on their 4-cards workstations.

                                  This is more like a micro-DGX then, for $3k.

                                • yoyohello13 a day ago

                                  And unless there is some expanded maintenance going on, 22.04 is EOL in 2 years. In my experience, vendors are not as on top of security patches as upstream. We will see, but given NVIDIA's closed ecosystem, I don't have high hopes that this will be supported long term.

                                  • saidinesh5 a day ago

                                    Is there any recent, powerful SBC with fully upstream kernel support?

                                    I can only think of raspberry pi...

                                    • sliken 19 hours ago

                                      rk3588 is pretty close, I believe it's usable today, just missing a few corner cases with HDMI or some such. I believe that last patches are either pending or already applied to an RC.

                                      • msh a day ago

                                        The odroid H series. But that packs a x86 cpu.

                                        • shadowpho 21 hours ago

                                          Radha but that’s n100 aka x64

                                        • nickpsecurity a day ago

                                          If its stack still works, you might be able to sell or donate it to a student experimenting. They can still learn quite a few things with it. Maybe even use it for something.

                                          • sangnoir 21 hours ago

                                            Using outdated tensorflow (v1 from 2018) or outdated PyTorch makes learning harder than it need to be, considering most resources online use much newer versions of the frameworks. If you're learning the fundamentals and working from first principle and creating the building blocks yourself, then it adds to the experience. However, most most people just want to build different types of nets, and it's hard to do when the code won't work for you.

                                          • tcdent a day ago

                                            If you're expecting this device to stay relevant for 4 years you are not the target demographic.

                                            Compute is evolving way too rapidly to be setting-and-forgetting anything at the moment.

                                            • tempoponet a day ago

                                              Today I'm using 2x 3090's which are over 4 years old at this point and still very usable. To get 48gb vram I would need 3x 5070ti - still over $2k.

                                              In 4 years, you'll be able to combine 2 of these to get 256gb unified memory. I expect that to have many uses and still be in a favorable form factor and price.

                                              • mrybczyn a day ago

                                                Eh? By all indications compute is now evolving SLOWER than ever. Moore's Law is dead, Dennard scaling is over, the latest fab nodes are evolutionary rather than revolutionary.

                                                This isn't the 80s when compute doubled every 9 months, mostly on clock scaling.

                                                • sliken 19 hours ago

                                                  Indeed, generational improvements are at an all time low. Most of the "revolutionary" AI and/or GPU improvements are less precision (fp32 -> fp16 -> fp8 -> fp4) or adding ever more fake pixels, fake frames, and now in the most recent iteration multiple fake frames per computed frame.

                                                  I believe Nvidia has some published numbers for the 5000 series that showed DLSS off performance, which allowed a fair comparison to the previous generation, on the order of 25%, then removed it.

                                                  Thankfully the 3rd party benchmarks that use the same settings on old and new hardware should be out soon.

                                                  • tcdent a day ago

                                                    Fab node size is not the only factor in performance. Physical limits were reached, and we're pulling back from the extremely small stuff for the time being. That is the evolutionary part.

                                                    Revolutionary developments are: multi-layer wafer bonding, chiplets (collections of interconnected wafers) and backside power delivery. We don't need the transistors to keep getting physically smaller, we need more of them, and at increased efficiency, and that's exactly what's happening.

                                                    • dotancohen 21 hours ago

                                                      All that comes with linear increases of heat, and exponential difficulty of heat dissipation (square-cube law).

                                                      There is still progress being made in hardware, but for most critical components it's looking far more logarithmic now as we're approaching the physical material limits.

                                              • Karupan a day ago

                                                I feel this is bigger than the 5x series GPUs. Given the craze around AI/LLMs, this can also potentially eat into Apple’s slice of the enthusiast AI dev segment once the M4 Max/Ultra Mac minis are released. I sure wished I held some Nvidia stocks, they seem to be doing everything right in the last few years!

                                                • rbanffy a day ago

                                                  This is something every company should make sure they have: an onboarding path.

                                                  Xeon Phi failed for a number of reasons, but one where it didn't need to fail was availability of software optimised for it. Now we have Xeons and EPYCs, and MI300C's with lots of efficient cores, but we could have been writing software tailored for those for 10 years now. Extracting performance from them would be a solved problem at this point. The same applies for Itanium - the very first thing Intel should have made sure it had was good Linux support. They could have it before the first silicon was released. Itaium was well supported for a while, but it's long dead by now.

                                                  Similarly, Sun has failed with SPARC, which also didn't have an easy onboarding path after they gave up on workstations. They did some things right: OpenSolaris ensured the OS remained relevant (still is, even if a bit niche), and looking the other way for x86 Solaris helps people to learn and train on it. Oracle cloud could, at least, offer it on cloud instances. Would be nice.

                                                  Now we see IBM doing the same - there is no reasonable entry level POWER machine that can compete in performance with a workstation-class x86. There is a small half-rack machine that can be mounted on a deskside case, and that's it. I don't know of any company that's planning to deploy new systems on AIX (much less IBMi, which is also POWER), or even for Linux on POWER, because it's just too easy to build it on other, competing platforms. You can get AIX, IBMi and even IBMz cloud instances from IBM cloud, but it's not easy (and I never found a "from-zero-to-ssh-or-5250-or-3270" tutorial for them). I wonder if it's even possible. You can get Linux on Z instances, but there doesn't seem to be a way to get Linux on POWER. At least not from them (several HPC research labs still offer those).

                                                  • nimish a day ago

                                                    1000% all these ai hardware companies will fail if they don't have this. You must have a cheap way to experiment and develop. Even if you want to only sell a $30000 datacenter card you still need a very low cost way to play.

                                                    Sad to see big companies like intel and amd don't understand this but they've never come to terms with the fact that software killed the hardware star

                                                    • theptip a day ago

                                                      Isn’t the cloud GPU market covering this? I can run a model for $2/hr, or get a 8xH100 if I need to play with something bigger.

                                                      • rbanffy a day ago

                                                        People tend to limit their usage when it's time-billed. You need some sort of desktop computer anyway, so, if you spend the 3K this one costs, you have unlimited time of Nvidia cloud software. When you need to run on bigger metal, then you pay $2/hour.

                                                        • bmicraft 16 hours ago

                                                          3k is still very steep for anyone not on a silicon valley like salary.

                                                        • johndough a day ago

                                                          I have the skills to write efficient CUDA kernels, but $2/hr is 10% of my salary, so no way I'm renting any H100s. The electricity price for my computer is already painful enough as is. I am sure there are many eastern European developers who are more skilled and get paid even less. This is a huge waste of resources all due to NVIDIA's artificial market segmentation. Or maybe I am just cranky because I want more VRAM for cheap.

                                                          • rbanffy 19 hours ago

                                                            This has 128GB of unified memory. A similarly configured Mac Studio costs almost twice as much, and I'm not sure the GPU is on the same league (software support wise, it isn't, but that's fixable).

                                                            A real shame it's not running mainline Linux - I don't like their distro based on Ubuntu LTS.

                                                        • rbanffy a day ago

                                                          > Sad to see big companies like intel and amd don't understand this

                                                          And it's not like they were never bitten (Intel has) by this before.

                                                          • nimish a day ago

                                                            Well, Intel management is very good at snatching defeat from the jaws of victory

                                                            • the_panopticon 20 hours ago
                                                              • rbanffy 20 hours ago

                                                                At least they don’t suffer from a lack of onboarding paths for x86, and it seems they are doing a nice job with their dGPUs.

                                                                Still unforgivable that their new CPUs hit the market without excellent Linux support.

                                                          • p_ing a day ago

                                                            Raptor Computing provides POWER9 workstations. They're not cheap, still use last-gen hardware (DDR4/PCIe 4 ... and POWER9 itself) but they're out there.

                                                            https://www.raptorcs.com/content/base/products.html

                                                            • rbanffy 19 hours ago

                                                              It kind of defeats the purpose of an onboarding platform if it’s more expensive than the one you think of moving away from.

                                                              IBM should see some entry-level products as loss leaders.

                                                              • throwaway48476 18 hours ago

                                                                They're not offering POWER10 either because IBM closed the firmware again. Stupid move.

                                                              • UncleOxidant 18 hours ago

                                                                There were Phi cards, but they were pricey and power hungry (at the time, now current GPU cards probably meet or exceed the Phi card's power consumption) for plugging into your home PC. A few years back there was a big fire sale on Phi cards - you could pick one up for like $200. But by then nobody cared.

                                                                • rbanffy 4 hours ago

                                                                  Imagine if they were sold at cost in the beginning. Also, think about having one as the only CPU rather than a card.

                                                                • AtlasBarfed a day ago

                                                                  It really mystifies me that Intel AMD and other hardware companies obviously Nvidia in this case Don't either have a consortium or each have their own in-house Linux distribution with excellent support.

                                                                  Windows has always been a barrier to hardware feature adoption to Intel. You had to wait 2 to 3 years, sometimes longer, for Windows to get around us providing hardware support.

                                                                  Any OS optimizations in Windows you had to go through Microsoft. So say you added some instructions custom silicon or whatever to speed up Enterprise databases, provide high-speed networking that needed some special kernel features, etc, there was always Microsoft being in the way.

                                                                  Not just in the drag the feet communication. Getting the tech people a line problem.

                                                                  Microsoft will look at every single change. It did as to whether or not it would challenge their Monopoly whether or not it was in their business interest whether or not it kept you as the hardware and a subservient role.

                                                                  • p_ing a day ago

                                                                    From the consumer perspective, it seems that MSFT has provided scheduler changes fairly rapidly for CPU changes, like X3D, P/e cores, etc. At least within a couple of months, if not at release.

                                                                    Amd/Intel work directly with Microsoft for shipping new silicon that would otherwise require it.

                                                                    • rbanffy 19 hours ago

                                                                      > From the consumer perspective, it seems that MSFT has provided scheduler changes fairly rapidly

                                                                      Now they have some competition. This is relatively new, and Satya Nadella reshaped the company because of that.

                                                                • sheepscreek a day ago

                                                                  The developers they are referring to aren’t just enthusiasts; they are also developers who were purchasing SuperMicro and Lambda PCs to develop models for their employers. Many enterprises will buy these for local development because it frees up the highly expensive enterprise-level chip for commercial use.

                                                                  This is a genius move. I am more baffled by the insane form factor that can pack this much power inside a Mac Mini-esque body. For just $6000, two of these can run 400B+ models locally. That is absolutely bonkers. Imagine running ChatGPT on your desktop. You couldn’t dream about this stuff even 1 year ago. What a time to be alive!

                                                                  • HarHarVeryFunny a day ago

                                                                    The 1 PetaFLOP spec and 200GB model capacity specs are for FP4 (4-bit floating point), which means inference not training/development. It's still be a decent personal development machine, but not for that size of model.

                                                                    • numba888 21 hours ago

                                                                      This looks like a bigger brother of Orin AGX, which has 64GB of RAM and runs smaller LLMs. The question will be power and performance vs 5090. We know price is 1.5x

                                                                      • stogot a day ago

                                                                        How does it run 400B models across two? I didn’t see that in the article

                                                                        • tempay a day ago

                                                                          > Nvidia says that two Project Digits machines can be linked together to run up to 405-billion-parameter models, if a job calls for it. Project Digits can deliver a standalone experience, as alluded to earlier, or connect to a primary Windows or Mac PC.

                                                                          • FuriouslyAdrift 20 hours ago

                                                                            Point to point ConnectX connection (RDMA with GPUDirect)

                                                                            • sliken 19 hours ago

                                                                              Not sure exactly, but they mentioned linking to together with ConnectX, which could be ethernet or IB. No idea on the speed though.

                                                                          • dagmx a day ago

                                                                            I think the enthusiast side of things is a negligible part of the market.

                                                                            That said, enthusiasts do help drive a lot of the improvements to the tech stack so if they start using this, it’ll entrench NVIDIA even more.

                                                                            • Karupan a day ago

                                                                              I’m not so sure it’s negligible. My anecdotal experience is that since Apple Silicon chips were found to be “ok” enough to run inference with MLX, more non-technical people in my circle have asked me how they can run LLMs on their macs.

                                                                              Surely a smaller market than gamers or datacenters for sure.

                                                                              • dagmx a day ago

                                                                                I mean negligible to their bottom line. There may be tons of units bought or not, but the margin on a single datacenter system would buy tens of these.

                                                                                It’s purely an ecosystem play imho. It benefits the kind of people who will go on to make potentially cool things and will stay loyal.

                                                                                • htrp a day ago

                                                                                  >It’s purely an ecosystem play imho. It benefits the kind of people who will go on to make potentially cool things and will stay loyal.

                                                                                  100%

                                                                                  The people who prototype on a 3k workstation will also be the people who decide how to architect for a 3k GPU buildout for model training.

                                                                                  • mrlongroots a day ago

                                                                                    > It’s purely an ecosystem play imho. It benefits the kind of people who will go on to make potentially cool things and will stay loyal.

                                                                                    It will be massive for research labs. Most academics have to jump through a lot of hoops to get to play with not just CUDA, but also GPUDirect/RDMA/Infiniband etc. If you get older/donated hardware, you may have a large cluster but not newer features.

                                                                                    • ckemere a day ago

                                                                                      Academic minimal-bureaucracy purchasing card limit is about $4k, so pricing is convenient*2.

                                                                                    • bwfan123 a day ago

                                                                                      Devalapers developers developers - balmer monkey dance - the key to be entrenched is the platform ecosystem.

                                                                                      Also why aws is giving trainium credits for free

                                                                                    • stuaxo a day ago

                                                                                      It's annoying I do LLMs for work and have a bit of an interest in them and doing stuff with GANS etc.

                                                                                      I have a bit of an interest in games too.

                                                                                      If I could get one platform for both, I could justify 2k maybe a bit more.

                                                                                      I can't justify that for just one half: running games on Mac, right now via Linux: no thanks.

                                                                                      And on the PC side, nvidia consumer cards only go to 24gb which is a bit limiting for LLMs, while being very expensive - I only play games every few months.

                                                                                      • WaxProlix a day ago

                                                                                        The new $2k card from Nvidia will be 32GB but your point stands. AMD is planning a unified chiplet based GPU architecture (AI/data center/workstation/gaming) called UDNA, which might alleviate some of these issues. It's been delayed and delayed though - hence the lackluster GPU offerings from team Red this cycle - so I haven't been getting my hopes up.

                                                                                        Maybe (LP)CAMM2 memory will make model usage just cheap enough that I can have a hosting server for it and do my usual midrange gaming GPU thing before then.

                                                                                        • sliken 19 hours ago

                                                                                          Grace + Hopper, Grace + blackwell, and discussed GB10 are much like the currently shipping AMD MI300A.

                                                                                          I do hope that a AMD Strix Halo ships with 2 LPCAMM2 slots for a total width of 256 bits.

                                                                                          • FuriouslyAdrift 20 hours ago

                                                                                            Unified architecture is still on track for 2026-ish.

                                                                                          • wkat4242 a day ago

                                                                                            32gb as of last night :)

                                                                                          • moralestapia a day ago

                                                                                            Yes, but people already had their Macs for others reasons.

                                                                                            No one goes to an Apple store thinking "I'll get a laptop to do AI inference".

                                                                                            • JohnBooty a day ago

                                                                                              They have, because until now Apple Silicon was the only practical way for many to work with larger models at home because they can be configured with 64-192GB of unified memory. Even the laptops can be configured with up to 128GB of unified memory.

                                                                                              Performance is not amazing (roughly 4060 level, I think?) but in many ways it was the only game in town unless you were willing and able to build a multi-3090/4090 rig.

                                                                                              • moralestapia a day ago

                                                                                                I would bet that people running LLMs on their Macs, today, is <0.1% of their user base.

                                                                                                • sroussey a day ago

                                                                                                  People buying Macs for LLMs—sure I agree.

                                                                                                  Since the current MacOS comes built in with small LLMs, that number might be closer to 50% not 0.1%.

                                                                                                  • moralestapia 19 hours ago

                                                                                                    I'm not arguing whether or not Macs are capable of doing it, but whether is a material force that drives people to buy Macs because of it; it's not.

                                                                                                  • justincormack 20 hours ago

                                                                                                    Higher than that buying the top end machines though, which are very high margin

                                                                                                    • throwaway48476 18 hours ago

                                                                                                      All macs? Yes. But of 192GB mac configs? Probably >50%

                                                                                                  • the_other a day ago

                                                                                                    I'm currently wondering how likely it is I'll get into deeper LLM usage, and therefore how much Apple Silicon I need (because I'm addicted to macOS). So I'm some way closer to your steel man than you'd expect. But I'm probably a niche within a niche.

                                                                                                    • com2kid a day ago

                                                                                                      Tons of people do, my next machine will likely be a Mac for 60% this reason and 40% Windows being so user hostile now.

                                                                                                      • kelsey98765431 a day ago

                                                                                                        my $5k m3 max 128gb disagrees

                                                                                                        • moralestapia a day ago

                                                                                                          Doubt it, a year ago useful local LLMs on a Mac (via something like ollama) was barely taking off.

                                                                                                          If what you say it's true you were among the first 100 people on the planet who were doing this; which btw, further supports my argument on how extremely rare is that use case for Mac users.

                                                                                                          • sroussey a day ago

                                                                                                            No, I got a MacBook Pro 14”with M2 Max and 64GB for LLMs, and that was two generations back.

                                                                                                            • kgwgk 17 hours ago

                                                                                                              People were running llama.cpp on Mac laptops in March 2023 and Llama2 was released in July 2023. People were buying Macs to run LLMs months before M3 machines became available in November 2023.

                                                                                                      • qwertox a day ago

                                                                                                        You could have said the same about gamers buying expensive hardware in the 00's. It's what made Nvidia big.

                                                                                                        • spaceman_2020 a day ago

                                                                                                          I keep thinking about stocks that have 100xd, and most seemed like obscure names to me as a layman. But man, Nvidia was a household name to anyone that ever played any game. And still so many of us never bothered buying the stock

                                                                                                          Incredible fumble for me personally as an investor

                                                                                                          • motoxpro a day ago

                                                                                                            Unless you predicted AI and Crypto then it was just really good, not 100x. It 20x from 2005-2020 but ~500x from 2005-2025

                                                                                                            And if you truly did predict that Nvidia would own those markets and those markets would be massive, you could have also bought Amazon, Google or heck even Bitcoin. Anything you touched in tech really would have made you a millionaire really.

                                                                                                            • fragmede a day ago

                                                                                                              Survivors bias though. It's hard to name all the companies that failed in the dot com bust, but even among the ones that made it through, because they're not around any more, they're harder to remember than the winners. But MCI, Palm, RIM, Nortel, Compaq, Pets.com, Webvan all failed and went to zero. There's an uncountable number of ICOs and NFTs that ended up nowhere. SVB isn't exactly an tech stock but they were strongly connected to it and they failed.

                                                                                                              • adolph a day ago

                                                                                                                It is interesting to think about crypto as a stairstep that Nvidia used to get to its current position in AI. It wasn't games > ai, but games > crypto > ai.

                                                                                                              • robohoe a day ago

                                                                                                                Nvidia joined S&P500 in 2001 so if you've been doing passive index fund investing, you probably got a little bit of it in your funds. So there was some upside to it.

                                                                                                              • Cumpiler69 a day ago

                                                                                                                There's a lot more gamers than people wanting to play with LLms at home.

                                                                                                                • anonylizard a day ago

                                                                                                                  There's a titanic market with people wanting some uncensored local LLM/image/video generation model. This market extremely overlaps with gamers today, but will grow exponentially every year.

                                                                                                                  • JohnBooty a day ago

                                                                                                                    I'm sure a lot of people see "uncensored" and think "porn" but there's a lot of stuff that e.g. Dall-E won't let you do.

                                                                                                                    Suppose you're a content creator and you need an image of a real person or something copyrighted like a lot of sports logos for your latest YouTube video's thumbnail. That kind of thing.

                                                                                                                    I'm not getting into how good or bad that is; I'm just saying I think it's a pretty common use case.

                                                                                                                    • stuaxo a day ago

                                                                                                                      Apart from the uncensored bit, I'm in this small market.

                                                                                                                      Do I buy a Macbook with silly amount of RAM when I only want to mess with images occasionally.

                                                                                                                      Do I get a big Nvidia card, topping out at 24gb - still small for some LLMs, but I could occasionally play games using it at least.

                                                                                                                      • Cumpiler69 a day ago

                                                                                                                        How big is that market you claim? Local LLM image generation already exists out off the box on latest Samsung flagship phones and it's mostly a Gimmick that gets old pretty quickly. Hardly comparable to gaming in terms of market size and profitablity.

                                                                                                                        Plus, YouTube and the Google images is already full of AI generated slop and people are already tired of it. "AI fatigue" amongst majority of general consumers is a documented thing. Gaming fatigues is not.

                                                                                                                        • TeMPOraL a day ago

                                                                                                                          > Gaming fatigues is not.

                                                                                                                          It is. You may know it as the "I prefer to play board games (and feel smugly superior about it) because they're ${more social, require imagination, $whatever}" crowd.

                                                                                                                          • Cumpiler69 a day ago

                                                                                                                            The market heavily disagrees with you.

                                                                                                                            "The global gaming market size was valued at approximately USD 221.24 billion in 2024. It is forecasted to reach USD 424.23 billion by 2033, growing at a CAGR of around 6.50% during the forecast period (2025-2033)"

                                                                                                                            • com2kid a day ago

                                                                                                                              Farmville style games underwent similar explosive estimates of growth, up until they collapsed.

                                                                                                                              Much of the growth in gaming of late has come from exploitive dark patterns, and those dark patterns eventually stop working because users become immune to them.

                                                                                                                              • mrguyorama 21 hours ago

                                                                                                                                >Farmville style games underwent similar explosive estimates of growth, up until they collapsed.

                                                                                                                                They did not collapse, they moved to smartphones. The "free"-to-play gacha portion of the gaming market is so successful it is most of the market. "Live service" games are literally traditional game makers trying to grab a tiny slice of that market, because it's infinitely more profitable than making actual games.

                                                                                                                                >those dark patterns eventually stop working because users become immune to them.

                                                                                                                                Really? Slot machines have been around for generations and have not become any less effective. Gambling of all forms has relied on the exact same physiological response for millennia. None of this is going away without legislation.

                                                                                                                                • com2kid 20 hours ago

                                                                                                                                  > Slot machines have been around for generations and have not become any less effective.

                                                                                                                                  Slot machines are not a growth market. The majority of people wised to them literal generations ago, although enough people remain susceptible to maintain a handful of city economies.

                                                                                                                                  > They did not collapse, they moved to smartphones

                                                                                                                                  Agreed, but the dark patterns being used are different. The previous dark patterns became ineffective. The level of sophistication of psychological trickery in modern f2p games is far beyond anything Farmville ever attempted.

                                                                                                                                  The rise of live service games also does not bode well for infinite growth in the industry as there's only so many hours to go around each day for playing games and even the evilest of player manipulation techniques can only squeeze so much blood from a stone.

                                                                                                                                  The industry is already seeing the failure of new live service games to launch, possibly analogous to what happened in the MMO market when there was a rush of releases after WoW. With the exception of addicts, most people can only spend so many hours a day playing games.

                                                                                                                          • madwolf a day ago

                                                                                                                            I think he implied AI generated porn. Perhaps also other kind of images that are at odds with morality and/or the law. I'm not sure but probably Samsung phones don't let you do that.

                                                                                                                          • weregiraffe a day ago

                                                                                                                            >There's a titanic market

                                                                                                                            Titanic - so about to hit an iceberg and sink?

                                                                                                                            • otabdeveloper4 a day ago

                                                                                                                              > There's a titanic market with people wanting some uncensored local LLM/image/video generation model.

                                                                                                                              No. There's already too much porn on the internet, and AI porn is cringe and will get old very fast.

                                                                                                                              • ceejayoz a day ago

                                                                                                                                AI porn is currently cringe, just like Eliza for conversations was cringe.

                                                                                                                                The cutting edge will advance, and convincing bespoke porn of people's crushes/coworkers/bosses/enemies/toddlers will become a thing. With all the mayhem that results.

                                                                                                                                • otabdeveloper4 a day ago

                                                                                                                                  It will always be cringe due to how so-called "AI" works. Since it's fundamentally just log-likelihood optimization under the hood, it will always be a statistically most average image. Which means it will always have that characteristic "plastic" and overdone look.

                                                                                                                                  • ceejayoz 21 hours ago

                                                                                                                                    The current state of the art in AI image generation was unimaginable a few years back. The idea that it'll stay as-is for the next century seems... silly.

                                                                                                                                    • otabdeveloper4 7 hours ago

                                                                                                                                      If you're talking about some sort of non-existent sci-fi future "AI" that isn't just log-likelihood optimization, then most likely such a fantastical thing wouldn't be using NVidia's GPU with CUDA.

                                                                                                                                      This hardware is only good for current-generation "AI".

                                                                                                                                • JohnBooty a day ago

                                                                                                                                  I think there are a lot of non-porn uses. I see a lot of YouTube thumbnails that seem AI generated, but feature copyrighted stuff.

                                                                                                                                  (example: a thumbnail for a YT video about a video game, featuring AI-generated art based on that game. because copyright reasons, in my very limited experience Dall-E won't let you do that)

                                                                                                                                  I agree that AI porn doesn't seem a real market driver. With 8 billion people on Earth I know it has its fans I guess, but people barely pay for porn in the first place so I reallllly dunno how many people are paying for AI porn either directly or indirectly.

                                                                                                                                  It's unclear to me if AI generated video will ever really cross the "uncanny valley." Of course, people betting against AI have lost those bets again and again but I don't know.

                                                                                                                                  • Filligree a day ago

                                                                                                                                    > No. There's already too much porn on the internet, and AI porn is cringe and will get old very fast.

                                                                                                                                    I needed an uncensored model in order to, guess what, make an AI draw my niece snowboarding down a waterfall. All the online services refuse on basis that the picture contains -- oh horrors -- a child.

                                                                                                                                    "Uncensored" absolutely does not imply NSFW.

                                                                                                                                    • otabdeveloper4 a day ago

                                                                                                                                      Yeah, and there's that story about "private window" mode in browsers because you were shopping for birthday gifts that one time. You know what I mean though.

                                                                                                                                      • Filligree a day ago

                                                                                                                                        I really don't. Censored models are so censored they're practically useless for anything but landscapes. Half of them refuse to put humans in the pictures at all.

                                                                                                                                    • Paradigma11 a day ago

                                                                                                                                      I think scams will create a far more demand. Spear Phishing targets by creating persistent elaborate online environments is going to be big.

                                                                                                                                    • itsoktocry a day ago

                                                                                                                                      >There's a titantic market

                                                                                                                                      How so?

                                                                                                                                      Only 40% of gamers use a PC, a portion of those use AI in any meaningful way, and a fraction of those want to set up a local AI instance.

                                                                                                                                      Then someone releases an uncensored, cloud based AI and takes your market?

                                                                                                                                    • estebarb a day ago

                                                                                                                                      Sure, but those developers will create functionality that will require advanced GPUs and people will want that functionality. Eventually OS will expect it and it will became default everywhere. So, it is an important step that will push nvidia growing in the following years.

                                                                                                                                  • gr3ml1n a day ago

                                                                                                                                    AMD thought the enthusiast side of things was a negligible side of the market.

                                                                                                                                    • dagmx a day ago

                                                                                                                                      That’s not what I’m saying. I’m saying that the people buying this aren’t going to shift their bottom line in any kind of noticeable way. They’re already sold out of their money makers. This is just an entrenchment opportunity.

                                                                                                                                    • epolanski a day ago

                                                                                                                                      If this is gonna be widely used by ML engineers, in biopharma, etc and they land 1000$ margins at half a million sales that's half a billion in revenue, with potential to grow.

                                                                                                                                      • VikingCoder a day ago

                                                                                                                                        If I were NVidia, I would be throwing everything I could at making entertainment experiences that need one of these to run...

                                                                                                                                        I mean, this is awfully close to being "Her" in a box, right?

                                                                                                                                        • dagmx a day ago

                                                                                                                                          I feel like a lot of people miss that Her was a dystopian future, not an ideal to hit.

                                                                                                                                          Also, it’s $3000. For that you could buy subscriptions to OpenAI etc and have the dystopian partner everywhere you go.

                                                                                                                                          • VikingCoder a day ago

                                                                                                                                            We already live in dystopian hell and I'd like to have Scarlett Johansen whispering in my ear, thanks.

                                                                                                                                            Also, I don't particularly want my data to be processed by anyone else.

                                                                                                                                            • nostromo a day ago

                                                                                                                                              Fun fact: Her was set in the year 2025.

                                                                                                                                              • swat535 a day ago

                                                                                                                                                Boring fact: The underlying theme of the movie Her is actually divorce and the destructive impact it has on people, the futuristic AI stuff is just for stuffing!

                                                                                                                                                • AnonymousPlanet a day ago

                                                                                                                                                  The overall theme of Her was human relationships. It was not about AI and not just about divorce in particular.The AI was just a plot device to include a bodyless person into the equation. Watch it again with this in mind and you will see what I mean.

                                                                                                                                                  • adolph a day ago

                                                                                                                                                    The universal theme of Her was the set of harmonics that define what is something and the thresholds, boundaries, windows onto what is not thatthing but someotherthing, even if the thing perceived is a mirror, not just about human relationships in particular. The relationship was just a plot device to make a work of deep philosophy into a marketable romantic comedy.

                                                                                                                                              • int_19h a day ago

                                                                                                                                                This is exactly the scenario where you don't want "the cloud" anywhere.

                                                                                                                                                • croes a day ago

                                                                                                                                                  OpenAI doesn’t make any profit. So either it dies or prices go up. Not to mention the privacy aspect of your own machine and the freedom of choice which models to run

                                                                                                                                                  • blackoil a day ago

                                                                                                                                                    > So either it dies or prices go up.

                                                                                                                                                    Or efficiency gains in hardware and software catchup making current price point profitable.

                                                                                                                                                    • croes 19 hours ago

                                                                                                                                                      Training data gets mired in expensive and they need constant input otherwise the AI‘s knowledge is outdated

                                                                                                                                                    • com2kid a day ago

                                                                                                                                                      OpenAI built a 3 billion dollar business in less than 3 years of a commercial offering.

                                                                                                                                                      • croes 19 hours ago

                                                                                                                                                        3 billion revenue and 5 billion loss doesn’t sound like a sustainable business model.

                                                                                                                                                        • com2kid 6 hours ago

                                                                                                                                                          Rumor has it they run queries at a profit, and most of the cost is in training and staff.

                                                                                                                                                          If they is true their path to profitability isn't super rocky. Their path to achieving their current valuation may end up being trickier though!

                                                                                                                                                          • VikingCoder 14 hours ago

                                                                                                                                                            The real question is what the next 3 years look like. If it's another 5 billion burned for 3 billion or less in revenue, that's one thing... But...

                                                                                                                                                            • croes 8 hours ago

                                                                                                                                                              How...

                                                                                                                                                              • menaerus 7 hours ago

                                                                                                                                                                Recent report says there are 1M paying customers. At ~30USD for 12 months this is ~3.6B of revenue which kinda matches their reported figures. So to break even at their ~5B costs assuming that they need no further major investment in infrastructure they only need to increase the paying subscriptions from 1M to 2M. Since there are ~250M people who engaged with OpenAI free tier service 2x projection doesn't sound too surreal.

                                                                                                                                                      • vasco 19 hours ago

                                                                                                                                                        One man's dystopia is another man's dream. There's no "missing" in the moral of a movie, you make whatever you want out of it.

                                                                                                                                                        • smt88 a day ago

                                                                                                                                                          If Silicon Valley could tell the difference between utopias and dystopias, we wouldn't have companies named Soylent or iRobot, and the recently announced Anduril/Palantir/OpenAI partnership to hasten the creation of either SkyNet or Big Brother wouldn't have happened at all.

                                                                                                                                                          • VikingCoder 14 hours ago

                                                                                                                                                            I mean, we still act like a "wild goose chase" is a bad thing.

                                                                                                                                                            We still schedule "bi-weekly" meetings.

                                                                                                                                                            We can't agree on which way charge goes in a wire.

                                                                                                                                                            Have you seen the y-axis on an economists chart?

                                                                                                                                                          • t0lo a day ago

                                                                                                                                                            The dystopian overton window has shifted, didn't you know, moral ambiguity is a win now? :) Tesla was right.

                                                                                                                                                            • tacticus a day ago

                                                                                                                                                              they don't miss that part. they just want to be the evil character.

                                                                                                                                                              • dnissley 21 hours ago

                                                                                                                                                                Please name the dystopian elements of Her.

                                                                                                                                                              • int_19h a day ago

                                                                                                                                                                The real interesting stuff will happen when we get multimodal LMs that can do VR output.

                                                                                                                                                              • computably a day ago

                                                                                                                                                                Yeah, it's more about preempting competitors from attracting any ecosystem development than the revenue itself.

                                                                                                                                                                • option a day ago

                                                                                                                                                                  today’s enthusiast, grad student, hacker is tomorrow’s startup founder, CEO, CTO or 10x contributor in large tech company

                                                                                                                                                                  • Mistletoe a day ago

                                                                                                                                                                    > tomorrow’s startup founder, CEO, CTO or 10x contributor in large tech company

                                                                                                                                                                    Do we need more of those? We need plumbers and people that know how to build houses. We are completely full on founders and executives.

                                                                                                                                                                    • hatboat 4 hours ago

                                                                                                                                                                      If they're already an "enthusiast, grad student, hacker", are they likely to choose the "plumbers and people that know how to build houses" career track?

                                                                                                                                                                      True passion for one's career is rare, despite the clichéd platitudes ecouraging otherwise. That's something we should encourage and invest in regardless of the field.

                                                                                                                                                                      • davrosthedalek a day ago

                                                                                                                                                                        We might not, but Nvidia would certainly like it.

                                                                                                                                                                  • bloomingkales a day ago

                                                                                                                                                                    Jensen did say in recent interview, paraphrasing, “they are trying to kill my company”.

                                                                                                                                                                    Those Macs with unified memory is a threat he is immediately addressing. Jensen is a wartime ceo from the looks of it, he’s not joking.

                                                                                                                                                                    No wonder AMD is staying out of the high end space, since NVIDIA is going head on with Apple (and AMD is not in the business of competing with Apple).

                                                                                                                                                                    • T-A a day ago

                                                                                                                                                                      From https://www.tomshardware.com/pc-components/cpus/amds-beastly...

                                                                                                                                                                      The fire-breathing 120W Zen 5-powered flagship Ryzen AI Max+ 395 comes packing 16 CPU cores and 32 threads paired with 40 RDNA 3.5 (Radeon 8060S) integrated graphics cores (CUs), but perhaps more importantly, it supports up to 128GB of memory that is shared among the CPU, GPU, and XDNA 2 NPU AI engines. The memory can also be carved up to a distinct pool dedicated to the GPU only, thus delivering an astounding 256 GB/s of memory throughput that unlocks incredible performance in memory capacity-constrained AI workloads (details below). AMD says this delivers groundbreaking capabilities for thin-and-light laptops and mini workstations, particularly in AI workloads. The company also shared plenty of gaming and content creation benchmarks.

                                                                                                                                                                      [...]

                                                                                                                                                                      AMD also shared some rather impressive results showing a Llama 70B Nemotron LLM AI model running on both the Ryzen AI Max+ 395 with 128GB of total system RAM (32GB for the CPU, 96GB allocated to the GPU) and a desktop Nvidia GeForce RTX 4090 with 24GB of VRAM (details of the setups in the slide below). AMD says the AI Max+ 395 delivers up to 2.2X the tokens/second performance of the desktop RTX 4090 card, but the company didn’t share time-to-first-token benchmarks.

                                                                                                                                                                      Perhaps more importantly, AMD claims to do this at an 87% lower TDP than the 450W RTX 4090, with the AI Max+ running at a mere 55W. That implies that systems built on this platform will have exceptional power efficiency metrics in AI workloads.

                                                                                                                                                                      • adrian_b a day ago

                                                                                                                                                                        "Fire breathing" is completely inappropriate.

                                                                                                                                                                        Strix Halo is a replacement for the high-power laptop CPUs from the HX series of Intel and AMD, together with a discrete GPU.

                                                                                                                                                                        The thermal design power of a laptop CPU-dGPU combo is normally much higher than 120 W, which is the maximum TDP recommended for Strix Halo. The faster laptop dGPUs want more than 120 W only for themselves, not counting the CPU.

                                                                                                                                                                        So any claims of being surprised that the TDP range for Strix Halo is 45 W to 120 W are weird, like the commenter has never seen a gaming laptop or a mobile workstation laptop.

                                                                                                                                                                        • bmicraft 16 hours ago

                                                                                                                                                                          > The thermal design power of a laptop CPU-dGPU combo is normally much higher than 120 W

                                                                                                                                                                          Normally? Much higher than 120W? Those are some pretty abnormal (and dare I say niche?) laptops you're talking about there. Remember, that's not peak power - thermal design power is what the laptop should be able to power and cool pretty much continuously.

                                                                                                                                                                          At those power levels, they're usually called DTR: desktop replacement. You certainly can't call it "just a laptop" anymore once we're in needs-two-power-supplies territory.

                                                                                                                                                                          • adrian_b 8 hours ago

                                                                                                                                                                            Any laptop that in marketed as "gaming laptop" or "mobile workstation" belongs to this category.

                                                                                                                                                                            I do not know which is the proportion of gaming laptops and mobile workstations vs. thin and light laptops. While obviously there must be much more light laptops, the gaming laptops cannot be a niche product, because there are too many models offered by a lot of vendors.

                                                                                                                                                                            My own laptop is a Dell Precision, so it belongs to this class. I would not call Dell Precision laptops as a niche product, even if they are typically used only by professionals.

                                                                                                                                                                            My previous laptop was some Lenovo Yoga that also belonged to this class, having a discrete NVIDIA GPU. In general, any laptop having a discrete GPU belongs to this class, because the laptop CPUs intended to be paired with discrete GPUs have a default TDP of 45 W or 55 W, while the smallest laptop discrete GPUs may have TDPs of 55 W to 75 W, but the faster laptop GPUs have TDPs between 100 W and 150 W, so the combo with CPU reaches a TDP around 200 W for the biggest laptops.

                                                                                                                                                                            • bloomingkales 3 hours ago

                                                                                                                                                                              People are very unaware just how much better a gaming laptop from 3 years ago is (compared to a copilot laptop). These laptops are sub $500 on eBay, and Best Buy won’t give you more than $150 for it as a trade in (almost like they won’t admit that those laptops outclass the new category type of AI pc).

                                                                                                                                                                      • nomel 17 hours ago

                                                                                                                                                                        > since NVIDIA is going head on with Apple

                                                                                                                                                                        I think this is a race that Apple doesn't know it's part of. Apple has something that happens to work well for AI, as a side effect of having a nice GPU with lots of fast shared memory. It's not marketed for inference.

                                                                                                                                                                        • JoshTko 21 hours ago

                                                                                                                                                                          Which interview was this?

                                                                                                                                                                        • hkgjjgjfjfjfjf a day ago

                                                                                                                                                                          You missed the Ryzen hx ai pro 395 product announcement

                                                                                                                                                                        • llm_trw a day ago

                                                                                                                                                                          From the people I talk to the enthusiast market is nvidia 4090/3090 saturated because people want to do their fine tunes also porn on their off time. The Venn diagram of users who post about diffusion models and llms running at home is pretty much a circle.

                                                                                                                                                                          • dist-epoch a day ago

                                                                                                                                                                            Not your weights, not your waifu

                                                                                                                                                                            • Tostino a day ago

                                                                                                                                                                              Yeah, I really don't think the overlap is as much as you imagine. At least in /r/localllama and the discord servers I frequent, the vast majority of users are interested in one or the other primarily, and may just dabble with other things. Obviously this is just my observations...I could be totally misreading things.

                                                                                                                                                                            • numba888 a day ago

                                                                                                                                                                              > I sure wished I held some Nvidia stocks, they seem to be doing everything right in the last few years!

                                                                                                                                                                              They propelled on unexpected LLM boom. But plan 'A' was robotics in which NVidia invested a lot for decades. I think their time is about to come, with Tesla's humanoids for 20-30k and Chinese already selling for $16k.

                                                                                                                                                                              • qwertox a day ago

                                                                                                                                                                                This is somewhat similar to what GeForce was to gamers back in the days, but for AI enthusiasts. Sure, the price is much higher, but at least it's a completely integrated solution.

                                                                                                                                                                                • Karupan a day ago

                                                                                                                                                                                  Yep that's what I'm thinking as well. I was going to buy a 5090 mainly to play around with LLM code generation, but this is a worthy option for roughly the same price as building a new PC with a 5090.

                                                                                                                                                                                  • qwertox a day ago

                                                                                                                                                                                    It has 128 GB of unified RAM. It will not be as fast as the 32 GB VRAM of the 5090, but what gamer cards have always lacked was memory.

                                                                                                                                                                                    Plus you have fast interconnects, if you want to stack them.

                                                                                                                                                                                    I was somewhat attracted by the Jetson AGX Orin with 64 GB RAM, but this one is a no-brainer for me, as long as idle power is reasonable.

                                                                                                                                                                                    • moffkalast a day ago

                                                                                                                                                                                      Having your main pc as an LLM rig also really sucks for multitasking, since if you want to keep a model loaded to use it when needed, it means you have zero resources left to do anything else. GPU memory maxed out, most of the RAM used. Having a dedicated machine even if it's slower is a lot more practical imo, since you can actually do other things while it generates instead of having to sit there and wait, not being able to do anything else.

                                                                                                                                                                                • tarsinge a day ago

                                                                                                                                                                                  > I sure wished I held some Nvidia stocks

                                                                                                                                                                                  I’m so tired of this recent obsession with the stock market. Now that retail is deeply invested it is tainting everything, like here on a technology forum. I don’t remember people mentioning Apple stock every time Steve Jobs made an announcement in the past decades. Nowadays it seems everyone is invested in Nvidia and just want the stock to go up, and every product announcement is a mean to that end. I really hope we get a crash so that we can get back to a more sane relation with companies and their products.

                                                                                                                                                                                  • lioeters a day ago

                                                                                                                                                                                    > hope we get a crash

                                                                                                                                                                                    That's the best time to buy. ;)

                                                                                                                                                                                  • axegon_ a day ago

                                                                                                                                                                                    > they seem to be doing everything right in the last few years

                                                                                                                                                                                    About that... Not like there isn't a lot to be desired from the linux drivers: I'm running a K80 and M40 in a workstation at home and the thought of having to ever touch the drivers, now that the system is operational, terrifies me. It is by far the biggest "don't fix it if it ain't broke" thing in my life.

                                                                                                                                                                                    • sliken 19 hours ago

                                                                                                                                                                                      Use a filesystem that snapshots AND do a complete backup.

                                                                                                                                                                                      • mycall a day ago

                                                                                                                                                                                        Buy a second system which you can touch?

                                                                                                                                                                                        • axegon_ a day ago

                                                                                                                                                                                          That IS the second system (my AI home rig). I've given up on Nvidia for using it on my main computer because of their horrid drivers. I switched to Intel ARC about a month ago and I love it. The only downside is that I have a xeon on my main computer and Intel never really bothered to make ARC compatible with xeons so I had to hack my bios around, hoping I don't mess everything up. Luckily for me, it all went well so now I'm probably one of a dozen or so people worldwide to be running xeons + arc on linux. That said, the fact that I don't have to deal with nvidia's wretched linux drivers does bring a smile to my face.

                                                                                                                                                                                      • paxys a day ago

                                                                                                                                                                                        “Bigger” in what sense? For AI? Sure, because this an AI product. 5x series are gaming cards.

                                                                                                                                                                                        • a________d a day ago

                                                                                                                                                                                          Not expecting this to compete with the 5x series in terms of gaming; But it's interesting to note the increase in gaming performance Jensen was speaking about with Blackwell was larger related to inferenced frames generated by the tensor cores.

                                                                                                                                                                                          I wonder how it would go as a productivity/tinkering/gaming rig? Could a GPU potentially be stacked in the same way an additional Digit can?

                                                                                                                                                                                          • wpwpwpw a day ago

                                                                                                                                                                                            Would hadn't nvidia cripple nvlink on geforce.

                                                                                                                                                                                          • Karupan a day ago

                                                                                                                                                                                            Bigger in the sense of the announcements.

                                                                                                                                                                                            • AuryGlenz a day ago

                                                                                                                                                                                              Eh. Gaming cards, but also significantly faster. If the model fits in the VRAM the 5090 is a much better buy.

                                                                                                                                                                                            • technofiend a day ago

                                                                                                                                                                                              Will there really be a mac mini wirh Max or Ultra CPUs? This feels like somewhat of an overlap with the Mac Studio.

                                                                                                                                                                                            • croes 19 hours ago

                                                                                                                                                                                              Did they say anything about power consumption?

                                                                                                                                                                                              Apple M chips are pretty efficient.

                                                                                                                                                                                              • iKevinShah a day ago

                                                                                                                                                                                                I can confirm this is the case (for me).

                                                                                                                                                                                                • puppymaster a day ago

                                                                                                                                                                                                  it eats into all NVDA consumer-facing clients no? I can see why openai and etc are looking for alternative hardware solution to train their next model.

                                                                                                                                                                                                  • GaryNumanVevo a day ago

                                                                                                                                                                                                    I bet $100k on NVIDIA stocks ~7 years ago, just recently closed out a bunch of them

                                                                                                                                                                                                    • informal007 a day ago

                                                                                                                                                                                                      I would like to have Mac as my personal computer and digits as service to run llm.

                                                                                                                                                                                                      • behringer a day ago

                                                                                                                                                                                                        Not only that, but it should help free up the gpus for the gamers.

                                                                                                                                                                                                        • wslh 19 hours ago

                                                                                                                                                                                                          The nVidia price is closer (USD 3k) to a top Mac mini but I trust Apple more for the end-to-end support from hardware to apps than nVidia. Not an Apple fanboy but an user/dev, and I don't think we realize what Apple really achieved, industrially speaking. The M1 was launched in late 2020.

                                                                                                                                                                                                          • trhway a day ago

                                                                                                                                                                                                            >enthusiast AI dev segment

                                                                                                                                                                                                            i think it isn't about enthusiast. To me it looks like Huang/NVDA is pushing further a small revolution using the opening provided by the AI wave - up until now the GPU was add-on to the general computing core onto which that computing core offloaded some computing. With AI that offloaded computing becomes de-facto the main computing and Huang/NVDA is turning tables by making the CPU is just a small add-on on the GPU, with some general computing offloaded to that CPU.

                                                                                                                                                                                                            The CPU being located that "close" and with unified memory - that would stimulate development of parallelization for a lot of general computing so that it would be executed on GPU, very fast that way, instead of on the CPU. For example classic of enterprise computing - databases, the SQL ones - a lot, if not, with some work, everything, in these databases can be executed on GPU with a significant performance gain vs. CPU. Why it isn't happening today? Load/unload onto GPU eats into performance, complexity of having only some operations offloaded to GPU is very high in dev effort, etc. Streamlined development on a platform with unified memory will change it. That way Huang/NVDA may pull out rug from under the CPU-first platforms like AMD/INTC and would own both - new AI computing as well as significant share of the classic enterprise one.

                                                                                                                                                                                                            • tatersolid a day ago

                                                                                                                                                                                                              > these databases can be executed on GPU with a significant performance gain vs. CPU

                                                                                                                                                                                                              No, they can’t. GPU databases are niche products with severe limitations.

                                                                                                                                                                                                              GPUs are fast at massively parallel math problems, they anren’t useful for all tasks.

                                                                                                                                                                                                              • trhway a day ago

                                                                                                                                                                                                                >GPU databases are niche products with severe limitations.

                                                                                                                                                                                                                today. For the reasons like i mentioned.

                                                                                                                                                                                                                >GPUs are fast at massively parallel math problems, they anren’t useful for all tasks.

                                                                                                                                                                                                                GPU are fast at massively parallel tasks. Their memory bandwidth is 10x of that of the CPU for example. So, typical database operations, massively parallel in nature like join or filter, would run about that faster.

                                                                                                                                                                                                                Majority of computing can be parallelized and thus benefit from being executed on GPU (with unified memory of the practically usable for enterprise sizes like 128GB).

                                                                                                                                                                                                                • menaerus 7 hours ago

                                                                                                                                                                                                                  > So, typical database operations, massively parallel in nature like join or filter, would run about that faster.

                                                                                                                                                                                                                  Given workload A how much of the total runtime JOIN or FILTER would take in contrast to the storage engine layer for example? My gut feeling tells me not much since to see the actual gain you'd need to be able to parallelize everything including the storage engine challenges.

                                                                                                                                                                                                                  IIRC all the startups building databases around GPUs failed to deliver in the last ~10 years. All of them are shut down if I am not mistaken.

                                                                                                                                                                                                                  • trhway 4 hours ago

                                                                                                                                                                                                                    With cheap large RAMs and the SSD the storage has already became much less of an issue, especially when the database is primarily in-memory one.

                                                                                                                                                                                                                    How about attaching SSD based storage to NVLink? :) Nvidia does have the direct to memory tech and uses wide buses, so i don't see any issue for them to direct attach arrays of SSD if they feel like it.

                                                                                                                                                                                                                    >IIRC all the startups building databases around GPUs failed to deliver in the last ~10 years. All of them are shut down if I am not mistaken.

                                                                                                                                                                                                                    As i already said - model of database offloading some ops to GPU with its separate memory isn't feasible, and those startups confirmed it. Especially when GPU would be 8-16GB while the main RAM can easily be 1-2TB with 100-200 CPU cores. With 128GB unified memory like on GB10 the situation looks completely different (that Nvidia allows only 2 to be connected by NVLink is just a market segmentation not a real technical limitation).

                                                                                                                                                                                                                  • justincormack 20 hours ago

                                                                                                                                                                                                                    The unified memory is no faster for the GPU than the CPU. So its not 10x the CPU. HBM on a GPU is much faster.

                                                                                                                                                                                                                    • trhway 16 hours ago

                                                                                                                                                                                                                      No. The unified memory on GB10 is much faster than regular RAM to CPU system:

                                                                                                                                                                                                                      https://nvidianews.nvidia.com/news/nvidia-puts-grace-blackwe...

                                                                                                                                                                                                                      "The GB10 Superchip enables Project DIGITS to deliver powerful performance using only a standard electrical outlet. Each Project DIGITS features 128GB of unified, coherent memory and up to 4TB of NVMe storage. With the supercomputer, developers can run up to 200-billion-parameter large language models to supercharge AI innovation."

                                                                                                                                                                                                                      https://www.nvidia.com/en-us/data-center/grace-cpu-superchip...

                                                                                                                                                                                                                      "Grace is the first data center CPU to utilize server-class high-speed LPDDR5X memory with a wide memory subsystem that delivers up to 500GB/s of bandwidth "

                                                                                                                                                                                                                      As far as i see it is about 4x of Zen 5.

                                                                                                                                                                                                              • csomar a day ago

                                                                                                                                                                                                                Am I the only one disappointed by these? They cost roughly half the price of a macbook pro and offer hmm.. half the capacity in RAM. Sure speed matters in AI, but what do I do with speed when I can't load a 70b model.

                                                                                                                                                                                                                On the other hand, with a $5000 macbook pro, I can easily load a 70b model and have a "full" macbook pro as a plus. I am not sure I fully understand the value of these cards for someone that want to run personal AI models.

                                                                                                                                                                                                                • gnabgib a day ago

                                                                                                                                                                                                                  Are you, perhaps, commenting on the wrong thread? Project Digits is a $3k 128GB computer.. the best your your $5K MBP can have for ram is.. 128GB.

                                                                                                                                                                                                                  • rictic a day ago

                                                                                                                                                                                                                    Hm? They have 128GB of RAM. Macbook Pros cap out at 128GB as well. Will be interesting to see how a Project Digits machine performs in terms of inference speed.

                                                                                                                                                                                                                    • macawfish a day ago

                                                                                                                                                                                                                      Then buy two and stack them!

                                                                                                                                                                                                                      Also I'm unfamiliar with macs is there really a MacBook pro with 256GB of RAM?

                                                                                                                                                                                                                      • csomar a day ago

                                                                                                                                                                                                                        No, macbooks pro cap at 128GB. But, still, they are a laptop. It'll be interesting to see if Apple can offer a good counter for the desktop. The mac pro can go to 192Gb which is closer to the 128Gb Digits + your Desktop machine. At $9299 price tag, it's not too competitive but close.

                                                                                                                                                                                                                        • lr1970 a day ago

                                                                                                                                                                                                                          > It'll be interesting to see if Apple can offer a good counter for the desktop.

                                                                                                                                                                                                                          Mac Pro [0] is a desktop with M2 Ultra and up to 192GB of unified memory.

                                                                                                                                                                                                                          [0] https://www.apple.com/mac-pro/

                                                                                                                                                                                                                      • maniroo 8 hours ago

                                                                                                                                                                                                                        Bro we can connect two ProjectDigits as well. I was only looking at the M4 macbook because 128gb unified memory. Now this beast can cook better LLMs at just 3K with 4TB SSD too. M4 Macbook Max (128 GB unified ram and 4TB Storage) is 5999. So, No more apple for me. I will just get the Digits. And can create a workstation as well.

                                                                                                                                                                                                                      • doctorpangloss a day ago

                                                                                                                                                                                                                        What slice?

                                                                                                                                                                                                                        Also, macOS devices are not very good inference solutions. They are just believed to be by diehards.

                                                                                                                                                                                                                        I don't think Digits will perform well either.

                                                                                                                                                                                                                        If NVIDIA wanted you to have good performance on a budget, it would ship NVLink on the 5090.

                                                                                                                                                                                                                        • Karupan a day ago

                                                                                                                                                                                                                          They are perfectly fine for certain people. I can run Qwen-2.5-coder 14B on my M2 Max MacBook Pro with 32gb at ~16 tok/sec. At least in my circle, people are budget conscious and would prefer using existing devices rather than pay for subscriptions where possible.

                                                                                                                                                                                                                          And we know why they won't ship NVLink anymore on prosumer GPUs: they control almost the entire segment and why give more away for free? Good for the company and investors, bad for us consumers.

                                                                                                                                                                                                                          • acchow a day ago

                                                                                                                                                                                                                            > I can run Qwen-2.5-coder 14B on my M2 Max MacBook Pro with 32gb at ~16 tok/sec. At least in my circle, people are budget conscious

                                                                                                                                                                                                                            Qwen 2.5 32B on openrouter is $0.16/million output tokens. At your 16 tokens per second, 1 million tokens is 17 continuous hours of output.

                                                                                                                                                                                                                            Openrouter will charge you 16 cents for that.

                                                                                                                                                                                                                            I think you may want to reevaluate which is the real budget choice here

                                                                                                                                                                                                                            Edit: elaborating, that extra 16GB ram on the Mac to hold the Qwen model costs $400, or equivalently 1770 days of continuous output. All assuming electricity is free

                                                                                                                                                                                                                            • Karupan a day ago

                                                                                                                                                                                                                              It's a no brainer for me cause I already own the MacBook and I don't mind waiting a few extra seconds. Also, I didn't buy the mac for this purpose, it's just my daily device. So yes, I'm sure OpenRouter is cheaper, but I just don't have to think about using it as long as the open models are reasonable good for my use. Of course your needs may be quite different.

                                                                                                                                                                                                                              • oarsinsync a day ago

                                                                                                                                                                                                                                > Openrouter will charge you 16 cents for that

                                                                                                                                                                                                                                And log everything too?

                                                                                                                                                                                                                                • moffkalast a day ago

                                                                                                                                                                                                                                  It's a great option if you want to leak your entire internal codebase to 3rd parties.

                                                                                                                                                                                                                              • YetAnotherNick a day ago

                                                                                                                                                                                                                                > Also, macOS devices are not very good inference solutions

                                                                                                                                                                                                                                They are good for single batch inference and have very good tok/sec/user. ollama works perfectly in mac.

                                                                                                                                                                                                                            • derbaum a day ago

                                                                                                                                                                                                                              I'm a bit surprised by the amount of comments comparing the cost to (often cheap) cloud solutions. Nvidia's value proposition is completely different in my opinion. Say I have a startup in the EU that handles personal data or some company secrets and wants to use an LLM to analyse it (like using RAG). Having that data never leave your basement sure can be worth more than $3000 if performance is not a bottleneck.

                                                                                                                                                                                                                              • lolinder a day ago

                                                                                                                                                                                                                                Heck, I'm willing to pay $3000 for one of these to get a good model that runs my requests locally. It's probably just my stupid ape brain trying to do finance, but I'm infinitely more likely to run dumb experiments with LLMs on hardware I own than I am while paying per token (to the point where I currently spend way more time with small local llamas than with Claude), and even though I don't do anything sensitive I'm still leery of shipping all my data to one of these companies.

                                                                                                                                                                                                                                This isn't competing with cloud, it's competing with Mac Minis and beefy GPUs. And $3000 is a very attractive price point in that market.

                                                                                                                                                                                                                                • logankeenan a day ago

                                                                                                                                                                                                                                  Have you been to the localLlama subreddit? It’s a great resource for running models locally. It’s what got me started.

                                                                                                                                                                                                                                  https://www.reddit.com/r/LocalLLaMA/

                                                                                                                                                                                                                                  • lolinder a day ago

                                                                                                                                                                                                                                    Yep! I don't spend much time there because I got pretty comfortable with llama before that subreddit really got started, but it's definitely turned up some helpful answers about parameter tuning from time to time!

                                                                                                                                                                                                                                  • ynniv a day ago

                                                                                                                                                                                                                                    I'm pretty frugal, but my first thought is to get two to run 405B models. Building out 128GB of VRAM isn't easy, and will likely cost twice this.

                                                                                                                                                                                                                                    • rsanek a day ago

                                                                                                                                                                                                                                      You can get a M4 Max MBP with 128GB for $1k less than two of these single-use devices.

                                                                                                                                                                                                                                      • ynniv 21 hours ago

                                                                                                                                                                                                                                        These are 128GB each. Also, Nvidias inference speed is much higher than Apple's.

                                                                                                                                                                                                                                        I do appreciate that my MBP can run models though!

                                                                                                                                                                                                                                        • ganoushoreilly 19 hours ago

                                                                                                                                                                                                                                          I read the Nvidia units are 250 Tflops vs the M4 Pro 27 Tflops. If they perform as advertised i'm in for two.

                                                                                                                                                                                                                                          • lolinder 21 hours ago

                                                                                                                                                                                                                                            Don't these devices provide 128GB each? So you'd need to price in two Macs to be a fair comparison to two Digits.

                                                                                                                                                                                                                                            • layer8 20 hours ago

                                                                                                                                                                                                                                              But then you have to use macOS.

                                                                                                                                                                                                                                        • originalvichy a day ago

                                                                                                                                                                                                                                          Even for established companies this is great. A tech company can have a few of these locally hosted and users can poll the company LLM with sensitive data.

                                                                                                                                                                                                                                          • sensesp a day ago

                                                                                                                                                                                                                                            100% I see many SMEs not willing to send their data to some cloud black box.

                                                                                                                                                                                                                                            • jckahn a day ago

                                                                                                                                                                                                                                              Exactly this. I would happily give $3k to NVIDIA to avoid giving 1 cent to OpenAI/Anthropic.

                                                                                                                                                                                                                                            • diggan a day ago

                                                                                                                                                                                                                                              The price seems relatively competitive even compared to other local alternatives like "build your own PC". I'd definitely buy one of this (or even two if it works really well) for developing/training/using models that currently run on cobbled together hardware I got left after upgrading my desktop.

                                                                                                                                                                                                                                              • 627467 a day ago

                                                                                                                                                                                                                                                > Having that data never leave your basement sure can be worth more than $3000 if performance is not a bottleneck

                                                                                                                                                                                                                                                I get what you're saying, but there are also regulations (and your own business interest) that expects data redundancy/protection which keeping everything on-site doesnt seem to cover

                                                                                                                                                                                                                                                • btbuildem a day ago

                                                                                                                                                                                                                                                  Yeah that's cheaper than many prosumer GPUs on the market right now

                                                                                                                                                                                                                                                • narrator a day ago

                                                                                                                                                                                                                                                  Nvidia releases a Linux desktop supercomputer that's better price/performance wise than anything Wintel is doing and their whole new software stack will only run on WSL2. They aren't porting to Win32. Wow, it may actually be the year of Linux on the Desktop.

                                                                                                                                                                                                                                                  • sliken a day ago

                                                                                                                                                                                                                                                    Not sure how to judge better price/perf. I wouldn't expect 20 Neoverse N2 cores to do particularly well vs 16 zen5 cores. The GPU side looks promising, but they aren't mentioning memory bandwidth, configuration, spec, or performance.

                                                                                                                                                                                                                                                    Did see vague claims of "starting at $3k", max 4TB nvme, and max 128GB ram.

                                                                                                                                                                                                                                                    I'd expect AMD Strix Halo (AI Max plus 395) to be reasonably competitive.

                                                                                                                                                                                                                                                    • skavi a day ago

                                                                                                                                                                                                                                                      It’s actually “10 Arm Cortex-X925 and 10 Cortex-A725” [0]. These are much newer cores and have a reasonable chance of being competitive.

                                                                                                                                                                                                                                                      [0]: https://newsroom.arm.com/blog/arm-nvidia-project-digits-high...

                                                                                                                                                                                                                                                      • adrian_b a day ago

                                                                                                                                                                                                                                                        For programs dominated by iterations over arrays, these 10 Arm Cortex-X925 + 10 Cortex-A725, all 20 together, should have a throughput similar with only 10 of the 16 cores of Strix Halo (assuming that Strix Halo has full Zen 5 cores, which has not been confirmed yet).

                                                                                                                                                                                                                                                        For programs dominated by irregular integer and pointer operations, like software project compilation, 10 Arm Cortex-X925 + 10 Cortex-A725 should have a similar throughput with a 16-core Strix Halo, but which is faster would depend on cooling (i.e. a Strix Halo configured for a high power consumption will be faster).

                                                                                                                                                                                                                                                        There is not enough information to compare the performance of the GPUs from this NVIDIA Digits and from Strix Halo. However, it can be assumed that NVIDIA Digits will be better for ML/AI inference. Whether it can also be competitive for training or for graphics remains to be seen.

                                                                                                                                                                                                                                                        • skavi a day ago

                                                                                                                                                                                                                                                          How did you come up with these numbers? There don't seem to be many shipping products with these cores. In fact, the only one I could find was the Dimensity 9400 with a single X925 and older generation A720s. And of course the Dimensity is a mobile SoC, so clocks will be low.

                                                                                                                                                                                                                                                          Are you projecting based on Arm's stated improvements from their last gen? In that case, what numbers are you using as your baseline?

                                                                                                                                                                                                                                                          • adrian_b 21 hours ago

                                                                                                                                                                                                                                                            For programs rich in array operations, which can be accelerated by SVE or AVX-512, Cortex-X925 has 6 x 128-bit execution pipelines, Cortex-A725 has 2 pipelines, Snapdragon Oryon has 4 pipelines, while a Zen 5 core has the equivalent of 8 Arm execution pipelines (i.e. 2 x 512-bit pipelines equivalent with 8 x 128-bit) + other 8 execution pipelines that can do only a subset of the operations.

                                                                                                                                                                                                                                                            That means a total of 80 execution pipelines for NVIDIA Digits, 48 execution pipelines for Snapdragon Elite and 128 equivalent execution pipelines for Strix Halo, taking into account only the complete execution pipelines, otherwise for operations like FP addition, which can be done in any pipeline, there would be 256 equivalent execution pipelines for Strix Halo.

                                                                                                                                                                                                                                                            Because the clock frequencies for multithreaded applications should be similar, if not better for Strix Halo, there is little doubt that the throughput for applications dominated by array operations should be at least 128/80 for Strix Halo vs. NVIDIA Digits, if not much better, because for many instructions even more execution pipelines are available and Zen 5 also has a higher IPC when executing irregular code, especially vs. the smaller Cortex-A725 cores. Therefore the throughput of NVIDIA Digits is smaller or at most equal in comparison with the throughput of 10 cores of Strix Halo.

                                                                                                                                                                                                                                                            On the other hand, for integer/pointer processing code, the number of execution units in a Cortex-925 + a Cortex-725 is about the same as in 2 Zen 5 cores. Therefore the 20 Arm cores of NVIDIA Digits have about the same number of execution units as 20 Zen 5 cores. Nevertheless, the occupancy of the Zen 5 execution units will be higher for most programs than for the Arm cores, especially because of the bigger and better cache memories, and also because of the lower IPC of Cortex-A725. Therefore the 20 Arm cores must be slower than 20 Zen 5 cores, probably only equivalent with about 15 Zen 5 cores, but the exact equivalence is hard to predict, because it depends on the NVIDIA implementation of things like the cache memories and the memory controller.

                                                                                                                                                                                                                                                        • ksec a day ago

                                                                                                                                                                                                                                                          For context, the X925 is what used to call Cortex X5 and it is now shipping in MediaTek Dimensity 9400. It has roughly the same performance per clock as a Snapdragon 8 Elite Or roughly 5% lower performance per clock compared to Apple M3 on Geekbench 6.

                                                                                                                                                                                                                                                          Assuming they are not limited by power or heat dissipation I would say that is about as good as it gets.

                                                                                                                                                                                                                                                          The hardware is pretty damn good. I am only worried about the software.

                                                                                                                                                                                                                                                          • sliken a day ago

                                                                                                                                                                                                                                                            Good catch, they called it "Grace Blackwell". Changing the CPU cores completely and calling it Grace seems weird. Maybe it was just a mistake during the keynote.

                                                                                                                                                                                                                                                            • wmf a day ago

                                                                                                                                                                                                                                                              I don't think it was a mistake; maybe they intend Grace to be a broader brand like Ryzen not one particular model.

                                                                                                                                                                                                                                                              • kristopolous a day ago

                                                                                                                                                                                                                                                                it's an interesting idea. I mean grace hopper was an actual person but nvidia can have whatever arbitrary naming rules they'd like.

                                                                                                                                                                                                                                                          • z4y5f3 a day ago

                                                                                                                                                                                                                                                            NVIDIA is likely citing 1 PFlops at FP 4 sparse (they did this for GB200), so that is 128 TFlops BF16 dense, or 2/3 of what RTX 4090 is capable of. I would put the memory bandwidth at 546 GBps, using the same 512 bit LPDDR5X 8533 Mbps as Apple M4 max.

                                                                                                                                                                                                                                                            • gardnr a day ago

                                                                                                                                                                                                                                                              Based on your evaluation, it sounds like it will run inference at speed similar to an M4 Max and also allow "startups" to experiment with fine tuning larger models or larger context windows.

                                                                                                                                                                                                                                                              It's the best "dev board" setup I've seen so far. It might be part of their larger commercial plan but it definitely hits the sweet spot for the home enthusiast who have been pleading for more VRAM.

                                                                                                                                                                                                                                                          • bee_rider a day ago

                                                                                                                                                                                                                                                            Seems more like a workstation. So, that’s just a continuation of the last could Decades of Unix on the Workstation, right?

                                                                                                                                                                                                                                                            • throw310822 a day ago

                                                                                                                                                                                                                                                              They should write an AI-centered OS for it, allowing people to write easily AI heavy applications. And you'd have the Amiga of 2025.

                                                                                                                                                                                                                                                            • pjmlp a day ago

                                                                                                                                                                                                                                                              Because NVidia naturally doesn't want to pay for Windows licenses.

                                                                                                                                                                                                                                                              NVidia works closely with Microsoft to develop their cards, all major features come first in DirectX, before landing on Vulkan and OpenGL as NVidia extensions, and eventually become standard after other vendors follow up with similar extensions.

                                                                                                                                                                                                                                                              • diggan a day ago

                                                                                                                                                                                                                                                                > their whole new software stack will only run on WSL2. They aren't porting to Win32

                                                                                                                                                                                                                                                                Wait, what do you mean exactly? Isn't WSL2 just a VM essentially? Don't you mean it'll run on Linux (which you also can run on WSL2)?

                                                                                                                                                                                                                                                                Or will it really only work with WSL2? I was excited as I thought it was just a Linux Workstation, but if WSL2 gets involved/is required somehow, then I need to run the other direction.

                                                                                                                                                                                                                                                                • awestroke a day ago

                                                                                                                                                                                                                                                                  No, nobody will run windows on this. It's meant to run NVIDIAs own flavor of Ubuntu with a patched kernel

                                                                                                                                                                                                                                                                  • hx8 a day ago

                                                                                                                                                                                                                                                                    Yes, WSL2 is essentially a highly integrated VM. I think it's a bit of a joke to call Ubuntu WSL2, because it seems like most Ubuntu installs are either VMs for Windows PCs or on Azure Cloud.

                                                                                                                                                                                                                                                                  • CamperBob2 a day ago

                                                                                                                                                                                                                                                                    Where does it say they won't be supporting Win32?

                                                                                                                                                                                                                                                                  • rvz a day ago

                                                                                                                                                                                                                                                                    > Wow, it may actually be the year of Linux on the Desktop.

                                                                                                                                                                                                                                                                    ?

                                                                                                                                                                                                                                                                    Yeah starting at $3,000. Surely a cheap desktop computer to buy for someone who just wants to surf the web and send email /s.

                                                                                                                                                                                                                                                                    There is a reason why it is for "enthusiasts" and not for the general wider consumer or typical PC buyer.

                                                                                                                                                                                                                                                                    • Topfi a day ago

                                                                                                                                                                                                                                                                      I see the most direct competitor in the Mac Studio, though of course we will have to wait for reviews to gauge how fair that comparison is. The Studio does have a fairly large niche as a solid workstation, though, so I could see this being successful.

                                                                                                                                                                                                                                                                      For general desktop use, as you described, nearly any piece of modern hardware, from a RasPI, to most modern smartphones with a dock, could realistically serve most people well.

                                                                                                                                                                                                                                                                      The thing is, you need to serve both, low-end use cases like browsing, and high-end dev work via workstations, because even for the "average user", there is often one specific program on which they need to rely and which has limited support outside the OS they have grown up with. Course, there will be some programs like Desktop Microsoft Office which will never be ported, but still, Digitis could open the doors to some devs working natively on Linux.

                                                                                                                                                                                                                                                                      A solid, compact, high-performance, yet low power workstation with a fully supported Linux desktop out of the box could bridge that gap, similar to how I have seen some developers adopt macOS over Linux and Windows since the release of the Studio and Max MacBooks.

                                                                                                                                                                                                                                                                      Again, we have yet to see independent testing, but I would be surprised if anything of this size, simplicity, efficiency and performance was possible in any hardware configuration currently on the market.

                                                                                                                                                                                                                                                                      • sliken a day ago

                                                                                                                                                                                                                                                                        I did want a M2 max studio, ended up with a 12 core Zen 4 + radeon 7800 XT for about half the money.

                                                                                                                                                                                                                                                                        A Nvidia Project Digit/GB10 for $3k with 128GB ram does sound tempting. Especially since it's very likely to have standard NVMe storage that I can expand or replace as needed, unlike the Apple solution. Decent linux support is welcome as well.

                                                                                                                                                                                                                                                                        Here's hoping, if not I can fall back to a 128GB ram AMD Strix Halo/395 AI Max plus. CPU perf should be in the same ballpark, but not likely to come anywhere close on GPU performance, but still likely to have decent tokens/sec for casual home tinkering.

                                                                                                                                                                                                                                                                      • yjftsjthsd-h a day ago

                                                                                                                                                                                                                                                                        > Surely a cheap desktop computer to buy for someone who just wants to surf the web and send email /s.

                                                                                                                                                                                                                                                                        That end of the market is occupied by Chromebooks... AKA a different GNU/Linux.

                                                                                                                                                                                                                                                                        • fooker a day ago

                                                                                                                                                                                                                                                                          The typical PC buyer is an enthusiast now.

                                                                                                                                                                                                                                                                        • immibis a day ago

                                                                                                                                                                                                                                                                          Never underestimate the open source world's power to create a crappy desktop experience.

                                                                                                                                                                                                                                                                          • tokai a day ago

                                                                                                                                                                                                                                                                            You're like 15 years out of date.

                                                                                                                                                                                                                                                                        • neom a day ago

                                                                                                                                                                                                                                                                          In case you're curious, I googled. It runs this thing called "DGX OS":

                                                                                                                                                                                                                                                                          "DGX OS 6 Features The following are the key features of DGX OS Release 6:

                                                                                                                                                                                                                                                                          Based on Ubuntu 22.04 with the latest long-term Linux kernel version 5.15 for the recent hardware and security updates and updates to software packages, such as Python and GCC.

                                                                                                                                                                                                                                                                          Includes the NVIDIA-optimized Linux kernel, which supports GPU Direct Storage (GDS) without additional patches.

                                                                                                                                                                                                                                                                          Provides access to all NVIDIA GPU driver branches and CUDA toolkit versions.

                                                                                                                                                                                                                                                                          Uses the Ubuntu OFED by default with the option to install NVIDIA OFED for additional features.

                                                                                                                                                                                                                                                                          Supports Secure Boot (requires Ubuntu OFED).

                                                                                                                                                                                                                                                                          Supports DGX H100/H200."

                                                                                                                                                                                                                                                                          • AtlasBarfed a day ago

                                                                                                                                                                                                                                                                            Nvidia optimize meaning non-public patches, a non-upgradable operating system like what happens if you upgrade with a binary blob Nvidia driver?

                                                                                                                                                                                                                                                                            • wmf 19 hours ago

                                                                                                                                                                                                                                                                              You can upgrade to a newer release of DGX OS.

                                                                                                                                                                                                                                                                            • yoyohello13 a day ago

                                                                                                                                                                                                                                                                              I wonder what kind of spyware is loaded onto DGX OS. Oh, sorry I mean telemetry.

                                                                                                                                                                                                                                                                              • ZeroTalent 21 hours ago

                                                                                                                                                                                                                                                                                Cybersecurity analysts check and monitor these things daily, and they are pretty easy to catch. Likely nothing malicious, as history shows.

                                                                                                                                                                                                                                                                                • thunkshift1 9 hours ago

                                                                                                                                                                                                                                                                                  Correct, highly concerning.. this is totally not the case with existing os’s and products

                                                                                                                                                                                                                                                                              • a_bonobo a day ago

                                                                                                                                                                                                                                                                                There's a market not described here: bioinformatics.

                                                                                                                                                                                                                                                                                The owner of the market, Illumina, already ships their own bespoke hardware chips in servers called DRAGEN for faster analysis of thousands of genomes. Their main market for this product is in personalised medicine, as genome sequencing in humans is becoming common.

                                                                                                                                                                                                                                                                                Other companies like Oxford Nanopore use on-board GPUs to call bases (i.e., from raw electric signal coming off the sequencer to A, T, G, C) but it's not working as well as it could due to size and power constraints. I feel like this could be a huge game changer for someone like ONT, especially with cooler stuff like adaptive sequencing.

                                                                                                                                                                                                                                                                                Other avenues of bioinformatics, such as most day-to-day analysis software, is still very CPU and RAM heavy.

                                                                                                                                                                                                                                                                                • evandijk70 a day ago

                                                                                                                                                                                                                                                                                  This is, at least for now, a relatively small market. Illumina acquired the company manufacturing these chips for $100M. Analysis of a genome in the cloud generally costs below $10 on general purpose hardware.

                                                                                                                                                                                                                                                                                  It is of course possible that these chips enable analyses that are currently not possible/prohibited by cost, but at least for now, this will not be the limiting factor for genomics, but cost of sequencing (which is currently $400-500 per genome)

                                                                                                                                                                                                                                                                                  • mfld 6 hours ago

                                                                                                                                                                                                                                                                                    Small nitpick: Illumia is the owner of the sequencing market. But not really of the bioinformatics market.

                                                                                                                                                                                                                                                                                    • mocheeze a day ago

                                                                                                                                                                                                                                                                                      Doesn't seem like Illumina actually cares much about security: https://arstechnica.com/security/2025/01/widely-used-dna-seq...

                                                                                                                                                                                                                                                                                      • mycall a day ago

                                                                                                                                                                                                                                                                                        The bigger picture is that OpenAI o3/o4.. plus specialized models will blow open the doors to genome tagging and discovery, but that is still 1 to 3 years away for ASI to kick in.

                                                                                                                                                                                                                                                                                        • nzach a day ago

                                                                                                                                                                                                                                                                                          While I kinda agree with you, I don't think we will ever find a meaningful way to throw genome sequencing data at LLMs. It's simple too much data.

                                                                                                                                                                                                                                                                                          I've worked in a project some years ago where we were using data from genome sequencing of a bacteria. Every sequenced sample was around 3GB of data and sample size was pretty small with only about 100 samples to study.

                                                                                                                                                                                                                                                                                          I think the real revolution will happen because code generation through LLMs will allow biologists to write 'good enough' code to transform, process and analyze data. Today to do any meaningful work with genome data you need a pretty competent bioinformatician, and they are a rare breed. Removing this bottleneck is what will allow us to move faster in this field.

                                                                                                                                                                                                                                                                                          • amelie-iska 18 hours ago
                                                                                                                                                                                                                                                                                            • amelie-iska 19 hours ago

                                                                                                                                                                                                                                                                                              Just use a DNA/genomic language model like gLM2 or Evo and cross-attention that with o3 and you’re golden imo.

                                                                                                                                                                                                                                                                                          • newsclues a day ago

                                                                                                                                                                                                                                                                                            Is this for research labs, health clinics, or peoples homes?

                                                                                                                                                                                                                                                                                            • a_bonobo a day ago

                                                                                                                                                                                                                                                                                              ONT sells its smallest MinION to regular people, too. But Illumina's and ONT's main market is universities, followed by large hospitals

                                                                                                                                                                                                                                                                                          • treprinum a day ago

                                                                                                                                                                                                                                                                                            Nvidia just did what Intel/AMD should have done to threaten CUDA ecosystem - release a "cheap" 128GB local inference appliance/GPU. Well done Nvidia, and it looks bleak for any AI Intel/AMD efforts in the future.

                                                                                                                                                                                                                                                                                            • mft_ a day ago

                                                                                                                                                                                                                                                                                              I think you nailed it. Any basic SWOT analysis of NVidia’s position would surely have to consider something like this from a competitor - either Apple, who is already nibbling around the edges of this space, or AMD/Intel who could/should? be.

                                                                                                                                                                                                                                                                                              It’s obviously not guaranteed to go this route, but an LLM (or similar) on every desk and in every home is a plausible vision of the future.

                                                                                                                                                                                                                                                                                              • iszomer a day ago

                                                                                                                                                                                                                                                                                                Nvidia also brought Mediatek into the spotlight..

                                                                                                                                                                                                                                                                                            • gnatman a day ago

                                                                                                                                                                                                                                                                                              >> The IBM Roadrunner was the first supercomputer to reach one petaflop (1 quadrillion floating point operations per second, or FLOPS) on May 25, 2008.

                                                                                                                                                                                                                                                                                              $100M, 2.35MW, 6000 ft^2

                                                                                                                                                                                                                                                                                              >>Designed for AI researchers, data scientists, and students, Project Digits packs Nvidia’s new GB10 Grace Blackwell Superchip, which delivers up to a petaflop of computing performance for prototyping, fine-tuning, and running AI models.

                                                                                                                                                                                                                                                                                              $3000, 1kW, 0.5 ft^2

                                                                                                                                                                                                                                                                                              • DannyBee 21 hours ago

                                                                                                                                                                                                                                                                                                Digits is petaflops of FP4, roadrunner is petaflops of FP32. So at least a factor of 8 difference, but in practice much more. (IE I strongly doubt digits can do 1/8th petaflop of FP32)

                                                                                                                                                                                                                                                                                                Beyond that, the factors seem reasonable for 2 decades?

                                                                                                                                                                                                                                                                                                • dotancohen 21 hours ago

                                                                                                                                                                                                                                                                                                  Why even use a floating point if you have only 4 bits? Models with INT8 features are not unheard of.

                                                                                                                                                                                                                                                                                                  • cjbgkagh 20 hours ago

                                                                                                                                                                                                                                                                                                    1 sign and 3 exponent bits. AFAIK at the small number of bits it’s basically a teeny tiny look up table so you can precompute the table to be whatever math you want. Having an exponent instead of mantissa just means that the values that can be expressed are not linearly separated.

                                                                                                                                                                                                                                                                                                  • stassats 19 hours ago

                                                                                                                                                                                                                                                                                                    > roadrunner is petaflops of FP32

                                                                                                                                                                                                                                                                                                    Isn't it actually FP64?

                                                                                                                                                                                                                                                                                                    • DannyBee 18 hours ago

                                                                                                                                                                                                                                                                                                      So i can find sources that claim both ;) I wasn't sure what to believe, and didn't spend more than 5 minutes digging for the real results, so i went with the conservative one.

                                                                                                                                                                                                                                                                                                • mrtksn a day ago

                                                                                                                                                                                                                                                                                                  Okay, so this is not a peripheral that you connect to your computer to run specialized tasks, this is a full computer running Linux.

                                                                                                                                                                                                                                                                                                  It's a garden hermit. Imagine a future where everyone has one of those(not exactly this version but some future version), it lives with you it learns with you and unlike the cloud based SaaS AI you can teach it things immediately and diverge from the average to your advantage.

                                                                                                                                                                                                                                                                                                  • Topfi a day ago

                                                                                                                                                                                                                                                                                                    I'd love to own one, but doubt this will go beyond a very specific niche. Despite there being advantages, very few still operate their own Plex server over subscriptions to streaming services, and on the local front, I feel that the progress of hardware, alongside findings that smaller models can handle a variety of tasks quite well, will mean a high performance, local workstation of this type will have niche appeal at most.

                                                                                                                                                                                                                                                                                                    • mrtksn a day ago

                                                                                                                                                                                                                                                                                                      I have this feeling that at some point it will be very advantageous to have personal AI because when you use something that everyone can use the output of this something becomes very low value.

                                                                                                                                                                                                                                                                                                      Maybe it will still make sense to have your personal AI in some data center, but on the other hand, there is the trend of governments and mega corps regulating what you can do with your computer. Try going out of the basics, try to do something fun and edge case - it is very likely that your general availability AI will refuse to help you.

                                                                                                                                                                                                                                                                                                      when it is your own property, you get the chance to overcome restrictions and develop the thing beyond the average.

                                                                                                                                                                                                                                                                                                      As a result, having something that can do things that no other else can do and not having restrictions on what you can do with this thing can become the ultimate superpower.

                                                                                                                                                                                                                                                                                                    • noduerme a day ago

                                                                                                                                                                                                                                                                                                      "garden hermit" is a very interesting and evocative phrase. Where is that from?

                                                                                                                                                                                                                                                                                                      • mrtksn a day ago

                                                                                                                                                                                                                                                                                                        It's a real thing: https://en.wikipedia.org/wiki/Garden_hermit

                                                                                                                                                                                                                                                                                                        In the past, in Europe, some wealthy people used to look after of a scholar living on their premises so they can ask them questions etc.

                                                                                                                                                                                                                                                                                                        • noduerme a day ago

                                                                                                                                                                                                                                                                                                          aha, this is really something. I just got around to watching "Furiosa" last night. So something like having a personal "history man" (although, my take on the whole Mad Max series is that it's just bottled up fear-porn about white settlers going uncivilized and becoming "tribal" - a colonial horror tale, "The Heart of Darkness" with motorcycles - common anywhere a relatively small group spread themselves out on a lot of ill-gotten land, did some nasty deeds and lost touch with the mothership).

                                                                                                                                                                                                                                                                                                          In the Furiosa context, it's a bit like a medicine man or shaman, then. A private, unreliable source of verbal hand me downs, whose main utility is to make elites feel like they have access to knowledge without needing to acquire it for themselves or question its veracity.

                                                                                                                                                                                                                                                                                                          We really are entering a new dark age.

                                                                                                                                                                                                                                                                                                          • mycall a day ago

                                                                                                                                                                                                                                                                                                            > We really are entering a new dark age.

                                                                                                                                                                                                                                                                                                            All the indicators are there:

                                                                                                                                                                                                                                                                                                            Instead of leaders like Charlemagne who unified the Frankish domain, stabilized society, and promoted education and culture, we now have leaders who want to dismantle society, education and use culture for wars.

                                                                                                                                                                                                                                                                                                            Long-distance ocean trade routes since the 1950s have taken international commerce to another level for humans, but this is being challenged now by aging/leaking tankers, unruly piracy at transit choke points, communication cable destruction, etc.

                                                                                                                                                                                                                                                                                                            Loss of interest in classical learning and the arts where dystopian, murder or horror movies, music and books now are the best sellers as WW3 seems to be on many people's minds now.

                                                                                                                                                                                                                                                                                                            While innovations are still occurring for improved navigation and agricultural productivity, the Earth's ecosystem collapse is in full effect.

                                                                                                                                                                                                                                                                                                            I wish it could reversed somehow.

                                                                                                                                                                                                                                                                                                          • rsynnott a day ago

                                                                                                                                                                                                                                                                                                            > The one at Painshill, hired by The Hon. Charles Hamilton for a seven-year term under strict conditions, lasted three weeks until he was sacked after being discovered in a local pub

                                                                                                                                                                                                                                                                                                            I mean, fair. Very bad hermit-ing.

                                                                                                                                                                                                                                                                                                            (Terry Pratchett has a fun parody of this in one of the Discworld books; the garden hermit gets two weeks' holidays a year, which he spends in a large city.)

                                                                                                                                                                                                                                                                                                            • Mistletoe a day ago

                                                                                                                                                                                                                                                                                                              This is so strange, my girlfriend was just telling me about those yesterday. The word “ornamental hermit” fills me with about as much disgust as I can experience.

                                                                                                                                                                                                                                                                                                              > Later, suggestions of hermits were replaced with actual hermits – men hired for the sole purpose of inhabiting a small structure and functioning as any other garden ornament.

                                                                                                                                                                                                                                                                                                        • ryao a day ago

                                                                                                                                                                                                                                                                                                          This looks like a successor to the Nvidia Jetson AGX Orin 64GB Developer Kit:

                                                                                                                                                                                                                                                                                                          https://www.okdo.com/wp-content/uploads/2023/03/jetson-agx-o...

                                                                                                                                                                                                                                                                                                          I wonder what the specifications are in terms of memory bandwidth and computational capability.

                                                                                                                                                                                                                                                                                                          • kcb a day ago

                                                                                                                                                                                                                                                                                                            Hopefully, the OS support isn't as awful as the Jetson platforms usually are. Unless they change, you'll get 1 or 2 major kernel updates ever and have to do bizarre stuff like install a 6 year old Ubuntu on your x86 PC to run the utility to flash the OS.

                                                                                                                                                                                                                                                                                                            • ryao a day ago

                                                                                                                                                                                                                                                                                                              The community likely will make instructions for installing mainstream Linux distributions on it.

                                                                                                                                                                                                                                                                                                              • kcb a day ago

                                                                                                                                                                                                                                                                                                                Doesn't really help though if it requires an nvidia kernel.

                                                                                                                                                                                                                                                                                                                • snerbles a day ago

                                                                                                                                                                                                                                                                                                                  The official Linux kernel driver for Blackwell is GPL/MIT licensed: https://developer.nvidia.com/blog/nvidia-transitions-fully-t...

                                                                                                                                                                                                                                                                                                                  • sliken a day ago

                                                                                                                                                                                                                                                                                                                    Keep in mind that a kernel module != driver. It's just doing initialization and passing data to/from the driver, which is closed source and in user space.

                                                                                                                                                                                                                                                                                                                  • ryao a day ago

                                                                                                                                                                                                                                                                                                                    The Linux kernel license requires Nvidia to disclose their Linux kernel sources and Nvidia open sourced their kernel driver.

                                                                                                                                                                                                                                                                                                                    That said, you can probably boot a Debian or Gentoo system using the Nvidia provided kernel if need be.

                                                                                                                                                                                                                                                                                                                    • bionade24 a day ago

                                                                                                                                                                                                                                                                                                                      It always has been the userspace of the Jetsons which was closed source and tied to Nvidia's custom kernel. I have not heard from people running Jetpack on a different userland than the one provided by Nvidia. Companies/Labs that update the OS don't care about CUDA, Nvidia contributes to Mesa support of the Jetsons and some only need a bit more GPU power than a RasPi.

                                                                                                                                                                                                                                                                                                              • zamadatix a day ago

                                                                                                                                                                                                                                                                                                                The Jetson Orin Dev Kit is squarely aimed at being a dev kit for those using the Jetson module in production edge compute (robotic vision and the like). The only reason it's so well known in tech circles is "SBC syndrome" where people get excited about what they think they could do with it and then 95% end up in a drawer a year later because it what it's actually good at is unrelated to why they bought it.

                                                                                                                                                                                                                                                                                                                This is more accurately a descendant of the HPC variants like the article talks about - intentionally meant to actually be a useful entry level for those wanting to do or run general AI work better than a random PC would have anyways.

                                                                                                                                                                                                                                                                                                                • moffkalast a day ago

                                                                                                                                                                                                                                                                                                                  The AGX Orin was only 64GB of LPDDR5 and priced at $5k so this does seem like a bargain in comparison with 128GB of presumably HBM. But Nvidia never lowers their prices, so there's a caveat somewhere.

                                                                                                                                                                                                                                                                                                                  • fulafel a day ago

                                                                                                                                                                                                                                                                                                                    The memory is LPDDR accordning to the specs graphic on the NV product page: https://www.nvidia.com/en-us/project-digits/

                                                                                                                                                                                                                                                                                                                    Anyone willing to guess how wide?

                                                                                                                                                                                                                                                                                                                    • moffkalast a day ago

                                                                                                                                                                                                                                                                                                                      I've seen some claims that it can do 512 GB/s on Reddit (not sure where they got that from), which would imply a ~300 bit bus with LPDDR5X depending on the frequency.

                                                                                                                                                                                                                                                                                                                      • pella a day ago

                                                                                                                                                                                                                                                                                                                        probably:

                                                                                                                                                                                                                                                                                                                        "According to the Grace Blackwell's datasheet- Up to 480 gigabytes (GB) of LPDDR5X memory with up to 512GB/s of memory bandwidth. It also says it comes in a 120 gb config that does have the full fat 512 GB/s."

                                                                                                                                                                                                                                                                                                                        via https://www.reddit.com/r/LocalLLaMA/comments/1hvj1f4/comment...

                                                                                                                                                                                                                                                                                                                        "up to 512GB/s of memory bandwidth per Grace CPU"

                                                                                                                                                                                                                                                                                                                        https://resources.nvidia.com/en-us-data-center-overview/hpc-...

                                                                                                                                                                                                                                                                                                                        • sliken 21 hours ago

                                                                                                                                                                                                                                                                                                                          Keep in mind the "full" grace is a completely different beast with Neoverse cores. This new GB10 uses different cores and might well have a different memory interface. I believe the "120 GB" config includes ECC overhead (which is inline on Nvidia GPUs) and Neoverse cores have various tweaks for larger configurations that are absent in the Cortex-x925.

                                                                                                                                                                                                                                                                                                                          I'd be happy to be wrong, but I don't see anything from Nvidia that implies a 512 bit wide memory interface on the Nvidia Project DIgits.

                                                                                                                                                                                                                                                                                                                          • moffkalast a day ago

                                                                                                                                                                                                                                                                                                                            Yep I think that's it. So it's referencing the GB200, it could have absolutely nothing in common with this low power version.

                                                                                                                                                                                                                                                                                                                  • tim333 a day ago

                                                                                                                                                                                                                                                                                                                    I've followed progress since Moravec's "When will computer hardware match the human brain?" since that came out in 1997. It starts:

                                                                                                                                                                                                                                                                                                                    >This paper describes how the performance of AI machines tends to improve at the same pace that AI researchers get access to faster hardware. The processing power and memory capacity necessary to match general intellectual performance of the human brain are estimated. Based on extrapolation of past trends and on examination of technologies under development, it is predicted that the required hardware will be available in cheap machines in the 2020s.

                                                                                                                                                                                                                                                                                                                    and this is about the first personal unit that seems well ahead of his proposed specs. (He estimated 0.1 petaflops. The nvidia thing is "1 petaflop of AI performance at FP4 precision").

                                                                                                                                                                                                                                                                                                                    (paper https://jetpress.org/volume1/moravec.pdf)

                                                                                                                                                                                                                                                                                                                    • gavi a day ago
                                                                                                                                                                                                                                                                                                                      • paxys a day ago

                                                                                                                                                                                                                                                                                                                        The text on the screen is an obvious giveaway.

                                                                                                                                                                                                                                                                                                                        • diggan a day ago

                                                                                                                                                                                                                                                                                                                          Damn, you're right. I didn't even consider looking at the monitor itself as "They can't be so lazy they don't even use a real screenshot" while faking the rest kind of makes sense, otherwise you need a studio setup.

                                                                                                                                                                                                                                                                                                                          Never underestimate how lazy companies with a ~$3 trillion market cap can be.

                                                                                                                                                                                                                                                                                                                          • sipjca a day ago

                                                                                                                                                                                                                                                                                                                            I mean the whole company is betting on AI, why wouldn’t they use AI to generate the image?? Fundamentally it doesn’t matter if it was AI generated or not, most people don’t care and the people that do won’t impact their bottom line

                                                                                                                                                                                                                                                                                                                            • adolph a day ago

                                                                                                                                                                                                                                                                                                                              Lazy? This is Nvidia eating their own dogfood. They put in lots of work to get to the point where someone can call it "lazy."

                                                                                                                                                                                                                                                                                                                              • diggan 21 hours ago

                                                                                                                                                                                                                                                                                                                                > Lazy? This is Nvidia eating their own dogfood

                                                                                                                                                                                                                                                                                                                                Absolutely, I'm all for dogfooding! But when you do, make sure you get and use good results, not something that looks like it was generated by someone who just learned about Stable Diffusion :)

                                                                                                                                                                                                                                                                                                                          • diggan a day ago

                                                                                                                                                                                                                                                                                                                            Agree, unless I've missed some recent invention where keyboards now have two of either Enter/Backspace/Shift keys on the right side.

                                                                                                                                                                                                                                                                                                                            Not sure if that isn't expected though? Likely most people wouldn't even notice, and the company can say they're dogfooding some product I guess.

                                                                                                                                                                                                                                                                                                                            • tsimionescu a day ago

                                                                                                                                                                                                                                                                                                                              The keyboard layout seems perfectly reasonable, and rather common: from top to bottom, the rightmost column of keys after the letters would be backspace, |\, enter, shift, ctrl. On the left, mirrored, you have ~`, tab, caps lock, shift, ctrl. The sizes and shapes match many common keyboard layouts I've seen.

                                                                                                                                                                                                                                                                                                                              • patrulek a day ago

                                                                                                                                                                                                                                                                                                                                > unless I've missed some recent invention where keyboards now have two of either Enter/Backspace/Shift keys on the right side

                                                                                                                                                                                                                                                                                                                                It doesnt have to be two enter/backspace/shift. Keyboard layout seems to be almost identical to Azio L70 Keyboard (at least the keys).

                                                                                                                                                                                                                                                                                                                              • throw310822 a day ago

                                                                                                                                                                                                                                                                                                                                Prompt: something with some splashy graph on screen.

                                                                                                                                                                                                                                                                                                                              • modeless a day ago

                                                                                                                                                                                                                                                                                                                                Finally a real ARM workstation from Nvidia! This will be much faster than Apple's offerings for AI work. And at $3000 it is much cheaper than any Mac with 128 GB RAM.

                                                                                                                                                                                                                                                                                                                                • sliken a day ago

                                                                                                                                                                                                                                                                                                                                  On the CPU size the Neoverse N2 doesn't compete particularly well with apple's M4, or the Zen5 for the matter.

                                                                                                                                                                                                                                                                                                                                  Bit bit hard to tell what's on offer on the GPU side, I wouldn't be surprised if it was RTX 4070 to 5070 in that range.

                                                                                                                                                                                                                                                                                                                                  If the price/perf is high enough $3k wouldn't be a bad deal, I suspect a Strix Halo (better CPU cores, 256GB/sec memory interface, likely slower GPU cores) will be better price/perf, same max ram for unified memory, and cheaper.

                                                                                                                                                                                                                                                                                                                                  • skavi a day ago

                                                                                                                                                                                                                                                                                                                                    It’s actually “10 Arm Cortex-X925 and 10 Cortex-A725” [0]. These are much newer cores and have a reasonable chance of being competitive.

                                                                                                                                                                                                                                                                                                                                    [0]: https://newsroom.arm.com/blog/arm-nvidia-project-digits-high...

                                                                                                                                                                                                                                                                                                                                    • modeless a day ago

                                                                                                                                                                                                                                                                                                                                      AI work happens predominantly on the GPU, not the CPU. This GPU with CUDA will run rings around M4 with MLX. And with much more RAM than you can get in a Mac for $3k.

                                                                                                                                                                                                                                                                                                                                      A lot of people have been justifying their Mac Studio or Mac Pro purchases by the potential for running large AI models locally. Project Digits will be much better at that for cheaper. Maybe it won't run compile Chromium as fast, but that's not what it's for.

                                                                                                                                                                                                                                                                                                                                      • gardnr a day ago
                                                                                                                                                                                                                                                                                                                                        • sliken a day ago

                                                                                                                                                                                                                                                                                                                                          The quotes I've seen mention the maximum config (128GB ram and 4TB of storage) and the minimum price. Nothing saying $3k for 128GB ram and 4TB of storage. I hope I'm wrong, but I'm betting the max price is at least twice the minimum price.

                                                                                                                                                                                                                                                                                                                                          • rfoo a day ago

                                                                                                                                                                                                                                                                                                                                            This is NVIDIA, not Apple. They don't charge you a RAM tax (at least for this product). There is only one config for RAM: 128GB.

                                                                                                                                                                                                                                                                                                                                            • sliken a day ago

                                                                                                                                                                                                                                                                                                                                              It's been far from clear what config options are going to be available, and the $3,000 price is the "starting at" price. Not sure what the options will be, but people have collectively found statements that imply all configs will have 128GB ram. Sounds good, I hope it's true.

                                                                                                                                                                                                                                                                                                                                              Seems like the storage will have options, because it's "up to 4TB". Unsure if there will be differently binned CPUs (clock or number of cores). Or connectX optional or at different speeds.

                                                                                                                                                                                                                                                                                                                                              • rfoo a day ago

                                                                                                                                                                                                                                                                                                                                                Doubt it. I don't remember NVIDIA ever doing binning and having different SKUs for their Jetson Developer Kit line, which is similar to this Project Digits thing.

                                                                                                                                                                                                                                                                                                                                                • sliken a day ago

                                                                                                                                                                                                                                                                                                                                                  Well presumably there's some different configurations supported, otherwise they would say $2,999 instead of starting at $2,999.

                                                                                                                                                                                                                                                                                                                                            • gnabgib a day ago

                                                                                                                                                                                                                                                                                                                                              NVidia says 128GB ram at $3k[0], it looks like the 4TB storage might be variable (and possibly CPU or GPU cores?). This article says 128GB too.. but used up to twice in a row with different meanings which doesn't help.

                                                                                                                                                                                                                                                                                                                                              [0]: https://nvidianews.nvidia.com/news/nvidia-puts-grace-blackwe...

                                                                                                                                                                                                                                                                                                                                              • p_l a day ago

                                                                                                                                                                                                                                                                                                                                                It's GB200 in desktop compatible enclosure, the RAM is fixed, the SSDs are not, the network ports are fixed too.

                                                                                                                                                                                                                                                                                                                                                • KeplerBoy a day ago

                                                                                                                                                                                                                                                                                                                                                  It's GB10 a much cut down version to fit the price point, and space, weight and power requirements.

                                                                                                                                                                                                                                                                                                                                                  • sliken a day ago

                                                                                                                                                                                                                                                                                                                                                    The CPU is apparently the result of a "secret" project that wasn't on published roadmaps. It's called the GB110. So maybe they will offer differently binned CPU/GPUs with a different fraction of cores disabled and you can pick your SSD.

                                                                                                                                                                                                                                                                                                                                                    • dagmx a day ago

                                                                                                                                                                                                                                                                                                                                                      It’s most definitely not a GB200 in a desktop enclosure.

                                                                                                                                                                                                                                                                                                                                                      The processor is using completely different cores, and the GPU is somewhere around a 5070 for TOPs.

                                                                                                                                                                                                                                                                                                                                            • magicalhippo a day ago

                                                                                                                                                                                                                                                                                                                                              Not much was unveiled but it showed a Blackwell GPU with 1PFLOP of FP4 compute, 128GB unified DDR5X memory, 20 ARM cores, and ConnectX powering two QSFP slots so one can stack multiple of them.

                                                                                                                                                                                                                                                                                                                                              edit: While the title says "personal", Jensen did say this was aimed at startups and similar, so not your living room necessarily.

                                                                                                                                                                                                                                                                                                                                              • computably a day ago

                                                                                                                                                                                                                                                                                                                                                From the size and pricing ($3000) alone, it's safe to conclude it has less raw FLOPs than a 5090. Since it uses LPDDR5X, almost certainly less memory bandwidth too (5090 @ 1.8 TB/s, M4 Max w/ 128GB LPDDR5X @ 546 GB/s). Basically the only advantage is how much VRAM it packs in a small form factor, and presumably greater power efficiency at its smaller scale.

                                                                                                                                                                                                                                                                                                                                                The only thing it really competes with is the Mac Studio for LocalLlama-type enthusiasts and devs. It isn't cheap enough to dent the used market, nor powerful enough to stand in for bigger cards.

                                                                                                                                                                                                                                                                                                                                                • llm_nerd a day ago

                                                                                                                                                                                                                                                                                                                                                  The product isn't even finalized. It might never come to fruition, and I cannot fathom how they will make the power profile fit. I am skeptical that a $3000 device with 128GB of RAM and a 4TB SSD with the specs provided will even see reality any time within the next year, but let's pretend it will.

                                                                                                                                                                                                                                                                                                                                                  However we do know that it offers 1/4 the TOPS of the new 5090. It will be less powerful than the $600 5070. Which, of course it will given power limitations.

                                                                                                                                                                                                                                                                                                                                                  The only real compelling value is that nvidia memory starves their desktop cards so severely. It's the small opening that Apple found, even though Apple's FP4/FP8 performance is a world below what nvidia is offering. So purely from that perspective this is a winning product, as 128GB opens up a lot of possibilities. But from a raw performance perspective, it's actually going to pale compared to other nvidia products.

                                                                                                                                                                                                                                                                                                                                                  • computably 8 hours ago

                                                                                                                                                                                                                                                                                                                                                    AI TOPS numbers for Blackwell/ 5090 are probably for a niche numeric type like INT8 or INT4.

                                                                                                                                                                                                                                                                                                                                                    At FP32 (and FP16, assuming the consumer cards are still neutered), the 5090 apparently does ~105-107 TFLOPS, and the full GB202 ~125 TFLOPS. That means a non-neutered GB202-based card could hit ~250 TFLOPS of FP16, which lines up neatly with 1 PFLOP of FP4.

                                                                                                                                                                                                                                                                                                                                                    In reality, FP4 is more-than-linearly efficient relative to FP32. They quoted FP4 and not FP8 / FP16 for a reason. I wouldn't be too surprised if it doesn't even support FP32, maybe even FP16. Plus, they likely cut RT cores and other graphics-related features, making for a smaller and therefore more power efficient chip, because they're positioning this as an "AI supercomputer" and this hardware doesn't make sense for most graphical applications.

                                                                                                                                                                                                                                                                                                                                                    I see no reason this product wouldn't come to market - besides the usual supply/demand. There's value for a small niche and particular price bracket: enthusiasts running large q4 models, cheaper but slower vs. dedicated cards (3x-10x price/VRAM) and price-competitive but much faster vs. Apple silicon. It's a good strategic move for maintaining Nvidia's hold on the ecosystem regardless of the sales revenue.

                                                                                                                                                                                                                                                                                                                                                  • sliken a day ago

                                                                                                                                                                                                                                                                                                                                                    I believe $3,000 is for the unmentioned minimum config, no idea on the mentioned 4TB storage and 128GB ram version.

                                                                                                                                                                                                                                                                                                                                                    Running a 96GB ram model isn't cheap (often with unified memory 25% is reserved for CPUs), so maybe it will win there.

                                                                                                                                                                                                                                                                                                                                                    • ac29 a day ago

                                                                                                                                                                                                                                                                                                                                                      The NVIDIA press release [0] says "Each Project DIGITS features 128GB of unified, coherent memory and up to 4TB of NVMe storage", which makes it sound like the RAM is fixed size.

                                                                                                                                                                                                                                                                                                                                                      [0] https://nvidianews.nvidia.com/news/nvidia-puts-grace-blackwe...

                                                                                                                                                                                                                                                                                                                                                      • sliken a day ago

                                                                                                                                                                                                                                                                                                                                                        Awesome.

                                                                                                                                                                                                                                                                                                                                                        Maybe there will be storage options of 1,2,and 4TB and optional 25/100/200/400 GBit interfaces. Or maybe everything except the CPU/GPU is constant, but having a 50%, 75%, or 100% of the CPU/GPU cores so they can bin their chips.

                                                                                                                                                                                                                                                                                                                                                    • KeplerBoy a day ago

                                                                                                                                                                                                                                                                                                                                                      Of course. It has much less FLOPs than the 5090, after all this will have a TDP of ~50W and run off a regular USB-PD power supply.

                                                                                                                                                                                                                                                                                                                                                      It's basically the successor to the AGX Orin and in line with its pricing (considering it comes with a fast NIC). The AGX Orin had RTX 3050 levels of performance.

                                                                                                                                                                                                                                                                                                                                                      • adrian_b a day ago

                                                                                                                                                                                                                                                                                                                                                        The successor of NVIDIA Orin is named Thor and it is expected to be launched later this year.

                                                                                                                                                                                                                                                                                                                                                        It uses other Arm processor cores than Digits, i.e. Neoverse V3AE, the automotive-enhanced version of Neoverse V3 (which is the server core version of Cortex-X4). According to rumors, NVIDIA Thor might have 14 Neoverse V3AE cores in the base version and there is also a double-die version.

                                                                                                                                                                                                                                                                                                                                                        The GPU of NVIDIA Thor is also a Blackwell, but probably with a very different configuration than in NVIDIA Digits.

                                                                                                                                                                                                                                                                                                                                                        NVIDIA Thor, like Orin, is intended for high reliability applications, like in automotive or industrial environments, unlike NVIDIA Digits, which is made with consumer-level technology.

                                                                                                                                                                                                                                                                                                                                                        • krasin a day ago

                                                                                                                                                                                                                                                                                                                                                          Yes and no. Jetson line (which Jetson AGX Orin is a part of) is also providing multi-camera support (with MIPI CSI-2 connectors) and other real-time / microcontroller stuff, as well as rugged options via partners.

                                                                                                                                                                                                                                                                                                                                                          I hope to see new Jetsons based on Blackwell sometime in 2026 (they tend to be slow to release those).

                                                                                                                                                                                                                                                                                                                                                          • KeplerBoy a day ago

                                                                                                                                                                                                                                                                                                                                                            Yeah, i guess its more a branch off the jetson line. Or a midpoint between the Jetsons, IGX Orin (not a typo) and Data Center offerings.

                                                                                                                                                                                                                                                                                                                                                        • kcb a day ago

                                                                                                                                                                                                                                                                                                                                                          Making comparisons to the 5090 is silly. That thing draws 500W+ and will require a boat anchor of metal to keep it cool. The device they showed is something more along the lines of a mobile dev kit.

                                                                                                                                                                                                                                                                                                                                                          • computably 8 hours ago

                                                                                                                                                                                                                                                                                                                                                            I agree they're not products that compete against each other. Unfortunately, the silly comparison has to be made, as less informed consumers are already claiming that the 128 GB RAM of Project Digits will obsolete workstation/server-class GPUs.

                                                                                                                                                                                                                                                                                                                                                      • theptip a day ago

                                                                                                                                                                                                                                                                                                                                                        $3k for a 128GB standalone is quite favorable pricing considering the next best option at home is going to be a 32GB 5090 at $2k for the card alone, so probably $3k when you’re done building a rig around it.

                                                                                                                                                                                                                                                                                                                                                        • egorfine a day ago

                                                                                                                                                                                                                                                                                                                                                          The press-release says "up to 128GB" while the price is a single figure of $3,000. So it won't be out of the real of possibility that the 128GB version would cost quite a bit more.

                                                                                                                                                                                                                                                                                                                                                          • mysteria a day ago

                                                                                                                                                                                                                                                                                                                                                            From what I've seen the general consensus is that the 128GB of memory is standard across all models, and that the price would vary for different storage and networking configurations. Their marketing materials say that "Each Project DIGITS features 128GB of unified, coherent memory and up to 4TB of NVMe storage."

                                                                                                                                                                                                                                                                                                                                                            https://nvidianews.nvidia.com/news/nvidia-puts-grace-blackwe...

                                                                                                                                                                                                                                                                                                                                                            • egorfine a day ago

                                                                                                                                                                                                                                                                                                                                                              Indeed!

                                                                                                                                                                                                                                                                                                                                                          • lhl a day ago

                                                                                                                                                                                                                                                                                                                                                            The memory bandwidth has not been announced for this device. It's probably going to be more appropriate to compare vs a 128GB M4 Max (410-546GB/s MBW) or an AMD Ryzen AI Max+ 395 (yes, that's its real name) at 256GB/s of MBW.

                                                                                                                                                                                                                                                                                                                                                            The 5090 has 1.8TB/s of MBW and is in a whole different class performance-wise.

                                                                                                                                                                                                                                                                                                                                                            The real question is how big of a model will you actually want to run based on how slowly tokens generate.

                                                                                                                                                                                                                                                                                                                                                            • elorant 21 hours ago

                                                                                                                                                                                                                                                                                                                                                              Well obviously it has to be low otherwise they would cannibalize their high end GPUs.

                                                                                                                                                                                                                                                                                                                                                            • sliken 19 hours ago

                                                                                                                                                                                                                                                                                                                                                              Agreed. I care more about LLM size than tokens/sec so the GB10 or Strix Halo with 128GB are my leading choices. Both look to be cheaper than a similar mac studio with 128GB (minimum $4,800 currently). Will have to wait on final config, pricing, and performance.

                                                                                                                                                                                                                                                                                                                                                            • Tepix a day ago

                                                                                                                                                                                                                                                                                                                                                              With more and more personal AI, i think having a truly private device that can run large LLMs (remember: larger is better) is fantastic!

                                                                                                                                                                                                                                                                                                                                                              Ideally we can configure things like Apple Intelligence to use this instead of OpenAI and Apple's cloud.

                                                                                                                                                                                                                                                                                                                                                              • blackoil a day ago

                                                                                                                                                                                                                                                                                                                                                                Is there any effort in local cloud computing? I can't justify $3000 for a fun device. But if all devices (6 phone, 2 iPads, a desktop and 2 laptops) in my home can leverage that for fast LLM, gaming, and photo/video editing, now it makes so much more sense.

                                                                                                                                                                                                                                                                                                                                                                • reissbaker a day ago

                                                                                                                                                                                                                                                                                                                                                                  Open WebUI, SillyTavern, and other frontends can access any OpenAI-compatible server, and on Nvidia cards you have a wealth of options that will run one of those servers for you: llama.cpp (or the Ollama wrapper), of course, but also the faster vLLM and SGLang inference engines. Buy one of these, slap SGLang or vLLM on it, and point your devices at your machine's local IP address.

                                                                                                                                                                                                                                                                                                                                                                  I'm mildly skeptical about performance here: they aren't saying what the memory bandwidth is, and that'll have a major impact on tokens-per-second. If it's anywhere close to the 4090, or even the M2 Ultra, 128GB of Nvidia is a steal at $3k. Getting that amount of VRAM on anything non-Apple used to be tens of thousands of dollars.

                                                                                                                                                                                                                                                                                                                                                                  (They're also mentioning running the large models at Q4, which will definitely hurt the model's intelligence vs FP8 or BF16. But most people running models on Macs runs them at Q4, so I guess it's a valid comparison. You can at least run a 70B at FP8 on one of these even with fairly large context size, which I think will be the sweet spot.)

                                                                                                                                                                                                                                                                                                                                                                  • KeplerBoy a day ago

                                                                                                                                                                                                                                                                                                                                                                    You can just setup your local openAI like API endpoints for LLMs. Most devices and apps won't be able to use them, because consumers don't run self-hosted apps, but for a simple chatGPT style app this is totally viable. Today.

                                                                                                                                                                                                                                                                                                                                                                    • papichulo2023 a day ago

                                                                                                                                                                                                                                                                                                                                                                      Most tools expose openai-like apis that you can easily integrate with.

                                                                                                                                                                                                                                                                                                                                                                      • TiredOfLife a day ago

                                                                                                                                                                                                                                                                                                                                                                        That is literally how it was announced as. AI cloud in a box. That can also be used as Linux desktop.

                                                                                                                                                                                                                                                                                                                                                                        • aa-jv a day ago

                                                                                                                                                                                                                                                                                                                                                                          For companies interested in integrating machine learning into their products, both soft and hard - essentially training models specific to a particular use-case - this could be quite a useful tool to add to the kit.

                                                                                                                                                                                                                                                                                                                                                                        • quick_brown_fox a day ago

                                                                                                                                                                                                                                                                                                                                                                          How about “We sell a computer called the tinybox. It comes in two colors + pro.

                                                                                                                                                                                                                                                                                                                                                                          tinybox red and green are for people looking for a quiet home/office machine. tinybox pro is for people looking for a loud compact rack machine.” [0]

                                                                                                                                                                                                                                                                                                                                                                          [0] https://tinygrad.org/#tinybox

                                                                                                                                                                                                                                                                                                                                                                          • mft_ a day ago

                                                                                                                                                                                                                                                                                                                                                                            This was my second thought; while we don’t have full performance data, it’s probably a bad day for tinybox.

                                                                                                                                                                                                                                                                                                                                                                            • kkzz99 a day ago

                                                                                                                                                                                                                                                                                                                                                                              These look terrible. For 5 times the price you get worse performance.

                                                                                                                                                                                                                                                                                                                                                                              • nilstycho a day ago

                                                                                                                                                                                                                                                                                                                                                                                Are you comparing tinybox red with 738 FP16 TFLOPS at $15K to Project Digits with 1 FP4 PFLOP at $3K? Or did they announce the Project Digits FP16 performance somewhere?

                                                                                                                                                                                                                                                                                                                                                                              • loudmax a day ago

                                                                                                                                                                                                                                                                                                                                                                                Going by the specs, this pretty much blows Tinybox out of the water.

                                                                                                                                                                                                                                                                                                                                                                                For $40,000, a Tinybox pro is advertised as offering 1.36 petaflops processing and 192 GB VRAM.

                                                                                                                                                                                                                                                                                                                                                                                For about $6,000 a pair of Nvidia Project Digits offer about a combined 2 petaflops processing and 256 GB VRAM.

                                                                                                                                                                                                                                                                                                                                                                                The market segment for Tinybox always seemed to be people that were somewhat price-insensitive, but unless Nvidia completely fumbles on execution, I struggle to think of any benefits of a Tinygrad Tinybox over an Nvidia Digits. Maybe if you absolutely, positively, need to run your OS on x86.

                                                                                                                                                                                                                                                                                                                                                                                I'd love to see if AMD or Intel has a response to these. I'm not holding my breath.

                                                                                                                                                                                                                                                                                                                                                                                • nilstycho a day ago

                                                                                                                                                                                                                                                                                                                                                                                  > For about $6,000 a pair of Nvidia Project Digits offer about a combined 2 petaflops processing and 256 GB VRAM.

                                                                                                                                                                                                                                                                                                                                                                                  2 PFLOPS at FP4.

                                                                                                                                                                                                                                                                                                                                                                                  256 GB RAM, not VRAM. I think they haven't specified the memory bandwidth.

                                                                                                                                                                                                                                                                                                                                                                                  • loudmax a day ago

                                                                                                                                                                                                                                                                                                                                                                                    You're right. Tinybox's 1.36 petaflops is FP16 so that is a significant difference.

                                                                                                                                                                                                                                                                                                                                                                                    Also, the Tinybox's memory bandwidth is 8064 GB/s, while the Digits seems to be around 512 GB/s, according to speculation on Reddit.

                                                                                                                                                                                                                                                                                                                                                                                    Moreover, Nvidia's announced their RTX 5090s priced at $2k, which could put downward pressure on the price of Tinybox's 4090s. So the Tinybox green or pro models might get cheaper, or they might come out with a 5090-based model.

                                                                                                                                                                                                                                                                                                                                                                                    If you're the kind of person that's ready to spend $40k on a beastly ML workstation, there's still some upside to Tinybox.

                                                                                                                                                                                                                                                                                                                                                                                  • elorant 21 hours ago

                                                                                                                                                                                                                                                                                                                                                                                    You're missing the most critical part though. Memory bandwidth. It hasn't been announced yet for Digits and it probably won't be comparable to that of dedicated GPUs.

                                                                                                                                                                                                                                                                                                                                                                                  • moffkalast a day ago

                                                                                                                                                                                                                                                                                                                                                                                    >tinybox

                                                                                                                                                                                                                                                                                                                                                                                    >the size of several ATX desktops

                                                                                                                                                                                                                                                                                                                                                                                  • timmg a day ago

                                                                                                                                                                                                                                                                                                                                                                                    One thing I didn't see mentioned: this would be a good motivation for Nvidia to release "open weights" models.

                                                                                                                                                                                                                                                                                                                                                                                    Just like Mac OS is free when you buy a Mac, having the latest high-quality LLM for free that just happens to run well on this box is a very interesting value-prop. And Nvidia definitely has the compute to make it happen.

                                                                                                                                                                                                                                                                                                                                                                                    • swalsh a day ago

                                                                                                                                                                                                                                                                                                                                                                                      They already do release open weight models, in this very keynote he released some of the bigget open weight models yet: https://huggingface.co/collections/nvidia/cosmos-6751e884dc1...

                                                                                                                                                                                                                                                                                                                                                                                      • logicchains a day ago

                                                                                                                                                                                                                                                                                                                                                                                        They did exactly this, announcing at the same event new Nemo models upcoming (a fine-tune of llama).

                                                                                                                                                                                                                                                                                                                                                                                      • ttul a day ago

                                                                                                                                                                                                                                                                                                                                                                                        I bought a Lamba Labs workstation with a 4090 last year. I guess I’m buying one of these things now because the Lambda workstation just became a relic…

                                                                                                                                                                                                                                                                                                                                                                                        • nabla9 a day ago

                                                                                                                                                                                                                                                                                                                                                                                          Amortized cost is 10 cents per petaflop hour if you run it 5-6 years 24/7. I'm including the cost of electricity.

                                                                                                                                                                                                                                                                                                                                                                                          This is really game changer.

                                                                                                                                                                                                                                                                                                                                                                                          They should make a deal with Valve to turn this into 'superconsole' that can run Half Life 3 (to be announced) :)

                                                                                                                                                                                                                                                                                                                                                                                          • erikvanoosten 6 hours ago

                                                                                                                                                                                                                                                                                                                                                                                            > It’s a cloud computing platform that sits on your desk …

                                                                                                                                                                                                                                                                                                                                                                                            This goes against every definition of cloud that I know off. Again proving that 'cloud' means whatever you want it to mean.

                                                                                                                                                                                                                                                                                                                                                                                            • fweimer a day ago

                                                                                                                                                                                                                                                                                                                                                                                              If they end up actually shipping this, lots of people will buy these machines to get an AArch64 Linux workstation—even if they are not interested in AI or Nvidia GPUs.

                                                                                                                                                                                                                                                                                                                                                                                              At $3,000, it will be considerably cheaper than alternatives available today (except for SoC boards with extremely poor performance, obviously). I also expect that Nvidia will use its existing distribution channels for this, giving consumers a shot at buying the hardware (without first creating a company and losing consumer protections along the way).

                                                                                                                                                                                                                                                                                                                                                                                              • kllrnohj a day ago

                                                                                                                                                                                                                                                                                                                                                                                                > At $3,000, it will be considerably cheaper than alternatives available today

                                                                                                                                                                                                                                                                                                                                                                                                $3000 gets me a 64-core Altra Q64-22 from a major-enough SI today: https://system76.com/desktops/thelio-astra-a1-n1/configure

                                                                                                                                                                                                                                                                                                                                                                                                And of course if you don't care about the SI part, then you can just buy that motherboard & CPU directly for $1400 https://www.newegg.com/asrock-rack-altrad8ud-1l2t-q64-22-amp... with the 128-core variant being $2400 https://www.newegg.com/asrock-rack-altrad8ud-1l2t-q64-22-amp...

                                                                                                                                                                                                                                                                                                                                                                                                • adrian_b a day ago

                                                                                                                                                                                                                                                                                                                                                                                                  That Altra may be a good choice for certain server applications, like a Web server, but when used as a workstation it will be sluggish, because it uses weak cores, with much lower single-threaded performance than the Arm cores used in NVIDIA Digits.

                                                                                                                                                                                                                                                                                                                                                                                                  For certain applications, e.g. for those with many array operations, the 20 cores of Digits might match 40 cores of Altra at equal clock frequency, but the cores of Digits are likely to also have a higher clock frequency, so for some applications the 20 Arm cores of Digits may provide a higher throughput than 64 Altra cores, while also having a much higher single-thread performance, perhaps about double.

                                                                                                                                                                                                                                                                                                                                                                                                  So at equal price, NVIDIA Digits is certainly preferable as a workstation instead of a 64-core Altra. As a server, the latter should be better.

                                                                                                                                                                                                                                                                                                                                                                                                  • kllrnohj a day ago

                                                                                                                                                                                                                                                                                                                                                                                                    I mean I can get a Snapdragon X Elite laptop for $1200 that'll have a faster CPU than the one in the Digits, too...

                                                                                                                                                                                                                                                                                                                                                                                                    • adrian_b a day ago

                                                                                                                                                                                                                                                                                                                                                                                                      Their speeds should be very similar, but which is faster is uncertain.

                                                                                                                                                                                                                                                                                                                                                                                                      There have not been any published benchmarks demonstrating the speed of Cortex-X925 in a laptop/mini-PC environment.

                                                                                                                                                                                                                                                                                                                                                                                                      In smartphones, Cortex-X925 and Snapdragon Elite have very similar speeds in single thread.

                                                                                                                                                                                                                                                                                                                                                                                                      For multithreaded applications, 10 big + 10 medium Arm cores should be somewhat faster than 12 Snapdragon Elite.

                                                                                                                                                                                                                                                                                                                                                                                                      The fact that NVIDIA Digits has a wider memory interface should give it even more advantages in some applications.

                                                                                                                                                                                                                                                                                                                                                                                                      The Blackwell GPU should have much better software support in graphics applications, not only in ML/AI, in comparison with the Qualcomm GPU.

                                                                                                                                                                                                                                                                                                                                                                                                      So NVIDIA Digits should be faster than a Qualcomm laptop, but unless one is interested in ML/AI applications the speed difference should not be worth the more than double price of NVIDIA.

                                                                                                                                                                                                                                                                                                                                                                                                      • fweimer a day ago

                                                                                                                                                                                                                                                                                                                                                                                                        If the Nvidia system runs reasonably well with stock distribution kernels, it may well be worth the extra price. Usually, an optimized, custom kernel is a warning sign, but maybe they have upstreaming plans, and support for other distributions is planned.

                                                                                                                                                                                                                                                                                                                                                                                                      • sliken 18 hours ago

                                                                                                                                                                                                                                                                                                                                                                                                        Sure, but less than half the memory bandwidth and inference is largely bandwidth bound.

                                                                                                                                                                                                                                                                                                                                                                                                        • kllrnohj 17 hours ago

                                                                                                                                                                                                                                                                                                                                                                                                          The context was people that don't care about AI or Nvidia GPUs and just want an AArch64 system. So inference performance is irrelevant here

                                                                                                                                                                                                                                                                                                                                                                                                          > lots of people will buy these machines to get an AArch64 Linux workstation—even if they are not interested in AI or Nvidia GPUs.

                                                                                                                                                                                                                                                                                                                                                                                                          • sliken 17 hours ago

                                                                                                                                                                                                                                                                                                                                                                                                            Ah, sure the Oryon cores are decent, GPU is a bit weak, even compared to the normal cheap/low end APUs.

                                                                                                                                                                                                                                                                                                                                                                                                    • fweimer a day ago

                                                                                                                                                                                                                                                                                                                                                                                                      I had not seen the System76 systems before. They don't have distributors, and unlike the major OEMs, they don't take care of the customs details for intentional shipments. Prices for systems built with these older Ampere CPUs have come down at the local SIs as well (those that refuse to consumers), which I had not noticed before. Still the workstation form factor seems to be somewhat unique to System76 (unless, as you said, you build your own).

                                                                                                                                                                                                                                                                                                                                                                                                      Still I expect the Nvidia systems will be easier to get, especially for (de jure) consumers.

                                                                                                                                                                                                                                                                                                                                                                                                  • delegate a day ago

                                                                                                                                                                                                                                                                                                                                                                                                    I think this is version 1 of what's going to become the new 'PC'.

                                                                                                                                                                                                                                                                                                                                                                                                    Future versions will get more capable and smaller, portable.

                                                                                                                                                                                                                                                                                                                                                                                                    Can be used to train new types models (not just LLMs).

                                                                                                                                                                                                                                                                                                                                                                                                    I assume the GPU can do 3D graphics.

                                                                                                                                                                                                                                                                                                                                                                                                    Several of these in a cluster could run multiple powerful models in real time (vision, llm, OCR, 3D navigation, etc).

                                                                                                                                                                                                                                                                                                                                                                                                    If successful, millions of such units will be distributed around the world within 1-2 years.

                                                                                                                                                                                                                                                                                                                                                                                                    A p2p network of millions of such devices would be a very powerful thing indeed.

                                                                                                                                                                                                                                                                                                                                                                                                    • mycall a day ago

                                                                                                                                                                                                                                                                                                                                                                                                      > A p2p network of millions of such devices would be a very powerful thing indeed.

                                                                                                                                                                                                                                                                                                                                                                                                      If you think RAM speeds are slow for the transformer or inference, imagine what 100Mbs would be like.

                                                                                                                                                                                                                                                                                                                                                                                                      • ben_w a day ago

                                                                                                                                                                                                                                                                                                                                                                                                        Depends on the details, as always.

                                                                                                                                                                                                                                                                                                                                                                                                        If this hypothetical future is one where mixtures of experts is predominant, where each expert fits on a node, then the nodes only need the bandwidth to accept inputs and give responses — they won't need the much higher bandwidth required to spread a single model over the planet.

                                                                                                                                                                                                                                                                                                                                                                                                    • Havoc a day ago

                                                                                                                                                                                                                                                                                                                                                                                                      Can one game on it?

                                                                                                                                                                                                                                                                                                                                                                                                      If one can skip buying gaming rig with a 5090 with its likely absurd price then this 3k becomes a lot easier for dual use hobbyists to swallow

                                                                                                                                                                                                                                                                                                                                                                                                      Edit 5090 is 2k

                                                                                                                                                                                                                                                                                                                                                                                                      • sliken 18 hours ago

                                                                                                                                                                                                                                                                                                                                                                                                        I'm not sure it has a video out. There is a AI produced image of the digit and a monitor though.

                                                                                                                                                                                                                                                                                                                                                                                                        • Havoc 3 hours ago

                                                                                                                                                                                                                                                                                                                                                                                                          hmm...interesting point.

                                                                                                                                                                                                                                                                                                                                                                                                          Putting a screen next to your product that doesn't have video out would be quite disingenuous though. So I'd be surprised if it has zero output at all.

                                                                                                                                                                                                                                                                                                                                                                                                          ...I guess there is a risk that it has an output but it's more like a CPU iGPU style basic output rather than being powered by the main GPU.

                                                                                                                                                                                                                                                                                                                                                                                                        • ThatMedicIsASpy a day ago

                                                                                                                                                                                                                                                                                                                                                                                                          It is not made for gaming and the form factor says a lot. With this being an option for AI there is less of a need to buy tons of nvidia RTX GPUs.

                                                                                                                                                                                                                                                                                                                                                                                                          The 5090 surprised me with the two slot height design while having a 575W power budget.

                                                                                                                                                                                                                                                                                                                                                                                                          • Havoc a day ago

                                                                                                                                                                                                                                                                                                                                                                                                            It’s certain not its primary use, but it may still work well - powerful modern GPU anyway. And that may be enough that want to use it for gaming as secondary use

                                                                                                                                                                                                                                                                                                                                                                                                        • macawfish a day ago

                                                                                                                                                                                                                                                                                                                                                                                                          Is this going to make up for the lack of VRAM in the new consumer GPUs?

                                                                                                                                                                                                                                                                                                                                                                                                          • openrisk a day ago

                                                                                                                                                                                                                                                                                                                                                                                                            Will there be a healthy "personal AI supercomputer" economy to generate demand for this? (NB: spam generators are only a parasite on any digital economy, viable to the extend they don't kill the host).

                                                                                                                                                                                                                                                                                                                                                                                                            One can only wish for this, but Nvidia would be going against the decades-long trend to emaciate local computing in favor of concentrating all compute on somebody else's linux (aka: cloud).

                                                                                                                                                                                                                                                                                                                                                                                                            • pimeys 16 hours ago

                                                                                                                                                                                                                                                                                                                                                                                                              NVIDIA has a robotics department where I could see this tech fitting pretty nicely.

                                                                                                                                                                                                                                                                                                                                                                                                              Also I consider this a dev board. Soon this tech will be everywhere, in our phones, computers...

                                                                                                                                                                                                                                                                                                                                                                                                              You could already plug that to your home assistant and have your own Star Trek computer you can ask questions from. And NVIDIA seems to know this is the future and they were the first in the market.

                                                                                                                                                                                                                                                                                                                                                                                                            • bionade24 a day ago

                                                                                                                                                                                                                                                                                                                                                                                                              Smart move for Nvidia to subsidise their ARM CPU and platform business by selling a big GPU packet with a CPU that most users don't really care about. Even if the margin is less than selling the raw GPU power would be (which I doubt), it'll look good on the shareholders conference if other business segments go up steep, too.

                                                                                                                                                                                                                                                                                                                                                                                                              • sabareesh 20 hours ago

                                                                                                                                                                                                                                                                                                                                                                                                                I am pretty sure memory bandwidth will be low it doesn't eat up their enterprise lineup. If we are luck we might get 512GB/S this is still half of 4090

                                                                                                                                                                                                                                                                                                                                                                                                                • rapatel0 a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                  I'm buying one. It's cheaper then my 4090RTX+192GB of ram for more performance and model traning headroom. It's also probably a beast for data science workloads.

                                                                                                                                                                                                                                                                                                                                                                                                                  • friend_Fernando a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                    Little by little, we're getting an answer to the question: "What kind of investment does an outrageous influx of capitalization spur?" One might think it would be an AI silicon-moat, and it might yet be some of that.

                                                                                                                                                                                                                                                                                                                                                                                                                    But it's clear that everyone's favorite goal is keretsuification. If you're looking for abnormal profits, you can't do better than to add a letter to FAANG. Nvidia already got into the cloud business, and now it's making workstations.

                                                                                                                                                                                                                                                                                                                                                                                                                    The era of specialists doing specialist things is not really behind us. They're just not making automatic money, nor most of it. Nvidia excelled in that pool, but it too can't wait to leave it. It knows it can always fail as a specialist, but not as a kereitsu.

                                                                                                                                                                                                                                                                                                                                                                                                                    • tkanarsky a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                      This seems surprisingly cheap for what you get! Excited to see what people cook with this

                                                                                                                                                                                                                                                                                                                                                                                                                      • haunter 20 hours ago

                                                                                                                                                                                                                                                                                                                                                                                                                        The monitor is AI generated in the product photo.... Nvidia please

                                                                                                                                                                                                                                                                                                                                                                                                                        https://s3.amazonaws.com/cms.ipressroom.com/219/files/20250/...

                                                                                                                                                                                                                                                                                                                                                                                                                        • adam_arthur 19 hours ago

                                                                                                                                                                                                                                                                                                                                                                                                                          Finally!

                                                                                                                                                                                                                                                                                                                                                                                                                          First product that directly competes on price with Macs for local inferencing of large LLMs (higher RAM). And likely outperforms them substantially.

                                                                                                                                                                                                                                                                                                                                                                                                                          Definitely will upgrade my home LLM server if specs bear out.

                                                                                                                                                                                                                                                                                                                                                                                                                          • gigatexal a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                            What are the CPU specs? Idk about the GPU but a really fast ARM cpu and a ton of ram and it already runs Linux?!! If it’s competitive with the M chips from Apple this might be my next box.

                                                                                                                                                                                                                                                                                                                                                                                                                          • pizza a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                            Does this also answer the question "What am I supposed to do with my old 4090 and my old 3090 once I get a 5090?" ie can we attach them as PCIe hardware to Digits?

                                                                                                                                                                                                                                                                                                                                                                                                                            • sfrules 16 hours ago

                                                                                                                                                                                                                                                                                                                                                                                                                              Newbie question: can you connect a RTX 5090 to the GB10 Digits computer?

                                                                                                                                                                                                                                                                                                                                                                                                                              • tobyhinloopen a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                $3000 seems incredibly good value

                                                                                                                                                                                                                                                                                                                                                                                                                                • palmfacehn a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                  Would love to see something like this with an ATX form factor, socketed GPU, socketed GPU and non-soldered RAM.

                                                                                                                                                                                                                                                                                                                                                                                                                                  • blackoil a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                    Isn't that just a regular PC with one or more 5090 or equivalent workstation GPU?

                                                                                                                                                                                                                                                                                                                                                                                                                                    • palmfacehn a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                      There would be unified GPU/CPU memory and an ARM processor that isn't soldered to the board.

                                                                                                                                                                                                                                                                                                                                                                                                                                      • pbalcer a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                        One of the reasons why they can do unified memory efficiently is because the CPU/GPU is a single SoC. If you separate them, you end up with a normal PC architecture, with memory having to go through a PCIe bus. This is possible to do with reasonable latency and bandwidth (thanks to CXL), but we haven't seen that in consumer hardware. Even in server space I think only MI300 supports CXL, and even then I don't think it's something AMD particularity promotes.

                                                                                                                                                                                                                                                                                                                                                                                                                                        Personally I think Strix Halo workstations may come with expendable memory, storage and free PCIe slots. But then you have to deal with ROCm...

                                                                                                                                                                                                                                                                                                                                                                                                                                  • prollyjethi a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                    Nvidia could potentially bring us all the year of Linux Desktop.

                                                                                                                                                                                                                                                                                                                                                                                                                                    • smcl a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                      Do I need a "personal AI supercomputer"?

                                                                                                                                                                                                                                                                                                                                                                                                                                      • tmoneymoney a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                        No but you want one regardless

                                                                                                                                                                                                                                                                                                                                                                                                                                      • sam_goody a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                        So, a company that doesn't feel like sharing all their secret sauce with Anthropic can run DeepSeek Coder on three of these for $9K, and it should be be more or less the same experience.

                                                                                                                                                                                                                                                                                                                                                                                                                                        Do I understand that right? It seems way to cheap.

                                                                                                                                                                                                                                                                                                                                                                                                                                        • Havoc 14 hours ago

                                                                                                                                                                                                                                                                                                                                                                                                                                          Such a setup would likely work - assuming DS3 support on software stack but wouldn’t be able to serve as many requests in parallel as a classic gpu setup.

                                                                                                                                                                                                                                                                                                                                                                                                                                          Main issue is the ram they’re using here isn’t the same as is in GPUs

                                                                                                                                                                                                                                                                                                                                                                                                                                        • stuaxo a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                          Of course no chance of this with x86 because of market segmentation.

                                                                                                                                                                                                                                                                                                                                                                                                                                          • sliken 18 hours ago

                                                                                                                                                                                                                                                                                                                                                                                                                                            This Nvidia is quite a bit like the AMD Strix Halo, which has the same 128GB max memory, x86, unified memory, and a 256 bit wide bus. I've seen much hopeful speculation that the nvidia is 512 bits wide, but I'm dubious.

                                                                                                                                                                                                                                                                                                                                                                                                                                            • TiredOfLife a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                              No chance because they don't have x86 license.

                                                                                                                                                                                                                                                                                                                                                                                                                                            • gigel82 a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                              What we need is more diversity in the space. Something between the Jetson and this thing, at under $1000 that can run in a LAN to do LLM, STT, TTS, etc. would be an awesome (if niche) device to enable truly local / private AI scenarios for privacy-sensitive folks.

                                                                                                                                                                                                                                                                                                                                                                                                                                              • rafaelmn a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                Feels like data center AI HW demand peak is over now that these things are trickling down to consumers and they are diversifying customers. Also going lower than expected on gaming HW, seems like they have enough fab capacity.

                                                                                                                                                                                                                                                                                                                                                                                                                                                • bushbaba a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                  Their DC sales likely aren’t growing at the rates prior seen. Law of big numbers. They gotta diversify to satisfy growth & profit expectations

                                                                                                                                                                                                                                                                                                                                                                                                                                                • thntk a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                  Anyone know if it can run training/fine-tuning and not just 4-bit inference? Does it support mixed precision training with either BF16 or FP16?

                                                                                                                                                                                                                                                                                                                                                                                                                                                  • anigbrowl a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                    Honestly surprised at how affordable this is, I was expecting $5-6k as I scanned the opening paragraphs.

                                                                                                                                                                                                                                                                                                                                                                                                                                                    • sliken a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                      Might well be, from what I can tell the price "starts at $3k", which might well be config'd like the minimum mac studio for RAM and storage. Mac studios easily hit $5k-$6k or more.

                                                                                                                                                                                                                                                                                                                                                                                                                                                    • henearkr a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                      It's based on Mediatek CPU cores, so I am really pessimistic about their open source support...

                                                                                                                                                                                                                                                                                                                                                                                                                                                      I'm bracing for a whole new era of unsufferable binary blobs for Linux users, and my condolences if you have a non-ultramainstream distro.

                                                                                                                                                                                                                                                                                                                                                                                                                                                      • magicalhippo a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                        The press release[1] says it's using NVIDIA's Grace CPU[1], as well as this:

                                                                                                                                                                                                                                                                                                                                                                                                                                                        MediaTek, a market leader in Arm-based SoC designs, collaborated on the design of GB10, contributing to its best-in-class power efficiency, performance and connectivity.

                                                                                                                                                                                                                                                                                                                                                                                                                                                        I assume that means USB and such peripherals is MediaTek IP, while the Blackwell GPU and Grace CPU is entirely NVIDIA IP.

                                                                                                                                                                                                                                                                                                                                                                                                                                                        That said, NVIDIA hasn't been super-great with the Jetson series, so yeah, will be interesting to see what kind of upstream support this gets.

                                                                                                                                                                                                                                                                                                                                                                                                                                                        [1]: https://nvidianews.nvidia.com/news/nvidia-puts-grace-blackwe...

                                                                                                                                                                                                                                                                                                                                                                                                                                                      • bobheadmaker a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                        Pricing seems off!

                                                                                                                                                                                                                                                                                                                                                                                                                                                        • cess11 a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                          With a bit of luck it'll mean some of the Jetson series will get cheaper.

                                                                                                                                                                                                                                                                                                                                                                                                                                                          While I'm quite the "AI" sceptic I think it might be interesting to have a node in my home network capable of a bit of this and that in this area, some text-to-speech, speech-to-text, object identification, which to be decent needs a bit more than the usual IoT- and ESP-chips can manage.

                                                                                                                                                                                                                                                                                                                                                                                                                                                          • YetAnotherNick a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                            I highly doubt it's half or ever quarter of GB200, unless they have hidden water cooling or something outside. GB200 is 1200 Watts. Digits doesn't look like it would be above 200W, and cooling 200W would be impressive.

                                                                                                                                                                                                                                                                                                                                                                                                                                                            • sliken 18 hours ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                              I think it's 1/40th of the GB200, but I think that's the one with two blackwells, so 1/20th of a full blackwell.

                                                                                                                                                                                                                                                                                                                                                                                                                                                              • blackoil a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                                GB200 is ~USD 60,000/. So, it should be like 20th of that.

                                                                                                                                                                                                                                                                                                                                                                                                                                                              • trhway a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                                $3000? The GB10 inside it seems to be a half of GB200 which is like $60K. One can wonder about availability at those $3K.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                • kcb a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                                  No way is the GPU half a GB200. I'd expect something much lower end and power conscious.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                  They mention 1 PFLOP for FP4, GB200 is 40 PFLOP.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                  • rs38 8 hours ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                                    at least the naming implies factor ~20 less :)

                                                                                                                                                                                                                                                                                                                                                                                                                                                                  • sliken a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                                    TheRegister mentions:

                                                                                                                                                                                                                                                                                                                                                                                                                                                                       Specs we’ve seen suggest the GB10 features a 20-core Grace CPU and a GPU that packs manages a 40th the performance of the twin Blackwell GPUs used in Nvidia’s GB200 AI server.
                                                                                                                                                                                                                                                                                                                                                                                                                                                                  • poisonborz a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                                    Welcome to tomorrow's "personal" computer, a single unmodifiable SoC with closed source software stack.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                    • blackoil a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                                      That is the PC of today and yesterday and yesteryears.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                      • papichulo2023 a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                                        This would be closer to a personal cloud. It's meant for other devices to connect with.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                        • otabdeveloper4 a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                                          So exactly like the PC of yesterday then?

                                                                                                                                                                                                                                                                                                                                                                                                                                                                          • vegabook 19 hours ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                                            Like a mac?

                                                                                                                                                                                                                                                                                                                                                                                                                                                                          • jeleh a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                                            ...but will it run DOOM?

                                                                                                                                                                                                                                                                                                                                                                                                                                                                            • giacomoforte a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                                              It costs the equivalent of 2 years of cloud GPU H100s at current prices.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                              Edit: Sorry fucked up my math. I wanted to do 40x52x4, $4/hr being the cloud compute price but that us actually $8300, so it is actually equivalent to about 4.5 months of cloud compute. 40 hours because I presume that this will only be used for prototyping and debugging, i.e during office hours.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                              • sabareesh a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                what are current h100 price ? lowest i have seen is only 0.99 per hour

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                • billconan a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  and 2 years have 17520 hours.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  • YetAnotherNick a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    Where can you find 99c/hour? Cheapest I can find is double that.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    • sabareesh a day ago
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      • YetAnotherNick a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        > "Starts from"

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        This is a marketplace, not cloud pricing.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      • saagarjha a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        Lambda has on-demand GH200 right now for $1.49. There might be a cheaper deal elsewhere for a contract.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        • YetAnotherNick a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          Weird that they have cheapest H100 for $2.49. It should either be shared GH200 or it could just a promotional price.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          • saagarjha 13 hours ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            It’s a temporary thing I think because nobody wants to use ARM

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  • rubatuga a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    Would consider at a lower price of $500 USD, way too expensive for what it brings.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    • lz400 a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      I think it's cheap at $3000. 128gb RAM, top of the line GPU capabilities, 4tb storage... it's much better than what a top shelf mbp can do and much cheaper

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      • saagarjha a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        This definitely loses on CPU performance.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        • sliken a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          I originally thought so, since the previous Grace CPUs used the neoverse N2, which loses to Apple's M4 cores.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          However apparently 10 of the cores are the Cortex-X925 CPUs, which are a serious upgrade. Basically 10 performance cores and 10 efficiency cores that should be pretty competitive with any current apple CPU.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          • saagarjha a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            M4 Max is 12 performance cores and 4 efficiency cores, the former of which are basically the fastest single-core performance you can buy right now, and the latter of which have never been touched by any architecture for their energy class. It seems highly unlikely that what you say is the case.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            • sliken a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              The cortex X925 is pretty competitive. Seems similarly aggressive to the Apple M4. Both have 10 wide decode and disbatch, pretty exceptional single thread, and good overall performance.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              Seems close enough that it might well come down to if your application uses SVE (which the X925 has) or SME (which apple has). I believe generally SVE is much easier to use without using Apple proprietary libraries.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              Of if you need significant memory bandwidth, apple M4 peaks at around 200GB/sec or so, the other 300GB/sec or so is available for the GPUs.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              Seems quite plausible that 10 * x925 and 10 * A725 might well be more collective performance than apples 12 p-cores + 4 e-cores. But sure it's a bit early to tell and things like OS, kernel, compiler, thermal management, libraries, etc will impact actual real world performance.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              Generally I'd expect the Nvidia Project Digit 10 p cores + 10 e cores + healthy memory system to be in the same ball park as the apple M4 max.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              • saagarjha 6 hours ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                I guess you are talking about something different; I'm not really treating the cores as ML accelerators.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        • sergiotapia a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          correct, I just spent $4k on an "AI" machine to do stuff. 96GB ram, ryzen 9 9950x 16 core, 4TB nvme, 24tb hdd, rtx 4090.

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          If this thing was available six months ago I would have bought it instead!

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          • Mistletoe a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            What do you do on it?

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          • lispm a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            upto 4tb storage

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          • jerryspringster a day ago

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            $500? that won't even buy you a decent new graphics card, anybody claiming this is overpriced doesn't have a clue.