NVIDIA is pretty established but there's also Intel, AMD, Google to contend with. Sure Cerebras is unique in that they make one large chip out of the entire wafer but nothing prevents these other companies from doing the same thing. Currently they are choosing not to because of wafer economics but if they chose to, Cerebras would pretty much lose their advantage. https://www.servethehome.com/cerebras-wse-3-ai-chip-launched... 56x the size of H100 but only 8x the performance improvement isn't something I would brag about. I expected much higher performance since all processing is on one wafer. Something doesn't add up (I'm no system designer). Also, at $3.13 million per node, one could buy 100 H100s at $30k each (not including system, cooling, cluster, etc). Based on price/performance Cerebras loses IMO.
I think the wafer itself isn’t the whole deal. If you watch their videos and read the link you posted the wafer size allows them to stack them in a block with integrated power and cooling at a higher density than blades and attach enormous amounts of memory. Not including the system, cooling, cluster, etc seems like a relatively unfair comparison too given the node includes all of those things - which are very expensive when considering enterprise grade data center hardware.
I don’t think their value add is simple “single wafer” with all other variables the same. In fact I think the block and system that gets the most out of that form factor is the secret sauce and not as easily replicated - especially since the innovations are almost certainly protected by an enormous moat of patents and guarded by a legion of lawyers.
At the end of the day, Cerebras has not submitted any MLPerf results (of which I am aware). That means they are hiding something. Something not very competitive.
So, performance is iffy. Density for density sake doesn’t matter since clusters are power limited.
Nothing for the training part of MLPerf's benchmark. If they're competing just on inference, then they have stiff competition from specialized NPU-for-inference makers like Hailo (see: it's even part of the official Raspberry Pi AI kit), Qualcomm, tons of other players, and also some players using optics instead of electrons for inference such as Lightmatter, and also SIMD on highly abundant CPU servers which are never in shortage unlike GPUs (and have recently gotten support for specialized inference ops besides simply SIMD ones).
This isn't a benchmark, it's a press release. MLPerf has an inference component so they could have released numbers, but they chose not to.
At the end of the day it's all about performance per dollar/TCO, too, not just raw perf. A standardized benchmark helps to evaluate that.
My guess is that they neglected the software component (hardware guys always disdain software) and have to bend over backwards to get their hardware to run specific models (and only those specific models) well. Or potentially common models don't run well because their cross-chip interconnect is too slow.
MLPerf brings in exactly zero revenue. If they have sold every chip they can make for the next 2+ years, why would they be diverting resources to MLPerf benchmarking?
Artificial analysis does good API provider inference benchmarking and has evaluated Cerebras, Groq, Sambanova, the many Nvidia-based solutions, etc. IMO it makes way more sense to benchmark actual usable end points rather than submit closed and modified implementations to mlcommons. Graphcore had the fastest BERT submission at one point (when BERT was relevant lol) and it didn't really move the needle at all.
With Artificial Analysis I wonder if model tweaks are detectable. That’s the benefit of a standardized benchmark, you’re testing the hardware. If some inference vendor changes Llama under the hood, the changes are known. And of course if you don’t include precise repro. instructions in your standardized benchmark, nobody can tell how much money you’re losing (that is, how many chops are serving your requests).
I guess it's a software problem.
Without optimized implementations their performance will look like shit, even if their chip were years ahead of the competition.
Building efficient implementations with an immature ecosystem and toolchain doesn't sound like a good time. But yeah, huge red flag. If they can't get their chip to perform there's no hope for customers.
This hypothesis is an eerily exact instance of the tinygrad (tinycorp) thesis, along the lines of
“nvidia’s chip is better than yours. If you can’t make your software run well on nvidia’s chip, you have no hope of making it run well on your chip, least of all the first version of your chip.”
That’s why tinycorp is betting on a simple ML framework (tinygrad, which they develop and make available open source) whose promise is, due to the few operations needed by the framework: it’ll be very easy to get this software to run on a (eg your) new chip and then you can run ML workloads.
I’m not a (real) expert in the field but find the reasoning compelling. And it might be a good explanation for the competition for nvidia existing in hardware, but seemingly not in reality (ie including software that does something with it).
> That’s why tinycorp is betting on a simple ML framework (tinygrad, which they develop and make available open source) whose promise is, due to the few operations needed by the framework: it’ll be very easy to get this software to run on a (eg your) new chip and then you can run ML workloads.
This sounds easy in theory, but in reality, based on current models, the implementations are often tuned to make them work fast on the chip. As an engineer in the ML compiler space, I think this idea of just using small primitives, which comes from the compiler / bytecode world, is not going to yield acceptable performance.
Often enough, hardware-specific optimizations can be performed automatically by the compiler. On the flip side, depending on a small set of general-purpose primitives makes it easier to apply hardware-agnostic optimization passes to the model architecture. There are many efforts that are ultimately going in this direction, from Google's Tensorflow to the community project Aesara/PyTensor (née Theano) to the MLIR intermediate representation from the LLVM folks.
I'm a compiler engineer at a GPU company, and while tiny grad kernels might be made more performant by the JIT compiler underlying every GPU chips stack, oftentimes, a much bigger picture is needed to properly optimize all the chip's resources. The direction that companies like NVIDIA et al are going in involves whole model optimization, so I really don't see how tiny grad can be competitive here. I see it most useful in embedded, but Hotz is trying to make it a thing for training. Good luck.
> There are many efforts that are ultimately going in this direction, from Google's Tensorflow to the community project Aesara/PyTensor (née Theano) to the MLIR intermediate representation from the LLVM folks.
The various GPU companies (AMD, NVIDIA, Intel) are some of the largest contributors to MLIR, so saying that they're going in the direction of standardization is not wholly true. They're using MLIR as a way to share optimizations (really to stay at the cutting edge), but, unlike tiny grad, MLIR has a much higher level overview of the whole computation and the company's backends will thus be able to optimize over the whole model.
If tiny grad were focused on MLIR's ecosystem I'd say they had a fighting chance of getting NVIDIA-like performance, but they're off doing their own thing.
Yes, sure. I'm occasionally reading up on what George Hotz is doing with tinygrad and him ranting about AMD hardware certainly has influenced my opinion on non-Nvidia hardware to some degree - even though I take his opinion with a grain of salt, he and his team are clearly encountering some non-trivial issues.
I would love to try some of the stuff I do with CUDA on AMD hardware to get some first-hand experience, but it's a though sell: They are not as widely available to rent and telling my boss to order a few GPUs, so we can inspect that potential mess for ourselves is not convincing either.
Can their system attach memory? from what I read, it doesn't seem to be able to: https://www.reddit.com/r/mlscaling/comments/1csquky/with_waf...
I think they do have external memory that they use for training.
Former Cerebras engineer. At the time I was there, it could not.
Surprising. DRAM (and more importantly high-bandwidth DRAM) seems to be scaling significantly better than SRAM -- and I'm not sure if that could be seriously expected to shift.
Correction: it's 8x the TFLOPS of a DGX (8 H100), not 1 H100. But it's true that if it stays at $3M it's probably too much and I don't think the memory bottleneck on gpus is large enough to justify this price/performance.
So, the corrected statement is:
"56x the size of H100 but only 64x the performance improvement"
Doesn't sound too shabby.
The company started in 2015 so I think they are (were?) banking on SRAM scaling better than it has in recent years.
If you have a problem that you can’t easily split up into 64 chunks, I guess it makes more sense, right?
> 56x the size of H100 but only 8x the performance improvement isn't something I would brag about.
It doesn't sound like it's too bad for a 9 year old company. Nvidia had a 20-year head start. I would expect that they will continue to shrink it and increase performance. At some point, that might become compelling?
Nvidia is also going to keep improving, so it will be a moving target.
That's true, but the advantage of having a head start does eventually diminish. They won't catch up to Nvidia in the next couple of years, but they could eventually be a real competitor.
Comparing a WSE-3 to a H100 without considering the systems they go in or the systems, cooling, networking, etc that supports them means little when doing cost analysis, be it CapEx or TCO. A better (but still flawed) comparison would be a DGX H200 (a cluster of H100's and their essential supporting infra) to a CS-3 (a cluster of WSE-3's and their essential supporting infra in a similar form factor/volume of a DGX H200).
Now, is Cerebras going to eventually beat Nvidia or at least compete healthily with Nvidia and other tech titans in the general market or a given lucrative niche of it? No idea. That'd be a cool plot twist, but hard to say. But it's worth acknowledging that investing in a company and buying their products are two entirely separate decisions. Much of silicon valleys success stories are a result of people investing in the potential of what they could become, not because they were already the best on the market, and for nothing else, Cerebras approach is certainly novel and promising.
> wafer economics
What are they?
Is this related to defects? Can't they disable parts of defective chip just like other CPUs do? Sounds cheaper than cutting up and packaging chips individually!
Process development, feature size, and ultimate yield are probably what theyre after. Yes, for the past 30+ years everyone has used a combination of disabling (“fusing”) unused/unreliable logic on the die. In addition everyone also “bins” the chips from the same wafer to different SKUs based on stable clock speed, available/fused components, test results, etc. This can be very effective in increasing yield and salable parts.
My recollection is that theres speculation cerebras is building in significant duplicate features to account for defects. They cant “bin” their wafers in the same way as packaged chips. That will reduce total yield/utilization of the surface area.
The actual packaging steps are relatively low tech/cost compared to the semiconductor manufacturing. Theyre commonly outsourced somwhere like malaysia or thailand.
Agreed, it just seems like Nvidia chips are going to be easier to produce at scale. Cerberas will be limited to a few niche use-cases, like HFT where hedge funds are using LLMs to analyze SEC filings as fast as possible.
Where/how did you learn of the hedge fund usages?
If the poster never comes back, I think it is fair to assume it is just a reasonable guess, right?
they don’t need an advantage, they just need orders and inventory
get extorted by nvidia sales people for a 2026 delivery date that gets pushed out if you say anything about it or decline cloud services
or another provider delivering earlier
thats what the market wants, and even then, who cares? this company is trying to IPO at whay valuation? this article didnt say but the last valuation was like $1.5bn? so you mean a 300x of delta between this and Nvidia’s valuation if these guys get a handful of orders? ok
At the end of the day it's all made in the same factory. If nVidia have problems delivering then so do Cerebras.
> Sure Cerebras is unique in that they make one large chip out of the entire wafer
I'm sure tgey test it thoroughly. /s
On the one hand, the financials are terrible for an IPO in this market.
On the other, Nvidia is worth 3trn so they can sell a pretty good dream of what success looks like to investors.
Personally I would expect them to get a valuation well about the 4bln from the 2021 round, despite the financials not coming close to justifying it.
Saying the financials are terrible is a bit of a stretch. Rapidly growing revenue, decreasing loss/share and a loss/share similar to other companies that IPO'ed this year.
The more concerning thing is just not having diversity of revenue, since most of it comes from G42.
Has G42 shipped any working AI models?
afaik they have the current SOTA language models for arabic
IPOs are coming back. Expect pretty big ones in 2025.
It’ll pop. The it’ll rot.
Rev for last 2 years:
$24.6M $78.7M $270M($136.4M)
Sounds like a rocketship. You also get a better sharp if you take some money off the table in the form of leverage and put it in other firms within the industry. E.G. Leveraging your NVDA shares and buying Cerebras.
> take some money off the table in the form of leverage and put it in other firms within the industry. E.G. Leveraging your NVDA shares and buying Cerebras
Please don't do this. Sell your Nvidia shares and rebalance to Cerebras, whatever. But financially leveraging a high-multiple play to buy a correlated asset (which is also high multiple) is begging for a margin call. You may wind up having been right. But leverage forces you to be right and on time.
You are so on point! A huge number of amateur investors get obliterated on this. Your call may be right, but that's no help if you don't survive to see it realized.
You may have a hugely profitable idea that could realize crazy gains over a 5 year horizon, but if you get margin called and liquidated in year 3, you'll end up with nothing.
The magic of investment is compound returns, not crazy leverage. Take some of the crazy Nvidia profits and reinvest it elsewhere where you expect geometric growth. Keep things decently diversified.
Cerebras is well-known in the AI chip market. They make chips that are an entire wafer.
Cerebrus made a great (now deleted) video on the whole computer hosting the wafer: https://web.archive.org/web/20230812020202/https://www.youtu...
It’s fascinating.
This is a great video, thank you for sharing. My favorite part:
"...next we have this rubber sheet, which is very clever, and very patented!"
TIL - web archive saves youtube videos
Wow 200k amps in a chip. Whole thing looks like an early computer from 50s.
Yep! Them, SambaNova, and Groq are super exciting mid-late stage startups imo.
Shhhhh, stop telling the normies about the future!
And especially don't tell them to start looking into who "sovereign clouds" actually are!
Interesting that they’ve scaled on-chip memory sublinearly with the growth of transistors between their generations, I would’ve thought they would try to bump that number up. Maybe it’s not a major bottleneck for their training runs?
SRAM is scaling significantly more slowly than logic in recent process nodes.
Ahh that explains it, thanks. Seems like a potentially large problem given their strategy.
They could use something like GCRAM[1] to double capacity if they had to...but it's not clear how much worse performance would be.
The performance doesn't look great (yet). See Fig. 7
https://www.eng.biu.ac.il/fishale/files/2020/12/A-1-Mbit-Ful...
Cerebras runs at 1.1 GHz[1], and this was a much earlier design on 16nm so it might be a good fit by now. Their TSMC 5 nm version is scheduled for early 2025.[2]
[1]https://cerebras.ai/blog/cerebras-architecture-deep-dive-fir...
[2]https://www.eenewseurope.com/en/raaam-signs-lead-licensee-fo...
They'd have to quadruple their performance to be relevant in the market generally, here's to hoping.
Yeah...but double density and 1/10 power consumption would be outstanding for ML type loads...hopefully they can get performance to at least the 1.x GHz range! I'm keeping an eye out...dollars to donuts Cerebras will be using this within a year, once it's qualified for TSMC 5nm.
Not so sure myself. eDRAM has been tried many times and yet here we all are, banging our heads against the SRAM scaling limits. It's such a huge risk for cerebras, they absolutely cannot afford to miss. But then you could say that about a wafer-scale chip...!
I'd bet that making a chip the size of the waver has the benefit on not losing any silicon to dicing the wafer up like a desktop or GPU chips coming from a wafer. Major downside is you need to either have a massive x and Y exposure size or break the wafer into smaller exposures which means your still needing to focus on alignment between the steps, and if a defect can't be corrected then is that wafer just scrap?
They fuse off sections of the wafer with defects just like other manufacturers do in monolithic CPUs (as opposed to chiplets like AMD).
Making larger monolithic silicon doesn't get 2x as expensive to get 2x as large. Bigger silicon is massively more expensive. I'm not sure that making each piece require a large chunk of perfect wafer is a fantastic idea, especially when you're looking to unseat juggernauts who have a great deal of experience making high quality product already.
It is designed to handle defects.
https://www.servethehome.com/cerebras-wafer-scale-engine-ai-....
0.5% overheads for defects. You are not correct.
How does one cool that!? Heck power it...
The only way for Cereberas to actually succeed in the market is to raise funds. They need better software, better developer relations, and better hardware too. It's a gamble, but if they can raise enough money then there's a chance of success, whereas if they can't it's pretty hopeless.
Time (and the market ?) will tell whether all the people clamoring for NVIDIA alternatives actually put their money on it (understanding that NVIDIA's headstart is a long-term heavy investment on software too: compilers, libraires, and of course hardware/software co-design). I still can't fathom how Intel thought Arc and/or Ponte Vecchio would pay for themselves on day one.
> I still can't fathom how Intel thought Arc and/or Ponte Vecchio would pay for themselves on day one.
I don't think they did expect that.
Then I don't understand why they stopped before they even released a second gen, and instead went for nebulous promises of falcon shores, weird archs or maybe some sprinkled Gaudi, or maybe something in some 'soon' future. The Arc boards weren't bad and at least you could start porting to SYCL/oneapi for a good reason (aka there's hardware available...).
They haven't stopped? Or if they have stopped they aren't saying so.
The second generation of Arc is called Battlemage and the successor to Ponte Vechio is Falcon Shores and both are promised in 2025.
This is later than originally expected by a few months but no reason to think they've abandoned it.
If they do ever decide to do that it wouldn't be because the first gen didn't make money but because they no longer think they can subside it long enough for it to become profitable, which can either be because they run out of money or because it isn't gaining market share at the rate they expected.
They have a cloud platform. I just ran a test query on their version of Llama 3.1 70B and got 566 tokens/sec.
Is that a lot? Do they have MLPerf submissions?
Yes, that's very fast. The same query on Groq, which is known for its fast AI inference, got 249 tokens/s, and 25 tokens/s on Together.ai. However, it's unclear what (if any) quantization was used and it's just a spot check, not a true benchmark.
https://www.zdnet.com/article/cerebras-did-not-spend-one-min...
Met them at an MIT event last week, they don't quantize any models.
The real winner in chip war is TSMC. Everyone is using them to make chips.
Yeah I also have a feeling more value will gravitate towards the really hard stuff once we’ve got the NN architectures fairly worked out and stable.
To put my money where my mouth is I’m long TSMC and ASML among others, and (moderately) short NVidia. Very long the industry as a whole though.
If Cerebras keeps improving it will be a decent contender to Nvidia. Nvidia VRAM-SRAM is a bottleneck. For just inference, it needs to download a model at least once per token (divided by batch size). The bottleneck is not Tensor Cores but memory transfers. They say it themselves. Cerebras fixes that (at a cost of software complexity and narrower target solution).
"the filing puts the spotlight on the company’s financials and its relationship with its biggest customer, Abu Dhabi-based G42, which is also a partner and an investor in the company."
"The documents also say that a single customer, G42, accounted for 83% of revenue in 2023 and 87% in the first half of 2024."
https://www.eetimes.com/cerebras-ipo-paperwork-sheds-light-o...
Kind of vaguely reminds me of Transmeta vs Intel/AMD back in ~2000.
Cerebras has a real technical advantage in development of wafer scale.
They use the whole wafer for a chip (wafer scale). The WSE-3 chip is optimized for sparse linear algebra ops, used 5nm TSMC process.
Their idea is to have 44 GB SRAM per chip. SRAM is _very_expensive_ compared to DRAM (about two orders of magnitude).
It's easy to design larger chip. What determines the price/performance ratio are things like
- performance per chip area.
- yield per chip area.
Wafer scale integration has been a thing since wafers. Yet, I almost never read of anyone taking it the full distance to a product. I don't know if it turns out the yield per die per wafer or the associated technology problems were the glitch, but it feels like a good idea which never quite makes it out the door.
They don't give yield numbers but this says that they get acceptable yields by putting extra cores on the silicon and then routing around the defective ones. https://cerebras.ai/blog/wafer-scale-processors-the-time-has...
I found this bit interesting: They worked with TSMC to ensure the off-die areas used for test and other foundry purposes have been more clearly circumscribed so they can use the blanks between the chips for the inter-chip connects. The distances are kept short and they can avoid a lot of encode/decode logic costs associated with how people used to do this:
"The cross scribe line wiring has been developed by Cerebras in partnership with TSMC. TSMC allowed us to use the scribe lines for tens of thousands of wires. We were also allowed to create certain keep-out zones with no TSCM test structures where we could embed Cerebras technology. The short wires (inter-die spacing is less than a millimeter) enable ultra-high bandwidth with low latency. The wire pitch is also comparable to on-die, so we can run the inter-die wires at the same clock as the normal wires, with no expensive serialization/deserialization. The overheads and performance of this homogeneous communication are far more attractive than those of multi-chip systems that involve communication through package boundaries, transceivers, connecters or cables, and communication software interfaces."
I believe I’ve heard them say they have 100% yield. They haven’t made very many yet though, on the order of 100.
Concerning in terms of hype bubble now having even more exposure to the stock market. Perhaps less concerning since it's a hardware startup? Nah, nvm, I think this will end up cratered within 3 years.
I’m going to go ahead and predict this flubs long term. Not only is what they are doing very challenging, I’ve had some random brokerage house reach out to me multiple times about investing in this IPO. When your IPO process resorts to cold calling I don’t think it’s a good sign. Granted I have some associations with AI startups I don’t think that had anything to do with the outreach from the firm.
Agreed, it seems like NVIDIA would be happy to make whole-wafer chips if it seemed like a good play.
My guess is there are a lot of bespoke limitations that the software has to work around to run on a "whole wafer" chip, and even companies that have 99% similar designs to Nvidia already are struggling to deal with software incompatibilities, even with such a tiny difference.
You do realize that brokerages earn commissions on selling shares, so why wouldn't they contact people who may be interested?
The point is, there are IPO shares available to sell, even to people who have never expressed any interest in the company. That never happens if there's genuine demand for an IPO.
1. Have you purchased other companies' stock from the aforementioned broker?
2. The price of the shares in private markets has been steadily inclining, so I think there is demand.
I don’t know enough to say they’ll fail or be successful but I am wondering who will underwrite this IPO — they must have balls of steal and confidence gallore
Is it a good idea to go IPO when the balance sheet looks terrible?
Does cerebras make gaming GPUs, or is it enterprise-only?
Very solidly enterprise-only. They make single chips that take an entire wafer, use something like 10 kilowatts, and have liquid cooling channels that go through the chip. Systems are >$1M.
It's the return of the supercomputer! I really didn't think the supercomputer would come back as a thing, for so long it seemed stuck as a weird research project that only made sense for a tiny set of workloads... but it does make sense now
Guess a room full of PS2s can only take you so far...
The chip is huge. It wouldn't fit in any conceivable PC form card.
They sound more like NPUs or TPUs than GPUs. Though that doesn't answer the question about the market they are targeting.
How does Cerebras compare to D-Matrix?
They have zero moat
So many things here smell funny...
I have never heard of any models trained on this hardware. How does a company IPO on the basis of having the "best tech" in this industry, when all the top models are trained on other hardware.
It just doesn't add up.
Plenty of companies IPO before releasing anything, or before building a large audience. That's how lots of things that requite a long lead time and large initial investment get made. It's just a bigger risk for the investors.
Tesla IPOed in 2010 after selling only a few hundred Roadsters.
Seems like they support training on a bunch of industry standard models. I think most of the customers in the training space tend to be for fine tuning right? The P and T in GPT stand for pre-trained - then you tune for your actual specification. I don't think they will take over the insane computational effort of training Llama or GPT from scratch - those companies are using clusters that cost more than Cerebras' last evaluation.
I thought they were fore inference not training...either way, kind of is concerning that I've heard about them plenty from the hype bubble but I apparently still don't really understand what they do.
This is the first I've heard of Cerebras Systems.
From the article
>Cerebras had a net loss of $66.6 million in the first six months of 2024 on $136.4 million in sales, according to the filing.
That doesn't sound very good.
What makes them think they can compete with Nvidia, and why IPO right now?
Are they trying to get government money to make chip fabs like Intel or something?
You seemed surprised that this company is having an IPO to actually raise funds for operations and expansion, vs as just an "exit" where VCs and other insiders can dump their shares onto the broader public.
I might be a bit suspicious if a company in some low-capital-intensive industry was IPOing while unprofitable, but this is chip making. Even if they're not making their own fabs this is still an industry with high capital requirements.
We should be thrilled at a company actually using an IPO for its original intended purpose as opposed to some financialization scheme.
they don't make chips. they design and contract TSMC to fab the chips. The high capital is in design tools and engineers.
Thanks - I said that in my comment, but then just realized I had a typo of "fans" where it should have said "Even if they're not making their own fabs..."
Does this mean that they couldn't find VCs to raise more cash?
Cerebras is currently heavily backed by the Emirati government's sovereign wealth fund.
VCs offer cash on different terms than the public does. This just means Cerebras believes it can get capital more cheaply (or on otherwise better terms) than it can from VCs.
That might mean VCs are turning them down, yeah, but that’s just one of many possible factors into “where do we raise money”
I don’t think that’s been a problem. I’ve been following the pre IPO market on them for a while and pretty much any shares at a 7b valuation have been snapped up pretty much same day
Nvidia's moat is real but not big enough that one can't surpass it with a lot of engineering. It's not the only company making AI accelerators, and this has been the case for many years already. The first TPU was introduced in 2015. Nvidia has just managed to get a leader position in the race.
Saying it's "just" a lot of engineering effort to catch up isn't wrong, but it understates the reality. There are very few organizations on earth that have the technical and financial resources to meaningfully compete with even small parts of Nvidia's portfolio. Nvidia's products benefit from that breadth of strengths and the volumes they ship.
They don't just make accelerators, they'll sell you the hardware too (unlike TPUs). They don't just sell you the hardware, the software ecosystem will work too (unlike AMD or Intel). That hardware won't just do a lot of computations, it'll also have a lot of off-chip memory bandwidth (vs Cerebras or others). Need to embed those capabilities in a device that can't fit a wafer cabinet or a server rack's worth of compute, Nvidia will sell you similar hardware that uses a similar stack, certified for your industry (e.g. automotive). Take any of that away and you're left with a significantly weaker offering.
Also they benefit from the priority of paying fabs a lot of money and placing a lot of orders.
If anything, Nvidia is less dominant than they should be because they've managed to ensure absolutely no one wants to buy from them when there are viable alternatives.
People said the same about Cisco, Intel, IBM etc. It will only be a matter of time for companies to eat into the high margin stuff for specific use-cases and grow from there.
There's something weird about the market right now in that all the AI budgets being used to by GPUs are loss-leading. Orgs are treating the spend as a waste anyways, so I suspect they aren't going to be looking to cut costs. Make Cerebras a hard sell imo.
> Nvidia's moat is real but not big enough that one can't surpass it with a lot of engineering.
Yes, but you also need a lot of capital if you want node parity with them. Nvidia (supposedly) spent an estimated $9 billion dollars getting onto TSMC's 4nm node. https://www.techspot.com/news/93490-nvidia-reportedly-spent-...
> Taiwan Semiconductor Manufacturing Company makes the Cerebras chips. Cerebrus warned investors that any possible supply chain disruptions may hurt the company.
They get their chips from the same company that Nvidia does.
Virtually any competitors to Nvidia would be in the same position.
It's not necessarily to TSMC's advantage for Nvidia to become a monopolist either, although they wouldn't be totally dependent on Nvidia even if they did because TSMC serves every chip market.
They both contract TSMC to fabricate their chips.
The actual design and R&D is still done by Nvidia, Cerebras, AMD, Groq, etc.
Think of TSMC like Kinko's - they do printing and fabrication which is very low margins.
The main PMF for Cerebras is in simulations, drug discovery, and ofc ML.
As I've mentioned before on HN, Public-Private Drug Discovery and NatLab research has been a major driver for HPC over the past 20 years.
TSMC has a market cap of 0.9T USD. It would be the 7 largest US company by market cap if it were one. Manufacturing chips is extremely profitable, at least in the current climate. It used to be that software is more profitable than hardware, which is more commoditized, but AI gave hardware companies a renaissance of sorts.
It's not a simple process at all but requires a lot of engineering and engineers to do it.
https://companiesmarketcap.com/usa/largest-companies-in-the-... https://companiesmarketcap.com/tsmc/marketcap/
> Manufacturing chips is extremely profitable
It only became profitable NOW in the last 2-3 years.
Before that, foundry after foundry was shutting down or merging.
TSMC, UMC, Samsung, Intel Foundry Services, and GloFlo are the last men standing after the severe contraction in the foundry model in the 2000s-2010s due to it's extremely high upfront costs and lack of moat to prevent commodification.
TSMC margins are over 30% and growing [1] - that's very far from "low".
[1] https://www.macrotrends.net/stocks/charts/TSM/taiwan-semicon...
30% net due to a near monopoly and a recent upswing due to Nvidia.
Almost every other foundry system died because of low net margins.
Software (and fabless hardware like chip design) is expected to have 60-70% gross margins or the ability to reach that.
Semiconductors is part of TMT just like Software or Telecom, and this has an impact on available liquidity.
This is why TSMC is heavily subsidized by the Taiwanese government.
TSMC is neither software nor fabless. I'm not sure we are talking about the same company, there seems to be some disconnect here. For hardware business 30% margins are high, Apple is one of the most famous exceptions.
> For hardware business
When a foundry wishes to raise capital from the private or public markets, it's bucketed under TMT - which includes software and fabless hardware as well.
This means it's almost impossible to raise capital without a near monopoly and/or government support and intervention - which is what Taiwan did for TSMC and UMC - because the upfront costs are too high and the margins are much lower compared to other subsegments in the same sector.
This is why industrial subsidizes like the CHIPS act are enacted - to minimize the upfront cost of some very CapEx heavy projects (which almost everything Foundry related is).
Kinko's is not the pinnacle of human engineering - TSMC is. A slight difference there.
> Think of TSMC like Kinko’s
What an amazingly reductive analogy :)
Compare it to the same period last year ($8.7M in sales). That’s a pretty solid growth rate.
Their tech is very impressive, look it up.
It's a deadend. SRAM doesn't scale on advanced nodes.
Similar to Tenstorrent who chose GDDR instead of HBM, they throught production AI models won't get bigger than GPT3.5 due to cost.
I don't think they rely on SRAM very much for training. https://cerebras.ai/blog/the-complete-guide-to-scale-out-on-... outlines the memory architecture but it seems like they are able to keep most of the storage off wafer which is how they scale to 100s of GB of parameters with "only" 10s of GB of SRAM.