Comments Page - Netflix's Key-Value Data Abstraction Layer

« Back Netflix's Key-Value Data Abstraction Layernetflixtechblog.comSubmitted by gslin 20 hours ago

ericmcer 7 hours ago
Can anyone explain why Netflix is considered to have such high tier engineering? Just from a super high level view they store and serve ~5000 videos saved at a few different qualities (4?) so lets say a total of 20,000 videos. Those files only change when specific privileged users update them.
Compare that with Youtube where ~5,000 videos are uploaded, processed into different formats/qualities every minute, and can be added by anyone with an email. It seems like Netflix has a fairly trivial problem when compared with video sharing or content sharing sites.
- jolynch 6 hours ago
  My experience has been that the talent density is the main difference. Netflix tackles huge problems with a small number of engineers. I think one angle of complexity you may be missing is efficiency - both in engineering cost and infrastructure cost.
  Also YouTube has _excellent_ engineering (e.g. Vitess in the data space), and they are building atop an excellent infrastructure (e.g. Borg and the godly Google network). It's worth noting though that the whole Netflix infrastructure team is probably smaller than a small to medium satellite org at Google.
- thecosmicfrog 5 hours ago
  As soon as a streaming service starts having availability issues, it will garner a reputation very quickly and lose customers just as quickly. Being able to serve N amount of content reliably and consistently (even if less than M amount) is still a strong demonstration of good engineering practice in my opinion.
  On that point, I can't honestly recall a time I had Netflix streaming issues that weren't because of a problem on my side. Maybe I've just been lucky though, so ymmv.
- ianbutler 5 hours ago
  Netflix still has to serve 20k videos to 300million people. That's about a 750million hours of streamed content. Serving that content is challenging.
  Then they have their ad network on top of it. Then they have their analytics apparatus. Then they probably have a whole suite of tools for content producers. Then they probably have a bunch of janky tools for things that didn't exist as products 15 years ago.
  Seems reasonable to me if you put in a little more thought about the problem and scale.
- NBJack 6 hours ago
  Hype for the engineering culture? Helps attract the right talent. It is a relatively small team that is...ah, heavily motivated to come up with good solutions around the clock. And they maintain an excellent tech blog.
  Don't get me wrong; serving the level of traffic they handle isn't easy to scale or do cost-effectively around the globe. They are also considered by some to be pioneers in chaos engineering, and made headlines years ago making a competition to find the "best" suggestion algorithm.
- loire280 6 hours ago
  You're probably right, but Netflix does a good job building their engineering brand by writing up and sharing their technical work publicly.
snicker7 10 hours ago
This API is very similar to DynamoDB, which is basically a hash table of B-trees.
My experience is that this architecture can lead to very chatty applications if you have a rich data model (eg a graph).
- jolynch 7 hours ago
  (post author)
  It is indeed similar to DynamoDB as well as the original Cassandra Thrift API! This is intentional since those are both targeted backends and we need to be able to migrate customers between Cassandra Thrift, Cassandra CQL and DynamoDB. One of the most important things we use this abstraction for is seamless migration [1] as use cases and offerings evolve. Rather than think of KeyValue as the only database you ever need, think of it like your language's Map interface, and depending on the problem you are solving you need different implementations of that interface (different backing databases).
  Graphs are indeed a challenge (and Relational is completely out of scope), but the high-scale Netflix graph abstraction is actually built atop KV just like a Graph library might be built on top of a language's built in Map type.
  [1] https://www.youtube.com/watch?v=3bjnm1SXLlo
jerf 7 hours ago
For anyone looking for a TL;DR, I'd suggest starting at https://netflixtechblog.com/introducing-netflixs-key-value-d... , which HN is truncating so you can't see it but I've directly linked to a later section in the post with a #. Up to that point it's basically "a networked HashMap<String, SortedMap<Bytes, Bytes>>". But the ability to return partial results based on a timeout with a pagination token is somewhat unusual and the next section called "Signaling" is at least worth a look.