• CSDude 2 hours ago

    I feel like there is a great potential to be explored here by playing with cgroups dynamically, not in a machine learning way but allowing bursts, finding good ratios request/limit to pick up (1s/10s or 0.1s/1s ?) and voluntarily kicking out (eviction) stateless workloads.

    I even pursued my PhD on it until I quit (unrelated reasons). There was a startup doing this with ML but forgot their name.

    • jeffbee 2 hours ago

      I would say that this has relatively little to do with Kubernetes in the end. The Kubelet just turns the knobs and pulls the levers that Linux offers. If you understand how Linux runs your program, then what K8s does will seem obvious.

      A detail I would like to quibble about: GOMAXPROCS is not by default the number of CPUs "on the node" as the article states. It is the number of set bits in the task's CPU mask at startup. This will not generally be the number of CPUs on the node, since that mask is determined by the number of other tenants and their resource configurations. "Other tenants" includes the kubelet and whatever other system containers are present.

      The problem of using this default scheme arises because GOMAXPROCS is latched in once at startup, but the actual CPU mask may change while the task is running, and if you start 100 replicas of something on 100 different nodes they may all end up with various GOMAXPROCS, which will affect the capacity of each replica. So it is better to explicitly set GOMAXPROCS to something reasonable.

      • Groxx 18 minutes ago

        Or just update it at runtime every minute or something.