Linux has introduced the Pressure Stall Information[1] which completely replaced the classic cpu/mem percentage metrics for me. With PSI you see upfront where the bottleneck is. And it inherently accounts for factors like cpu frequency regulation or the disk cache.
The PSI data also has a value (total=) that is not averaged, so you can now see cpu jitter and spikes.
> What Activity Monitor actually shows as % CPU or “percentage of CPU capability that’s being used” is what’s better known as active residency of each core, that’s the percentage of processor cycles that aren’t idle, but actively processing threads owned by a given process. But it doesn’t take into account the frequency or clock speed of the core at that time, nor the difference in core throughput between P and E cores.
Does "%CPU" need to take into account these things?
I think it would be useful in many scenarios. For example if I am running at near cores*100% I may think my system or fully loaded (or overloaded). But if those cores are running at a low frequencies because there isn't actually any overload I would want to know that. Because in this case if I spawn twice as many tasks and have twice as much throughput while the CPU% doesn't change that seems confusing.
I think if I had to pick a single number for reporting CPU usage it would be percent of available throughout. Although this has complications (the max frequency will depend on external conditions like temperature). But 0% would mean "no runnable tasks" and 100% would mean that the CPUs are running at the maximum currently available speed. The values in-between would be some sort of approximation but I think that is fine.
It would be valuable to know that, but that also means knowing the power profile, cooling capability etc.
In general, understanding how much "compute capacity" you still have left is very much an experimental science with modern CPUs.
As an example of what I mean, MacBook Airs had exactly the same cores as MacBook Pros (up to 13" Pros iirc), but while you could max out one or two cores on both (and get "200% usage" that is comparable), going past that would mean entirely different things due to one being passively cooled.
With x64 platforms, TDP discussion is even more relevant even without getting to ARMs big.SMALL or Intel's P/E core distinction.
I knew something was off with "big.SMALL" and it's because it's really "big.LITTLE" (though not used very much nowadays as a moniker, it was prominent around a decade ago when it was being introduced into the market).
From an intuitive point of view, if you want to use %CPU as "how much of the total available processing power is this process using", it's potentially valuable. With the status quo, a process might appear to be a few multiples more CPU intensive than it really is, if the system happens to be relatively idle.
That said, it's not particularly easy to apply these corrections, especially because the available maximum clock speed depends on variables like the ambient temperature, how bust all the cores are, and how long the CPU has been boosting for. So if you were to apply these corrections, either you report that a fully loaded system is using less than 100% of possible available CPU power in a lot of cases, or your correction factors vary over time and are difficult to calculate.
> if you want to use %CPU as "how much of the total available processing power is this process using", it's potentially valuable ... the available maximum clock speed depends on variables like the ambient temperature, how bust all the cores are, and how long the CPU has been boosting for
I don't think the theoretical maximum clock speed is relevant to a %CPU measurement, for two reasons. First, as you note, it's subject to a lot of things that the user generally has no control over. Second, I suppose %CPU is meant to be a measure of current/real/instantaneous load -- what is, rather than what could be.
Raymond Chen has an article discussing whether or not this should be taken into account:
https://devblogs.microsoft.com/oldnewthing/20210629-00/?p=10...
need no, but it could be useful. Not a requirement, but a very “nice to have” property. It would reduce certain confusions in some end users, as well as being handy for us techie types.
Often you don't care that the current batch of processes are using 100% of what the CPU cores they are assigned to can do at current clock rates, what you want to know is how much is left available, so you can add more work without slowing any existing tasks down much.
It used to be, back when CPUs didn't have low & high power cores and always ran what they had at the same speed, that %CPU shown in various OS displays was a reasonably accurate measure of the impact of a process that could easily be used to judge optimisation success (getting the same done in less hardware effort) and scaling success (getting the same done in less wall-clock time by giving more hardware to the problem or improving parallelism to make better use of what you already have).
These days it is more complicated than most assume at face value, and you have to be a lot more careful when assessing such things to avoid incorrect assumptions leading to wrong decisions. It would be nice to get back to the previous state of affairs, in terms of a given % value meaning something more fixed. Of course that is not as practical to achieve as naively stating the problem might suggest: for a start you can't really state what 100% is because in many cases the maximum clock might only be achievable for very short periods before thermal throttling kicks in. Maybe if there is a “minimum maximum”, below which we know the throttle won't go, we could state that as 100% and display more when the heat limit is not taking effect, but I expect that really would confuse end users (I have memories of confused conversations when multi-core CPUs became common, when people saw displays of processes using ~200%, with that meaning ~100% of ~2 cores).
I guess not. I think the problem here is a bit more fundamental: people (read, at least 1 from a sample of 1 - me) think that the '% CPU' column in Activity Monitor shows how much of the total processing power the computer has is being used by the process, when actually it's a much more complicated story. I don't think it's a bad thing that people learn more about what the metric actually means.
I at least found the article interesting, and learned something useful from it.
Yes. If my 10Ghz cpu core says it's running 100%, but is scaled down to 1hz, i'll be really confused about how much work it is doing, and look in all the wrong places to find out why my process is taking forever to run.
(extreme numbers to highlight the point)
If your potentially-10GHz CPU core is being 100% utilized (ie, is 0% idle), and yet is still scaled to down to a sleepy 1Hz when it should be clocked faster than that, then: Your rig is broken.
(Is it the hardware that is broken? Software? Firmware? User configuration? Cooling? Power? Is it beyond your paygrade to make this determination?
In any or all cases, and at any clock speed: At 100% utilization, it is doing as much work as it can within the constraints of the operating environment in which it exists; one can't squeeze an extra instruction in edgewise without it getting in the way of existing work.
And no, perhaps displaying utilization as 100% won't help you at all with identifying and correcting the issue.
But the alternative of displaying utilization as 0.00000001% won't help you with that, either.)
I'd like it to show how much work it does, relative to how much work it could theoretically do in the current state, and/or how much more work I could give it. Not just "how utilized is it at the current clock speed, which could change at any moment".
That is: if we simplify the case to a single core laptop which has 2 power states: laptop plugged into wall, it has a min freq of 1Ghz and a max freq of 2Ghz. If on battery, it has a min freq of 500MHz and a max freq of 1Ghz. It will scale this frequency if it can, but the external constraints on whether the power cord is connected is still a hard limit.
Now I want to know "how much work is being done" and "how much computing headroom do I have to add more work".
So in the case of the laptop being plugged in, if it's fully utilized and running at 1Ghz then it should show 50% even though its fully utilized at the throttled clock speed. Because the current clock speed is just half of the current theoretical max for the power state. As you say, it being fully utilized should very quickly cause it to throttle up to 2GHz or something is broken. But when it does throttle up to 2GHz, hopefully in a short amount of time, it will now have half the occupancy at 2Ghz so it will keep showing 50%. And that's consistent with both 1) how much work it's actually performing, and 2) how much more work I can give it.
Edit: Lol, Raymond Chen agrees and basically said exactly this in his blogpost. My comment was just a new old thing.
You can want whatever you want, including a new metric that measures and portrays things however you wish.
The existing metric is not what you want. And that's perfectly OK: There's absolutely no damned reason why there can only be one. This isn't Highlander.
Edit: Lol, the King of Egypt rang and basically said this while we were chatting. Also [additional appeal to authority] and [someone's mom].
What I meant was, I thought I had a proposal that was both impossible, fringe, and at least somewhat novel. Seems it was neither.
honestly if you want load avg why not just use load avg
I can't see a way to measure load average on a process by process basis as you can for % CPU.
I'm confused, isn't this exactly the same as for intel? Intel processors can turbo-boost, and you can manually cap the frequency by setting the package power-limit register (or well, you used to before some firmware update). That obviously doesn't change the % CPU reported either.
Yes it the same for Intel.
One quantitative difference is that tasks assigned to the E cores may run at a sustained frequency much lower than the maximum (3.7x here) while on Intel any sustained load generally results in frequency scaling up over a few 100 ms to a maximum value which is much closer to the absolute max.
Yeah. When the article compares what Apple is doing to Intel, I think they mean “classic Intel processors of years ago” that didn’t frequency scale. Like the numbered Pentiums, IIRC. That did stand out to me a little bit.
You’re right that everyone has been doing this for quite a while, it’s certainly not an Apple invention.
The Electric Light Company often covers Mac stuff and Activity Monitor is the tool you use to view this stuff in OS X, so it’s just the domain this article is focusing on.
It's not as if they don't know that this is a false comparison, specially because generations of Intel CPUs used in Macs also did frequency throttling.
"Unlike traditional Intel CPUs, CPU cores in Apple silicon chips can be run at a wide range of frequencies, as set by macOS."
The use of the word 'traditional' here is essentially gaslighting.
I don’t think so.
Like I said, I think it’s referring to much older generations.
It just seems like awkward phrasing.
Is there an alternative app that shows the better numbers?
I read the article and I still don't know the answer.
[stub for offtopicness]
What Activity Monitor actually shows as % CPU or “percentage of CPU capability that’s being used” is what’s better known as active residency of each core, that’s the percentage of processor cycles that aren’t idle, but actively processing threads owned by a given process.
That's exactly what I thought it was. Where do I sign up for the refund?
I really hate this click-bait trend to assume what I do or do not know.
Yes, we've taken the linkbait 'you' out of the title now, and replaced it with a representative phrase from the article body.
Pretty sure it’s just scheduled CPU time / wall clock time. If you have multiple cores then scheduled CPU time can be greater than wall clock time.
Also scheduled CPU time doesn’t take in to account frequency scaling or core type as explained in the article. Just how much time the OS scheduler has allocated to the core to run tasks.
Why even comment if you're not going to read TFA which actively disputes your "isn't it just" assertion in its title?
There are too many comments parading "you haven't read the article", even when replying to comments referencing parts of the article, lately. It's already one thing to complain about someone's level of takeaway but another to only discuss the assumption of how they came to a different conclusion rather than just discuss the topic instead. The article title is also nothing to be proud of, it's 0% actionable information and 100% clickbait assumption - the person that wrote the Activity Monitor gets the same knowledge assignment as someone who's never even thought about it before.
I'd be curious what GP means in the difference between allocated scheduled time vs the way the article describes it (non idle process assignment time) though. It feels like that's 2 ways of saying the same thing? I agree with GP the frequency part seems off and can be surmised as a summary of the point later in the article anyways. Particularly, the opening of that section:
> Unlike traditional Intel CPUs, CPU cores in Apple silicon chips can be run at a wide range of frequencies, as set by macOS.
Is already setting off alarms - in what way is that different from traditional Intel CPUs from the last 2 decades? A long enough time ago there weren't enough (or any) cores to put in clustered groups but beyond that frequency scaling boosting by core/core group is just as common.
The author doesn't know what I know, yet the title suggests they do. That's the only assertion I see, and at least for me it isn't even a valid one.
Yes, we've taken the linkbait 'you' out of the title now, and replaced it with a representative phrase from the article body.