I find this a highly interesting problem. How do you measure how much CPU time a program needs? The OS has ceded its control of the CPU to the program. Does it just look at the clock after it’s in charge again to derive a program’s load?
While we’re at it: How does the OS even yank the CPU away from the currently running process?
Well behaving programs give control back to the kernel as soon as they are done with what they are doing. If they don’t the control is forcefully taken away after some assigned time.
It looks something like this:
Something happens – e.g. a key is pressed – a process waiting for this event is woken up and gets e.g. 100ms to do it stuff. If it can handle the key press in 50ms, kernel notes it used 50 ms of CPU time and can give control to another process waiting for an event or busy with other work. If the key press triggered long computation the process won’t be done in 100ms, the kernel notes it used 100ms of CPU time and gives control to other processes with pending events or busy with other work.
After one second the kernel may have noted:Process A: used 50ms, then nothing, then 100ms, another 100ms and another 100ms
Process B: was constantly busy doing something, so it got allocated 6 * 100ms in that one second
Process C: just got one event and handled it in 50ms
Process D: was not waken at allSo total of 1000ms was used – the CPU was 100% busy
Of that 60% was process B, 35% process A and 5% process C.And then that information is read from the kernel by top and displayed.
How does the OS even yank the CPU away from the currently running process?
Interrupts. CPU has means triggering and interrupt at a specific time. Interrupt means that CPU stops what it is doing and runs selected piece of kernel code. This piece of kernel code can save the current state of user process execution and do something else or restore saved execution of another process.
Timer based interrupts are the foundation of pre-emptive multitasking operating systems.
You set up a timer to run every N milliseconds and generate an interrupt. The interrupt handler, the scheduler, decides what process will run during the next time slice (the time between these interrupts), and handles the task of saving the current process’ state and restoring the next process’ state.
To do that it saves all the CPU registers (incl stack pointer, instruction pointer, etc), updates the state of the process (runnable, running, blocked), and restores the registers for the next process, changes it’s state to running, then exits and the CPU resumes where the next process left off last time it was in a running state.
While it does that switcheroo, it can add how long the previous process was running.
The other thing that can cause a process to change state is when it asks for a resource that will take a while to access. Like waiting for keyboard input. Or reading from the disk. Or waiting for a tcp connection. Long and short of it is the kernel puts the process in a blocked state and waits for the appropriate I/O interrupt to put the process in a runnable state.
Or something along those lines. It’s been ages since I took an OS class and maybe I don’t have the details perfect but hopefully that gives you the gist of it.
as a non computer scientist who is programming multitasking applications, you did a good job at explaining context switching :)
I think a lot of modern kernels are “tickless” - they don’t use a timeslice timer, and only context switch on IO interrupt, process yield, or when timeouts are specifically requested (including capping cpu-bound processes). https://en.m.wikipedia.org/wiki/Tickless_kernel
Tickless means it’s not based on the computer frequency and idle CPUs can stay idle rather than being annoyingly brought into high power mode ever 100 Hz, but it’s still firing interrupts based on scaling timed variables.
They’re now called “Dynticks”
SUSE wrote the vaguely more understandable write up that Linux foundation links to: https://www.suse.com/c/cpu-isolation-full-dynticks-part2/
BTW, the Linux RCU code is evil but interesting: https://www.p99conf.io/session/how-to-avoid-learning-the-linux-kernel-memory-model/
Fascinating stuff. Obviously a lot has changed since I took an undergrad OS class lol. Hell, Linux didn’t even exist back then.
deleted by creator
changes its* state to running
changes its* state to running
Oh, because its* a pointer to state-change. Good catch!
How do you measure how much CPU time a program needs?
While I have no specific examples, that is the task of scheduling algorithms. The kernel is responsible for looking at running processes and figuring out how to assign it CPU time efficiently. That can include a variety of metrics, such as past behavior, if it is IO blocked, process priority, etc.
There is no perfect scheduling algorithm. Each one has tradeoffs depending on what the priority of the system is.
Also, you don’t have to relinquish all control to the process if you have multiple cores. If you do, I believe that the process is interrupted after some time to allow the kernel to always be able to check in, but again, it depends on implementation.
How does the OS even yank the CPU away from the currently running process?
That is called context switching. Simplified, a process is a list of instructions and a bundle of memory. The memory is composed of RAM and CPU registers (again, simplified). The process memory can stay in the same spot in RAM, but the registers need to move out of the way for another process to take its spot. When a process is yanked away, the state of the registers for that process is snapshotted and stored in RAM managed by the kernel. This allows another process to be allocated to that core without deleting important process data. To resume the paused process, you just need to restore the registers back to the snapshotted state and have the core execute the next instruction for that process, and the process would be none the wiser.
Of course, there’s a lot more that happens internally, but that’s the main gist.
Is there a perfect scheduler that is non-optimal in the Big(O) sense but is optimal if you’re looking at maximizing hardware utilization? In other words, scheduler that takes a long time to determine CPU utilization for each process, but provides an optimal total CPU utilization? I realize that it would not be ideal since we’d essentially have these “sudden stops” as it recalculates the schedule. I’m just more interested in the theory.
If you have a fixed collection of processes to run on a single processor and unlimited time to schedule them in, you can always brute force all permutations of the processes and then pick whichever permutation maximizes and/or minimizes whatever property you like. The problem with this approach is that it has awful time complexity.
How would you deal with iowaits in a system like that? I can perfectly burn 100% of CPU time running a poll(), but that’s not useful work…
¯\_(ツ)_/¯
I guess that’s why I asked. I’m just curious if it’s even possible.
An awesome find. Thank you very much!
yes, it looks at a fine-grained clock, usually a cycle counter provided by the CPU for this purpose, to aggregate total on-cpu time for each process.
Look up preemptive task scheduling. Basic processes will get interrupted after using a too long time slice.
How do you measure how much memory a process is using?
Shared libraries. Shared memory. Virtual. Physical. Memory mapped files. Some applications depend heavily on the availability of the file system cache for good performance, so should you consider that memory technically in use by that process? What if a process is providing a service to other applications and allocates memory on behalf of those apps to provide those services? If I ask the kernel to allocate lots of things for me to do my job should that really be my memory?
There are popular measures to get an idea of the memory consumed by a process, but none can tell the whole truth and nothing but.
I.e. could be the dress.
Hmm. It’s been a hot minute (ok 30 years) since I learned about this stuff and I don’t know how it works in Linux in excruciating detail. Just the general idea.
So I would be curious to know if I am off base… but I would guess that, since the kernel is in charge of memory allocation, memory mapping, shared libraries, shared memory, and virtualization, that it simply keeps track of all the associated info.
Info includes the virtual memory pages it allocates to a process, which of those pages is mapped to physical memory vs swap, the working set size, mapping of shared memory into process virtual memory, and the memory the kernel has reserved for shared libraries.
I don’t think it is necessary to find out how much is “technically in use” by a process. That seems philosophical.
The job of the OS is all just resource management and whether the systems available physical memory is used up or not. Because if the system spends too much time swapping memory pages in and out, it slows down everything. Shared memory is accounted for correctly in managing all that.
Maybe I am not understanding your point, but the file system is a different resource than memory so a file system cache (like a browser used) has zero impact on or relevance to memory usage as far as resource management goes.
sudo kill 4346
Literally
“sudo kill -9 4346” for additional brutality
221% cpu usage?
100% equals one full core. Higher numbers are possible for multithreaded processes.
Brother, let me show you this: https://i.imgur.com/PbU8U5n.png
Is that from one of those crazy lads who dared open more than one tab on Chrome?
That’d be the RAM column.
Lmao at RAM chips and CPU cola
Heh, freshly started Java. Hasn’t allocated all your ram and swap yet.
Missing the zombie process
Try btop. It looks nice…
I was wondering how to do this last night
How Nice