The CPU doesn't do the usage calculations by itself. It may have hardware features to make that task easier, but it's mostly the job of the operating system. So obviously the details of implementations will vary (especially in the case of multicore systems).
The general idea is to see how long is the queue of things the CPU needs to do. The operating system may take a look at the scheduler periodically to determine the number of things it has to do.
As for the second part of your question, most modern operating systems are multi-tasked. That means the OS is not going to let programs take up all the processing time and not have any for itself (unless you make it do that). In other words, even if an application appears hung, the OS can still steal some time away for its own work.
loop that the operating systems spins up. Your process is managed from within that loop. It allows external code to be executed directly on the processor in chunks. Without exaggerating too much, this is an uber-simplification of what is actually going on.
This is my basic understanding from having a little exposure to similar code. Programs like Task Manager or your widget access system calls like NtQuerySystemInformation() and use the information gathered from the OS to make the simple calculation of the percent of time a CPU is idle or being used (in a standard amount of time). A CPU knows when it is idle so it can therefore determine when it's not idle. These programs can indeed get clogged up...my crummy laptop's Task Manager freezes all the time when calculating CPU usage when it's topping out at 100%.
To get CPU usage, periodically sample the total process time, and find the difference.
For example, if these are the CPU times for process 1:
kernel: 1:00:00.0000
user: 9:00:00.0000
And then you obtain them again two seconds later, and they are:
kernel: 1:00:00.0300
user: 9:00:00.6100
You subtract the kernel times (for a difference of 0.03) and the user times (0.61), add them together (0.64), and divide by the sample time of 2 seconds (0.32).
So over the past two seconds, the process used an average of 32% CPU time.
The specific system calls needed to get this info are (obviously) different on every platform. On Windows, you can use GetProcessTimes, or GetSystemTimes if you want a shortcut to total used or idle CPU time.
There's a special task called the idle task that runs when no other task can be run. The % usage is just the percentage of the time we're not running the idle task. The OS will keep a running total of the time spent running the idle task:
when we switch to the idle task, set t = current time
when we switch away from the idle task, add (current time - t) to the running total
If we take two samples of the running total n seconds apart, we can calculate the percentage of those n seconds spent running the idle task as (second sample - first sample)/n
Note that this is something the OS does, not the CPU. The concept of a task doesn't exist at the CPU level! (In practice, the idle task will put the processor to sleep with a HLT instruction, so the CPU does know when it isn't being used)
As for the second question, modern operating systems are preemptively multi-tasked, which means the OS can switch away from your task at any time. How does the OS actually steal the CPU away from your task? Interrupts: http://en.wikipedia.org/wiki/Interrupt
Pick a sampling interval, say every 5 min (300 seconds) of real elapsed time. You can get this from gettimeofday.
Get the process time you've used in that 300 seconds. You can use the times() call to get this. That would be the new_process_time - old_process_time, where old_process_time is the process time you saved from the last time interval.
Your cpu percentage is then (process_time/elapsed_time)*100.0
You can set an alarm to signal you every 300 seconds to make these calculations.
I have a process that I do not want to use more than a certain target cpu percentage. This method works pretty good, and agrees well with my system monitor. If we're using too much cpu, we usleep for a little.
The CPU does not get 'hung' up, it is simply operating at peak capacity, meaning it is processing as many instructions as it is physically capable every second.
The process that is calculating CPU usage is some of those instructions. If applications try to do operations faster than the CPU is capable, then simply they will be delayed, hence the 'hung up'.
The calculation of CPU utilization is based on the total available utilization. So if a CPU has two cores, and one core has 30% usage, and the other is 60%, the overall utilization is 45%. You can also see the usage of each individual core.