Context:
I'm academically interested in tracking/identifying UNIX processes in a way that is proof against PID wraparound. To start tracking a process by PID, I need to be able to conclusively identify it on the system.
Thus, I need a function, get_identity, that takes a PID, and only returns once it has determined a system-wide unique identity for that PID. The function should work on all or most POSIX-compliant systems.
The only immutable values in the process table that I know of are PID and start time. However, the following scenario poses a problem:
- User calls
get_identity(pid) get_identityreads the start time in seconds-since-the-epoch ofpid, if it exists, and returns the hopefully-unique tuple[pid, starttime](this is what the excellentpsutilPython library considers "unique enough", so it should be pretty robust).- Within a second of that call, PID wraparound occurs on the system, and
pidis recycled. - The
[pid, starttime]tuple now refers to a different process than was present at the call toget_identity.
While it is extremely improbable for PID wraparound to occur and re-use the selected PID within a second of its being identified, it is not impossible . . . right?
Questions:
- Is there a guarantee on UNIX/POSIX-compliant systems that the start time of a PID will be different between wraparound-caused re-uses of that same PID value?
- If not, how can I uniquely identify a process on a wraparound-prone system?
What I've Tried:
- I can simply
sleepfor a second after examining the target process. If the start-time-in-seconds is the same after thesleep, then it's either the same process that I started watching, or the PID has wrapped around to a different one but the system cannot tell the difference. If the start time has changed, I can return an error, or start over. However, this requires my identification function to wait for up to 1 second before returning, which is not ideal. times()returns values in clock ticks, which I can convert to seconds. Assuming that the starttime-in-seconds of a process is based on the same clock thattimesuses, and assuming that all UNIXes use the same rounding logic to convert fromclock ticks -> fractional seconds -> whole seconds, I could theoretically use this information to reduce the duration of thesleepin the above workaround to the time until the next "full second boundary according to the process table". However, the worst-case sleep time would still be nearly 1 second, so this is not ideal.- On Linux, I can get the starttime in jiffies (or CPU ticks, for old Linuxes) from the
/proc/$pid/statfile. With that information, my program could wait one jiffy(ie?), check the starttime again, and, if it was the same, determine identity. This correctly solves my problem (1 jiffy + overhead is a fast enough runtime), but only on Linux; other UNIX platforms may not have/proc. On BSD, that information is available via thekvmsubsystem or viasysctls. On other Unixes . . . who knows? I'd need to develop multiple platform-specific implementations to gather this data--something I'd prefer to avoid.