About | Disclaimer | Webmaster

Note: Windows is a registered trademark of Microsoft Corporation in the United States and other countries. The Windows Timestamp Project is an independent publication and is not affiliated with, nor has it been authorized, sponsored, or otherwise approved by Microsoft Corporation.

Microsecond Resolution Time Services for Windows

Arno Lentfer, June 2012

1.  Abstract

Various methods for obtaining high resolution time stamping on Windows have been described. The most promising implementations have been proposed by W. Nathaniel Mills: "When microseconds matter" (2002) and Johan Nilsson: "Implement a Continuously Updating, High-Resolution Time Provider for Windows " (2004).

Suggested auxiliary initial reading: Keith Wansbrough: "Obtaining Accurate Timestamps under Windows XP" (2003), msdn: "Guidelines for Providing Multimedia Timer Support", and Chuck Walbourn: "Game Timing and Multicore Processors" (2005).

A substantial amount of time and effort has been spent on the attempt to get a proper high resolution time service implemented for Windows. However, the performance of these implementations is still not satisfactory. The complexity arises from the variety of Windows versions running on an even greater variety of hardware platforms.

Proper implementation of an accurate time service for Windows will be discussed and diagnosed within the Windows timestamp project. Test code will be released to prove functionality on a broader range of hardware platforms. Besides the timestamp functionality, high resolution (microsecond) timer functions are also discussed.

2.  Resources

Time resources on Windows are mostly interrupt controlled entities. Therefore, they show a certain granularity. Typical interrupt periods are 10 ms to 20 ms. However, the interrupt period can also be set to be 1 ms or even a little below 1 ms by using API calls to NTSetTimerResolution or timeBeginPeriod. However, for several reasons they can and shall never be set to anything near the 1 μs regime. The best resolution to observe by means of Windows time services is therefore in the 1 ms regime.

The best resource for retrieving the system time is the GetSystemTimeAsFileTime API. It is a fast access API that is able to hold sufficiently accurate (100 ns units) values in its arguments. The alternative API is GetSystemTime, which is 20 times slower, has double the structure size, and does not provide a well-suited data format.

An interrupt independent system resource is used to extend the accuracy into the microsecond regime i.e., the performance counter. The performance counter API provides the asynchronous calls QueryPerformanceCounter and QueryPerformanceFrequency. A virtual counter delivers a performance counter value, which increases by a performance counter frequency. The frequency is typically a few MHz and can therefore open the microsecond regime. The counter parameters are typically backed by a physical counter, but they are not necessarily independent of the version of the operating system. A hardware platform can deliver different performance frequencies when running Windows 7 or Windows Vista, for example.

The Sleep() API and the WaitableTimer API are further timing resources in the context of this project. Their functionality and their habit also need to be looked at.

2.1.  GetSystemTimeAsFileTime API

The GetSystemTimeAsFileTime API provides access to the system's real-time clock (RTC). It is stated as

void WINAPI GetSystemTimeAsFileTime(OUT LPFILETIME lpSystemTimeAsFileTime);

with its argument of type

typedef struct _FILETIME {
  DWORD dwLowDateTime;
  DWORD dwHighDateTime;
} FILETIME;

A 64-bit FILETIME structure receives the system time as FILETIME in 100ns units, which have been expired since Jan 1, 1601. After some 400 years about 1.28×1010 seconds or 1.28×1017 100ns slices have been accumulated. The 64-bit value can hold almost 2×1019 100 ns time slices. The remaining time before this scheme wraps would be about 58,000 years from now. The call to GetSystemTimeAsFileTime typically requires 10 ns to 15 ns.

In order to investigate the real accuracy of the system time provided by this API, the granularity that comes along with the time values needs to be discussed. In other words: How often is the system time updated? A first estimate is provided by the hidden API call:

NTSTATUS NtQueryTimerResolution(OUT PULONGMinimumResolution,
  OUT PULONGMaximumResolution,
  OUT PULONGActualResolution);

NtQueryTimerResolution is exported by the native Windows NT library NTDLL.DLL. The ActualResolution reported by this call represents the update period of the system time in 100 ns units, which obviously does not necessarily match the interrupt period. The value depends on the hardware platform. Common hardware platforms report 156,250 or 100,144 for ActualResolution; older platforms may report even larger numbers. This is one of the heartbeats controlling the system. The MinimumResolution and the ActualResolution are relevant for the multimedia timer configuration. Two common hardware platform configurations are discussed here to highlight the details to be dealt with:

Platform configuration A

- MinimumResolution:156,250
- MaximumResolution:10,000
- ActualResolution:156,250

Platform configuration B

- MinimumResolution:100,144
- MaximumResolution:10,032
- ActualResolution:100,144

Platform A simply has 64 timer interrupts per second (64 x 156,250 x 100 ns = 1 s), but when looking at platform B the difficulties become more obvious: 99,856 interrupts per second? Answer: The full second interrupt is not available on all platforms.

However, the system time may be updated at these interrupt events. An API call to

BOOL WINAPI GetSystemTimeAdjustment(OUT PDWORD lpTimeAdjustment,
  OUT PDWORD lpTimeIncrement,
  OUT PBOOL lpTimeAdjustmentDisabled);

will disclose the time adjustment and time increment values. The actual purpose of this call is to query the status of the system time correction, which is active when TimeAdjustmentDisabled is FALSE. When TimeAdjustmentDisabled is TRUE, no adjustment takes place and TimeAdjustemt and TimeIncrement are equal and do report exactly what was read as ActualResolution before. For a platform A type system, the call will report that the system time has incrementally increased by 156,250 100 ns units every 156,250 100 ns units. Within this description, this is considered the granularity of the system time.

Knowing the system time granularity raises doubts about its accuracy. Certainly, the TimeIncrement will be applied, thus changes of the system time will always be one TimeIncrement, but does the interrupt period or any multiple of it always match the time increment?

Even when the standard setting of ActualResolution corresponds to the MinimumResolution, the ActualResolution may have a setting different from MinimumResolution (see table below). In fact it may be configured to values in the range from MinimumResolution to MaximumResolution. The ActualResolution determines the interrupt period of the system. That is the period after which the timer generates an interrupt to let the system react. The ActualResolution can be set by using the API call

NTSTATUS NtSetTimerResolution(IN ULONGRequestedResolution,
  IN BOOLEANSet
  OUT PULONGActualResolution);

or via the multimedia timer interface

MMRESULT timeBeginPeriod(UINT uPeriod);

with the value of uPeriod derived from the range allowed by

MMRESULT timeGetDevCaps(LPTIMECAPS ptc, UINT cbtc );

which fills the structure

typedef struct {
  UINT wPeriodMin;
  UINT wPeriodMax;
  } TIMECAPS;

Typical values are 1 ms for wPeriodMin and 1,000,000 ms for wPeriodMax. The 1,000 s period for wPeriodMax is somewhat meaningless within the context of this description. However, the possibility of setting the timer resolution to 1 ms requires a more detailed investigation. When the multimedia timer interface is used to set the multimedia timer to wPeriodMin, the ActualResolution received by a call to NtQueryTimerResolution will show a new value. For the two platform configurations discussed, the examples are as follows:

Platform configurationAB
MinimumResolution156,250100,144
MaximumResolution 10,000 10,032
ActualResolution 156,250100,144

ActualResolution varies according to the varying multimedia timer periods uPeriod applied by the timeBeginPeriod() API:

Platform configurationABuPeriod
ActualResolution9,76610,0321 ms
ActualResolution19,53220,0642 ms
ActualResolution19,53230,0963 ms
ActualResolution39,06339,9524 ms
ActualResolution39,06349,9845 ms
ActualResolution39,06360,0166 ms
ActualResolution39,06370,0487 ms
ActualResolution156,25080,0808 ms
ActualResolution156,25089,9369 ms
ActualResolution156,250100,14410 ms
ActualResolution156,250100,14411 ms
ActualResolution156,250100,14412 ms
ActualResolution156,250100,144100 ms

This list shows the supported interrupt periods for platforms of type A and B in 100 ns units. Platform A only supports four different interrupt heartbeat frequencies, while platform B has a better approximation to the desired period. The specific numbers are relevant for the procedures described here and thus need a detailed interpretation.

Note: TimeIncrement provided by GetSystemTimeAdjustment and ActualResolution provided by NtQueryTimerResolution are not necessarily identical.

2.1.1.  ActualResolution on Platform Type A

The timer intervals are given with 100 ns accuracy in the last digit. Since the true ActualResolution cannot be expressed correctly, rather than reporting the true ActualResolution of 0.9765625 ms the call to NtQueryTimerResolution reports the rounded value of 0.9766 ms. The other values are also rounded (shall be 1.953125 ms and 3.90625 ms respectively).

A quick test using the Sleep(dwMilliseconds) API confirms this assumption:

Sleep(1) = 1.9531 ms = 2 x 0.9765625 ms

Sleep(2) = 2.9295 ms = 3 x 0.9765625 ms

Sleep(3) = 3.9062 ms = 4 x 0.9765625 ms

The Sleep() will only return when n x ActualResolution exceeds the desired duration. The required accuracy for the interval specification would have to extend to 0.5 ns, in other words show the 100 ps digit. The number would be 156,250,000 for the MinimumResolution and 9,765,625 for the MaximumResolution (in 100 ps or 10-10 s units).

Note: Sleep(1) measurements (10,000, with 100 ahead) result in a mean delay of 1953.163824 μs. This is 2.0000397 times the interrupt time slice (should have been 1953.125 μs, so the measurement was off by 0.04 μs).

2.1.2.  ActualResolution on Platform Type B

An interrupt timer period of 1.0032 ms will accumulate 10.032 ms after 10 interrupts and change the system time by 10.0144 ms. A time change of 10.0144 ms after 10.032 ms means that the time is behind by 176 μs. At the 57th of such periods, the deviation has accumulated to 1.0032 ms, which is exactly one timer interrupt period and the time will be updated after just 9 interrupts (9.0288ms). This way the time is updated by 10.0144 ms 56 times after 10.032 ms and one time after 9.0288 ms, which is a total elapsed time of 570.8208 ms with an adjustment of 57*10.0144 ms = 570.8208 ms. This corresponds to a total number of interrupts of 569 (57*100,144 = 569*10,032). As a result, the time will lose 176 μs for each of the 56 consecutive system time updates and then gain 9,856 μs in the 57th interrupt interval.

2.1.3.  Changes of System File Time

The system time changes according to the described mechanisms after a certain period of time. Additional time changes do happen if time corrections are caused by periodic time changes, which are continuously applied to the system time over a longer period of time to adjust to an external time reference. The occurrence and the parameters of this adjustment can be gathered by a call to GetSystemTimeAdjustment. Sudden time changes, for example, introduced by using the clock GUI or SetSystemTime(…) , are not announced or predictable; they happen spontaneously.

Changes of the system time will have no influence on the expiration of Sleep periods or waitable timer periods. The actual change will be taken over by the routines here. Nevertheless, system time changes are discontinuities in time, whether they are sudden or spread over a longer period of time. What is an accurate time stamp supposed to deliver when the system inserts several hundred seconds at an interval of 1,0000032 s? The system will assume that the seconds are that long (elongated) for the time being. This can be accomplished by the temporary adaptation of the performance counter frequency to the applied granular time correction.

2.2.  The Sleep API

The Sleep function suspends the execution of the current thread for a specified interval.

VOID Sleep(DWORD dwMilliseconds);

This would indeed be a very useful function if it were doing what it is supposed to do. Unfortunately, a detailed view discloses some artifacts, some of which are helpful, and others that are not. The Sleep() function is backed up by the system's interrupt services. As described in section 2.1, the interrupt period can be configured to some extent. This has a direct impact on Sleep(). The call to Sleep() passes the parameter dwMilliseconds to the system and expects the function to return after dwMilliseconds. In practice the Sleep() only returns when two conditions are met: Firstly, the requested delay must be expired and secondly an interrupt has occurred (the test to see if the requested delay has expired is only done with an interrupt). A simple Sleep(1) call may therefore have a number of different results. The results also depend on the time at which the call was made with respect to the interrupt period phase.

Say the ActualResolution is set to 156,250, the interrupt heartbeat of the system will run at 15.625 ms periods or 64 Hz and a call to Sleep is made with a desired delay of 1 ms. Two scenarios are to be looked at:

The observed delay heavily depends on the time at which the call was made. This matters particularly when the desired delay is shorter than the ActualResolution. However, when the ActualResolution is set to MaximumResolution, the system runs at its maximum interrupt frequency and the deviations are in the order of one interrupt period.

This behavior can be used to synchronize code with the interrupt period in an easy way by simply calling two or more consecutive sleeps. Regardless of what ΔT is, the first will end at the time of an interrupt. Consequently the following sleep call will start at the interrupt time (or at least so close to it that the system will assume that it happened at the same time). As a result a ΔT = 0 applies and the sleep will return when N x ActualResolution becomes larger than the desired period. Right after the return of a sleep, the system has just processed an interrupt. Conditional latency may be on board due to a priority and/or task/process switching delay or due to interrupt handler CPU capture reasons. Typical latencies of a few μs can be observed with very little implementation effort.

A special case is the call Sleep(0). It looks meaningless, but it is a very powerful tool since it relinquishes the reminder of the thread's time slice. That means that other threads of equal priority level will take over when ready to run. When a number of threads are running at the same priority level and all of them are very responsive, all of them will make frequent calls to Sleep(0) whenever they can afford it. As a result, a task switch can be forced to happen in just a few μs.

2.3.  The WaitableTimer API

Another important mechanism for performing timed operations is provided by the waitable timer interface:

HANDLE WINAPI CreateWaitableTimer(IN LPSECURITY_ATTRIBUTES lpTimerAttributes,
  IN BOOL bManualReset,
  IN LPCTSTR lpTimerName);

The returned handle is used to setup a timer function:

BOOL WINAPI SetWaitableTimer(IN HANDLE hTimer,
  IN const LARGE_INTEGER* pDueTime,
  IN LONG lPeriod,
  IN PTIMERAPCROUTINE pfnCompletionRoutine,
  IN LPVOID lpArgToCompletionRoutine,
  IN BOOL fResume);

This tool can be unsed in a variety of ways. Below are just a few things that need to be mentioned within the scope of this description:

The expired (signaled) timer can be handled by means of an asynchronous procedure (APC) call or by means of a call to WaitForSingleObject, for example. According to the last point above, the former is useless when high accuracy is required. The latter suits the needs of the mechanisms described here much better. The API needs the handle to the object to wait for and allows specifying a timeout dwMilliseconds, which can be optionally set to INIFINTE.

DWORD WINAPI WaitForSingleObject(IN HANDLE hHandle,IN DWORD dwMilliseconds);

Waitable timers synchronize to the rhythm of the systems interrupt period (ActualResolution). This has to be kept in mind because it has severe implications to the system's overall performance. All of the tasks waiting for a Sleep() or a timer to reach a signaled state will continue after the interrupt has occurred. The system's load tends to reach peaks at interrupts.

2.4.  The QueryPerformanceCounter and QueryPerformanceFrequency API

This API is backed by a virtual counter running at a "fixed" frequency started at boot time. The following two basic calls are used to explore the microsecond regime: QueryPerformanceCounter() and QueryPerformanceFrequency(). The counter values are derived from some hardware counter, which is platform dependent. However, the Windows version also influences the results by handling the counter in a version specific manner. Windows 7, in particular has introduced a new way of supplying performance counter values.

2.4.1.  QueryPerformanceCounter

The call to

BOOL QueryPerformanceCounter(OUT LARGE_INTEGER *lpPerformanceCount);

will update the content of the LARGE_INTEGER structure PerformanceCount with a count value. The count value is initialized to zero at boot time.

2.4.2.  QueryPerformanceFrequency

The call to

BOOL QueryPerformanceFrequency(OUT LARGE_INTEGER *lpFrequency);

will update the content of the LARGE_INTEGER structure PerformanceFrequency with a frequency value. The frequency is treated by the system as a constant.

2.4.3.  Performance of the Performance Counter

The range in time that can be held by the LARGE_INTEGER structure PerformanceCount depends on the update rate or the Frequency at which the count will incrementally increase. Depending on the hardware platform the counter may be an Intel 8245 at 1,193,000 Hz or an ACPI Power Management Timer chip with an update frequency of 3,579,545 Hz or even another source. A number of Platforms do not have these timers at all; they mimic the timer by providing the CPU clock. As a result of the latter, the frequency can get into the GHz range. PerformanceCount.QuadPart (signed) will change sign after 263 increments. At a frequency of say 1GHz (109 s-1), such a system can run for about 290 years without reaching the sign bit. Even for multi-GHz platforms, there does not seem to be a serious limit.
However, apart from the system's treatment, the frequency cannot be considered being constant. Firstly, the frequency generating hardware will deviate from the specified value by an offset and secondly the frequency may vary (i.e., due to thermal drift). The impact of these deviations is not negligible. Oscillators do have tolerances in the range of a few ppm and would consequently introduce errors of a few μs/s in the measured time period. Within this description the performance counter will be used to predict time intervals over a few seconds at accuracies better than 1 μs. If an accuracy of 0.1 μs is reached after 10s, the frequency needs to be known to 0.01 ppm, which corresponds to 0.035 Hz at a nominal frequency of 3,579,545 Hz. Obviously, that value is not provided by the system and needs to be calibrated. A first estimate of the true frequency can be gathered by querying two counter values at a certain (known) time apart from each other. The code snippet uses the API call

DWORD timeGetTime(VOID);

and could look like this:

DWORD ms_begin,ms_end;
LARGE_INTEGER count_begin, count_end;
Double ticks_per_second;
ms_begin = timeGetTime();
QueryPerformanceCounter(&count_begin);
Sleep(1000);
ms_end = timeGetTime();
QueryPerformanceCounter(&count_end);
ticks_per_second = (double)(count_end−count_begin)/(ms_end−ms_begin);

However, due to artifacts described in 2.2, timeGetTime() is accompanied by an inaccuracy of up to 2 ms, thus a Sleep(1,000) would give an accuracy for ticks_per_second of 0.002 (2,000 ppm) at most. An accuracy of 2 ppm would be achievable when the Sleep extends to 1,000,000 ms or 1,000s. In order to obtain 0.01 ppm, the Sleep would have to cover more than 55 hours. This is obviously a hopeless approach. It also averages temporary changes of the frequency and it will not forgive frequency changes due to thermal drifts. The thermal drift of the performance counter frequency can be severe:


thermal_drift.png

This graph shows an older system with heavy thermal drift. At boot time (~8:00) the measured performance counter frequency is off by about 60Hz. The system reports the performance counter frequency as 3,579,545 Hz. In fact, it is already at 3,579,605 Hz when it is "cold". After many hours of doing nothing, the system seems to reach a thermal equilibration. At ~14:00 (six hours after boot), the system was heavily loaded for about 45 minutes and consequently warmed up. The load has increased the main board temperature by 5 deg. (centigrade scale) only, but the influence to the measured performance counter frequency is quite considerable. It rose to an offset of almost 100 Hz or a true performance counter frequency of 3,579,645 Hz. A 100 Hz offset at a base frequency of 3,579,605 Hz is a deviation of about 28 ppm or an error in time of 28 μs/s.

The calibration procedure used for the time stamp mechanism described here uses a repeated averaging period evaluation and reaches an accuracy of better than 0.05 ppm after about 100s. Thermal drifts can be captured reasonably well and can be applied without much delay. (Note: The declaration of ticks_per_second as a 64-bit float in the code snippet above enables the ticks_per_second to hold a number with an accuracy of 15 digits. A value of 3,579,545.12 Hz shows the 0.01 ppm accuracy in the last digit.)

The use of QueryPerformanceCounter on multi-processor platforms implies that the call is made on the same processor all the time. The SetThreadAffinityMask API and its associated calls are used to ensure this.

2.4.4.  Is the CPU Time Stamp Counter an Alternative?

The RTDSC specifies a call to query the time stamp counter of the CPU. The advent of multi processor platforms or muti-core processors highly recommends not using RTDSC calls. Newer processors also support adaptive CPU frequency adjustments. This is just another reason to not use RTDSC calls for the purpose discussed here.

2.5.  Discussion of Resources

Some of the resources discussed show a platform-specific behavior. They may deliver results depending on the hardware and/or on Windows version. The precision time functions developed within the windows timestamp project mainly rely on four function suites provided by the operating system:

The complexity of the system time update with respect to the interrupt settings was explained and is understood. A complex automatic diagnosis of the system has to establish proper settings in order to obtain the desired accuracy. Particularly, the continuous calibrations of the performance counter frequency described in 2.4.3 is of utmost importance to obtain high accuracy. In addition, the proper interrupt period setting to obtain truly cyclic timer behavior (e.g., as described for example in 2.1) is very important. Another set of APIs is used to establish functionality:

The description of these functions falls outside the scope of this description.

3.  Goals

The Windows Timestamp Project provides the tools to enable access to time at microsecond resolution and accuracy. Furthermore, it provides timer functions at the same resolution and accuracy. The high accuracy and microsecond resolution are archived by synchronizing the system time with the performance counter. In fact, the performance counter is phase locked to the system time. A diagnosis determines the system's specific parameters and establishes a "truly cyclic" timer interval for updating the phase of the performance counter value. The drift of the performance counter is permanently evaluated and taken into account while the system is running.

Initial code runs in a real-time priority process providing time information. An auxiliary IO process builds the interface to an optional graphical user interface. Nonblocking IO enables proper performance testing and debugging.

3.1.  Time Support

Any time providing mechanism needs time for its internals. Thus, the following question arises with respect to time: Is the time requested at the time the call is made or shall the time be reported at the time in which the call returns? This may sound strange, but considering the level of resolution and accuracy aimed for here, it matters.

Example:

Two time functions are implemented to fulfill these two needs:

3.1.1.  GetTimeStamp

The function GetTimeStamp, declared as

void GetTimeStamp(TimeStamp_TYPE * TimeStamp);

fills the argument pointed to by TimeStamp with numbers according to the TimeStamp structure definition:

typedef struct {
  long long Time;
  long long ScheduledTime;
  long Accuracy;
TimeStamp_TYPE;

The 64-bit value Time represents the number of elapsed 100-nanosecond intervals elapsed since January 1, 1601. ScheduledDueTime reports the system file time at which the next reference time is scheduled for an attempt to update the phase. This value should primarily be used to verify the operation of the precision time mechanism. If ScheduledDueTime is noticeable behind the current system file time, the scheduled update of the time reference must have failed for a number of consecutive attempts.
Finally the 32-bit value of Accuracy gives an estimate of the assumed accuracy (rms) of the time stamp in 1 ns units (error in ns/s).

The call to GetTimeStamp is fast (a few thousand CPU cycles max.) and it reports the time at the time it is called.

3.1.2.  Time

A simple function is stated as

long long Time(void);

The function is as fast as GetTimeStamp and it returns the time at the time the call returns. With the need for a few thousand CPU cycles, the call will require very few μs with the current hardware. The Time() can be used to compare times or to wait until a certain time is observed. The 64-bit return value represents the number of elapsed 100-nanosecond intervals since January 1, 1601.

3.2.  Timer Support

A set of timer functions:

HANDLE CreateTimedEvent(BOOL bManualReset,
  LPCTSTR lpTimerName);

bManualReset [in]

If this parameter is TRUE, the function creates a manual reset event object, which requires the use of the ResetEvent function to set the event state to nonsignaled. If this parameter is FALSE, the function creates an auto reset event object, and the system automatically resets the event state to nonsignaled after a single waiting thread has been released.

lpTimerName [in, optional]

The name of the event object. The name is limited to MAX_PATH characters. Name comparison is case sensitive. If lpName matches the name of an existing named event object, this function will fail. If lpName is NULL, the event object is created without a name. If lpName matches the name of another kind of object in the same namespace (such as an existing semaphore, mutex, waitable timer, job, or file-mapping object), the function fails and the GetLastError function returns ERROR_INVALID_HANDLE. This occurs because these objects share the same namespace. The name can have a "Global\" or "Local\" prefix to explicitly create the object in the global or session namespace. The remainder of the name can contain any character except the backslash character (\). For more information, see Kernel Object Namespaces. Fast user switching is implemented using Terminal Services sessions. Kernel object names must follow the guidelines outlined for Terminal Services so that applications can support multiple users. The object can be created in a private namespace. For more information, see Object Namespaces.

Return value

If the function succeeds, the return value is a handle to the event object. If the named event object existed before the function call, the function returns NULL and GetLastError returns ERROR_ALREADY_EXISTS. If the function fails, the return value is NULL. To get extended error information, call GetLastError.


int SetTimedEvent(HANDLE hTimerEvent,
  long long TimerDueTime,
  long long TimerPeriod);

hTimerEvent [in]

A handle to a named timed event. The CreateTimedEvent() function returns this value.

TimerDueTime [in]

The time after which the state of the timer is to be set to signal in 100 nanosecond intervals. Positive values indicate absolute time. Be sure to use a UTC-based absolute time, since the system uses UTC-based time internally. Negative values indicate relative time.

TimerPeriod [in]

The period of the timer in 100 ns intervals. If lPeriod is zero, the timer is signaled once. If TimerPeriod is greater than zero, the timer is periodic. A periodic timer automatically reactivates each time the period elapses, until the timer is canceled using the CancelTimedEvent function or reset using SetTimedEvent. If lPeriod is less than zero, the function fails.

Return value

If the function succeeds, the return value is nonzero. If the function fails, the return value is zero. To get extended error information, call GetLastError.


int CancelTimedEvent(HANDLE hTimerEvent);

hTimerEvent [in]

A handle to a named timed event. The CreateTimedEvent() function returns this value.

Return value

If the function succeeds, the return value is nonzero. If the function fails, the return value is zero. To get extended error information, call GetLastError.


HANDLE OpenTimedEvent(LPCTSTR lpTimerName);

lpTimerName [in]

The timed event name used when the timed event was created.

Return value

If the function succeeds, the return value is the handle to the named timed event. If the function fails, the return value is NULL. To get extended error information, call GetLastError.


int DeleteTimedEvent(HANDLE hTimerEvent);

hTimerEvent [in]

A handle to a named timed event. The CreateTimedEvent() function returns this value.

Return value

If the function succeeds, the return value is nonzero. If the function fails, the return value is zero. To get extended error information, call GetLastError.


These timer functions are based on timed events. The handle returned by CreateTimedEvent() is in fact a handle to a named event of which signaled state is supervised by a time service routine. Standard wait functions like WaitForSingleObject or WaitForMultipleObjects can be used to wait for the high resolution timer events.

4.  Implementation

Only two hardware platforms were described here to highlight some of the problems to bear in mind when implementing reliable time services for Windows. Many more configurations need to be diagnosed to ensure platform independent functionality to a large extent. However a flexible and automatic evaluation of hardware specific behavior may result in hardware independence.

The implementation of all the above into a time service is done by careful separation into different processes and threads. The time critical parts are hosted by a process running at real-time priority class. Some of the threads inside this process are even running at time-critical priority level. In the case of a multi-processor or multi-core system, certain threads are assigned to a specific CPU/core. This is the Kernel and hosts the time service routines. For testing and debugging the Kernel process has some IO capabilities shared with the IO process. A later version may not need this additional functionality. The high priority class requires the Kernel process to run with administrator privileges.

A second process hosts all kinds of less time critical service threads. It shares some IO service with the Kernel process by means of piped IO between these two processes. Furthermore it provides pipe services to the graphical user interface (GUI).

The third process is a graphical user interface (GUI), which runs optionally and helps in the current stage of the development to get an insight into what is going on.

The GUI and the IO process are development tools only. The only process that needs to run to access the time functions discussed here is the Kernel process.

4.1.  The Real-time Priority Class Process: Kernel

The Kernel is the heart of the time service described here. It provides the important link between the system file time and the performance counter value. The idea in this context is to provide data triplets of system file time, performance counter, and performance counter frequency. Knowing the performance counter value at a certain system file time allows the extrapolation of the system file time to the actual time by applying the performance counter value and the performance counter frequency. As discussed, the performance counter frequency is of insufficient accuracy; a refined performance counter frequency is supplied in format double (64bit float). There is also some internal information which allows a refinement of the performance counter value itself (as a result of some self-calibration). Thus, it is also represented in double (64bit float) format.

A typical result of such a data triplet could be:

This information is sufficient to establish time services. Querying the current performance counter value gives the difference to the value calculated to match the last captured file time. This difference is divided by the performance counter frequency and the result is the elapsed time since the last file time capture. This data triplet is, besides other parameters, written to a mutex protected shared memory section. Other processes/threads have access to this data triplet.

As described in sections 2.1 and 2.3, the important part is to get the file time updated correctly. It proves best to gather the data triplet exactly when the file time transits or just transited. Difficulties archiving this have been described for platform examples A and B. At startup, a complex diagnosis of the interrupt timing structure and file time update/transition structure is performed. This results in a timing scheme for updating the data triplet. The desired update period is in the range of 1 to 10 seconds. As discussed, the period duration influences the accuracy. Algorithms are looking for patterns and beat frequencies in the file time update and interrupt timing structure. As a result, a periodic timer is set up to run the data triplet generation and the calibration in parallel. Once exact ∆T file time periods do occur, the true performance counter frequency can be measured and averaged over a number of consecutive measurements. A running average over the last n captures is maintained at all times to provide information about the true (calibrated) performance counter frequency. When the accuracy of the average reaches a certain quality, the phase locking of file time change and performance counter is considered as established and timestamp requests are accompanied by information about their accuracy.

Running all of this at utmost priority ensures that there is very little overhead after an interrupt. Remember: Many processes/threads are waiting for interrupts. Therefore, systems do have a workload peak at the occurrence of an interrupt. Even running at such priority settings, it is unavoidable to be influenced by the load of other processes. However, the accuracy of this scheme easily stays below a few microseconds, even with heavy load on the system.

The routines Time() and GetTimeStamp() are applying the extrapolation scheme described here. Both calls are done in far less than 5μs, even on older systems.

The functionality of the timer routines listed in 3.2 is handled in this real-time process as well. Timed events are registered in a timer event queue. They are monitored with respect to their due time/period. When there is less than one interrupt period left before the due time expires, the timer service polls the timed event queue for the precise time to set the event. This may happen for a number of timed events, even within the same interrupt period. However, it should be noted that the time service thread is running at a high priority level and the signaled event may not be accessible to other processes/threads when there is just one CPU. A single CPU/core system simply cannot cope with multiple timed events setup to signal within the same interrupt period.

4.2.  Less critical services: The IO-Process

In order to implement the kernel as small as possible, much of the functionality is performed by a second process. The IO, in particular, matters. The IO process establishes pipe services to release the kernel from blocking IO. All IO done by the kernel is queued into the IO processes pipe service. These operations are nonblocking. A complex fprintf() can be queued in just a few microseconds. This allows extensive output for diagnosis. Furthermore, output is logged into a file.

4.3.  The Optional GUI-Process

The current GUI is mainly created for developing the time service. Meanwhile, it has become a valuable tool for diagnosing platforms. It runs optionally.


gui.png

The output is split into three tabs: all output tab, error messages, and Refine Performance Counter Offset tab. The text output within the first two tabs is produced using the queued qfprintf(…) function. This function makes its message time stamped and shows also some other parameters of the output piping thread:

The output line format is:

yyyy-mm-dd hh:ii:ss.μμμμμμ.n (s/a) [PID.THID.Processor.Priority]: Message

As already mentioned, the GUI runs optionally and any number of GUI can be started and ended at anytime. Ending a GUI will neither end the kernel process nor end the IO process. In order to terminate the whole group of processes, the Kernel process has to be stopped. The Stop Kernel button (lower right corner) stops the kernel. By doing so, queued messages that are supposed to be processed are stuck. A few message windows will pop up to show the contents of the unprocessed parts of the queues of all involved processes. These popup windows are not error messages; they just report what was happening while the Kernel was stopped.

The plot at the left lower corner shows the history of the accuracy in μs/s during the last 600 seconds. The GUI produces this information by means of GetTimeStamp() imported from the time service DLL.

It also provides a tiny test of the timer functionality: A single shot timer can be setup. The due time setting here is absolute, thus the time has to be in the future. Hint: Use the Update Date/Time Fields button to get the actual time into the fields and than e.g. incrementally change the minute field by 1. Press the Create Timed Event button quickly before the due time expires. Progress of the timed event approaching its due time is shown next to the button, which has now converted into a Stop Timed Event Button to allow cancellation of the timed event. A message window will popup when the timed event has signaled. It shows the precise time at which the signaled state was detected and how much it deviates from the requested due time.

The output can be stopped for the all output tab (Hold Output Button). All output will be queued and the button converts into a Continue Output button until the Continue Output button is pressed. An optional auto cont. check box lets the GUI continue automatically when the queue buffer reaches a critical stage. The auto cont. check box can only be checked when the output is hold.

Finally the Refine Performance Counter Offset tab shows the offset of the calibrated performance counter frequency. The graph shown in 2.4.3 was created within this tab. The graphs context menu (right mouse button) allows saving the graph or clearing the graph's data. Clearing the data will not stop further recording; creation of the graph will continue.

4.4.  The Libraries

The functions described above are accessible to other processes/threads through a static library (LIB) or a dynamic link library (DLL).

5.  Results

Microsecond resolution time stamps are possible on Windows systems. Resolution in the microsecond regime can be observed at accuracies of a few microseconds without distracting the system too much. Timer functions at the same resolution and accuracy are implemented and tested. Handling many timed events created by those timer functions set up to fire within the same millisecond is tricky but possible.

The evaluation at the startup of the services may sometimes take a few seconds and needs all the CPU time. Doing this at utmost priority will freeze single core/processor systems for a moment.

6.  Directions

Much of the code has to be revised and optimized since not much effort has been spent on code optimization. The startup phase, in particular, is somewhat dissatisfactory because of the frozen phase. Thre could be room for improvements with some of the algorithms when more platforms have been diagnosed.

The DLL part has to be completed by some special functions and a static library will be provided to link with other projects.

There may be a chance to convert the Kernel process into a Windows kernel mode process, however there are some drawbacks with KeQueryPerformanceCounter and other kernel services. For now, it seems like a better idea to shrink and optimize the current kernel process as much as possible to get a lightweight standard process running at real-time priority class.





A pdf version of this description can be downloaded here.