2009-05-28

ACE_High_Res_Timer runs too slow - loss of precision

This post is based on an ACE PRF, so it is formatted like it. Sorry :)

ACE VERSION: 5.6.9



HOST MACHINE and OPERATING SYSTEM:

x86 architecture (32-bit), Windows XP Service Pack 3, Winsock 2.2 or newer (WS2_32.DLL version is 5.1.2600.5512)



TARGET MACHINE and OPERATING SYSTEM, if different from HOST:

Windows 2000/XP/Vista/7



THE $ACE_ROOT/ace/config.h FILE [if you use a link to a platform-
specific file, simply state which one]:

config-win32.h



THE $ACE_ROOT/include/makeinclude/platform_macros.GNU FILE [if you
use a link to a platform-specific file, simply state which one
(unless this isn't used in this case, e.g., with Microsoft Visual
C++)]:

Not used, MSVC8 and MSVC9.



CONTENTS OF $ACE_ROOT/bin/MakeProjectCreator/config/default.features
(used by MPC when you generate your own makefiles):

Not used.



AREA/CLASS/EXAMPLE AFFECTED:

Any code reliant on both ACE_High_Res_Timer for timing precision, and
the system clock for standard timestamping and comparisons.



SYNOPSIS:

ACE_High_Res_Timer runs too slow, when compared to the standard system
time function: ACE_OS::gettimeofday().

About every 4 hours, my high performance timer seems to lose 1 second,
when compared to the standard system time. This can be clearly seen via timer
printouts every 15 seconds.

Unfortunately, this means that my code - which uses an
ACE_Timer_Queue_Thread_Adapter object - gets notified of events too early,
and since my code expects to be run *after* the event time, it doesn't work
properly.



DESCRIPTION:

ACE_High_Res_Timer gets its global scale factor from the host operating system.
This scale factor, as explained in the documentation, is used to convert the high
performance timer to seconds. This factor is stored in an ACE_UINT32 member of
the timer object, called global_scale_factor_.

Under Windows (and generally in most OSs) the amount of high resolution timer
ticks per second is returned as a 64-bit value, in my case - 2341160000. It is divided
by ACE_HR_SCALE_CONVERSION, which is defined as
ACE_ONE_SECOND_IN_USECS, which is defined as 1000000. This divided value
is then set to global_scale_factor_.

Note that 2341160000 / 1000000 = 2341.16, and so 2341 is set as the
global_scale_factor_. Thus - we lose 16 ticks of precision every 234116 ticks.

16 / 234116 = 0.0000683421893, which means that we lose 0.98412752592 seconds
in our high resolution timer every 4 hours. After 10 days of operation - we are a whole
minute earlier than the system clock.



REPEAT BY:

Use an ACE_Timer_Queue_Thread_Adapter, and set it to use ACE_High_Res_Timer instead of the standard system time. Set a timer event - every 15 seconds, for at least 4 hours. Use ACE_LOG((LM_DEBUG, "Timestamp:%T\n")) to print out the current system time when that timer event is triggered.

Sample output:

After 0 hours: Standard time: 18:19:41.093000, HiRes time: 04:56:08.082000
After 4 hours: Standard time: 22:19:39.437000, HiRes time: 08:56:08.082000
After 8 hours: Standard time: 02:19:38.406000, HiRes time: 12:56:08.082000



SAMPLE FIX/WORKAROUND:

A workaround would be to add the time difference (about a second every 4 hours)
manually by having a timer event that causes this value to be calculated. It is
quite trivial to calculate this, but changing the actual timers based on it can be
a hassle.

Another workaround would be to simply use the system time as the basis for the
ACE_Timer_Queue_Thread_Adapter, at the cost of losing precision - but at least
with the timer events executed *after* the system time, not *before* the system time.

A better fix would be to add an ACE_USE_64BIT_HIGH_RES_TIMER_CALCULATIONS
definition, that - if set - would allow the timers to use 64-bit division operations,
and would not require the division of the global_scale_factor_ by ACE_HR_SCALE_CONVERSION. Instead, the division would be done using 64-bit
integers, and the division by ACE_HR_SCALE_CONVERSION would be done
later on - on the final result.

Instead of calculating:

(QueryPerformanceCounter() / ACE_HR_SCALE_CONVERSION) /
(QueryPerformanceFrequency() / ACE_HR_SCALE_CONVERSION)

we could do:

(QueryPerformanceCounter() / QueryPerformanceFrequency()) /
ACE_HR_SCALE_CONVERSION

There'd be less loss of precision.

Does anyone have any other suggestions on how I should deal with this?

2009-05-26

Speeding up build times in Visual Studio 2008

Build times in my current project have been slowly ramping up.

Recently, it took me about 120 seconds to complete build the project from scratch - which can be quite annoying and anti-productive. This is in an early stage of development. I want to cut down on build times as soon as I can.

My first instinct was distributed building. I know of an Israeli solution called IncrediBuild which greatly speeds up build times in organization with many machines. Since my organization has many machines, when compared to the amount of developers, and since most of the machines are idle most of the time - this seemed like a good way to speed things up. However, the solution is very expensive.

I could not find any good free or open source solutions that support Visual Studio 2008. I did discover distcc, which is a good solution for non-Windows development. There is a project under development that aims to add support for Visual Studio 2008 to distcc - this might be a good solution in the future.

In the meanwhile, I decided to explore other ways to optimize the build time. I discovered this lovely article, which pointed my in several interesting directions.

Here are my conclusions.

  1. Use the full capabilities of your CPU! There is a new option in the Visual Studio 2008 C++ compiler named /MP ("Build with Multiple Processes"). Adding /MP to the compiler options cut down the build time on my machine by 60%! Note that it seems that this is only worthwhile on systems with multiple processors or multiple cores. To do this, go to the following property page:
    Project > "Project" Properties... > Configuration Properties > C/C++ > Command Line
    Under "Additional options", add:
    /MP
    Make sure you also go to
    Project > "Project" Properties... > Configuration Properties > C/C++ > Code Generation
    and disable the "Enable Minimal Rebuild" option, which is incompatible with the "Build with Multiple Processes" option. This might seem like a strange thing to do - disabling the option that is described as "Detect changes to C++ class definitions and recompile only affected source files" - but based on my experience, "Build with Multiple Processes" gives a much bigger performance boost.

  2. Use precompiled headers! They seem like a hassle, but using them correctly, as explained in this lovely article by Bruce Dawson cut compilation time from 53 seconds to 13 seconds in my current project! Obviously, you will benefit much more from precompiled headers if you use a "whale" include file that is static (non-changing), like "Windows.h", various STL headers and - in my case - the ACE headers. More about this issue, in a more user friendly way, can be found here.

  3. Employ smarter physical code design. This is based on ideas documented by John Lakos in his wonderful book Large Scale C++ Software Design, which - while outdated nowadays - still is a worthy read. You can learn a lot about this in these two articles in Games from Within. Remember, however, that using these techniques can sometimes impair the clarity and brevity of your code. I made some small changes in my code - and cut down the build time by another several seconds.

    All in all, after employing all of these techniques - I managed to cut down the build time from ~120 seconds to ~10 seconds. Pretty cool for 30 minutes of research and coding.

"If" conditional does not work properly

While debugging, I noticed that an "if" conditional displayed some pretty strange behavior.

I had code similar to this:

if (SomeFunctionReturningBoolean() == true)
{
// do something...
print "1";
}
else
{
// do something else.
print "2";
}

Strangely, even though SomeFunctionReturningBoolean() returned true (as the debugger helpfully noted) - the "else" clause was executed, rather than the "if" clause.

I started investigating this.

First, I suspected a mismatch between the code and the executable. I rebuilt from scratch, but no dice. I manually cleaned all of the intermediate files, nope - same strange behavior.

I then changed the code to the following:

bool fResult = SomeFunctionReturningBoolean();

if (fResult == true)
{
// do something...
print "1";
}
else
{
// do something else.
print "2";
}

fResult clearly held "true", but "2" was still printed.

Next, I changed the code to:
bool fResult = SomeFunctionReturningBoolean();

if (fResult == true)
{
// do something...
print "1";
}

if (fResult == false)
{
// do something else.
print "2";
}

Neither "1" nor "2" were printed. Strangely, this seemed to work:

bool fResult = SomeFunctionReturningBoolean();

if (fResult)
{
// do something...
print "1";
}
else
{
// do something else.
print "2";
}

it printed "1". I started wondering whether the function was indeed returning true. I changed to disassembly mode, and to my surprise - I noticed that the function was returning 0x6E, and that the equality check was:

cmp eax, 1

Going deeper into the code, I discovered that a function used by SomeFunctionReturningBoolean() as a basis for its return coded simply did not return any value. The programmer was annoyed by some of the warnings returned by an external include file, and disabled all of the warnings using a pragma - so the "Not all code paths return a value" warning simply did not appear when I compiled the code.

The Visual Studio 2008 debugger treats a bool variable with a value of 0 as false, and it displays "false" as its value in any variable windows. If the bool variable contains anything but 0 - the value displayed is "true". This is generally correct, since non-zero values are handled as "true" by most conditional assembly instructions. However, a comparison instruction like
if (variable == true)
is compiled to
cmp eax, 1
which does not check for "true"-ness, but rather for equality with 1 - which is the value that "true" signifies in the current Microsoft C++ world.

So - the actual source code compared the bool to a specific value - 1, and since 0x6E is not equal to 1 - the check failed and we executed the "else" clause.

So, what did we learn from this issue?
  1. Never disable all of the warnings using a pragma - disable specific warnings, otherwise you'll lose important warnings that might appear.
  2. The VS2008 debugger displays "true"/"false" values in the variables window based on a check of "true"-ness, not an equality to the "true" value.
  3. The "Go to assembly" can be very, very useful at times. You should familiarize yourself with it.

Goal of this blog

Hi,

The goal of this blog is to assist others in solving their own development problems encountered in their work as programmers.

The internet (as indexed by Google) has been an amazing aid to me, offering me solutions to many of the problems I've encountered which baffled me. Sometimes, however, I've had to read multiple sources to discover the real solution that works for me.

This blog is where I'll post the results of my research into the various new problems I'll encounter. I am hoping that Google will index it, and that programmers in need will find it useful.

Good luck :)