Seriously, back then writing fast code wasn't just a matter of choosing the right algorithm; you had to know your instruction times and you had to benchmark different sequences. Of course, because the system timer only ticked every fifty five milliseconds, benchmarking meant either repeating your code hundreds of thousands of times or pulling hackerish tricks like reading the timer chip's internal registers to get the time in 838 nanosecond increments.
Nowadays, of course, compilers are so good and processors are so fast that it's gotten pretty hard to come up with a piece of 'brain-damaged' brute force code that's going to noticeably slow your programs. It seems more than a little ironic, then, that Intel waited for the Pentium to introduce a benchmarking instruction. RDTSC - Read Time Stamp Counter - returns the number of clock cycles since the CPU was powered up or reset. Where was RDTSC back when we really needed it?
Still, better late than never. If you find yourself wondering something along the lines of 'Memory is so slow and the CPU so fast - might it actually be faster to calculate a square root twice than to calculate it once and write it to a temporary variable?' it's nice to be able to quickly do a few tests and find that it's not.
RDTSC is a two byte instruction - 0F 31 - and it returns a 64-bit count in EDX:EAX. Since the 8087 "comp" datatype is a 64-bit integer, we can use the following Delphi code to read the current value:
const D32 = $66; function RDTSC: comp; var TimeStamp: record case byte of 1: (Whole: comp); 2: (Lo, Hi: LongInt); end; begin asm db $0F; db $31; {BASM doesn't support RDTSC} {Pentium RDTSC - Read Time Stamp Counter - instruction} {$ifdef Cpu386} mov [TimeStamp.Lo],eax // the low dword mov [TimeStamp.Hi],edx // the high dword {$else} db D32 mov word ptr TimeStamp.Lo,AX {mov [TimeStamp.Lo],eax - the low dword} db D32 mov word ptr TimeStamp.Hi,DX {mov [TimeStamp.Hi],edx - the high dword} {$endif} end; Result := TimeStamp.Whole; end;One problem you may run into when you use RDTSC is that both IntToStr and Format('%d') can only handle LongInts, not comps. While you can pass a comp value to one of these functions, it cannot be any larger than High(LongInt), or 2,147,483,647. While that would be a mighty satisfying number of dollars to see on your brokerage statement, it's only a little over 16 seconds of clock ticks to a 133 MHz Pentium. If you need to compare two long running processes, the difference between the start ticks and the stop ticks can easily exceed High(LongInt). In these cases, you can use this CompToStr function:
uses SysUtils; {for ThousandSeparator} type CompStr = string[25]; {Comps have up to 18 digits, plus commas, and sign} function CompToStr(N: comp): CompStr; var Low3: string[3]; N1: extended; begin if N < 0 then Result := '-' + CompToStr(-N) else begin N1 := N / 1000; Str(Round(Frac(N1) * 1000), Low3); N := Int(N1); if N > 0 then begin while Length(Low3) < 3 do Low3 := '0' + Low3; Result := CompToStr(N) + ThousandSeparator + Low3; end else Result := Low3 end; end;In the end, I guess you could say this is just another case of "The rich get richer" - we need to do a lot less benchmarking than we ever did before, and at the same time the RDTSC instruction makes benchmarking easy and reliable.
This article first appeared in Visual Developer magazine
Copyright © 1996, Jon Shemitz - jon@midnightbeach.com - June 17, 1996