News:

Download Pelles C here: http://www.smorgasbordet.com/pellesc/

Main Menu

Profiling - speed check

Started by henrin, January 13, 2022, 10:32:32 PM

Previous topic - Next topic

henrin

DEFINITION
   Include this at top of your code:
   #include <intrin.h>
   #define PROFILE 1
   #ifdef PROFILE
      #define GCLK(m) { long long _t_t=_rdtsc() ; m ; _t_t=_rdtsc()-_t_t ; _cprintf(":CLK: %-50.50s :%14lld\n", #m, _t_t) ;}
   #else
      #define GCLK(m) m
   #endif


USE
   As start of main(), include:
   AllocConsole() ;
   If you want to profile some piece of code, surround it with the macro GCLK ; that's magic !
   The console output will show the first characters of the profiled code and the number of processor cycles.

EXAMPLE
   Profiled code:
   GCLK( test1(searchPath); )
   Console output:
   :CLK: test1(searchPath);                                :     673112146

TimoVJL

Some CPUs don't give good results.
#include <intrin.h>
int __cdecl printf(const char * restrict format, ...);
int __cdecl main(int argc, char **argv)
{
int regs[5];
int cnt = 0;
unsigned long long tt;
for (int i=0; i <20; i++) {
tt =_rdtsc();
_cpuid((int *)&regs, 0);
for (int j=0; j <1000; j++) cnt++;
tt =_rdtsc() - tt;
printf("ticks: %llu %d\n", tt, cnt);
}
return 0;
}
May the source be with you

henrin

Thanks for testing.

You mean that the _rdtsc intrinsic does not work ?
Compiled with Pelles9 in debug mode, I get about 7500 ticks for each {for(i} loop.
And 1300 ticks with speed optimisation.

Anyway, I forgot to indicate a restriction.
There is a size limit for the code surrounded by GLCK. It is easily understanded, but I do not know its value and where it comes from.
Any way, it allows several lines of code.

TimoVJL

From an old AMDticks: 168 1000
ticks: 410 2000
ticks: 116 3000
ticks: 116 4000
ticks: 116 5000
ticks: 117 6000
ticks: 117 7000
ticks: 119 8000
ticks: 117 9000
ticks: 116 10000
ticks: 119 11000
ticks: 117 12000
ticks: 116 13000
ticks: 119 14000
ticks: 117 15000
ticks: 117 16000
ticks: 118 17000
ticks: 118 18000
ticks: 119 19000
ticks: 146 20000
May the source be with you

frankie

Quote from: TimoVJL on January 15, 2022, 08:51:16 AM
From an old AMDticks: 168 1000
ticks: 410 2000
ticks: 116 3000
ticks: 116 4000
ticks: 116 5000
ticks: 117 6000
ticks: 117 7000
ticks: 119 8000
ticks: 117 9000
ticks: 116 10000
ticks: 119 11000
ticks: 117 12000
ticks: 116 13000
ticks: 119 14000
ticks: 117 15000
ticks: 117 16000
ticks: 118 17000
ticks: 118 18000
ticks: 119 19000
ticks: 146 20000

It seems enough correct.
Old CPU with one or two cores executing interrupt processing can influence the counter maybe.
An option to test is to set the processor affinity for the process to don't use the core used by the system.
I don't remember details anyway...  :P
"It is better to be hated for what you are than to be loved for what you are not." - Andre Gide