Using tolower() had no significant diffference with PellesC, but with LCC-Win32, VC++ and others returned with greater than 1 sec. Makes you wonder what these other compilers have under the hood
The default for Pelles C is to define a macro for tolower(), which will fetch the character from a lookup table (that can change with the locale) - just inline code, no function call.
It looks like Microsoft, and probably LCC-Win32 too, is using a true function call. I think Microsoft is more ambitious about supporting things like Japanese which might explain why they need to call a function (too much code to inline).
Pelle