I'm ignorant about OpenMP stuff, any good examples? How to deal with a while loop for example.
The OpenMP code will try to distribute different chunks of a loop (f.e. iteration 1-100, 101-200, ...) to different processors/threads, so a loop where it's hard to figure out the number of iterations doesn't fit well in this model.
Anyway, I've noticed that whatever runs here on my dual core seems to share the load nearly equally between the cores. Is that just lucky - do other computers behave differently?
Maybe a bit of luck. By default the OpenMP runtime creates a thread for each processor/core and ask the OS to be so kind to run each thread on a separate core. The OS is free to totally ignore this request, because there are more important threads to be run or it simply because likes "thread #2" better. It's much harder to get accurate timings for multi-processor/multi-threaded code.
I have started a new source code branch for OpenMP (that is pretty minimal now, but will grow over time):
http://www.smorgasbordet.com/pellesc/sourcecode.htm