If your monitor runs at let's say 60Hz then rendering at for example 120Hz is useless. Half the frames will stress the GPU but will not be visible on the monitor. Thus limiting the rendering to the refresh rate of the display stresses the GPU only as much as needed. This improves performance, reduces heat dissipation and reduces also power consumption.
Now why the core usage is different I don't know. OpenGL itself is single-threaded. It knows the concept of sharing contexts across threads but I would recommend nobody with a sane mind to venture down that road. Getting the locking done right is the responsibility of the application. It's just not worth the troubles (side note: it's the same problem as with multi-threaded X11 Window System). So it's the driver which is multi-process aware (this command buffer concept).
As mentioned I don't know why the 64-bit version is different. What I could imagine though is that multiple threads are set up in an worker thread type of setup. In such a concept a frame update is processed with one thread. If you get a SwapBuffers OpenGL does an implicit Flush which would tell the thread to finish the command buffer and transmit it to the card to render. I imagine the next free worker thread gets activate to process the next frame content (up to the next SwapBuffers). As mentioned, this is just my guess out into the blue.