Mac OS X 10.6, a.k.a. Snow Leopard, is slated for release at the end of this week. One of the most interesting features I have read about is OpenCL support, a parallelization framework.
So how about it? What kind of possibilities does this hold for something like FFmpeg? The pedagogical example in the Wikipedia article demonstrates partitioning a fast Fourier transform so that it can be handled as separate work units, possibly by separate CPUs. I doubt that it would make a (positive) difference to, e.g., split up all of the inverse transforms during a video frame decode.
I really can’t judge the spec by the one example. Perhaps I should, at the very least, read the overview slides available here.
Sometimes I think that it doesn’t help my development as a programmer and computer scientist that I view every single technological development that comes down the pike through the fairly narrow lens of multimedia hacking.
“Problem” is that there are already multimedia acceleration APIs available, and those can do bitstream processing too for which OpenCL (at least the on-GPU implementations) is unsuitable.
Anyway the biggest issue is generally latency, you can’t push one tiny DCT block onto the GPU, wait for it to finish and then do the next one, you’d have to e.g. queue them up and only wait at the end.
For some codecs it should be rather simple (and mostly withing the existing dsputil framework), but you’d need some code changes (e.g. instead of idct_add use an enqueue_idct_add and add something liek a “enqueue_barrier” at some appropriate place).
None of this is really new, the same questions existed for CUDA and nobody so far has done much about it.
CUDA and Stream are vendor-specific APIs, and relatively new ones at that, which may help explain why there isn’t widespread adoption, If OpenCL manages to generate initial traction I expect usage will increase significantly and the two previous APIs to wither away and die (like GLide or A3D).