- processing is faster by a factor of several or more
- for quite big selection sizes and moderate L-R iteration counts, the results are rendered immediately as you move the "sigma" slider
- no delay of image refresh when scrolling with zoom ≠ 100%
- cubic interpolation does not cause a slow down
https://www.youtube.com/watch?v=giq4jCnC6KM
Benchmarking result of my system - CPU: Ryzen 2700 (8 cores, 16 threads, 3.2 GHz base), GPU: Radeon R370. A typical workload - batch processing of 200 images, 1.2 Mpix each, 50 iterations of L-R deconvolution, unsharp masking and tone mapping. (Note that in CPU mode all cores are used.)
- CPU mode: 2:20 min
- GPU mode: 19 s