$ ./meanfilter_test --width=34 Mean Filtering Tester ===================== CPU Version (Xeon W5580 @ 3.20GHz) Filtering complete in 3.69872e+06 ms Writing output file: cpublur.out $ ./meanfilter_test_cuda --width=34 Mean Filtering Tester ===================== Acquiring CUDA device Using default device CUDA device: Tesla C1060 Filtering complete in 136.063 ms Writing output file: gpublur.out $ vol_diff cpublur.out gpublur.out No difference $
I take offense at that statement! ;-)Quote
while his CPU takes a long time to render a specific blur function
No, although I have pointed out the problem in the bug tracking database (such as it is). Part of the problem is the arr[j][k] semantics which pervade the CPU code. For the rest.... it's C code which is trying to be templated, which means that there are switch statements all over the place. I generally prefer to avoid touching that without a very long pole.Quote
Have you re-worked the bad implementation of the algorithm for the CPU as well to see how much improvement you'd get there?
Quote
John Willoughby
Who writes this stuff?
But how're going to confuse people if you do that? I'd like to submit another little gem from the codebase I'm working on right now:Quote
I don't favor overloaded operators for anything but very obvious usages
/* compute the triple scalar product v1 x v2 . v3 */ float VectorTripleProduct( const VECTOR *v1, const VECTOR *v2, const VECTOR *v3)Yes folks. The routine "VectorTripleProduct" does not, in fact, calculate the Vector Triple Product, but the Scalar Triple Product. Just a slight difference there....
Quote
Hi,
Thanks a ton for your feedback and the great catch with the time lag. We have made the change. Please take a look and let us know if we're still not on track.
Thanks and regards,
Aarti Jesrani
Director - Content & Communications
burrp!
http://www.burrp.com/
