Cuda: the Signal in the Noise

Numpy: the old new thing. Once you’ve got it, PyTorch, TensorFlow, and Jax come easy. They’re blazingly fast, and differentiate. But Cuda C++ is tops.
Where’s the sweet spot trade-off between application performance and production time?

This is more relevant to how much time Cuda C++ will save compared to other models in deployment and how often it is being used. Python-based applications have low development time but poor performance, especially when compared to C++.

In general, you should only attempt to rewrite working programs in a faster environment if the net time saved per run takes less than the time to write the program in the first place. So if you will only save a few seconds per execution, but there are hundreds of executions a day, then a week’s work will save you time. However, if the model isn’t being frequently used and time-saving in rewriting is minimal, then don’t rewrite.

Relevant XKCD:xkcd: Is It Worth the Time?