Many of today’s digital signal processing (DSP) applications are subject to realtime constraints. And it seems many applications eventually grow to a point where they are stressing the available CPU and memory resources. Many of these applications seem like trying to fit ten pounds of algorithms into a five pound sack. Understanding the architecture of the DSP, as well as the compiler can speed up applications, sometimes by an order of magnitude. This article will summarize some of the techniques used in practice to gain orders of magnitude speed increases from high performance DSPs.