Many of today’s DSP applications are subject to real-time constraints. This paper discusses the techniques used in practice to optimize DSP applications for performance, memory, and power. Topics include use of the direct memory access to improve performance, efficient management of on-chip processor memory, efficient mapping of the application to the DSP architecture, loop unrolling techniques, software pipelining fundamentals, top ten list of optimization techniques for DSPs, efficient use of the task stack, and higher order language versus assembly language.