One of the key features of a high performance embedded processor is a multistage instruction pipeline. As the processor complexity increases, the need for software scheduling and optimization schemes becomes inevitable. Developing optimized code for pipelined processors requires a thorough understanding of the processor’s pipeline structure and the latencies associated with each instruction. This paper discusses various optimization techniques and
software scheduling schemes, and illustrates them using examples on the TMS320C5510 DSP processor, ST Microelectronics’ ST10 microcontroller, and the LSI402ZX superscalar ZSP processor. Various software development strategies will be suggested to take advantage of the instruction pipeline to reduce the processor idle time and maximize the throughput per
instruction cycle.