Accurate measurement of processor utilization can be critical for successful development of a complex real time network application. A third party software tool for measuring software performance may not be available in some cases, for instance, when no RTOS is running on a target system. One example of an application where such tool (which is often called profiler) may not be available is an application running on a Communication Processor or a Network Processor. A processor may include one or several hardware-enabled packet processing engines. Such engines execute either very thin OS or no OS at all, leaving full control to the user program.

This paper focuses on a case of a network packet processing application. A packet processing application job is to handle data packets arriving from one or several input interfaces. It is important for such application to be able to process the incoming data, on average, as fast as it arrives, thus maintaining a line rate of processing. An application also needs some flexibility to be able to handle occasional bursts of network traffic without dropping packets, or, even worse, crushing.