The goal of this paper is to describe a simple approach to the SW structure of complex DSP based system-on-chip architectures, with special focus on scheduling and processors intercommunication. Real-time constraints as well as lack of memory resources often require a customized and optimized platform rather than a general purpose operating system. This paper is addressed to embedded SW system designers and embedded SW designers dealing with system-on-chips for real time high-speed applications. Topics of the class will be: 1) how to structure the system; 2) how to deploy the functions in order to fulfil the requirements; 3) how to build up the interface between application and platform SW; 4) how to build prioritized task scheduling and fast task intercommunication. The work will be presented with the aid of detailed examples taken from a tested and fully-working system-on-chip with DSPs, memories and peripherals.