Multithreaded architectures exploit explicit parallelism to extract more throughput from a single processor. Embedded SoC designs can exploit this for greater area-efficiency, or for better real-time responsiveness. While the programming models of multithreaded and multiprocessor systems are fundamentally similar, the optimal choice of multithreaded CPUs, multiple independent cores or multiple multithreaded cores is application-dependent. This class explores the trade-offs and discusses considerations in selecting the best solution, covering symmetric and distributed-memory multiprocessing as well as different approaches to multithreading.