Getting software up and running on a multi-core processor is, in many cases, fairly easy. The real challenge is getting the software to make full use of all the processor’s cores. This paper provides examples of multi-core optimization techniques and discusses how developers can use visualization tools to characterize multi-core behavior and measure performance improvements. The paper explores how developers can use threading models to create multiple concurrent tasks and parallel processing; it also discusses how to minimize lock contention by using mutexes to engineer the optimal level of lock granularity.