FPGA processors offer many advantages, but some designers have found the performance to be disappointing. Several case studies examine the effects of various embedded processor memory strategies and peripheral sets. Comparing the benchmark system to a real-world system, techniques for optimizing the performance in an FPGA embedded processor system are shown. This paper investigates the performance and cost trade-offs for several FPGA embedded processor systems, including minimal system running from internal FPGA memory, system running from external memory, system caching external memory, and system with many peripherals.