Methodologies for synthesis of stand-alone hardware modules from C/C++-based languages have been gaining adoption for embedded system design as an essential means of staying ahead of increasing performance, complexity, and time-to-market demands. However, using C to generate stand-alone blocks does not allow for truly seamless unification of embedded software and hardware development flows. This paper describes a methodology for generating hardware accelerator modules that are tightly coupled with a soft RISC CPU, its tool chain, and its memory system.

This coupling allows for several significant advancements, including: a unified development environment with true pushbutton switching between original software and hardware-accelerated implementations; direct access to memory from the accelerator module; full support for pointers and arrays; and latency-aware pipelining of memory transactions. The paper also presents the results of our implementation, the C2H Compiler. Eight user test cases on common embedded applications show speedup factors of 13-73X achieved in less than a few days.