A good bootloader is an essential component on any embedded system and any design deficiencies can lead to a system that is error prone or slow on boot. This paper covers bootloader design, techniques in the bootloader to achieve fast boot time, and efficient coding techniques. The following are also covered: different bootloader architectures; cache fine tunings; data structure alignment and padding for an efficient burst access on cache line boundary; techniques for setting up C environment; uploading firmware to Flash; debugging techniques during initial bootloader development; and endianess issues.