How to Choose – Embedded Computing

tqfp

In most embedded projects, there needs to be some kind of intelligence – something to move data around, and make some simple (or not so simple) decisions.

How do you choose what to use? If you just go on Digikey and search for microcontrollers/microprocessors, you’ll get a few thousand options, from a few dozen different families. This is not like the desktop world, where you can only choose between AMD and Intel, and the only difference between them is one is 5% faster and the other is 5% cheaper.

In the embedded world, you have all the way from $1 4-bit microcontrollers that can blink a few LEDs, to $50 digital signal processors that can do real-time MPEG-4 decoding, to $2000 FPGAs that can do something like 1024 elements bubble sort in 1 clock cycle. That’s what makes embedded development so interesting. But how do we choose the most suitable option for a particular application?

In this post, I’ll attempt to compare available technologies and parts, and advices on when to use them. This is obviously based on my personal experience, and I’ve only used a handful of devices, so it will definitely be biased, with a lot of omissions, etc. Though I will attempt to cover most popular devices, most of which I have researched into or used one time or another.

(Roughly) From low end to high end –

1. 8-bit Microcontrollers
Everyone should be familiar with these guys. They are cheap, simple, and usually fairly low power 8-bit processors.

What does it mean that a processor is n-bit?

If you ask 10 people that question, you’ll get 10 different answers. IMNSHO, a processor is n-bit if it works most naturally with data that is n-bit wide.

Usually, what that means is, the data bus will be n-bit, all general purpose registers will be n-bit, and all instructions work on n-bit at a time. None of these things are written in stone. There are exceptions to every single item above. For example, many 8-bit processors can access 16-bit of memory in some kind of “slow” mode using 2 registers, because 256 bytes of memory/addressing space is usually not enough. Most processors can also work efficiently with smaller data types. For example, most 32-bit processors can do 16-bit arithmetic just as fast as 32-bit arithmetic, but there are some exceptions. Some processors can only access memory in blocks of n-bits and/or aligned to n-bits, and require masking and shifting to do calculations with smaller data types.

In C, the “int” data type is supposed to reflect the most natural size for the processor, which should be the processor bit width, but there are also exceptions here. For example, most compilers do 16-bit ints on 8-bit processors, and compilers for x86-64 usually have 32-bit integers for backward compatibility.

It doesn’t mean you can’t use 32-bit integers on 8-bit processors. It only means the processor doesn’t have instructions to work on them directly, so the compiler has to generate instructions to “emulate” them. It will be significantly slower than if you used data types smaller than or equal to the processor bit width. This is especially true for multiplications and divisions, which can take hundreds of cycles. Should be avoided as much as possible (use char or int8_t instead of int on 8-bit processors!!).

Examples: Atmel AVR, Microchip PIC, 8051. They are all very popular. 8051 is nice because it’s patent-free (since it’s an ancient architecture), so there are many manufacturers making 8051-compatible microcontrollers. PICs are very popular because Microchip used to give out free samples generously to poor students, so many people grew up using them. AVRs are my personal favourite, because of the available high quality toolchains (avr-gcc and avr-gdb) with good Linux support, though choice of devices is relatively limited. Performance-wise they are pretty similar. Usually run at 10-20MHz.

Advantages: Easy to use, breadboard compatible (most higher end devices are not available in through hole packages), and popular, therefore easy to find help. Low power (good for battery life on devices that need to be running for years).

Arduino is probably THE most widely known starter kit/dev board for this segment. It’s based on Atmel AVR, but with many convenience functions, etc, so that hobbyists and artists (Arduino was actually designed for artists to blink LEDs, etc) can get started easily. If you are a professional engineer, or on your way to becoming a professional engineer, I would recommend switching away from Arduino as soon as possible.

8-bit CPUs are also popular as soft cores to be synthesized onto FPGAs because they are usually tiny due to the narrow datapaths, and also because intensive calculations can sometimes be implemented in hardware on the FPGA. However, 8-bit memory buses may severely limit data transfer rate to the rest of the FPGA fabric.

There are many open source HDL implementations of 8-bit CPUs to be used as soft cores available on opencores.org (AVR and 8051 mostly). Commercial ones are also available – Xilinx’s PicoBlaze, and Lattice’s LatticeMico 8.

Typical applications: LED animations, simple sensors/actuators, character LCDs, simple toys (RC cars, etc), very simple robots (no complex sensors, and definitely no vision or audio).

2. 16-bit Microcontrollers
I don’t really see much point with 16-bit processors, because they don’t have most of the 8-bit advantages, and are usually weaker than 32-bit microcontrollers. They aren’t very popular, probably for this reason.

Microchip has 16-bit PICs (dsPIC). Don’t know anything about them.

TI MSP430 is the only part I know worth mentioning. It’s a relatively new series from Texas Instruments, designed for low cost and ultra-low power. It has 0.1uA sleep mode and 4mA for 25MHz active mode. May be worth a look for very low powered designs (something that needs to run for years on a coin cell).

3. 32-bit Microcontrollers/Microprocessors
This is a relatively new category, that’s very quickly becoming popular. 32-bit microprocessors have been around for quite a while, but they have only been made into low power microcontrollers in the last decade or so.

If your application requires manipulating bigger numbers, 32-bit processors are much more efficient. They are also typically clocked much faster than 8-bit or 16-bit processors. There is no technical reason why they can’t make 100MHz 8-bit CPUs, but no one does it, because applications requiring that kind of performance would probably just use 32-bit processors instead (because they usually need higher memory bandwidth, if nothing else). So now they use older manufacturing plants on older processes to produce 8-bit microcontrollers for cheap, and newer ones to make high performance chips.

32-bit CPUs also have another nice advantage – flat memory model. On smaller CPUs, it’s often necessary to have banked or segmented memory, due to limited addressing space. 32-bit addressing space is 4GB (for byte addressing), which is way bigger than what embedded applications need for a very long time, even with memory-mapped devices taking up some of the space. As a result, all 32-bit architectures I know have a flat memory model, which is very nice, especially if you are programming in C/C++. Coupled with high memory bandwidth, they are very suitable for processing large amounts of data.

There are quite a few 32-bit microcontroller series – PIC32, AVR32, Renesas, etc, but ARM has the biggest market share by far.

Unlike other manufacturers, ARM doesn’t actually make chips themselves. They develop the CPU core, and license it to manufacturers to add peripherals, and make actual chips. I really like this system because that means we can just learn one toolchain, one core architecture, and be able to use dozens of parts out there with different specializations (ADC, clock speed, etc). For going beyond prototyping into production, this system also affords some risk mitigation – if a chip becomes obsolete or too expensive, it won’t be difficult to find a near-compatible replacement, with minimal code rewriting required.

Another common 32-bit CPU in this segment is the Analog Devices Blackfin. It’s a very high performance (~500MHz) DSP, but I don’t like it because the toolchain is not very friendly ($$), and also requires a lot more board designing, since it’s just a processor without peripherals, memory, etc. However, the Blackfin is EXTREMELY popular in video encoding/decoding and other multimedia applications.

In soft cores, unfortunately, they become less attractive because of their LUT (FPGA space) usage. There are a few popular open source ones, like the SPARC-compatible LEON3/4 developed by the European Space Agency, the OpenRISC processor, and the ZPU. I personally like ZPU the best because the base configuration is tiny, and it’s very configurable (from small size with most instructions emulated, to big and fast). Unfortunately, ARM is not freely available as a soft core. There is ARM Cortex-M1 designed for FPGAs, but it costs a lot of $$.

There are also a few popular commercial ones, like MicroBlaze from Xilinx, Nios II from Altera, and LatticeMico32 from Lattice. They are highly efficient and convenient to use because they were designed by the respective FPGA vendors, and they know their chips best. Downside is they cost $$ (except LM32), and they are all tied to their respective FPGA family (also with the exception of LM32).

Typical applications: Simple image processing, complex robots with a lot of sensors, audio player/recorder, gadgets with bitmap LCDs.

Price-wise, they are still more expensive than 8-bit microcontrollers, but the gap is closing fast. For example, a STM32F407 (168MHz, 192KB RAM) goes for $12 on Digikey, whereas an Atmel ATmega 328 (20MHz, 2KB RAM) goes for $3. The price difference can make or break a mass produced product, but for personal projects and other not-very-price-sensitive projects, it’s a lot of extra power for the money. For me, personally, I have more or less switched to STM32 as my “default” MCU of choice in personal projects, because the $10 difference is insignificant in most of my projects, and the extra speed, features (timers, high speed ADCs, more interfacing options), and memory, gives much greater flexibility in designs.

4. Single Board Computers

They are essentially PCs on a small PCB.

Gumstix, Raspberry Pi, and BeagleBoard are the most popular options nowadays.

They are very powerful, and reasonably cheap (especially RPi). And they run Linux. Developing for them is just like developing for Linux on PC.

Downsides are long booting process (no simple plugging unplugging), hard to interface with other hardware (I2C, SPI, etc), and bulky. So I try not to use them in embedded projects.

They are cool to play with, but I haven’t found a use case for them in the embedded project, yet, where they can do a better job than alternatives.

5. FPGAs

FPGAs are fundamentally different from everything above, so this comparison may not be very fair. However, in many embedded projects, FPGAs do compete with high end microcontrollers, so this section is included for completeness.

Of course, on FPGAs we wouldn’t be writing software anymore (unless a soft core is used). We would still be writing code, but in hardware description languages that, well, describes hardware.

In general people go for FPGAs to get higher computational power. Specialized hardware (on FPGA) can often outperform general purpose processors even when clocked at a much lower clock speed (low power consumption) because of their parallel nature. Unfortunately, it usually involves more design effort, and the performance gain varies extremely widely from case to case. Some problems are extremely parallelizable (like image/video processing, encryption, signal processing, graphics), while some are hopeless (sorting, tree searching, etc). Most are somewhere between the 2 extremes.

In most designs, one or more soft core(s) is synthesized on the FPGA to handle non-speed-critical tasks like user interfaces, etc, because it’s much easier to write software than hardware for that kind of things.

Low end FPGAs are fairly cheap at around $20, while the expensive ones go well into the tens of thousands range.

For some applications, like real time video encoding, FPGAs are more or less the only choice. It’s simply impossible to get the required performance from general purpose embedded processors (usually no heatsink).

For applications that require very fast and deterministic data transfer (like reading from a CMOS image sensor), FPGAs are also a lot more suitable, though some high end microcontrollers with dedicated hardware interfaces can do it through DMA as well.

For hobbyist projects, Xilinx’s Spartan-3 and Spartan-6, and Altera’s Cyclone series are all relatively easily accessible (fairly cheap, TQFP package). FPGA development boards tend to be very expensive for some reason, but it’s not all that difficult to design custom boards for them. They do usually require 4 layer boards because of the multi-rail requirements, but otherwise they are not too bad.