I have to admit: until recently, udev completely intimidated me. I needed udev rules for changing device permissions or creating symlinks with special names, but usually I had to use Google to find somebody else’s rule. Sometimes a rule I found wouldn’t work because the udev syntax changed–for example, you need to use ATTR or ATTRS instead of SYSFS with recent versions. It still scares me a little bit, but at least I feel like I actually understand what’s going on with it now. I’d like to share a few tips and tricks I’ve picked up about making udev rules.

First of all, the man page for udev is extremely useful. I know man pages can be boring, but this one has just about everything you need to know, and it’s not too long.

Where udev rules are stored

There are two directories where you can find udev rules:

  • /etc/udev/rules.d
  • /lib/udev/rules.d

The rules in /lib/udev/rules.d are bundled rules that you probably shouldn’t be messing with. Instead, put your custom rules in /etc/udev/rules.d. You’ll notice they are named in the form “number-description.rules”. The number is used for ordering the rules, so you can pick an order that makes sense for your needs. It’s a good idea to use that same form so that you have a good idea of the order in which the rules will be parsed. This can definitely matter–I have had troubles with creating symlinks for USB serial devices if the number is too small, probably due to some bundled rule that overrides something my rule does.

If you want to override a bundled rule file that’s in /lib/udev/rules.d, you should create a file with the same name in /etc/udev/rules.d. This will cause the new file in /etc/udev/rules.d to be used instead of the file in /lib/udev/rules.d.

The format of udev rules

udev rules are basically a comma-separated list of things–conditions and assignments. If all of the conditions in a rule are true, the rule is a match and all of the assignments are performed. Similar to many programming languages, a single equal sign (=) represents an assignment, while two equal signs (==) represent a comparison. != represents a “not equals” comparison, += adds a value to a list, and := assigns a value for the final time, disallowing any higher-numbered rules from modifying whatever you assigned. Every condition or assignment is a key-value pair separated by one of the operators listed above.

Keys you can match against

This section, I believe, is the most important part to understand. If you don’t understand the intricacies of how these matches work, especially with the keys ending in S, you will probably make mistaken assumptions about how they work (I did!). You can match against these keys:

  • ACTION — what happened to the device that caused udev to be invoked. Common actions checked against are “add” and “remove”.
  • KERNEL — the name of the device as given by the kernel
  • KERNELS — the name of the device or a parent device as given by the kernel
  • SUBSYSTEM — the kernel subsystem of the device (example: tty or usb)
  • SUBSYSTEMS — the subsystem of the device or a parent device
  • DRIVER — the name of the device’s driver
  • DRIVERS — the name of the device’s driver or the name of a parent device’s driver
  • ATTR{name} — a sysfs attribute of the device
  • ATTRS{name} — a sysfs attribute of the device or a parent device
  • IMPORTANT NOTE: If you are using any of the keys above that match against a parent device (KERNELS, SUBSYSTEMS, DRIVERS, ATTRS) in a rule, the parent device you’re matching against must be the same. For example, you can’t search for ATTRS of two different parent devices in the same rule. You also can’t search for the ATTRS of one parent device and the SUBSYSTEMS of a different parent device in the same rule. You can work around this limitation by creating multiple rules and using GOTO, but you can’t do it all in the same rule. This is actually a good thing because it allows you to do things like make sure you’re looking at a USB serial number instead of a PCI serial number for example.

This is very much an incomplete list, but these are some of the more important ones.

How to find values to match against

The info above is great, but it doesn’t really help much unless you know which driver, subsystem, and event attribute you’re trying to match against. That’s where a really handy command comes into play:

udevadm info --attribute-walk --name=ttyUSB0

ttyUSB0 is an example device name in this case — a USB serial port. This attribute walk command will find all of the keys/attributes for a particular device and all of its parent devices. If you run this command on a USB serial port, you will notice it finds the tty device at the top of the output, which is actually fairly boring as far as matching goes. Then as you scroll down, you will see udevadm walk all the way up the tree of parent devices.. You will see it reach the actual USB device that provides the serial port, the USB hub it’s connected to, the USB host controller the hub is connected to, and then probably a PCI controller or something like that depending on your host system.

Here is some example output from a PL-2303-based USB to serial converter I have connected to a VMware virtual machine:

looking at device '/devices/pci0000:00/0000:00:11.0/0000:02:00.0/usb2/2-2/2-2.1/2-2.1:1.0/ttyUSB0/tty/ttyUSB0':

looking at parent device '/devices/pci0000:00/0000:00:11.0/0000:02:00.0/usb2/2-2/2-2.1/2-2.1:1.0/ttyUSB0':

looking at parent device '/devices/pci0000:00/0000:00:11.0/0000:02:00.0/usb2/2-2/2-2.1/2-2.1:1.0':
 ATTRS{bAlternateSetting}==" 0"

looking at parent device '/devices/pci0000:00/0000:00:11.0/0000:02:00.0/usb2/2-2/2-2.1':
 ATTRS{bNumInterfaces}==" 1"
 ATTRS{version}==" 1.10"
 ATTRS{manufacturer}=="Prolific Technology Inc."
 ATTRS{product}=="USB-Serial Controller"

looking at parent device '/devices/pci0000:00/0000:00:11.0/0000:02:00.0/usb2/2-2':
 ATTRS{bNumInterfaces}==" 1"
 ATTRS{configuration}=="VMware Virtual USB Hub"
 ATTRS{version}==" 1.10"
 ATTRS{product}=="VMware Virtual USB Hub"

looking at parent device '/devices/pci0000:00/0000:00:11.0/0000:02:00.0/usb2':
 ATTRS{bNumInterfaces}==" 1"
 ATTRS{version}==" 1.10"
 ATTRS{manufacturer}=="Linux 3.11.0-12-generic uhci_hcd"
 ATTRS{product}=="UHCI Host Controller"

looking at parent device '/devices/pci0000:00/0000:00:11.0/0000:02:00.0':

looking at parent device '/devices/pci0000:00/0000:00:11.0':

looking at parent device '/devices/pci0000:00':

Let’s pretend we want to match against this device in order to give the device node in /dev permissions that allow it to be read and written by all users on the computer. Unfortunately, this PL-2303-based serial adapter does not seem to provide a serial number, so we’ll have to match against every PL-2303 device that uses the product and vendor ID that this device uses. (Side note: FTDI USB to serial chipsets such as the FT232RL also include a unique serial number, so you can identify a particular USB to serial converter dongle by its serial number. The PL-2303 chipset, at least in this case, does not seem to provide that capability)

First of all, notice that one of the parent devices (/devices/pci0000:00/0000:00:11.0/0000:02:00.0/usb2/2-2/2-2.1) contains the vendor and product IDs of the USB dongle:

  • ATTRS{idVendor}==”067b”
  • ATTRS{idProduct}==”2303″

That will be how we will identify this particular device. This is not the device we want to change the permissions of though; we want to change the permissions of the device at the top of the output that belongs to the “tty” subsystem. That is the actual character device shown in /dev for the port.

Everything above is all of the information we need. Let’s get started creating the rule. I’m going to put it in /etc/udev/rules.d/99-test-usb-dongle.rules. First of all, we will specify that we want to match a device that has a subsystem of tty:


This will ensure we’re changing the permissions of the actual tty device rather than any of the parent USB or PCI devices. Next, we want to match the USB dongle by its USB product and vendor IDs. The USB dongle is represented by several parent devices. In order to match a parent device’s attributes, we will have to use the items that end in S (e.g. ATTRS, KERNELS, SUBSYSTEMS). In this case, we will use the idVendor and idProduct attributes of the device I talked about above:

SUBSYSTEM=="tty", ATTRS{idVendor}=="067b", ATTRS{idProduct}=="2303"

Luckily, both of the ATTRS{} values we need to match belong to the same parent device, which is exactly how it has to work with udev. Now that we’ve matched ATTRS{} contained in the device “/devices/pci0000:00/0000:00:11.0/0000:02:00.0/usb2/2-2/2-2.1”, we can’t match the ATTRS{} or other “ending in S” keys of any other parent device. In this case, we luckily don’t have to worry about that. In other cases, it may be useful to be able to match against keys of multiple parent devices. If you ever run into that kind of situation, you can use LABEL and GOTO to jump to another rule that does another check for other properties of a different parent device. I’m not going to talk about that in this post, but if you look at some of the bundled rules you should be able to figure out how GOTO and LABEL work.

Anyway, back to what we were doing. We have completed the match portion of our rule. All that is left to do is add an assignment to the permissions when the conditions are matched:

SUBSYSTEM=="tty", ATTRS{idVendor}=="067b", ATTRS{idProduct}=="2303", MODE="0666"

This sets the permissions of the tty device to allow reading and writing for the owner, the group, and everyone else. Notice that the assignment uses a single “=” rather than a double “==” like the comparisons used. I think that difference was the biggest stumbling block when I first began looking at udev rules. I was very familiar with the difference between “=” and “==” in C, but I didn’t expect udev to use the same distinction. It turns out that it does.

That’s really all it takes. If you save that file in the location I mentioned earlier (/etc/udev/rules.d/99-test-usb-dongle.rules) and reload the udev rules, the next time you plug a PL-2303 USB dongle in, it should have correct permissions.

To reload the udev rules type the following command:

sudo udevadm control --reload-rules

That’s really all there is to it. Now I’d like to do a more complicated example:

More complicated example

Here’s a rule I created for giving certain USB-serial adapters a different name. If a USB-serial adapter with the serial number “12345678” is plugged in, I want a symlink of the form “ttyBlah#” to be created to point to it, where # is an integer. So if I plug three of them in, I want them to be called ttyBlah0, ttyBlah1, and ttyBlah2. This seems straightforward, but there might be other USB-serial adapters plugged in so I can’t just copy the kernel’s device number. For example, let’s pretend five USB-serial adapters are plugged in, named ttyUSB0 through ttyUSB4. ttyUSB0, ttyUSB1, and ttyUSB4 have the serial number “12345678” and ttyUSB2 and ttyUSB3 are other things that don’t use that serial number. I want my symlinks to be called ttyBlah0, ttyBlah1, and ttyBlah2. Notice that I can’t just blindly copy the number over because I want the third one to be called ttyBlah2, not ttyBlah4. It turns out udev can do this with a little bit of help.

I know this idea of having multiple USB devices with the same serial number seems goofy, but I actually did this in the real world. I programmed several FTDI chips to have the same serial number. It’s more convenient than getting a new USB device ID and not knowing if the Linux kernel is going to support it out of the box. The main reason I did it, though, was not Linux-related: it helps prevent Windows from creating a new COM port number each time a device with a different serial number is plugged in. In certain scenarios this behavior is undesirable.

Let’s get to the fun part. Here’s my example rule:

ACTION=="add", KERNEL="ttyUSB[0-9]*", SUBSYSTEMS=="usb", ATTRS{serial}=="12345678", PROGRAM="/home/doug/symlink-number.sh ttyBlah", SYMLINK+="ttyBlah%c"

This rule makes use of several new ideas. First of all, I have to match a device with a kernel name of ttyUSB followed by zero or more digits. This should match all USB serial ports. A parent device needs to be in the USB subsystem. This same parent device also needs to have a serial number of 12345678. The reason we’re checking for a subsystem of USB is to ensure that we don’t match against a device that has a PCI serial number (for example) of 12345678 instead. It’s probably overkill since ttyUSB devices are only going to be USB devices anyway, but it’s a good example of how the “ending in S” keys work. Anyway, we’re making sure we match against a ttyUSB* device which has a USB serial number of 12345678. Those three rules combined should match my three adapters that have that serial number.

If they match, we have two assignments. The first assignment is to PROGRAM. If you assign a value to PROGRAM, the value is interpreted as a command to run. The command is executed and the program’s output to stdout can be used in various ways in udev. This script (which we will see later) will print out the next available number to use for ttyBlah. Note that I passed a single argument to my script: ttyBlah. As you’ll see, I made this script generic so it could be used for numbering various different symlinks.

The second assignment uses the += operator, which as I said earlier adds a value to a list. The SYMLINK variable is a list of names of symlinks to create that will point to the matched device. We use += in case another rule has already set up a symlink for this same port.

You’ll notice that the symlink will be named ttyBlah%c. Several of the assigned values (one of which is SYMLINK) can be given names with printf-style formatters to add extra information. %c will be replaced with the stdout output from the program that was executed from the PROGRAM assignment. Another example of the available printf-style substitutions is %b which is the name of the parent device that was matched with the “ending in S” keys.

OK, so that’s pretty straightforward. Finally, I’ll give you the contents of my symlink-number.sh script (make sure you chmod +x it!):



for i in `seq 0 100`
    if [ ! -e /dev/${SYMPREFIX}${i} ]; then
        echo ${i}
        exit 0

echo Unknown

This is a really crude script, but basically it searches for the next free index between 0 and 100. If for some reason it can’t find a free one, it prints out Unknown. This is actually a pretty lame way to write a script and there are probably better ways to do it. I was just lazy. I’ll never have more than about 5 USB-serial adapters plugged in at a time, so the crude technique doesn’t really bother me.


Don’t be scared by udev. It’s really not that bad. Just sit down, read the very helpful man page, and don’t be afraid to use udevadm’s attribute-walk feature to help you figure out what you need to do.

A project I’ve been working on uses an SVN repository for its version control. I needed a way to automatically insert the SVN revision into the code at compile time so that an “about” window could be automatically updated to display the current revision number. I’d like to share how I did it, both for my own future reference and for anyone else out there.

I got this working on a Linux build host, but I’m not sure exactly how well it will work on Windows. If you have Windows versions of sh, svnversion, and sed, it might be possible. I would assume it should work on Mac OS X just fine too. Let me know how it goes for you below.

Start out with a shell script

Let’s make a shell script that will grab the revision using the svnversion command. This script will go somewhere in the project directory. In my example, it’s going to go in a directory called “scripts” inside the main project directory. The script will be called updateSVNVersion.sh. You should make it executable with “chmod +x”.

SVNVERSIONSTR=$(svnversion "$1" | sed -e "s/.*://")

echo "/* This file is auto-generated by the build script. Do not modify it. */" > "$VERSIONFILE"
echo "#include \"version.h\"" >> "$VERSIONFILE"
echo "QString SVNVersionString() {" >> "$VERSIONFILE"
echo "    return \"${SVNVERSIONSTR}\";" >> "$VERSIONFILE"
echo "}" >> "$VERSIONFILE"

This script takes two parameters. The first parameter is the main project directory to run the svnversion command on. The second parameter is the build directory, which is where all the object files are stored during compilation. This is also where we will generate a file called version.cpp. The build directory is passed as a parameter so that the script knows where to put the generated file.

The output of svnversion is passed to sed to grab the second half of the version string if there are two halves separated by a colon. This ensures we grab the newer revision number of a mixed-revision checkout (this happens if you commit a file and then don’t update the full checkout, for example). This generated version string may also contain other letters such as M if there are pending changes that haven’t yet been committed.

Create version.h in the project directory:

#ifndef VERSION_H
#define VERSION_H

#include <QString>

QString SVNVersionString();

#endif // VERSION_H

All this file does is provide a prototype for the function in the auto-generated version.cpp file. Any code that needs access to the SVN revision will #include “version.h” and call SVNVersionString() to get it.

Now that we have the script, we need to make it automatically run before each build.

Don’t use a Qt Creator pre-build step…

I know someone will suggest just adding a pre-build step in Qt Creator to run a shell script. That’s great, except Qt Creator’s pre-build step is a user setting that isn’t stored with the actual project file. Pre- and post-build steps are stored in a .pro.user file that is used by Qt Creator. qmake knows nothing about it. The project I am working on has multiple developers, and it would be a big pain to make sure that everyone had their .user file set up correctly. Instead…

…do it in qmake/make

Doing it this way will work better.

The thing that’s annoying about setting it up this way is you have to make sure you get the dependencies correct so that version.cpp is recompiled every time you re-run make. I ran into several problems trying to put the SVN revision directly into header files until I finally pieced together this solution which made it easier to force version.cpp to be recompiled every time. No fear: just follow these steps and everything should work fine.

For the rest of these instructions we will be operating on your project’s .pro file. Add the generated version.cpp file to your SOURCES variable:


Note that I prefixed version.cpp with $$OUT_PWD. OUT_PWD is a qmake variable that points to the build directory. As we’ll see in a bit, there is also a variable PWD that points to the directory containing the current file, which in this case is the .pro file. Anyway, this entry may stick out like a sore thumb next to the rest of the entries in your SOURCES variable, but you have to do it this way because version.cpp is not stored in the same directory as the rest of your source files.

Next, add version.h to your HEADERS variable:


That one’s not complicated!

Finally, add some rules (again, somewhere in your .pro file) that will cause extra items to be added to the final Makefile to force version.cpp to be auto-generated:

PRE_TARGETDEPS += version.cpp
QMAKE_EXTRA_TARGETS += svnrevision
svnrevision.target = version.cpp
svnrevision.commands = $$PWD/scripts/updateSVNVersion.sh $$PWD $$OUT_PWD
svnrevision.depends = FORCE
QMAKE_DISTCLEAN += $$svnrevision.target

I’m honestly not a huge qmake or make expert, but I’ll explain these to the best of my ability. PRE_TARGETDEPS ensures that version.cpp is added early in the dependencies list, although I’m not 100% sure whether it’s necessary. I left it in place because it seems to work. QMAKE_EXTRA_TARGETS adds another target internally named svnrevision, which we set up in the next three lines. These lines create a Makefile target called version.cpp which is generated with the updateSVNVersion.sh script. Notice how we pass the source and build directories to the script as expected. The dependencies for this target are set to FORCE to force the target to run its command every time. This is a common thing done in Makefiles, and Makefiles generated by qmake do indeed include the FORCE target so it appears to be safe to use.

Basically, it causes this extra target to appear in the generated Makefile:

version.cpp: FORCE
        /path/to/updateSVNVersion.sh /path/to/source_dir /path/to/build_dir

Don’t add these lines to your Makefile; that will happen automatically. This is just an example to show what the end result looks like.

Finally, version.cpp is added to QMAKE_DISTCLEAN so it’s deleted if you run make distclean. Like I said earlier, I’m not an expert at getting qmake or make to work correctly, so I might have added some unnecessary extras. The important part is that this combination works for me!

One last thing: you may get a warning that version.cpp is not present when you run qmake the first time. It’s a harmless warning that occurs because the file doesn’t exist until you run make. Just ignore it and everything will still work fine.

The other day, I was working on a TI AM3517-based Linux device and I needed to open a serial port at a weird baud rate (31250 bps). The AM3517 is one of TI’s Sitara processors, which are very similar to the popular OMAP chips. The AM3517 is definitely physically capable of doing this baud rate — it starts out with a 48 MHz clock, then it oversamples by either 16X or 13X (thus giving you a 3 MHz or 3.692308 MHz base clock), and then you can supply an integer divisor at that point. With the 16X oversampling, a divisor of 96 would give you 48,000,000 / 16 / 96 = 31250 baud. There was no question that the AM3517 could do the job.

Although the hardware supports it, Linux made it difficult to get the job done. Well, I guess it’s more accurate to say that the combination of Linux and the C library made it difficult.

First of all, the standard way of setting the serial speed is to do the following:

struct termios tio;
tcgetattr(fd, &tio);
cfsetispeed(&tio, B115200);
cfsetospeed(&tio, B115200);
/* do other miscellaneous setup options with the flags here */
tcsetattr(fd, TCSANOW, &amp;tio);

That’s great, except only some standard baud rates have constants: B9600, B19200, B38400, B57600, B115200, etc. There’s no B31250 constant, and you can’t just pass a number to cfsetispeed() and cfsetospeed(). It doesn’t work that way because the speeds are really combinations of bits that get set in the c_cflag member of the termios struct. So essentially, the standard POSIX tcsetattr/tcgetattr API is kind of lame.

I already knew there were ways to get around that problem. I started out by trying the old method that setserial uses:

struct serial_struct serialsettings;
ioctl(fd, TIOCGSERIAL, &serialsettings);
serialsettings.flags &= ~ASYNC_SPD_MASK;
serialsettings.flags |= ASYNC_SPD_CUST;
serialsettings.custom_divisor = 96;
ioctl(fd, TIOCSSERIAL, &serialsettings);

After running that, opening the port with a baud rate of 38400 is supposed to actually use the custom divisor. It’s an ugly way to do it, but it works on a Linux 2.4-based device that I’ve used in the past.

It compiled just fine, but first of all, as soon as my program tried to do the above code, the kernel (the device I’m using has Linux 2.6.37) warned me that setting a custom divisor is deprecated. Worse yet, the TIOCSSERIAL ioctl failed with an error of EINVAL. This is because the kernel’s OMAP serial driver (drivers/serial/omap-serial.c, or drivers/tty/serial/omap-serial.c on newer kernels) doesn’t support that call: serial_omap_verify_port() function returns -EINVAL. So deprecated or not, it just plain doesn’t work with the OMAP serial driver.

I did some looking and dug up a discussion about a replacement for the old, deprecated TIOCSSERIAL method. I also found some other promising leads, including a post on StackOverflow asking the same question.

The discussion about a replacement comes to a conclusion: if the CBAUDEX bit in the c_cflag member of the termios structure is set without any of the other bits from the other B#### constants, then you can put an integer value for the desired baud rate into the c_ispeed and c_ospeed members in the struct. Then, I found a draft patch for this change. The patch includes a new constant called BOTHER (that’s B-OTHER, not “bother!”) defined as the same thing as CBAUDEX by itself.

The patch ended up changing–nowadays, you’ll find that in the kernel, there’s a struct termios and then a struct termios2. termios2 is the new struct that implements the new speed fields. I looked in my glibc header file bits/termios.h, and glibc’s header file actually includes the c_ispeed and c_ospeed members in the standard termios struct. However, the actual code that talks to the Linux kernel, at least in my case, was copying the termios data into an old-style termios struct and using the old-style ioctl that doesn’t look at c_ispeed and c_ospeed.

So after a ton of messing around, I figured out what it takes to do a new-style termios call to set a custom baud rate on newer (2.6+) versions of Linux:

#include <asm/termios.h>

struct termios2 tio;
ioctl(fd, TCGETS2, &tio);
tio.c_cflag &= ~CBAUD;
tio.c_cflag |= BOTHER;
tio.c_ispeed = 31250;
tio.c_ospeed = 31250;
/* do other miscellaneous setup options with the flags here */
ioctl(fd, TCSETS2, &tio);

After doing that, everything worked fine and the OMAP UART worked perfectly at 31250 baud, confirmed with an oscilloscope.

I ran into some #include hell with asm/termios.h, especially if I already had the normal termios.h included elsewhere, so you might have to copy relevant struct definitions into your own code if you can’t get it to work (I know, nasty!). The other thing to keep in mind is that the definition of NCCS might be different between your libc and kernel, so you need to make sure you’re using the kernel’s definition of NCCS. I believe asm/termbits.h has the correct NCCS and structures defined.

Make sure you don’t do another normal tcsetattr() call after the TCSETS2 ioctl, because I’m guessing it’ll probably revert the custom baud rate back to a standard baud rate if you do. I’m not 100% sure on that though.

Anyway, I hope this helps someone. In the end, it wasn’t really that complicated to get working, but it took a lot of research and playing around to understand how to make it work. Maybe I’m missing something, but it seems like glibc and the kernel don’t play well together in this case. It’s also possible that glibc supports this all perfectly and my glibc is compiled incorrectly or something, but either way, manually doing the new-style ioctl like I did above seems to work.

This weekend, I finally decided to try to set up my MacBook Pro to boot into OS X, Windows XP, and Ubuntu. I ran into a few problems so I’d like to share what I did to make it all work. A lot of what I did was driven by other people’s wiki articles and blog posts, particularly this excellent blog about manually editing partitions. Without further adieu, I will describe my setup from start to finish.

Read the rest of this entry

After playing around with a ton of parallel port cards in an attempt to figure out Willem programmer compatibility, I decided it would be useful to write a parallel port tester program. Just a simple utility where you can set the output value of each output pin and read the value of each input pin. Several other test programs exist, but I wasn’t happy with their interfaces, and a lot of them weren’t prepared to easily handle PCI/PCI Express/ExpressCard parallel ports. At worst, they only supported the standard parallel port I/O addresses that new motherboards don’t have anymore. At best, they supported custom parallel ports but required you to manually look up the I/O address range of the card.

Oh, and a lot of the existing tools don’t work with newer versions of Windows. I wanted a tool compatible with newer versions of Windows, both 32-bit and 64-bit.

Introducing Parallel Port Tester:


Simply locate your parallel port in the list that appears in the bottom right. I use several methods to discover parallel ports and their I/O ranges, so you shouldn’t need to enter them manually. If for some reason your port doesn’t show up in the list, you can manually enter the base address of your parallel port (e.g. 0x3000) and hit Enter.

You can toggle parallel port outputs on or off by clicking on the circle representing the pin you want to change. Green represents on (high), black represents off (low). The circles representing input pins are automatically updated while you have the port selected, so you can easily test your inputs. You can also choose whether to use the four control pins as inputs or outputs.

For people who are really interested, I also display the raw register values of the three standard parallel port registers. I show separate values for the control register because what you write to the control register is not guaranteed to be what you read back.


Let’s get the requirements out of the way. You will need:

  • Windows 2000 or newer. Should work with 2000, XP, Vista, 7, 8, and the various newer server versions. 32- or 64-bit Windows are both supported.
  • Microsoft .NET Framework 2.0 or newer
  • Inpout32.dll

Note that I am linking to the best version of Inpout32.dll. It should be compatible with all versions of Windows supported by my program. If you are using Windows Vista or newer, you will need to run the InstallDriver.exe program inside the Win32 directory of Inpout32’s distribution in order to install the driver. You only have to do this once, and it’s just so you can get administrator privileges in order to install the driver. Even if you are on a 64-bit operating system, that’s OK. Still run InstallDriver.exe from the Win32 directory.

Put Inpout32.dll from the Win32 directory into the same directory as ParallelPortTester.exe. Don’t use the x64 directory; that DLL would only be for a 64-bit program, but Parallel Port Tester is not a 64-bit program.


OK, so you’re ready to download it? Here you go. Make sure you also get Inpout32.dll and install it as described above, because it’s not included with my program.


Download Parallel Port Tester version

Version History

  • 1.0: Initial release
  • Improves parallel port detection algorithm; previous algorithm was incorrect in certain cases.
  • Fixes a bug that was causing a crash for some users.

Any problems/questions/concerns? Please let me know in the comments below and I’ll try my best to help get you going. Be careful; it might be possible to fry your parallel port if you hook the outputs up wrong. I’m not responsible for any damage done to your computer by using this software.

For sample code that describes how I detect the computer’s parallel ports in Parallel Port Tester, see my blog post: Detecting parallel ports and their I/O addresses in Windows.

My previous blog posting on this subject from a few years ago sparked quite a bit of interest, so I’d like to follow it up with the latest compatibility information I have. First a quick summary:

Traditional Willem EPROM programmers require your computer to have a parallel port, and almost no computers today have them. You can find add-on parallel port cards, but a good chunk of today’s software is written to work directly with the parallel port addresses that were found on motherboards of older computers (0x378, 0x278, and 0x3BC). Add-on PCI/PCI Express/ExpressCard parallel ports don’t use those addresses.

Unfortunately, the Willem software only lets you pick from a hardcoded list of addresses to work with. The good news, however, is that it does its port access through a DLL called io.dll. There are replacement versions of io.dll that trick the Willem software into talking to a different parallel port address:

It’s actually a good thing that multiple options exist, because sometimes one option works for someone while the other option doesn’t, and vice versa. I would like to list what I have discovered about the various options available in terms of both hardware and software.

Parallel port cards

I have tried a total of four different parallel port cards cards, some of which are hard to find at this time:

My experience with these is the following. The Syba cards both use Moschip (now ASIX) chipsets, while the Shentek and StarTech cards use Oxford (now PLX Technologies) chipsets.

Both of the ASIX-based cards seem to work fine with no messing around needed. I’ve tested them on my desktop computer with Windows 7 64-bit (my DLL) and my laptop with Windows XP (Ben’s DLL).

The PLX-based cards throw a couple of curveballs into the picture, though. First of all, their bidirectional control pins (strobe, auto/linefeed, initialize, and select printer) do not have pull-up resistors. This causes a problem because they are open-drain/open-collector outputs, so something needs to pull them up when a high value is needed. Otherwise it’s impossible to use them as outputs — and the Willem programmer uses them as outputs. The Willem programmer (mine, at least) doesn’t supply its own pull-ups for those pins either. So in order to gain compatibility with those two cards, you will need to add pull-ups somewhere. To get these cards to work, I manually soldered some 10Kohm pull-up resistors for those lines onto my Willem board. It was a pretty ugly hack, though, so I removed the pull-ups after successfully testing it. Maybe someone can find a cleaner way to do it.

I have no idea about full-size PCI Express or PCI cards that use the PLX chipset. Perhaps they would already have the pull-up resistors in place. I’m just not sure.

The second issue with the PLX cards is described in the “DLLs” section below.


The ASIX cards seem to work great with both my DLL and Ben’s DLL, so no further comments are needed here about them.

The Oxford cards don’t behave quite so nicely with my DLL. It seems that TVicPort has trouble reading bytes from odd addresses with the PLX cards, even though it has no such trouble with the ASIX cards. Inpout32 does not have this same issue. I haven’t narrowed down the root problem, but I don’t really care at this point anyway because there’s a fix: if you’re using a PLX chipset, you should use Ben’s DLL instead of mine.

The original reason I made my DLL was because I couldn’t get Ben’s DLL to work correctly with 64-bit Windows 7. It seems that Inpout32 is now 64-bit compatible (and signed), so it may be possible to simply stick with Ben’s DLL. If you plan on going that route, I would recommend downloading the latest version of Inpout32 and grabbing the Inpout32.dll file included with that to go along with Ben’s io.dll. You’ll want to use the 32-bit version even if you’re on a 64-bit operating system. The reason for that is because the Willem software itself is 32-bit. You may need to run the InstallDriver.exe program included with it to get everything to install correctly, but I’m not an expert at Inpout32.

Testing compatibility

If you’d like, you can test your parallel port card with my Parallel Port Tester utility. Make sure all of the outputs and bidirectional pins can correctly output both high and low values. Physically check each pin with a voltmeter while you test. Every card I’ve seen so far outputs 3.3V as a high value. If your bidirectional pins don’t appear to output a high value correctly (or you get weird readbacks in the tester utility), the high value may be floating, thus indicating you need pull-up resistors on those pins. Also make sure the input pins read a high value with nothing attached and a low value when you ground them. It’s OK if the control pins don’t work correctly as inputs; the Willem programmer uses them all as outputs.


The information available in my original post is still very useful and should help walk you through setting up a Willem programmer with various types of cards. I just wanted to share all of my latest compatibility knowledge in a new post because it was all buried in the comments of my original post.

In the early- to mid-1990s, Apple produced their Macintosh Performa line of computers. These computers were meant for home users and typically came bundled with software such as ClarisWorks and Mario Teaches Typing, along with interactive tutorials teaching the basics of how to use the Mac OS. They were sold in places such as Wal-Mart and Sears.

One interesting thing about these computers is that at least initially, they did not come with any software restore disks. If something bad happened and you wiped out your system software (which was easy enough to do — somehow I did it as a kid), you didn’t have a supplied set of disks (or a CD) for restoring the software. Instead, these computers came with software called Apple Backup, and you were supposed to back up your system onto 1.44 MB floppies when you first got it. When you ran Apple Backup, it would let you choose to back up either the full hard drive or just your system folder:

Picture 3

The number of disks needed here is small because I ran it on a simple barebones system to make this screenshot. With all of the bundled software on a stock Performa machine, it would have taken somewhere in the ballpark of 50 floppy disks to complete the backup of the full hard drive. Seriously, how many people would have bothered to buy all of the floppies that would have been necessary, and then actually taken the time to do it? Some people definitely did, and I’m impressed by the dedication. I know my family didn’t bother when I was growing up. It would have been more realistic to only back up user-created files, and then provide a system restore disk (or set of disks) to restore any original software, but the backup software didn’t work that way. Apple obviously learned from their mistake because they began bundling Performa computers with restore CDs at some point later on. (Note: To be completely fair to Apple, it was possible to obtain restore disks from them if you needed to restore your system and your Performa didn’t come with any.)

I ran the backup process on an old Mac for fun, and it guided me through the process of backing up the system. Click the thumbnails to see the full size, if you care:

Picture 1 Picture 2 Picture 3 Picture 4
Picture 5 Picture 8 Picture 9 Picture 10

The resulting disks were normal Mac floppy disks, named “Backup Disk 1”, “Backup Disk 2”, and so on, each containing a single 1,414 KB file named Apple Backup Data.

To restore your data after a failure, you would boot up your Performa using the Utilities disk that came with the computer. The Utilities disk contained a barebones system folder along with disk formatting/repair utilities and a program called Apple Restore. You guessed it: Apple Restore was used to restore the system. You would run it and then insert your backup floppies one at a time to restore everything you backed up.

I looked at some of Apple’s later Performa restore CDs, and interestingly enough, they came with programs called “Restore All Software” and “Restore System Software”, each with a folder full of 1,414 KB files named “Data File 1”, “Data File 2”, and so on. So presumably Apple simply used Apple Backup to back up a stock system and stuck the resulting data files onto a CD to create the restore CD.

I decided that it might be useful to have Apple Backup’s file format documented in case someone out there ends up needing to restore files from their old backups (or wants to extract files from a factory restore CD). Although backup floppies from the 90s are probably going bad by now, I think it’s still cool to have the information out there. Anyway, I decided to reverse engineer Apple Backup and one of the factory CD restore programs (which does essentially the same thing as Apple Restore). I believe I have successfully reverse engineered the Apple Backup file format.

The rest of this blog posting contains the technical details about the format of the files that Apple Backup creates.

Overall layout of an Apple Backup data file

  • Type code: ‘OBDa’ (CD restore files have type ‘OBDc’ instead, but are otherwise identical in format)
  • Creator code: ‘OBBa’
  • There is no resource fork, and the data fork content is summarized in the table below:
Backup disk header
Boot blocks
File #1 header
File #1 full path
File #1 data fork data (if any)
File #1 resource fork data (if any)
Zero padding to next multiple of 0x200 bytes
File #2 header
File #2 full path
File #2 data fork data (if any)
File #2 resource fork data (if any)
Zero padding to next multiple of 0x200 bytes
File #3, 4, 5, … until data file is full

Next, allow me to describe the content of the data fork in more detail:

Backup disk header format

All offsets and lengths are in bytes. All multi-byte quantities are in big-endian format as would be expected from a Mac file format.

Offset Length Name Notes
0x00 2 Version This spec is valid up to and including version 0x0104
0x02 4 Magic number ‘CMWL’ – identifies this file as an Apple Backup file
0x06 2 This disk number Value is between 1 and the number of disks.
0x08 2 Total number of disks The total number of disks used for the backup.
0x0A 4 Backup start time In a Mac time format (seconds since January 1, 1904 00:00:00 local time)
0x0E 4 Backup start time Appears to always be a duplicate of the value above
0x12 32 Hard drive name The name of the hard drive that was backed up. Stored as a Pascal-style Str31 (1 byte length, 31 bytes of string data)
0x32 4 Total size of this file The total size of this restore file; value typically (always?) seen is 0x161800
0x36 4 Total size used in this file The number of bytes actually used in this restore file; value is typically 0x161800 except on last disk where it is probably going to be smaller.
0x3A 0x1C6 Unused Filled with zeros

Total length: 0x200 bytes

Boot blocks

These appear to be standard Mac OS boot blocks, 0x400 bytes in size. Easily seen in a hex editor because it begins with LK, and soon thereafter has names: System, Finder, MacsBug, Disassembler, StartUpScreen, Finder, Clipboard. They are written to the hard drive by the restore program when the System Folder is blessed as it’s restored. For just extracting files, they are not really relevant. They begin at an offset of 0x200 from the start of the backup file, and end at an offset of 0x600 (where the first file header begins)

File/folder header format

Each file or folder starts out with a header. Again, all offsets and lengths are in bytes, and everything is big-endian. This header will always begin on a 0x200-byte boundary; padding bytes of zero are added to the end of the previous file’s data fork/resource fork data if needed. The first file header in a backup data file is always at 0x600, immediately after the boot blocks.

Offset Length Name Notes
0x00 2 Version This spec is valid up to and including version 0x0104
0x02 4 Magic number ‘RLDW’ – identifies this as a file/folder header
0x06 2 Disk number that contains first part of this file/folder This will match the current disk number, unless this file is split across multiple disks and this is the second, third, etc. part of the file.
0x08 4 Backup start time Will be the same as the time in the backup disk header
0x0C 4 Offset of header The offset where this header begins in the disk (example: 0x00000600 in the first file header in every disk)
0x10 32 File/folder name The name of this file or folder. Stored as a Pascal-style Str31
0x30 2 Which file part this is 1, unless this is part of a file that has been split across multiple disks, in which case it will be 2, 3, etc.
0x32 1 Folder flags Bit 7 = 1 if this is a folder, 0 if this is a file.
Bit 0 = 1 if this is the system folder and it needs to be blessed [selected as the current system folder]
0x33 1 Validity flag Bit 0 = 1 if the following file info/attributes/dates are valid.
Bit 0 = 0 if this was a folder that is known to exist but its properties could not be read during the backup.
If a file’s properties cannot be read, the file is skipped during the backup process. So bit 0 = 0 could only happen with folders.
0x34 16 FInfo/DInfo about this file/folder A standard Mac FInfo or DInfo struct containing info about this file or folder (from HFileInfo or DirInfo)
0x44 16 FXInfo/DXInfo about this file/folder A standard Mac FXInfo or DXInfo struct containing info about this file or folder (from HFileInfo or DirInfo)
0x54 1 File/folder attributes Standard ioFlAttrib byte from Mac Toolbox HFileInfo/DirInfo struct
0x55 1 Unused
0x56 4 Creation date Standard Mac time
0x5A 4 Modification date Standard Mac time
0x5E 4 Length of file’s data fork Total length of data fork of the full restored file including all split parts (zero for folders)
0x62 4 Length of file’s resource fork Total length of resource fork of the full restored file including all split parts (zero for folders)
0x66 4 Length of data fork provided by this disk The length of data fork data this disk is providing for this file
0x6A 4 Length of resource fork provided by this disk The length of resource fork data this disk is providing for this file
0x6E 2 Length of full file path Maximum length of 33*50 (enough space for 50 colon-delimited path elements, with many extra bytes left over).
This is the length of the string that immediately follows this header.

Total length: 0x70 bytes. See Inside Macintosh: Files and the Mac Toolbox C headers named “Files.h” and “Finder.h” for more info on FInfo, DInfo, FXInfo, DXInfo, and ioFlAttrib. These items are where the type and creator code, invisible flag, and icon position in folder are stored, for example.

Full path format

The full path is just that: the full path to the folder or file, with the hard drive already being assumed. The components of the path are colon-delimited. The file being restored is the last component of the path. Example:

System Folder:Control Panels:Memory

(if the file being restored is the Memory control panel). It’s just printed as raw bytes, no null terminator or anything — it’s basically a Pascal string with a two-byte length, and the length is at the end of the file/folder header.

The actual file data

Immediately after the full path, the data fork bytes begin (number of bytes = “Length of data fork provided by this disk”), followed by resource fork bytes (number of bytes = “Length of resource fork provided by this disk”). It’s perfectly OK for the length of either (or both) of these to be zero. After that, there is padding (filled with “0” bytes) to the nearest 0x200 byte boundary, and then the next file/folder header begins.

When a file overflows the disk

If there is not enough space remaining on a disk for a complete file (this almost always happens at the end of each data file), the amount of file data that will fit on the disk is stored so that the full disk file size matches the size given in the disk header. Then the first file header on the next disk will be the same file’s header with the exception of the “Which file part this is” field, which will be incremented by one. It is possible for a file’s data to span several disks in this manner; the intermediate disks will only have one file header, followed by a repeat of the file path and data using up all available space on the disk.

For example: Let’s pretend you have 0x1000 bytes remaining on the current data file before its size reaches 1,414 KB. You’re ready to back up a file “Applications:TestApp” that that has a 512 KB data fork and a 128 KB resource fork. The file header will take up 0x70 of those bytes and the full path will take up 20 (0x14) of those bytes, for a total of 0x84 bytes — so there are 0xF7C (3,964) bytes left. So the file header is going to specify a data fork length in this disk of 3,964 bytes, and a resource fork length of 0 on this disk (the total lengths will be filled in as 512 KB and 128 KB though). Then the first 3,964 bytes of the data fork will be written to the file, giving it a total length of 1,414 KB. Then this disk will end. The first file header on the next disk will finish the remaining 520,324 data fork bytes and all of the 131,072 resource fork bytes, and then the next file’s header will begin on the nearest 0x200 byte boundary after that.


As you can see, the format is pretty simple. It’s just basically a disk header followed by a flat list of files until the end of the disk is reached. It should be possible to extract full and partial files from these backup archives even if the data files from some disks are missing. The partial files probably wouldn’t be very useful though.

The original Apple Restore utility didn’t let you pick and choose which files you wanted to restore — it just tried to restore them all, and it would ask you what to do if the file already existed and was newer than the backed up version. I see no reason why a utility couldn’t go through all of these files, give you a list of everything available, and let you selectively extract the files you want. It’s easy to detect what disk a particular backup data file came from because the disk header contains a field for what disk it belongs to.

If anyone’s interested in adding the ability to decode this format in an archive expander program, I would be happy to provide some sample data files. I may or may not decide to write a program to extract files from these backups, depending on how bored I get 🙂

I can’t remember exactly why, but I recently decided I wanted to play around with Super Nintendo games that support more than two players. I have a Super Nintendo console along with several such games — Super Bomberman, NBA Jam, and Madden 97 come to mind. The most popular adapter designed for this purpose is called the Super Multitap. It plugs into the second controller port on the SNES and provides four controller ports, so you get a total of five ports available for use.

I looked on eBay for Super Multitaps and found plenty for sale, but I decided I wanted to make things more complicated. I stumbled upon a schematic for it, found in the official Super Nintendo developer manual. Google for it if you’re interested — I don’t want to link to it just in case lawyers want to ruin all of the fun. I was intrigued by the simplicity of the circuit. It’s basically 3 chips (a mux and a bunch of tri-state buffers), eight resistors, and the controller connectors. Why buy the ready-made adapter if I could just make my own? In theory, buying the parts would be cheaper than buying a used Multitap on eBay, but in practice, the controller sockets and plugs are hard to find so it’s probably not worth it from a monetary standpoint.

But still, I think it’s worth it for the geek cred! Right?

Well, I decided to do it.

Read the rest of this entry

For a few years now, I’ve been fighting a weird problem: X-CTU (which is a software utility provided by Digi for programming XBee modules) is only available for Windows. I do most of my development in Linux so X-CTU is always a pain to work with. It does run pretty well under Wine if you need to use it with Linux or Mac OS X, but when you run it under Wine, it doesn’t detect any serial ports:


Now of course, we all know that you have to add a symlink in your ~/.wine/dosdevices directory to link Wine to your computer’s serial ports:

# ls -l ~/.wine/dosdevices/
total 0
lrwxrwxrwx 1 doug doug  8 Oct 30 21:30 a:: -> /dev/fd0
lrwxrwxrwx 1 doug doug 10 Oct 30 21:30 c: -> ../drive_c
lrwxrwxrwx 1 doug doug 10 Mar 29 18:08 com1 -> /dev/ttyS0
lrwxrwxrwx 1 doug doug 10 Mar 29 18:08 com2 -> /dev/ttyS1
lrwxrwxrwx 1 doug doug  8 Oct 30 21:30 d:: -> /dev/sr0
lrwxrwxrwx 1 doug doug  1 Oct 30 21:30 z: -> /

But even after doing that, X-CTU still doesn’t detect anything. All of the workarounds that I have found require you to define a user COM port in X-CTU:


After going through that process, you can pick the user COM port and X-CTU works perfectly fine (aside from not being able to download newer firmware versions from Digi’s site). As soon as you quit X-CTU, though, the user COM ports you have defined are gone. So whenever you re-open X-CTU, you have to redefine your user COM port. It gets old. So that’s the problem I’ve been fighting: having to manually add the user COM port every time I open X-CTU.

Today I got fed up and ran X-CTU with all of Wine’s debugging information enabled so I could get a clear idea of what X-CTU does when it first loads, in an attempt to figure out how to get the serial ports to show up. Good news: I got it working and now my serial ports show up automatically when I open X-CTU!


I’d like to explain how X-CTU detects attached serial ports, what Wine does in response, and finally, how you can get it working for yourself. Let’s dive in!

How X-CTU detects attached serial ports

X-CTU uses Windows’ Setup API to get a list of attached serial ports. I ran it with Wine set for full debugging and traced out the calls to Setup API functions to figure out exactly what it does. It starts out with a call to SetupDiClassGuidsFromName which, given the name of a device class (“Ports” in this case), returns a list of GUIDs that go with that class. Next, it calls SetupDiGetClassDevs with the list of GUIDs to get a list of devices that belong to the Ports class. It goes through the list of devices and requests the “friendly name” of each port by calling SetupDiGetDeviceRegistryProperty. The “friendly name” will look like one of these examples:

  • USB Serial Port (COM5)
  • Communications Port (COM1)
  • Printer Port (LPT1)
  • Blah blah blah port (COM7)

Notice how the friendly name always seems to end with (COM#) and it also includes other ports like printer ports. Well, X-CTU uses this info to detect the port — if the name of the port contains the string “(COM”, then it grabs the number directly after that string and uses it as the COM port number. It also ignores parallel ports.

So to get Wine to correctly populate the list, we need to figure out what Wine is doing in response to the three Windows functions I listed above. This information was readily available by both checking out the debug trace from earlier and also reading the Wine source code. Let’s go there now…

What Wine does when the setup API functions are called

SetupDiClassGuidsFromName searches in the registry for classes named “Port”. I don’t think it behaves exactly like an actual Windows machine, but here’s what it does on Wine. It searches for subkeys of:


and looks for subkeys that have a class matching the name provided to it. In Wine, it finds the class with GUID {4d36e978-e325-11ce-bfc1-08002be10318}, which according to Microsoft is for COM and LPT ports. Anyway, that’s the single GUID it finds in Wine.

SetupDiGetClassDevs searches in:


for items that match the GUID we found earlier. The above key contains various keys that represent different categories. Then the categories contain keys that represent devices, and the devices contain keys that represent instances of the devices (I believe). The gist of it is that it goes three levels deep through the Enum directory to try to find anything that has a string value “ClassGUID”. If the GUID matches the GUID we found earlier, Wine decides it’s a serial port and returns it in the list of discovered devices. This is the root cause of the whole problem — nothing is put into the registry automatically by Wine for these serial ports. So we’ll definitely need to add this manually, as we’ll see later.

SetupDiGetDeviceRegistryProperty is finally used to get the friendly name for the port. It looks in the same location it looked for the ClassGUID value, but this time it looks for a string called FriendlyName — which, as you guessed it, contains a string in the format of my examples above.

Once I figured this out, I was pretty much home free. So without further adieu, here are the instructions for getting it working.

What to add to the registry

The key (no pun intended) is to add your serial ports as subkeys of:


to satisfy the functions I described in the section above:

  • Create a subkey of Enum and call it SERIAL (although the name you use doesn’t really matter–I believe it searches everything, not just the SERIAL subkey).
  • Create a subkey of SERIAL and call it COM1 (if your port is COM1 — although this name doesn’t really matter)
  • Create a subkey of the COM1 key you just created and call it COM1 also (this name doesn’t really matter either though)
  • In the final COM1 subkey you created, add two string values:
    • ClassGUID — containing the value {4D36E978-E325-11CE-BFC1-08002BE10318} (the GUID for the Ports class)
    • FriendlyName — containing a name in the format “Serial Port (COM1)” without the quotes of course.
      • Make sure the name ends with the COM port name in parentheses as in this example — (COM1). It has to be that way or it won’t work–it might appear in the list, but it will fail to open unless you do it exactly in that format.
      • This is what X-CTU actually uses to decide which port to open. The other “COM1” subkeys we added in the earlier steps aren’t checked for anything — I just named them that way for clarity while you’re browsing the registry.

You can make these modifications using regedit in Wine (type “wine regedit” in a terminal window). Here’s an example screenshot to make it clear what you have to add:


That’s it! You’re done. The ports should now appear automatically in X-CTU. Assuming you have also created the symlinks in ~/.wine/dosdevices for the COM ports you added, they should also be operational.


This is tested in Ubuntu 12.10 with Wine 1.4.1. I would imagine if you can figure out where the dosdevices folder is to stick the symlinks, it will probably work in Mac OS X as well. Your mileage may vary. Good luck!

This strategy definitely works for X-CTU, but it’s not a generic strategy that will work for any Windows program under Wine. Different programs use different methods to get a list of serial ports. Some programs may check for a different key called PortName next to FriendlyName. X-CTU in particular only checks for FriendlyName. If you’re trying to get this to work with a different Windows program, play around. Check programming tutorials to see the various methods people use to enumerate COM ports on Windows. Figure out which method your particular program is using — disassemble it, check what functions it links against, run it in Wine with debugging enabled, etc. Once you’ve figured it out, use Wine’s debugging facilities (and the Wine source code) to see what Wine is doing in response to the various functions that are called. Chances are good that it is looking into the registry and you just need to tweak your registry to give the Windows functions the results they are expecting.

I hope this helps someone out there someday!

I’ve seen plenty of articles about this topic already, but I’d like to talk about it on my blog as well. It’s a pretty important topic in my opinion.

Let’s say you have a microcontroller peripheral that sends and/or receives a bunch of data. A good example of this type of microcontroller peripheral is a UART (universal asynchronous receiver/transmitter). That’s essentially a fancy name for a serial port–you know, the old 9-pin connectors you’d find on PCs. They were commonly used for connecting mice and modems back before PS/2, USB, and broadband took over. A UART basically does two things: receives data and transmits data. When you want to transmit data, you write a byte to a register in the UART peripheral. The UART converts the 8 bits into a serial data stream and spits it out over a single wire at the baud rate you have configured. Likewise, when it detects an incoming serial data stream on a wire, it converts the stream into a byte and gives you a chance to read it.

I’m actually not going to go into detail about UARTs yet because I feel it’s really important to understand a different concept that we will need to know when we learn about how to use UARTs. That concept is an interrupt-safe circular buffer. This lesson is dedicated to understanding this very important data structure. The next post I write will then go into more detail about UARTs, and that post will use a concept that you will learn about in this post. OK, let’s get started with circular buffers.

A common problem when you are receiving or transmitting a lot of data is that you need to buffer it up. For example, let’s say you want to transmit 50 bytes of data. You have to write the bytes one-by-one into the peripheral’s data register to send it out. Every time you write a byte, you have to wait until it has transmitted before writing the next byte. (Note: Some microcontroller peripherals have a hardware buffer so you can write up to 16 [for example] bytes before you have to wait, but that hardware buffer can still fill up, so this concept I’m describing still applies)

A typical busy loop sending the data would look like this:

char data[50]; // filled with data you're transmitting

for (int x = 0; x < 50; x++) {
    while (peripheral_is_busy());

This loop is simple. For each of the 50 bytes, it first waits until the peripheral isn’t busy, then tells the peripheral to send it. You can imagine what implementations of the peripheral_is_busy() and peripheral_send_byte() functions might look like.

While you’re transmitting these 50 bytes, the rest of your program can’t run because you’re busy in this loop making sure all of the bytes are sent correctly. What a waste, especially if the data transmission rate is much slower than your microcontroller! (Typically, that will be the case.) There are so many more important tasks your microcontroller could be doing in the meantime than sitting in a loop waiting for a slow transmission to complete. The solution is to buffer the data and allow it to be sent in the background while the rest of your program does other things.

So how do you buffer the data? You create a buffer that will store data waiting to be transmitted. If the peripheral is busy, rather than waiting around for it to finish, you put your data into the buffer. When the peripheral finishes transmitting a byte, it fires an interrupt. Your interrupt handler takes the next byte from the buffer and sends it to the peripheral, then immediately returns back to your program. Your program can then continue to do other things while the peripheral is transmitting. You will periodically be interrupted to send another byte, but it will be a very short period of time — all the interrupt handler has to do is grab the next byte waiting to be transmitted and tell the peripheral to send it off. Then your program can get back to doing more important stuff. This is called interrupt-driven I/O, and it’s awesome. The original code I showed above is called polled I/O.

You can probably imagine how useful this idea is. It will make your program much more efficient. If you’re a computer science person, you’re probably thinking,  “Doug, the abstract data type you should use for your buffer is a queue!” You would be absolutely correct. A queue is a first-in, first-out (FIFO) data structure. Everything will come out of the queue in the same order it went in. There’s no cutting in line! (Well, unless you have a bug in your code, which I’m not afraid to admit has happened to me enough times that I finally bothered to write up this article so I’ll have a reference for myself.) Anyway, that’s exactly how we want it to be. If I queue up “BRITNEYSPEARS” I don’t want it to be transmitted as “PRESBYTERIANS”. (Yes, amazingly enough, that is an anagram.)

A really easy to way to implement a queue is by creating a ring buffer, also called a circular buffer or a circular queue. It’s a regular old array, but when you reach the end of the array, you wrap back around to the beginning. You keep two indexes: head and tail. The head is updated when an item is inserted into the queue, and it is the index of the next free location in the ring buffer. The tail is updated when an item is removed from the queue, and it is the index of the next item available for reading from the buffer. When the head and tail are the same, the buffer is empty. As you add things to the buffer, the head index increases. If the head wraps all the way back around to the point where it’s right behind the tail, the buffer is considered full and there is no room to add any more items until something is removed. As items are removed, the tail index increases until it reaches the head and it’s empty again. The head and tail endlessly follow this circular pattern–the tail is always trying to catch up with the head–and it will catch up, unless you’re constantly transmitting new data so quickly that the tail is always busy chasing the head.

Anyway, we’ve determined that you need three things:

  1. An array
  2. A head
  3. A tail

These will all be accessed by both the main loop and the interrupt handler, so they should all be declared as volatile. Also, updates to the head and updates to the tail each need to be an atomic operation, so they should be the native size of your architecture. For example, if you’re on an 8-bit processor like an AVR, it should be a uint8_t (which also means the maximum possible size of the queue is 256 items). On a 16-bit processor it can be a uint16_t, and so on. Let’s assume we’re on an 8-bit processor here, so ring_pos_t in the code below is defined to be a uint8_t.

#define RING_SIZE   64
typedef uint8_t ring_pos_t;
volatile ring_pos_t ring_head;
volatile ring_pos_t ring_tail;
volatile char ring_data[RING_SIZE];

One final thing before I give you code: it’s a really good idea to use a power of two for your ring size (16, 32, 64, 128, etc.). The reason for this is because the wrapping operation (where index 63 wraps back around to index 0, for example) is much quicker if it’s a power of two. I’ll explain why. Normally a programmer would use the modulo (%) operator to do the wrapping. For example:

ring_tail = (ring_tail + 1) % 64;

If your tail began at 60 and you repeated this line above multiple times, the tail would do the following:

61 -> 62 -> 63 -> 0 -> 1 -> …

That works perfectly, but the problem with this approach is that modulo is pretty slow because it’s a divide operation. Division is a pretty slow operation on computers. It turns out when you have a power of two, you can do the equivalent of a modulo by doing a bitwise AND, which is a much quicker operation. It works because if you take a power of two and subtract one, you get a number which can be represented in binary as a string of all 1 bits. In the case of a queue of size 64, bitwise ANDing the head or tail with 63 will always keep the index between 0 and 63. So you can do the wrap-around like so:

ring_tail = (ring_tail + 1) & 63;

A good compiler will automatically convert the modulo to the faster AND operation if it’s a power of two, so you can just use my first example with the “% 64” since it makes the intent of the code clearer. I’ve been told I have “a lot of faith in compilers” but it’s true–if you compile with optimizations enabled, GCC will correctly optimize my first example into the assembly equivalent of my second example. Looking at the assembly code that your compiler generates is a very valuable tool for you to have available.

OK. Now that I’ve explained everything, here are my add() and remove() functions for the queue:

int add(char c) {
    ring_pos_t next_head = (ring_head + 1) % RING_SIZE;
    if (next_head != ring_tail) {
        /* there is room */
        ring_data[ring_head] = c;
        ring_head = next_head;
        return 0;
    } else {
        /* no room left in the buffer */
        return -1;

int remove(void) {
    if (ring_head != ring_tail) {
        int c = ring_data[ring_tail];
        ring_tail = (ring_tail + 1) % RING_SIZE;
        return c;
    } else {
        return -1;

The add function calculates the next position for the head, ensures that there’s room in the buffer, writes the character to the end of the queue, and finally updates the head. The remove function ensures the buffer isn’t empty, grabs the first character waiting to be read out of the queue, and finally updates the tail. The code is pretty straightforward, but there are a couple of details you should be aware of:

  • I only modify the head index in the add function, and I only modify the tail index in the remove function.
  • I only modify the index after reading/writing the data in the buffer.

By doing both of the above, I have successfully ensured that this is an interrupt-safe, lock-free ring buffer (if there is a single consumer and a single producer). What I mean is you can add to the queue from an interrupt handler and remove from the queue in your main loop (or vice-versa), and you don’t have to worry about interrupt safety in your main loop. This is a really cool thing! It means that you don’t have to temporarily disable interrupts in order to update the buffer from your main loop. It’s all because of the two bullets above. If the interrupt only modifies the head, and the main loop only modifies the tail, there’s no conflict between the two of them that would require disabling interrupts. As for my second bullet point, updating the head or tail index only after the read or write is only necessary in whatever side of the queue is working in the main loop — it doesn’t matter in the interrupt handler. But if I follow that convention in both the add() and remove() functions, they can be swapped around as needed — in one program add() could be used in an interrupt handler and remove() could be used in the main loop, but in a different program, remove() would be in the interrupt handler and add() would be in the main loop.

OK, so back to the original example of transmitting 50 bytes. In this case, the main loop will use the add() function. You will add 50 bytes to the queue by calling add() 50 times. In the meantime, the microcontroller peripheral will interrupt every time it successfully sends a byte. The interrupt handler will call remove(), tell the peripheral to transmit the character it just removed, and give control back to the main loop. Later on, the transmission of that character will complete, the interrupt will occur again, it will send the next character, and so on until all 50 bytes have been sent, at which point the queue will be empty. It works like a charm!

The only problem remaining is that the first time you need to send a character, the transmitter isn’t running yet. So you have to “prime the pump” by telling the transmitter to start. Sometimes a peripheral will have a way to induce the first interrupt to get the ball rolling. Otherwise, you may have to add some extra logic that forces the first transmission to occur from the main loop. Since this technically breaks the “single consumer and single producer” rule, this one exception to the rule may still require some careful interrupt safety. I’ll talk more about that in a later post. The main purpose of this post is just to get you familiar with ring buffers and their inherent interrupt safety (aside from that one “gotcha”). They are very, very useful and very, very common. I’ve used them in drivers for UARTs and CANbus, and as I understand it, they are very common in audio applications as well.

I hope this made some sense. If it didn’t, I hope my next posting about UARTs will help clear things up.

Note: The information I have given may not be completely correct if you’re dealing with a system that has more than one processor or core, or other weird stuff like that. In that case you may need something extra such as a memory barrier between writing the queue data and updating the index. That’s beyond the scope of this article, which is targeted toward microcontrollers. This technique should work fine on microcontrollers with a single processor core.