For the next post in my series about upgrading my Chumby 8’s Linux kernel (here are links to parts 1, 2, and 3), I thought I’d look at what was involved in getting the reboot and poweroff commands working properly. I noticed pretty early during the development process that they didn’t work, which was pretty annoying.

The system is going down NOW!
Sent SIGTERM to all processes
Sent SIGKILL to all processes
Requesting system reboot
[   46.457580] reboot: Restarting system
[   47.458947] Reboot failed -- System halted

This meant that in order to restart the Chumby, I had to physically press a button to power it off and again to power it back on. As you can imagine, this got old really fast during development. For that reason it was one of the earliest things that I got working.

I actually implemented it in U-Boot first, but I thought the Linux side of it would be more fun to share. If you want to see what was involved on the U-Boot side, see this commit from my fork of U-Boot.

Before I got it working in both U-Boot and Linux, I had to do some research to figure out how the reboot and poweroff commands worked in the stock Chumby 2.6.28 kernel, as well as how they are expected to work in modern kernels. On a lot of ARM variants, the existing code in the kernel handles it all for you and you don’t have to do anything special. Unfortunately, the PXA168 is not one of these architectures. It’s not a problem unique to the PXA168 though. On some boards you might need to toggle a GPIO pin, set a bit in a register, enable the watchdog timer and let it expire, or send a command to an external microcontroller. There are a bunch of reset drivers in modern mainline kernels. See the drivers/power/reset directory in the kernel source. This directory didn’t exist in the 2.6.28 kernel though, so it’s a mechanism that has undergone some changes throughout the years.

The Chumby 8/Insignia Infocast 8 was known internally by Chumby as “silvermoon”. Thus, one way to find a bunch of the Chumby-specific kernel customizations is simply to search the Chumby 2.6.28 kernel source for the string silvermoon:

cd linux-2.6.28
grep -iR silvermoon

This revealed all sorts of custom drivers and patches to the kernel, including: touchscreen support, PWM backlight control, a custom boot logo, several sound-related tweaks and drivers, and tweaks to the PXA168 framebuffer, USB, and UART drivers. If you’re curious about which topics are remaining to be covered in this series, that’s a pretty accurate list!

Several lines it found in the UART driver looked interesting:

drivers/serial/pxa.c:static void silvermoon_pm_power_off(void)
drivers/serial/pxa.c:static void silvermoon_pm_restart(char ignored)
drivers/serial/pxa.c:		arm_pm_restart = silvermoon_pm_restart;
drivers/serial/pxa.c:		pm_power_off   = silvermoon_pm_power_off;

At first glance, it seemed kind of…odd…that the serial driver would have functions related to poweroff and reboot. As I investigated more deeply, it started to make sense. The Chumby 8’s power control is actually handled by the STM32F101 “cryptoprocessor” on the same board. The PXA168 communicates with the STM32 through a UART, and then the STM32 twiddles GPIO pins in response to enable/disable power rails and/or reset the PXA168. It seems that it was easiest for the original developers to insert the reset code into the UART driver since it was all set up for sending and receiving data.

I investigated the silvermoon_pm_restart() and silvermoon_pm_power_off() functions in more depth. They are both pretty simple. They send a command 8 times over the UART: “RSET” for reset or “DOWN” for power down. The repetition is to be safe, because the UART communication could potentially fail.

static void silvermoon_pm_power_off(void)
{
	struct uart_pxa_port *up = serial_pxa_ports[2];

    CHLOG("Powering off...\n");
    // Have the CP power us down.
    if(!up) {
        CHLOG("serial_pxa_ports is NULL.  Try to power off too soon?\n");
        return;
    }

	/*
	 * Due to timing reasons and lack of flow-control, the first try might
	 * not bring the system down.  Try multiple times.
	 */
	send_cp_command(up, "DOWN", NULL, NULL);
	send_cp_command(up, "DOWN", NULL, NULL);
	send_cp_command(up, "DOWN", NULL, NULL);
	send_cp_command(up, "DOWN", NULL, NULL);
	send_cp_command(up, "DOWN", NULL, NULL);
	send_cp_command(up, "DOWN", NULL, NULL);
	send_cp_command(up, "DOWN", NULL, NULL);
	send_cp_command(up, "DOWN", NULL, NULL);

	CHLOG("We should be powered down now.\n");
}

These functions both call a common send_cp_command() function that sets up the UART for simple polling, tries to get the cryptoprocessor’s attention by sending “!!!!”, waits for a confirmation response of ‘?’, repeats if there is no response, and finally sends the command followed by a linefeed and carriage return.

With that knowledge in hand, I decided to figure out a clean approach for adding this functionality to newer kernels. I wanted it do it the “correct” way now that there is a dedicated subsystem for reset drivers, rather than patching it directly into the PXA serial driver. Let’s walk backwards from the “Reboot failed” error message and trace out how rebooting works in modern kernels:

$ grep -R 'Reboot failed'
arch/mips/kernel/reset.c:	pr_emerg("Reboot failed -- System halted\n");
arch/sparc/kernel/process_32.c:	panic("Reboot failed!");
arch/sparc/kernel/reboot.c:	panic("Reboot failed!");
arch/openrisc/kernel/process.c:	pr_emerg("Reboot failed -- System halted\n");
arch/arm64/kernel/process.c:	printk("Reboot failed -- System halted\n");
arch/arm/kernel/reboot.c:	printk("Reboot failed -- System halted\n");
arch/microblaze/kernel/reset.c:	pr_emerg("Reboot failed -- System halted\n");

Clearly, the error message is coming from architecture-specific code. In particular, we are interested in the ARM version:

void machine_restart(char *cmd)
{
	local_irq_disable();
	smp_send_stop();

	do_kernel_restart(cmd);

	/* Give a grace period for failure to restart of 1s */
	mdelay(1000);

	/* Whoops - the platform was unable to reboot. Tell the user! */
	printk("Reboot failed -- System halted\n");
	while (1);
}

This seems pretty straightforward, and explains why the failure message appears almost exactly one second after the “Restarting system” message. The relevant function is do_kernel_restart(), which is actually a platform-independent function in kernel/reboot.c:

void do_kernel_restart(char *cmd)
{
	atomic_notifier_call_chain(&restart_handler_list, reboot_mode, cmd);
}

Aha, so it’s just executing a list of registered restart handlers. Right above do_kernel_restart(), there’s a function called register_restart_handler() that adds a handler to the list. That function is called extensively in lots of drivers, especially the power/reset directory I mentioned earlier. It’s also called by the common ARM setup_arch function, which is how it’s hooked up on a lot of the ARM platforms.

I inspected some of the other reset drivers to see how they worked. It’s all pretty simple. In the probe function, you set up any necessary resources (e.g. I/O addresses) and register the reset handler. Some of them also deal with poweroff. Those drivers modify the pm_power_off global variable which is a function pointer to the function that should be called when the system needs to be powered off. I knew I would need to implement that functionality too. By the way, it looks like there is a new register_sys_off_handler() function that is similar to register_restart_handler(). Most drivers aren’t using it yet though. pm_power_off eventually ends up being hooked up to it under the hood. If I submit it upstream, I should probably use the new mechanism instead.

Anyway, the Chumby’s requirements are a little unique compared to the other drivers I inspected. It’s not just a simple GPIO pin toggle or a register write. It’s an algorithm that requires two-way communication with the STM32. I have to send the “!!!!”, wait for the ‘?’, retry a few times if I don’t get a response, and then send the command. Then, that whole process has to be retried several times in case it fails due to serial communication problems. The closest driver I found was qnap-poweroff, which is designed for some QNAP and Synology NAS devices. It sends a single character through a UART to tell the system to shut down.

My driver needed to do a bit more work, but I liked the way the qnap-poweroff driver handled it. Rather than trying to figure out how to interact with any serial drivers, it takes over complete control of the UART. It seems to be a 16550A-compatible UART just like the PXA168. It sets it up with the correct baud rate and framing, disables the FIFOs and interrupts, and manually sends the command by directly writing to the UART’s TX register.

I figured I could do pretty much the same thing, just with some added two-way communication. After all, we’re trying to shut down or reboot, so it’s not really a big deal if I take over the entire CPU busy-waiting for a response from the STM32. In some ways it’s even preferable. It doesn’t depend on any other drivers or subsystems that could introduce other failures. It just does a sequence of register reads and writes that should be pretty safe as long as the hardware is still working. With that in mind, I coded up a reset/poweroff driver.

One thing that required a bit of thinking was how I would handle it from a device tree standpoint. The thing is, the cryptoprocessor also has to be functional inside of Linux because you can run the cpi utility to communicate with it. I wanted to retain compatibility with cpi, because I could use it for purposes such as determining how many seconds have elapsed since the cryptoprocessor restarted. This would allow me to emulate an RTC, for example. So I needed both the normal serial driver and this new reset driver to work. They both needed to be mapped to the same address. This turned out to not be a problem at all. The reset driver doesn’t interfere with the UART driver during normal operation and they seem to coexist peacefully:

uart3: serial@d4026000 {
	compatible = "mrvl,mmp-uart", "intel,xscale-uart";
	reg = <0xd4026000 0x1000>;
	reg-shift = <2>;
	interrupts = <29>;
	clocks = <&soc_clocks PXA168_CLK_UART2>;
	resets = <&soc_clocks PXA168_CLK_UART2>;
	status = "okay";
};
/* Note: Mapped to same address as UART3 */
reset: reset@d4026000 {
	compatible = "chumby,chumby8-reboot-controller";
	reg = <0xd4026000 0x1000>;
	clocks = <&soc_clocks PXA168_CLK_UART2>;
	resets = <&soc_clocks PXA168_CLK_UART2>;
	status = "okay";
};

I don’t think it would be much fun to go line-by-line through the entire driver, but here are a few highlights of how it’s set up:

static struct chumby8_rebootcon {
	struct clk *clk;
	void __iomem *base;
	struct notifier_block nb;
} c8_poweroff;

I keep track of the UART’s peripheral clock. This is needed so that I can calculate the correct divisor to use in order to set up the UART for 115,200 baud before sending a command. The stock Chumby 2.6.28 reset code didn’t do this, so if something in userspace didn’t set up the baud rate correctly, the reboot/poweroff commands wouldn’t work. This was a nonissue with the stock Chumby firmware because it always set it up properly somewhere in userspace. I wanted to make sure it would always work in my updated kernel, regardless of what userspace had or hadn’t done. The base address is also saved so that I know where the UART registers are. Finally, “nb” is the notifier block which contains a function pointer to my restart callback.

When a command is ready to be sent to the UART, I manually set up various UART registers. I turn off the FIFOs, turn off interrupts, and set it up for 115,200 8N1.

#include <uapi/linux/serial_reg.h>

/* Disable UART interrupts and FIFOs */
chumby8_coproc_serial_out(UART_IER, UART_IER_UUE);
chumby8_coproc_serial_out(UART_FCR, 0);

/* Ensure UART is set for 115200, 8N1 */
clk_rate = clk_get_rate(c8_poweroff.clk);
divisor = clk_rate / 16 / 115200;
chumby8_coproc_serial_out(UART_LCR, UART_LCR_DLAB | UART_LCR_WLEN8);
chumby8_coproc_serial_out(UART_DLL, divisor & 0xFF);
chumby8_coproc_serial_out(UART_DLM, divisor >> 8);
chumby8_coproc_serial_out(UART_LCR, UART_LCR_WLEN8);

Luckily, the 16550A UART is so common that Linux has a header that includes defines for the register offsets and bits. Note that the PXA168 UART has a special non-standard UUE (UART unit enable) bit in the interrupt enable register that has to be set to a 1 in order to be used.

Reading data involves checking to make sure that data is actually available to read, and then grabbing it from the register if so:

int c = -1;

if (chumby8_coproc_serial_in(UART_LSR) & UART_LSR_DR) {
	c = chumby8_coproc_serial_in(UART_RX);
}

Writing data works similarly. I wait until the UART is ready to send another byte, and then write it out:

/* wait for transmit available */
while (!(chumby8_coproc_serial_in(UART_LSR) & UART_LSR_THRE));

/* send the char */
chumby8_coproc_serial_out(UART_TX, c);

Other than my manual register accesses described above, it’s all just the same basic solution that the original Chumby code used. The original code had a few delays, and we know it works okay, so I replicated the delays exactly as they were: 300 microseconds between checks for ‘?’ after “!!!!”, and 1 millisecond between each character when writing out the final command.

After enabling this driver, adding it to my device tree as shown above, and rebooting, I tried it out. It succeeded! Now, when I reboot, I get the following output:

The system is going down NOW!
Sent SIGTERM to all processes
Sent SIGKILL to all processes
Requesting system reboot
[   17.097732] reboot: Restarting system

At that point, the system reboots instead of complaining a second later. The poweroff command is also slightly different from its previous output. It now prints this:

The system is going down NOW!
Sent SIGTERM to all processes
Sent SIGKILL to all processes
Requesting system poweroff
[   28.257771] reboot: Power down

Previously the last line said “reboot: System halted” instead, and it wasn’t actually turning off. I had to press the power button once to turn it off, and then again to turn it back on. Now it definitely turns off. Pressing the power button once turns it back on.

While developing in the kernel, I have occasionally run into situations where I make a mistake and the kernel gets hung up or crashes. Luckily, reset drivers play nicely with the magic SysRq functionality to reboot the system. This is enabled with CONFIG_MAGIC_SYSRQ_SERIAL=y and CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE=0x1. In minicom, I activate the magic SysRq by typing Ctrl-A, then F. This sends a break, and then I can type the letter b to reboot. Even with the system in bad shape, my reset driver still works properly and reboots it. It’s much more convenient than fiddling with the power button.

This ended up being fairly simple to implement. It has been really nice to have the reset/poweroff driver working. I’m undecided on whether I want to attempt upstreaming this driver to the mainline kernel or not. I suspect that I will always be maintaining a fork of the kernel with a few custom tweaks, especially related to the display driver. The reset driver is so simple that I’m not super concerned about the difficulty of keeping it updated as the kernel evolves. We’ll see how it goes though.

Speaking of the display driver, my next post in the series is going to talk about what was involved in getting the LCD up and running. There was a lot of work required. I’m not sure if I can even fit it into a single post! Stay tuned and you’ll find out.

Click here to go to part 5, where I get the display working.

Trackback

3 comments

  1. […] Doug Brown ☛ Upgrading my Chumby 8 kernel part 4: reboot/poweroff […]

  2. […] Read more on the Downtown Doug Brown blog. […]

  3. this is amazing, I walked into work today to grab a broken ps4 controller to have at least one working, and got handed a chumby one new in box, hopefully can turn it into something fun cheers

Add your comment now