Hardware Watchdog

This chapter provides an overview of the K3 RTI-Windowed Watchdog Timer (WWDT) driver, designed to support the watchdog functionality found in TI’s AM64x SoCs. The WWDT offers a digital windowed watchdog mechanism, where a specific time window is defined for servicing the watchdog. If the watchdog is serviced outside this configurable window or fails to be serviced within it, the system responds by generating an interrupt to the MCU Error Signaling Module (ESM). The ESM then processes these interrupts and, if necessary, triggers the reset logic to ensure the device is reset, maintaining system reliability.

Note

Per default we configure our BSPs to let the ESM module reset the devices. This is enabled by the u-boot bootloader through the CONFIG_ESM_K3 R5 config in combination with the esm-pins properties from the device tree.

Timeout Configuration

In our BSP, systemd is the default userspace program responsible for handling the watchdog keepalive signal. The timeout is set to 60 seconds by default, with systemd sending keepalive pings at half that interval. Configuration is available at /usr/lib/systemd/system.conf.d/10-watchdog.conf.

/usr/lib/systemd/system.conf.d/10-watchdog.conf
[Manager]
RuntimeWatchdogSec=60
RebootWatchdogSec=120

Note

Please note that systemd now includes native hardware watchdog support and no longer relies on an service for this functionality.

For more details please refer to systemd’s documentation: https://www.freedesktop.org/software/systemd/man/latest/systemd-system.conf.html#Hardware%20Watchdog

Alternatively, you can configure the watchdog timeout by setting the parameter via the kernel command line - such as through the bootloader — using rti_wdt.heartbeat=10, instead of modifying the 10-watchdog.conf file.

Set timeout
sh-uboot:~# setenv optargs "rti_wdt.heartbeat=10"
sh-uboot:~# saveenv
sh-uboot:~# boot

Verify the Watchdog

You can check what process is handling your watchdog device by simply checking what process holds the device handle.

journalctl
sh-phyboard-electra-am64xx-2:~# fuser /dev/watchdog0
1 # 1 is the process id of systemd

Additionally, you can log into the system and use journalctl to view information about the active watchdog device, configured timeout, and related details.

journalctl
sh-phyboard-electra-am64xx-2:~# journalctl | grep -i watchdog
phyboard-electra-am64xx-2 systemd[1]: Using hardware watchdog 'K3 RTI Watchdog', version 0, device /dev/watchdog0
phyboard-electra-am64xx-2 systemd[1]: Modifying watchdog timeout is not supported, reusing the programmed timeout.
phyboard-electra-am64xx-2 systemd[1]: Watchdog running with a timeout of 10s.

Trigger the Watchdog

To verify that your device resets correctly in the event of a system failure, you can deliberately trigger a kernel panic and observe whether the watchdog causes a reset. The following command forces a kernel panic using the Linux SysRq mechanism:

Trigger Kernel panic
sh-phyboard-electra-am64xx-2:~# echo c > /proc/sysrq-trigger

/proc/sysrq-trigger is part of the Magic SysRq key mechanism in the Linux kernel. It allows users to send low-level commands directly to the kernel, even when the system is under heavy load or otherwise unresponsive (unless it is completely locked up).

By writing c to /proc/sysrq-trigger, you trigger an immediate kernel panic. This simulates a critical failure, providing a controlled way to test system behavior during a crash.

More information about SysRq commands can be found in the official Linux kernel documentation: https://www.kernel.org/doc/html/v4.10/admin-guide/sysrq.html

This test simulates a real-world system failure. In production environments, such failures should be detected and handled automatically - typically using a hardware watchdog timer.

Triggering a kernel panic allows you to confirm:

  • The system crashes as expected.

  • The watchdog detects the failure.

  • The system resets in response, demonstrating automatic recovery.

This test helps ensure your watchdog configuration is effective and that your system can recover from unrecoverable errors autonomously.

Disable Reset

The simplest way to prevent the reset is by disabling the CONFIG_ESM_K3 driver in U-Boot or modifying the ESM routing using the esm-pins configuration.