e1000e 0000:00:1f.6 ethX/ensX : Detected Hardware Unit Hang

So recently one of my hetzner servers were going down on its own. It happened 2 times. 1st time I just rebooted the server from robot control panel and assumed it won’t happen again (I was wrong). 3 days later it happened again since this server didn’t come with IPMI/KVMoIP to check console screen I decided to check syslog/messages/kern.log files in /var/log to find anything meaningful for this issue.

kernel: [952571.824765] e1000e 0000:00:1f.6 enp0s31f6: Detected Hardware Unit Hang:
kernel: [952571.824765] TDH <68>
kernel: [952571.824765] TDT <9a>
kernel: [952571.824765] next_to_use <9a>
kernel: [952571.824765] next_to_clean <68>
kernel: [952571.824765] buffer_info[next_to_clean]:
kernel: [952571.824765] time_stamp <10e30aba8>
kernel: [952571.824765] next_to_watch <69>
kernel: [952571.824765] jiffies <10e30b000>
kernel: [952571.824765] next_to_watch.status <0>
kernel: [952571.824765] MAC Status <80083>
kernel: [952571.824765] PHY Status <796d>
kernel: [952571.824765] PHY 1000BASE-T Status <3800>
kernel: [952571.824765] PHY Extended Status <3000>
kernel: [952571.824765] PCI Status <10>

digging through google I found out that the issue is related to kernel version 4.15 fix in the e1000e driver which has been introduced here .


ethtool -K <interface> tso off gso off

