> It's only when you assume that the link beat is a "serious" sign for
> link healthiness. Unfortunately there are many error cases where a link
> can fail, but the link beat is still there - for example the software
> on the other machine crashing but the NIC still working fine. These
> seem to be the majority of the cases in fact except for demo situations
> where people pull cables on purpose.
I disagree. There are two very real situations that come to my mind
spontaneously:
-Many telecom providers loop back a synchronous serial line whenever the
connection fails somewhere on the WAN. While this actually isn't a link
beat, it can be detected as an interface failure instantly
(!IFF_RUNNING) and is therefore much faster than any (even highly tuned)
routing protocol.
-You have two redundant routers pointing into an ethernet segment, and
if the NIC of one router starts failing, the switch might decide to turn
off the port in question because of excessive errors (many switches do).
Or the switch may simply lose power. Without link beat detection, the
router in question cannot see these situations and will happily continue
advertising the connected network, making the automated backup
mechanism fail. ARP probing is no solution as it can only detect a host
on the network going down, not the interface.
Of course, link beat detection is not the magic lantern making all
networking problems vanish, but from my networking experience it is
important enough to support it.
Stefan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/