*** John Goerzen [2020-12-28 12:32]:
>1) No incoming or outgoing NNCP packets to the site;
>2) No response to PING commands in that amount of time?
>I'm guessing from the documentation the answer is #1.

Yes, it is #1. Logic is simple, as I can see it in the code (written
years ago):

* if nothing (except for PINGs) were received and sent during
  onlinedeadline time, then drop the connection.

  onlinedeadline is just a timeout mechanism to decide when peer must
  terminate the connection. If peer has no packets to send, then should
  it terminate it? Obviously no, because remote side can have the
  packets for sending. If the node will send some notification that it
  has no packets, then connection will be some kind of half-closed, but
  remote side can send packets for a long time, during which new packets
  can appear on our "already closed".

  Actually I have not thought much about how to close connections and
  negotiate that it can be closed by both sides. I just used that simple
  timeout of lack of traffic.

  So onlinedeadline=20 means that if no packets (except for PINGs) were
  received during 20 seconds, then close the connection. If packet
  appeared on any of the side and it was transmitted, then
  onlinedeadline timer is reset of course for waiting another 20 seconds

* if maxonlinetime was specified, then connection will be forcefully
  terminated at (connection establish time + maxonlinetime). Actually I
  have added it as a hack. For example you have some limitations (quota,
  speed, whatever) with you communication channel. But they depend on
  exact daytime. For example you can use maximal bandwidth in your
  office at night time, but have to limit it at working ours (say from
  09:00). You configure your "calls" section correspondingly. But if you
  use large onlinedeadline (for example 3600), then connection
  established at 8:00, which is bandwidth-limit-less, where packet
  appeared at 8:50, won't be terminated at 09:00, because that packet
  resets onlinedeadline timer. Practically that connection can live
  forever, if packets appearing at least once per onlinedeadline. And
  practically there will be alive bandwidth-limit-less connection at
  9:00, 10:00 and any time. maxonlinetime just allows to forcefully
  terminate it, allowing new 09:00 connections to use another set of
  limits

* every minute, if no other packet was sent, then PING packet is sent
  (it is just dummy empty-payload packet, but fully
  encrypted/authenticated, so we are sure that it is not some kind of
  replay)

* of no packets were received during 2*PING timeouts (2 minutes), then
  treat remote side as dead and terminate connection

>That raises the
>question: can the code be configured to use SO_KEEPALIVE or a protocol-level
>ping to hold the TCP connection open?  This would help, eg, for NAT devices
>with short timeouts or a remote that's crashed and rebooted (to detect that
>it is no longer communicating).

PINGs are already sent every minute. I think it is small enough for NATs
"heartbeating".

I believe (not sure) TCP keepalives won't help with NATs at all,
because, as I can see by default it has huge timeouts (2 hours) before
sending any heartbeats:
https://tldp.org/HOWTO/html_single/TCP-Keepalive-HOWTO/
https://webhostinggeeks.com/howto/configure-linux-tcp-keepalive-setting/
default FreeBSD sysctl options has the same huge values:

    net.inet.tcp.keepidle: 7200000
    net.inet.tcp.keepintvl: 75000
    net.inet.tcp.keepinit: 75000
    net.inet.tcp.keepcnt: 8

>Complexity is a source of security bugs.

Completely agree with you. But I still will think about "multicasting"
next year, unrelated to your use-case. Probably burying that idea :-),
if won't find it simple enough and with valuable use-cases.

-- 
Sergey Matveev (http://www.stargrave.org/)
OpenPGP: CF60 E89A 5923 1E76 E263  6422 AE1A 8109 E498 57EF