Table des matières

Réseau Linux pile TCP/IP

Brouillon, Réseau, Linux, Kernel, TCP, CA

Réseau Linux pile TCP/IP

Voir aussi :

MPTCP, SCTP, DCCP

Voir :

man 7 tcp

Contrack

Voir :

/proc/net/nf_conntrack
/proc/sys/net/nf_conntrack_max

apt-get install conntrack

Flush

conntrack -F

/proc/sys/net/ipv4/tcp_syn_retries

$ sysctl net.ipv4.tcp_syn_retries
net.ipv4.tcp_syn_retries = 6

Effectively, this takes 1+2+4+8+16+32+64=127s before the connection finally aborts.

/proc/sys/net/ipv4/tcp_synack_retries

/proc/sys/net/ipv4/tcp_retries2

Voir :

Voir aussi :

/proc/sys/net/ipv4/tcp_retries
/proc/sys/net/ipv4/tcp_syn_retries
/proc/sys/net/ipv4/tcp_synack_retries

Cluster

In a High Availability (HA) situation consider decreasing the setting to 3.

RFC 1122 recommends at least 100 seconds for the timeout, which corresponds to a value of at least 8. Oracle suggest a value of 3 for a RAC configuration.

Source : https://access.redhat.com/solutions/726753

Nb de retransmissions vs temps

An experiment confirms that (on a recent Linux at least) the timeout is more like 13s with the suggested net.ipv4.tcp_retries2=5

“Windows defaults to just 5 retransmissions which corresponds with a timeout of around 6 seconds.” “Five retransmissions corresponds with a timeout of around six seconds.” tcp_retries2=5 means timeout with first transmission plus 5 retransmissions: 12.6 seconds=(2^6 - 1) * 0.2. tcp_retries2=15: 924.6 seconds=(2^10 - 1) * 0.2 + (16 - 10) * 120.

Source : https://github.com/elastic/elasticsearch/issues/102788

Voir aussi : https://www.elastic.co/guide/en/elasticsearch/reference/current/system-config-tcpretries.html#_related_configuration

F_RTO

https://access.redhat.com/solutions/4978771

TCP keepalive

Voir :

https://blog.cloudflare.com/when-tcp-sockets-refuse-to-die
- Python Scripts https://github.com/cloudflare/cloudflare-blog/tree/master/2019-09-tcp-keepalives

tcp_keepalive_time https://www.veritas.com/support/en_US/article.100028680

Configuring TCP/IP keepalive parameters for high availability clients (JDBC)

tcp_keepalive_probes - the number of probes that are sent and unacknowledged before the client considers the connection broken and notifies the application layer
tcp_keepalive_time - the interval between the last data packet sent and the first keepalive probe
tcp_keepalive_intvl - the interval between subsequent keepalive probes
tcp_retries2 - the maximum number of times a packet is retransmitted before giving up

echo "6" > /proc/sys/net/ipv4/tcp_keepalive_time
echo "1" > /proc/sys/net/ipv4/tcp_keepalive_intvl
echo "10" > /proc/sys/net/ipv4/tcp_keepalive_probes
echo "3" > /proc/sys/net/ipv4/tcp_retries2

Source : https://www.ibm.com/docs/en/db2/9.7?topic=ctkp-configuring-operating-system-tcpip-keepalive-parameters-high-availability-clients

ss -o

Process / diag tools

Voir :

https://access.redhat.com/solutions/30453

Outils

TCP retransmissions

Voir :

http://arthurchiao.art/blog/tcp-retransmission-may-be-misleading/
net.ipv4.tcp_early_retrans

Outils :

tcpretrans.bt (bpftrace)
tcpretrans (perf-tools)
tcpretrans.py (bpfcc-tools - iovisor/bcc))

Connaitre le rto_min et le rto_max

# grep ^Tcp /proc/net/snmp |column -t |cut -c1-99
Tcp:  RtoAlgorithm  RtoMin  RtoMax  MaxConn  ActiveOpens  PassiveOpens  AttemptFails  EstabResets
Tcp:  1             200     120000  -1       6834         964           161           4614

yum install bpftrace
/usr/share/bcc/tools/tcpretrans

timeout 60 ./tcpretrans | nl

sar -n ETCP
sar -n TCP

# netstat -s |egrep 'segments retransmited|segments send out'
    107428604792 segments send out
    47511527 segments retransmited

# echo "$(( 47511527 * 10000 / 107428604792 ))"
4

https://www.ibm.com/support/pages/tracking-tcp-retransmissions-linux

tcpretransmits.sh

#! /usr/bin/bash
 
test -x /usr/sbin/tcpretrans.bt && TCPRETRANS=/usr/sbin/tcpretrans.bt
test -x /usr/share/bpftrace/tools/tcpretrans.bt && TCPRETRANS=/usr/share/bpftrace/tools/tcpretrans.bt
# https://github.com/brendangregg/perf-tools/blob/master/net/tcpretrans
test -x ./tcpretrans.pl && TCPRETRANS=./tcpretrans.pl
 
OUT=/tmp/tcpretransmits.log
 
if [ -z "$TCPRETRANS" ]; then
  echo "It looks like 'bpftrace' is not installed"
else
  date > $OUT
  netstat -s |awk '/segments sen. out$/ { R=$1; } /segments retransmit+ed$/ { printf("%.4f\n", ($1/R)*100); }' >> $OUT
  $TCPRETRANS | tee -a $OUT
  netstat -s |awk '/segments sen. out$/ { R=$1; } /segments retransmit+ed$/ { printf("%.4f\n", ($1/R)*100); }' >> $OUT
fi

Resolving The Problem
TCP retransmissions are almost exclusively caused by failing network hardware, not applications or middleware. Report the failing IP pairs to a network administrator.

Autres

horodatages TCP https://access.redhat.com/documentation/fr-fr/red_hat_enterprise_linux/9/html/monitoring_and_managing_system_status_and_performance/benefits-of-tcp-timestamps_tuning-the-network-performance

tcp_low_latency (Boolean; default: disabled; since Linux 2.4.21/2.6; obsolete since Linux 4.14)

net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_sack = 1
net.ipv4.tcp_moderate_rcvbuf = 1

# ip route get 192.168.100.11
192.168.100.11 dev virbr1 src 192.168.100.1 uid 1000 
    cache 
    
# ip route show dev virbr1
192.168.100.0/24 proto kernel scope link src 192.168.100.1


# ip route change dev virbr1 192.168.100.0/24 proto kernel scope link src 192.168.100.1 rto_min 8ms