Difference between revisions of "Network/Visualize ping latency"

From braindump
Jump to navigation Jump to search
(Created page with "{{DISPLAYTITLE: Visualize ping latency with R}} I got recently into a troubleshooting session where a performance issue between two hosts looked like a problem with the networ...")
 
Line 35: Line 35:
}' >> ${file}" #2>/dev/null
}' >> ${file}" #2>/dev/null
done
done

The collected data file looks like the exerpt below:
'''datetime;source_ip;target_ip;proto;sequence;ttl;time_ms'''
2014-06-20 23:30:05;10.0.0.34;172.16.0.34;icmp;1;64;10.7
2014-06-20 23:30:06;10.0.0.34;172.16.0.34;icmp;2;64;10.0
2014-06-20 23:30:07;10.0.0.34;172.16.0.34;icmp;3;64;4.61
2014-06-20 23:30:08;10.0.0.34;172.16.0.34;icmp;4;64;11.5

=== Combining the files ===
Depending on the time span the test is running the files can become quite large. For the analysis the data of all the files needs to be merged into one large file. Best done with a loop once more.


== See also ==
== See also ==

Revision as of 22:42, 20 June 2014

I got recently into a troubleshooting session where a performance issue between two hosts looked like a problem with the network. Network round trip time that should be in the range of 0.2ms shot up to around 2000ms every 30 seconds. To fence the problem I set up ad hoc ping probes on multiple hosts and multiple to different targets. Letting the data collection run for a few days, I gathered data in a file that was 1.5 million lines large. To analyze the data I created a graph with R.

Goal

Gather a large amount of network latency data with ping in a semicolon separated file on multiple hosts and compare the data by plotting it out with R.

Prerequisites

  • A bunch of Linux hosts (other OS might work but the collector code needs to be adjusted)
  • R
  • Passwordless access to the hosts running the ping [optional]

Howto

Collect data

To collect the ping data an ssh connection to host collecting the data is made and the collected data is written to file. The HOSTS variable holds the ping source and ping destination as a colon ":" seperated list. The awk in this example expects the GNU version to be present.

HOSTS="ping01:pong01 ping02:pong02"
for hosts in ${HOSTS}; do 
    source=${hosts%%:*} 
    source_ip=$( getent hosts ${source} | cut -d " " -f 1 ) 
    target=${hosts##*:} 
    file=/var/tmp/$( date +%F )_${source}-${target}_ping-test.txt 
    ssh -f -n ${source} \
      "echo 'datetime;source_ip;target_ip;proto;sequence;ttl;time_ms' > ${file}; \
       ping -i 1 -n ${target} | \
       awk -F '[ =_]' -v source_ip=${source_ip} \
        '!/NA|^PING/{ 
           printf( \"%s;%s;%s;%s;%s;%s;%s\n\", 
                   strftime( \"%F %T\" ), 
                   source_ip, 
                   gensub( \":\", \"\", \"\", \$4 ), 
                   \$5, 
                   \$7, 
                   \$9, 
                   \$11 ); 
        }' >> ${file}" #2>/dev/null
done

The collected data file looks like the exerpt below:

datetime;source_ip;target_ip;proto;sequence;ttl;time_ms
2014-06-20 23:30:05;10.0.0.34;172.16.0.34;icmp;1;64;10.7
2014-06-20 23:30:06;10.0.0.34;172.16.0.34;icmp;2;64;10.0
2014-06-20 23:30:07;10.0.0.34;172.16.0.34;icmp;3;64;4.61
2014-06-20 23:30:08;10.0.0.34;172.16.0.34;icmp;4;64;11.5

Combining the files

Depending on the time span the test is running the files can become quite large. For the analysis the data of all the files needs to be merged into one large file. Best done with a loop once more.

See also