Difference between revisions of "Sar/Visualize CPU data"
Line 13: | Line 13: | ||
=== Dumping the <tt>sar</tt> data with <tt>sadf</tt> === |
=== Dumping the <tt>sar</tt> data with <tt>sadf</tt> === |
||
The data <tt>sar</tt> collects is in binary format and needs to be converted first to a format that can be imported into <tt>R</tt>. This is done with the <tt>sadf</tt> command which converts the collected data into tabular data delimited by semicolon. <br /> |
The data <tt>sar</tt> collects is in binary format and needs to be converted first to a format that can be imported into <tt>R</tt>. This is done with the <tt>sadf</tt> command which converts the collected data into tabular data delimited by semicolon. <br /> |
||
'''Note:''' On CentOS 6 and higher the <tt>sadf</tt> command also prints a header file to make most use of it we need to slightly changes it like remove the leading <tt>#</tt>, plus remove the <tt>%</tt> from the cpu data but only in the first. Other lines starting with <tt>#</tt> or containing a <tt>LINUX-RESTART</tt> should also be removed. |
'''Note:''' On CentOS 6 and higher the <tt>sadf</tt> command also prints a header file to make most use of it we need to slightly changes it like remove the leading <tt>#</tt>, plus remove the <tt>%</tt> from the cpu data but only in the first. Other lines starting with <tt>#</tt> or containing a <tt>LINUX-RESTART</tt> should also be removed. Your milage may vary! |
||
sadf -t -d -P ALL <span class="input"><SAR-FILE></span> | \ |
sadf -t -d -P ALL <span class="input"><SAR-FILE></span> | \ |
||
sed -e '1,1s/\(^#\|%\)//g' \ |
sed -e '1,1s/\(^#\|%\)//g' \ |
Revision as of 21:08, 8 June 2014
This is a five minute guide how to visualize Linux's sar data provided by the sysstat utility without a lot of mangeling the data.
Goal
Create CPU graphs in R from the sar utility without massaging the output data too much.
Prerequisites
- The Linux sysstat package installed and configured to report performance data.
- R
- ggplot2 R library
Howto
Dumping the sar data with sadf
The data sar collects is in binary format and needs to be converted first to a format that can be imported into R. This is done with the sadf command which converts the collected data into tabular data delimited by semicolon.
Note: On CentOS 6 and higher the sadf command also prints a header file to make most use of it we need to slightly changes it like remove the leading #, plus remove the % from the cpu data but only in the first. Other lines starting with # or containing a LINUX-RESTART should also be removed. Your milage may vary!
sadf -t -d -P ALL <SAR-FILE> | \ sed -e '1,1s/\(^#\|%\)//g' \ -e '/\(^#\|LINUX-RESTART\)/d' \ > <SADF-OUTPUT>
Importing the data into R
The next step is to read the tabular data into R and print the graphs there are just a handful of commands to do this. In R type the following commands.
library( ggplot2 )
cpu.data <- read.csv( file="<SADF-OUTPUT>", sep=";" )
cpu.data$timestamp <- as.POSIXct( cpu.data$timestamp )
cpu.data$CPU[ cpu.data$CPU == "-1" ] <- "all"
cpu.graph <- ggplot( data=cpu.data, aes( x=timestamp, y=user, group=CPU, colour=CPU ) )
cpu.graph + geom_line()
Will result in a graph like this:
Plotting each CPU separately
To show which CPU or core is used most it is probably better to separately print the CPUs. The ggplot2 library comes with a nifty command called facet_grid(). To print each separately simply add it to the end of the previous command.
cpu.graph + geom_line() + facet_grid( CPU~. )
Which will result in a graph like this.
Saving the graph to a file
Especially when automation comes into play saving to a file is a must this example shows how to save to PNG.
setwd( "/home/r-user/cpu-data" ) png( "cpu-graph-grid.png", width=600, height=360, res=72 ) cpu.graph + geom_line() + facet_grid( CPU~. ) dev.off()