1 Subject: RE: [Bacula-users] monitoring bacula with Nagios
2 From: "Julian Hein" <jhein@netways.de>
3 To: <bacula-users@lists.sourceforge.net>
7 > Anyway: I would really like to write such a check_bacula
9 > don't know what I need to implement to achive a successful
10 > authentication. And maybe to get some infos out. Like current
12 > jobs, runtime or so.
14 We are checking bacula with Nagios in two ways: First we check all servers if the neccessary services are running, like the fd on all bacula clients (windows & linux), directors, sd, etc. And the second check is to look in baculas mysql database if there is a successful job for every host within the last 24 hours:
17 1. Check if the fd is running
18 =============================
24 check_command check_spezial_procs_by_ssh!2:!1:!bacula-fd
27 check_command check_spezial_procs_by_ssh!2:!1:!bacula-sd
30 check_command check_spezial_procs_by_ssh!2:!1:!bacula-dir
33 check_command check_nt_service!bacula
38 # check for services by name with ssh
40 command_name check_spezial_procs_by_ssh
41 command_line $USER1$/check_by_ssh -t 60 -H $HOSTADDRESS$ -C "/opt/nagios/libexec/check_procs -w $ARG1$ -c $ARG2$ -C $ARG3$"
44 # check for the bacula-fd on windows with nsclient
46 command_name check_nt_service
47 command_line $USER1$/check_nt -H $HOSTADDRESS$ -p portno. -s password -v SERVICESTATE -l $ARG1$
50 2. Is there a successful job in the database
51 ============================================
57 check_command check_bacula_by_ssh!27!1!1
61 The name of our backup jobs have to match the hostname in Nagios. So we can check on the backup server, for a job called $HOSTNAME$:
64 command_name check_bacula_by_ssh
65 command_line $USER1$/check_by_ssh -t 60 -H my.backup.server -C "/opt/nagios/libexec/check_bacula.pl -H $ARG1$ -w $
66 ARG2$ -c $ARG3$ -j $HOSTNAME$"
99 my $progname = basename($0);
101 my %ERRORS = ( 'UNKNOWN' => '-1',
106 Getopt::Long::Configure('bundling');
109 "c=s" => \$opt_critical, "critical=s" => \$opt_critical,
110 "w=s" => \$opt_warning, "warning=s" => \$opt_warning,
111 "H=s" => \$opt_hours, "hours=s" => \$opt_hours,
112 "j=s" => \$opt_job, "job=s" => \$opt_job,
113 "h" => \$opt_help, "help" => \$opt_help,
114 "usage" => \$opt_usage,
115 "V" => \$opt_version, "version" => \$opt_version
116 ) || die "Try '$progname --help' for more information.\n";
120 print "PRINT HELP...\n";
125 print "PRINT USAGE...\n";
130 my $now = defined $_[0] ? $_[0] : time;
131 my $out = strftime("%Y-%m-%d %X", localtime($now));
137 my $now = defined $_[0] ? $_[0] : time;
138 my $new = $now - ((60*60*1) * $day);
139 my $out = strftime("%Y-%m-%d %X", localtime($new));
145 exit $ERRORS{'UNKNOWN'};
150 exit $ERRORS{'UNKNOWN'};
154 print "$progname 0.0.1\n";
155 exit $ERRORS{'UNKNOWN'};
159 if ($opt_job && $opt_warning && $opt_critical) {
160 my $dsn = "DBI:mysql:database=bacula;host=localhost";
161 my $dbh = DBI->connect( $dsn,'root','' ) or die "Error connecting to: '$dsn': $DBI::errstr\n";
165 $date_stop = get_date($opt_hours);
169 $date_stop = '1970-01-01 01:00:00';
172 $date_start = get_now();
174 $sql = "SELECT count(*) as 'count' from Job where (Name='$opt_job') and (JobStatus='T') and (EndTime <> '') and ((EndTime <= '$date_start') and (EndTime >= '$date_stop'));";
176 my $sth = $dbh->prepare($sql) or die "Error preparing statemment",$dbh->errstr;
179 while (my @row = $sth->fetchrow_array()) {
183 if ($count<$opt_warning) { $state='WARNING' }
184 if ($count<$opt_critical) { $state='CRITICAL' }
186 print "Bacula $state: Found $count successfull jobs\n";
187 exit $ERRORS{$state};
194 Well, this script is not really finished, but it works for us. Maybe it is helpful for you. If somebody makes enhancements, I would be happy to recieve a copy.
200 Julian Hein NETWAYS GmbH
201 Managing Director Deutschherrnstr. 47a
202 Fon.0911/92885-0 D-90429 Nürnberg
204 jhein@netways.de www.netways.de