git.sur5r.net Git - bacula/docs/blob - docs/manual/kaboom.tex

   1 %%
   2 %%
   3
   4 \section*{What To Do When Bacula Crashes (Kaboom)}
   5 \label{_ChapterStart47}
   6 \index[general]{Kaboom!What To Do When Bacula Crashes }
   7 \index[general]{What To Do When Bacula Crashes (Kaboom) }
   8 \addcontentsline{toc}{section}{What To Do When Bacula Crashes (Kaboom)}
   9
  10 If you are running on a Linux system, and you have a set of working
  11 configuration files, it is very unlikely that {\bf Bacula} will crash. As with
  12 all software, however, it is inevitable that someday, it may crash,
  13 particularly if you are running on another operating system or using a new or
  14 unusual feature.
  15
  16 This chapter explains what you should do if one of the three {\bf Bacula}
  17 daemons (Director, File, Storage) crashes.
  18
  19 \subsection*{Traceback}
  20 \index[general]{Traceback }
  21 \addcontentsline{toc}{subsection}{Traceback}
  22
  23 Each of the three Bacula daemons has a built-in exception handler which, in
  24 case of an error, will attempt to produce a traceback. If successful the
  25 traceback will be emailed to you.
  26
  27 For this to work, you need to ensure that a few things are setup correctly on
  28 your system:
  29
  30 \begin{enumerate}
  31 \item You must have an installed copy of {\bf gdb} (the GNU debugger),  and it
  32    must be on {\bf Bacula's} path. On some systems such as Solaris, {\bf
  33    gdb} may be replaced by {\dbx}.
  34 \item The Bacula installed script file {\bf btraceback} must  be in the same
  35    directory as the daemon which dies, and it must  be marked as executable.
  36 \item The script file {\bf btraceback.gdb} must  have the correct  path to it
  37    specified in the {\bf btraceback} file.
  38 \item You must have a {\bf mail} program which is on {\bf Bacula's}  path.
  39    By default, this {\bf mail} program is set to {\bf bsmtp}, so it must
  40    be correctly configured.
  41 \end{enumerate}
  42
  43 If all the above conditions are met, the daemon that crashes will produce a
  44 traceback report and email it to you. If the above conditions are not true,
  45 you can either run the debugger by hand as described below, or you may be able
  46 to correct the problems by editing the {\bf btraceback} file. I recommend not
  47 spending too much time on trying to get the traceback to work as it can be
  48 very difficult.
  49
  50 The changes that might be needed are to add a correct path to the {\bf gdb}
  51 program, correct the path to the {\bf btraceback.gdb} file, change the {\bf
  52 mail} program or its path, or change your email address. The key line in the
  53 {\bf btraceback} file is:
  54
  55 \footnotesize
  56 \begin{verbatim}
  57 gdb -quiet -batch -x /home/kern/bacula/bin/btraceback.gdb \
  58      $1 $2 2>\&1 | bsmtp -s "Bacula traceback" your-address@xxx.com
  59 \end{verbatim}
  60 \normalsize
  61
  62 Since each daemon has the same traceback code, a single btraceback file is
  63 sufficient if you are running more than one daemon on a machine.
  64
  65 \subsection*{Testing The Traceback}
  66 \index[general]{Traceback!Testing The }
  67 \index[general]{Testing The Traceback }
  68 \addcontentsline{toc}{subsection}{Testing The Traceback}
  69
  70 To "manually" test the traceback feature, you simply start {\bf Bacula} then
  71 obtain the {\bf PID} of the main daemon thread (there are multiple threads).
  72 Unfortunately, the output had to be split to fit on this page:
  73
  74 \footnotesize
  75 \begin{verbatim}
  76 [kern@rufus kern]$ ps fax --columns 132 | grep bacula-dir
  77  2103 ?        S      0:00 /home/kern/bacula/k/src/dird/bacula-dir -c
  78                                        /home/kern/bacula/k/src/dird/dird.conf
  79  2104 ?        S      0:00  \_ /home/kern/bacula/k/src/dird/bacula-dir -c
  80                                        /home/kern/bacula/k/src/dird/dird.conf
  81  2106 ?        S      0:00      \_ /home/kern/bacula/k/src/dird/bacula-dir -c
  82                                        /home/kern/bacula/k/src/dird/dird.conf
  83  2105 ?        S      0:00      \_ /home/kern/bacula/k/src/dird/bacula-dir -c
  84                                        /home/kern/bacula/k/src/dird/dird.conf
  85 \end{verbatim}
  86 \normalsize
  87
  88 which in this case is 2103. Then while Bacula is running, you call the program
  89 giving it the path to the Bacula executable and the {\bf PID}. In this case,
  90 it is:
  91
  92 \footnotesize
  93 \begin{verbatim}
  94 ./btraceback /home/kern/bacula/k/src/dird 2103
  95 \end{verbatim}
  96 \normalsize
  97
  98 It should produce an email showing you the current state of the daemon (in
  99 this case the Director), and then exit leaving {\bf Bacula} running as if
 100 nothing happened. If this is not the case, you will need to correct the
 101 problem by modifying the {\bf btraceback} script.
 102
 103 Typical problems might be that {\bf gdb} is not on the default path. Fix this
 104 by specifying the full path to it in the {\bf btraceback} file. Another common
 105 problem is that the {\bf mail} program doesn't work or is not on the default
 106 path. On some systems, it is preferable to use {\bf Mail} rather than {\bf
 107 mail}.
 108
 109 \subsection*{Getting A Traceback On Other Systems}
 110 \index[general]{Getting A Traceback On Other Systems }
 111 \index[general]{Systems!Getting A Traceback On Other }
 112 \addcontentsline{toc}{subsection}{Getting A Traceback On Other Systems}
 113
 114 It should be possible to produce a similar traceback on systems other than
 115 Linux, either using {\bf gdb} or some other debugger. Solaris with {\bf gdb}
 116 loaded works quite fine. On other systems, you will need to modify the {\bf
 117 btraceback} program to invoke the correct debugger, and possibly correct the
 118 {\bf btraceback.gdb} script to have appropriate commands for your debugger. If
 119 anyone succeeds in making this work with another debugger, please send us a
 120 copy of what you modified.
 121 \label{ManuallyDebugging}
 122
 123 \subsection*{Manually Running Bacula Under The Debugger}
 124 \index[general]{Manually Running Bacula Under The Debugger }
 125 \index[general]{Debugger!Manually Running Bacula Under The }
 126 \addcontentsline{toc}{subsection}{Manually Running Bacula Under The Debugger}
 127
 128 If for some reason you cannot get the automatic traceback, or if you want to
 129 interactively examine the variable contents after a crash, you can run Bacula
 130 under the debugger. Assuming you want to run the Storage daemon under the
 131 debugger (the technique is the same for the other daemons, only the name
 132 changes), you would do the following:
 133
 134 \begin{enumerate}
 135 \item Start the Director and the File daemon. If the  Storage daemon also
 136    starts, you will need to find its PID  as shown above (ps fax | grep
 137    bacula-sd) and kill it  with a command like the following:
 138
 139 \footnotesize
 140 \begin{verbatim}
 141       kill -15 PID
 142
 143 \end{verbatim}
 144 \normalsize
 145
 146 where you replace {\bf PID} by the actual value.
 147 \item At this point, the Director and the File daemon should  be running but
 148    the Storage daemon should not.
 149 \item cd to the directory containing the Storage daemon
 150 \item Start the Storage daemon under the debugger:
 151
 152    \footnotesize
 153 \begin{verbatim}
 154     gdb ./bacula-sd
 155
 156 \end{verbatim}
 157 \normalsize
 158
 159 \item Run the Storage daemon:
 160
 161    \footnotesize
 162 \begin{verbatim}
 163      run -s -f -c ./bacula-sd.conf
 164
 165 \end{verbatim}
 166 \normalsize
 167
 168 You may replace the {\bf ./bacula-sd.conf} with the full path  to the Storage
 169 daemon's configuration file.
 170 \item At this point, Bacula will be fully operational.
 171 \item In another shell command window, start the Console program  and do what
 172    is necessary to cause Bacula to die.
 173 \item When Bacula crashes, the {\bf gdb} shell window will  become active and
 174    {\bf gdb} will show you the error that  occurred.
 175 \item To get a general traceback of all threads, issue the following  command:
 176
 177
 178 \footnotesize
 179 \begin{verbatim}
 180        thread apply all bt
 181
 182 \end{verbatim}
 183 \normalsize
 184
 185 After that you can issue any debugging command.
 186 \end{enumerate}
 187
 188 \subsection*{Getting Debug Output from Bacula}
 189 \index[general]{Getting Debug Output from Bacula }
 190 \addcontentsline{toc}{subsection}{Getting Debug Output from Bacula}
 191
 192 Each of the daemons normally has debug compiled into the program, but
 193 disabled. There are two ways to enable the debug output. One is to add the
 194 {\bf -d nnn} option on the command line when starting the debugger. The {\bf
 195 nnn} is the debug level, and generally anything between 50 and 200 is
 196 reasonable. The higher the number, the more output is produced. The output is
 197 written to standard output.
 198
 199 The second way of getting debug output is to dynamically turn it on using the
 200 Console using the {\bf setdebug} command. The full syntax of the command is:
 201
 202 \footnotesize
 203 \begin{verbatim}
 204  setdebug level=nnn client=client-name storage=storage-name dir
 205 \end{verbatim}
 206 \normalsize
 207
 208 If none of the options are given, the command will prompt you. You can
 209 selectively turn on/off debugging in any or all the daemons (i.e. it is not
 210 necessary to specify all the components of the above command).