Add documentation about ActionOnPurge

[bacula/docs] / docs / manuals / en / concepts / newfeatures.tex
diff --git a/docs/manuals/en/concepts/newfeatures.tex b/docs/manuals/en/concepts/newfeatures.tex

index b5a620e6addcdb3699041729850d644496802c92..89bdc04c0ba38224c7dcf8344552282c320ecf17 100644 (file)
--- a/docs/manuals/en/concepts/newfeatures.tex
+++ b/docs/manuals/en/concepts/newfeatures.tex
@@ -2,16 +2,371 @@
  
  %%
  
-\chapter{New Features in 3.0.2}
+\chapter{New Features in 3.1.4 (Development Version}
+\label{NewFeaturesChapter}
+
+This chapter presents the new features that are currently under development
+in the 3.1.x versions to be released as Bacula version 3.2.0 sometime in
+late 2009 or early 2010.
+
+\section{Truncate volume after purge}
+\label{sec:actiononpurge}
+
+The Pool directive \textbf{ActionOnPurge=Truncate} instructs Bacula to truncate
+the volume when it is purged. It is useful to prevent disk based volumes from
+consuming too much space. 
+
+\begin{verbatim}
+Pool {
+  Name = Default
+  Action On Purge = Truncate
+  ...
+}
+\end{verbatim}
+
+\section{Maximum concurent jobs for Devices}
+\label{sec:maximumconcurentjobdevice}
+
+{\bf Maximum Concurrent Jobs} is a new Device directive in the Storage
+Daemon configuration permits setting the maximum number of Jobs that can
+run concurrently on a specified Device.  Using this directive, it is
+possible to have different Jobs using multiple drives, because when the
+Maximum Concurrent Jobs limit is reached, the Storage Daemon will start new
+Jobs on any other available compatible drive.  This facilitates writing to
+multiple drives with multiple Jobs that all use the same Pool.
+
+\section{Restore from Multiple Storage Daemons}
+\index[general]{Restore}
+
+Previously, you were able to restore from multiple devices in a single Storage
+Daemon. Now, Bacula is able to restore from multiple Storage Daemons. For
+example, if your full backup runs on a Storage Daemon with an autochanger, and
+your incremental jobs use another Storage Daemon with lots of disks, Bacula
+will switch automatically from one Storage Daemon to an other within the same
+Restore job.
+
+You must upgrade your File Daemon to version 3.1.3 or greater to use this feature.
+
+This project was funded by Bacula Systems with the help of Equiinet.
+
+\section{File Deduplication using Base Jobs}
+A base job is sort of like a Full save except that you will want the FileSet to
+contain only files that are unlikely to change in the future (i.e.  a snapshot
+of most of your system after installing it).  After the base job has been run,
+when you are doing a Full save, you specify one or more Base jobs to be used.
+All files that have been backed up in the Base job/jobs but not modified will
+then be excluded from the backup.  During a restore, the Base jobs will be
+automatically pulled in where necessary.
+
+This is something none of the competition does, as far as we know (except
+perhaps BackupPC, which is a Perl program that saves to disk only).  It is big
+win for the user, it makes Bacula stand out as offering a unique optimization
+that immediately saves time and money.  Basically, imagine that you have 100
+nearly identical Windows or Linux machine containing the OS and user files.
+Now for the OS part, a Base job will be backed up once, and rather than making
+100 copies of the OS, there will be only one.  If one or more of the systems
+have some files updated, no problem, they will be automatically restored.
+
+A new Job directive \texttt{Base=Jobx, Joby...} permits to specify the list of
+files that will be used during Full backup as base.
+
+\begin{verbatim}
+Job {
+   Name = BackupLinux
+   Level= Base
+   ...
+}
+
+Job {
+   Name = BackupZog4
+   Base = BackupZog4, BackupLinux
+   Accurate = yes
+   ...
+}
+\end{verbatim}
+
+In this example, the job \texttt{BackupZog4} will use the most recent version
+of all files contained in \texttt{BackupZog4} and \texttt{BackupLinux}
+jobs. Base jobs should have run with \texttt{level=Base} to be used.
+
+By default, Bacula will compare permissions bits, user and group fields,
+modification time, size and the checksum of the file to choose between the
+current backup and the BaseJob file list. You can change this behavior with the
+\texttt{BaseJob} FileSet option. This option works like the \texttt{verify=}
+one, that is described in the \ilink{FileSet}{FileSetResource} chapter.
+
+\begin{verbatim}
+FileSet {
+  Name = Full
+  Include = {
+    Options {
+       BaseJob  = pmugcs5
+       Accurate = mcs5
+       Verify   = pin5
+    }
+    File = /
+  }
+}
+\end{verbatim}
+
+
+This project was funded by Bacula Systems.
+
+
+\section{Accurate Fileset options}
+\label{sec:accuratefileset}
+
+In previous versions, the accurate code used the file creation and
+modification times to determine if a file was modified or not. Now you can specify
+which attributes to use (time, size, checksum, permission, owner, group,
+\dots), similar to the Verify options.
  
-This chapter presents the new features added to the development 3.0.2
-versions to be released as Bacula version 3.0.2 sometime in 2009.
+\begin{verbatim}
+FileSet {
+  Name = Full
+  Include = {
+    Options {
+       Accurate = mcs5
+       Verify   = pin5
+    }
+    File = /
+  }
+}
+\end{verbatim}
+
+\begin{description}  
+\item {\bf i}
+  compare the inodes  
+  
+\item {\bf p}
+  compare the permission bits  
+  
+\item {\bf n}
+  compare the number of links  
+  
+\item {\bf u}
+  compare the user id  
+  
+\item {\bf g}
+  compare the group id  
+  
+\item {\bf s}
+  compare the size  
+  
+\item {\bf a}
+  compare the access time  
+  
+\item {\bf m}
+  compare the modification time (st\_mtime)  
+  
+\item {\bf c}
+  compare the change time (st\_ctime)  
+  
+\item {\bf d}
+  report file size decreases  
+  
+\item {\bf 5}
+  compare the MD5 signature  
+  
+\item {\bf 1}
+  compare the SHA1 signature  
+\end{description}
+
+\textbf{Important note:} If you decide to use checksum in Accurate jobs,
+the File Daemon will have to read all files even if they normally would not
+be saved.  This increases the I/O load, but also the accuracy of the
+deduplication.  By default, Bacula will check modification/creation time
+and size.
+
+\section{Bvfs API}
+\label{sec:bvfs}
+
+To help developers of restore GUI interfaces, we have added new \textsl{dot
+  commands} that permit browsing the catalog in a very simple way.
+
+\begin{itemize}
+\item \texttt{.bvfs\_update [jobid=x,y,z]} This command is required to update the
+  Bvfs cache in the catalog. You need to run it before any access to the Bvfs
+  layer.
+
+\item \texttt{.bvfs\_lsdirs jobid=x,y,z path=/path | pathid=101} This command
+  will list all directories in the specified \texttt{path} or
+  \texttt{pathid}. Using \texttt{pathid} avoids problems with character
+  encoding of path/filenames.
+
+\item \texttt{.bvfs\_lsfiles jobid=x,y,z path=/path | pathid=101} This command
+  will list all files in the specified \texttt{path} or \texttt{pathid}. Using
+  \texttt{pathid} avoids problems with character encoding.
+\end{itemize}
+
+You can use \texttt{limit=xxx} and \texttt{offset=yyy} to limit the amount of
+data that will be displayed.
+
+\begin{verbatim}
+* .bvfs_update jobid=1,2
+* .bvfs_update
+* .bvfs_lsdir path=/ jobid=1,2
+\end{verbatim}
+
+\section{Testing your tape drive}
+\label{sec:btapespeed}
+
+To determine the best configuration of your tape drive, you can run the new
+\texttt{speed} command available in the \texttt{btape} program.
+
+This command can have the following arguments:
+\begin{itemize}
+\item[\texttt{file\_size=n}] Specify the Maximum File Size for this test
+  (between 1 and 5GB). This counter is in GB.
+\item[\texttt{nb\_file=n}] Specify the number of file to be written. The amount
+  of data should be greater than your memory ($file\_size*nb\_file$).
+\item[\texttt{skip\_zero}] This flag permits to skip tests with constant
+  data.
+\item[\texttt{skip\_random}] This flag permits to skip tests with random
+  data.
+\item[\texttt{skip\_raw}] This flag permits to skip tests with raw access.
+\item[\texttt{skip\_block}] This flag permits to skip tests with Bacula block
+  access.
+\end{itemize}
+
+\begin{verbatim}
+*speed file_size=3 skip_raw
+btape.c:1078 Test with zero data and bacula block structure.
+btape.c:956 Begin writing 3 files of 3.221 GB with blocks of 129024 bytes.
+++++++++++++++++++++++++++++++++++++++++++
+btape.c:604 Wrote 1 EOF to "Drive-0" (/dev/nst0)
+btape.c:406 Volume bytes=3.221 GB. Write rate = 44.128 MB/s
+...
+btape.c:383 Total Volume bytes=9.664 GB. Total Write rate = 43.531 MB/s
+
+btape.c:1090 Test with random data, should give the minimum throughput.
+btape.c:956 Begin writing 3 files of 3.221 GB with blocks of 129024 bytes.
++++++++++++++++++++++++++++++++++++++++++++
+btape.c:604 Wrote 1 EOF to "Drive-0" (/dev/nst0)
+btape.c:406 Volume bytes=3.221 GB. Write rate = 7.271 MB/s
++++++++++++++++++++++++++++++++++++++++++++
+...
+btape.c:383 Total Volume bytes=9.664 GB. Total Write rate = 7.365 MB/s
+
+\end{verbatim}
+
+When using compression, the random test will give your the minimum throughput
+of your drive . The test using constant string will give you the maximum speed
+of your hardware chain. (cpu, memory, scsi card, cable, drive, tape).
+
+You can change the block size in the Storage Daemon configuration file.
+
+\section{New {\bf Block Checksum} Device directive}
+You may now turn off the Block Checksum (CRC32) code
+that Bacula uses when writing blocks to a Volume.  This is
+done by adding:
+
+\begin{verbatim}
+Block Checksum = no
+\end{verbatim}
+
+doing so can reduce the Storage daemon CPU usage slightly.  It
+will also permit Bacula to read a Volume that has corrupted data.
+
+The default is {\bf yes} -- i.e. the checksum is computed on write
+and checked on read. 
+
+We do not recommend to turn this off particularly on older tape
+drives or for disk Volumes where doing so may allow corrupted data
+to go undetected.
+
+\section{New Bat Features}
+
+\subsection{Media information view}
+
+By double-clicking on a volume (on the Media list, in the Autochanger content
+or in the Job information panel), you can access a detailed overview of your
+Volume. (cf \ref{fig:mediainfo}.)
+\begin{figure}[htbp]
+  \centering
+  \includegraphics[width=13cm]{\idir bat11.eps}  
+  \caption{Media information}
+  \label{fig:mediainfo}
+\end{figure}
+
+\subsection{Job information view}
+
+By double-clicking on a Job record (on the Job run list or in the Media
+information panel), you can access a detailed overview of your Job. (cf
+\ref{fig:jobinfo}.)
+\begin{figure}[htbp]
+  \centering
+  \includegraphics[width=13cm]{\idir bat12.eps}  
+  \caption{Job information}
+  \label{fig:jobinfo}
+\end{figure}
+
+\subsection{Autochanger content view}
+
+By double-clicking on a Storage record (on the Storage list panel), you can
+access a detailed overview of your Autochanger. (cf \ref{fig:jobinfo}.)
+\begin{figure}[htbp]
+  \centering
+  \includegraphics[width=13cm]{\idir bat13.eps}  
+  \caption{Autochanger content}
+  \label{fig:achcontent}
+\end{figure}
+
+\section{Console timeout option}
+You can now use the -u option of bconsole to set a timeout for each command.
+
+\chapter{New Features in Released Version 3.0.2}
+
+This chapter presents the new features added to the
+Released Bacula Version 3.0.2.
+
+\section{Full restore from a given JobId}
+\index[general]{Restore menu}
+
+This feature allows selecting a single JobId and having Bacula
+automatically select all the other jobs that comprise a full backup up to
+and including the selected date (through JobId).
+
+Assume we start with the following jobs:
+\begin{verbatim}
++-------+--------------+---------------------+-------+----------+------------+
+| jobid | client       | starttime           | level | jobfiles | jobbytes   |
++-------+--------------+---------------------+-------+----------+------------
+| 6     | localhost-fd | 2009-07-15 11:45:49 | I     | 2        | 0          |
+| 5     | localhost-fd | 2009-07-15 11:45:45 | I     | 15       | 44143      |
+| 3     | localhost-fd | 2009-07-15 11:45:38 | I     | 1        | 10         |
+| 1     | localhost-fd | 2009-07-15 11:45:30 | F     | 1527     | 44143073   |
++-------+--------------+---------------------+-------+----------+------------+
+\end{verbatim}
+
+Below is an example of this new feature (which is number 12 in the
+menu).
+
+\begin{verbatim}
+* restore
+To select the JobIds, you have the following choices:
+     1: List last 20 Jobs run
+     2: List Jobs where a given File is saved
+...
+    12: Select full restore to a specified Job date
+    13: Cancel
+
+Select item:  (1-13): 12
+Enter JobId to get the state to restore: 5
+Selecting jobs to build the Full state at 2009-07-15 11:45:45
+You have selected the following JobIds: 1,3,5
+
+Building directory tree for JobId(s) 1,3,5 ...  +++++++++++++++++++
+1,444 files inserted into the tree.
+\end{verbatim}
+
+This project was funded by Bacula Systems.
  
  \section{Source Address}
  \index[general]{Source Address}
  
  A feature has been added which allows the administrator to specify the address
-from which the director and file daemons will attempt connections from.  This
+from which the Director and File daemons will establish connections.  This
  may be used to simplify system configuration overhead when working in complex
  networks utilizing multi-homing and policy-routing.
  
@@ -26,9 +381,10 @@ Director {
  }
  \end{verbatim}
  
-Simply adding specific host routes would have an undesirable side-effect: any
+Simply adding specific host routes on the OS
+would have an undesirable side-effect: any
  application trying to contact the destination host would be forced to use the
-more specific route, possibly diverting management traffic onto a backup VLAN.
+more specific route possibly diverting management traffic onto a backup VLAN.
  Instead of adding host routes for each client connected to a multi-homed backup
  server (for example where there are management and backup VLANs), one can
  use the new directives to specify a specific source address at the application
@@ -43,7 +399,7 @@ This project was funded by Collaborative Fusion, Inc.
  
  \section{Show volume availability when doing restore}
  
-When doing a restore the restore selection dialog ends by displaying this
+When doing a restore the selection dialog ends by displaying this
  screen:
  
  \begin{verbatim}
@@ -60,11 +416,11 @@ screen:
      001762L3                  LTO-4                     LTO3 
      001767L3                  LTO-4                     LTO3 
  
-Volumes marked with ``*'' are online.
+Volumes marked with ``*'' are online (in the autochanger).
  \end{verbatim}
  
-This should help getting large restores through minimizing the time spent
-waiting for operator to drop by and change tapes in the library.
+This should help speed up large restores by minimizing the time spent
+waiting for the operator to discover that he must change tapes in the library.
  
  This project was funded by Bacula Systems.
  
@@ -73,8 +429,8 @@ This project was funded by Bacula Systems.
  The \texttt{estimate} command can now use the accurate code to detect changes
  and give a better estimation.
  
-You can set the accurate behavior on command line using
-\texttt{accurate=yes/no} or use the Job setting as default value.
+You can set the accurate behavior on the command line by using
+\texttt{accurate=yes\vb{}no} or use the Job setting as default value.
  
  \begin{verbatim}
  * estimate listing accurate=yes level=incremental job=BackupJob
@@ -100,8 +456,8 @@ time, then the file will be backed up.  This does not, however, permit tracking
  what files have been deleted and will miss any file with an old time that may
  have been restored to or moved onto the client filesystem.
  
-\subsection{Accurate = \lt{}yes|no\gt{}}
-If the {\bf Accurate = \lt{}yes|no\gt{}} directive is enabled (default no) in
+\subsection{Accurate = \lt{}yes\vb{}no\gt{}}
+If the {\bf Accurate = \lt{}yes\vb{}no\gt{}} directive is enabled (default no) in
  the Job resource, the job will be run as an Accurate Job. For a {\bf Full}
  backup, there is no difference, but for {\bf Differential} and {\bf
    Incremental} backups, the Director will send a list of all previous files
@@ -165,21 +521,22 @@ Building directory tree for JobId(s) 19,2 ...  +++++++++++++++++++++++++++++++++
  The Copy Job runs without using the File daemon by copying the data from the
  old backup Volume to a different Volume in a different Pool. See the Migration
  documentation for additional details. For copy Jobs there is a new selection
-criterium named PoolUncopiedJobs which copies all jobs from a pool to an other
-pool which were not copied before. Next to that the client, volume, job or sql
-query are possible ways of selecting jobs which should be copied.  Selection
-types like smallestvolume, oldestvolume, pooloccupancy and pooltime are
-probably more suited for migration jobs only. But we could imagine some people
-have a valid use for those kind of copy jobs too.
-
-If bacula founds a copy when a job record is purged (deleted) from the catalog,
-it will promote the copy as \textsl{real} backup and will make it available for
-automatic restore. If more than one copy is available, it will promote the copy
-with the smallest jobid.
-
-A nice solution which can be build with the new copy jobs is what is
-called the disk-to-disk-to-tape backup (DTDTT). A sample config could
-look somethings like the one below:
+directive named {\bf PoolUncopiedJobs} which selects all Jobs that were
+not already copied to another Pool. 
+
+As with Migration, the Client, Volume, Job, or SQL query, are
+other possible ways of selecting the Jobs to be copied. Selection
+types like SmallestVolume, OldestVolume, PoolOccupancy and PoolTime also
+work, but are probably more suited for Migration Jobs. 
+
+If Bacula finds a Copy of a job record that is purged (deleted) from the catalog,
+it will promote the Copy to a \textsl{real} backup job and will make it available for
+automatic restore. If more than one Copy is available, it will promote the copy
+with the smallest JobId.
+
+A nice solution which can be built with the new Copy feature is often
+called disk-to-disk-to-tape backup (DTDTT). A sample config could
+look something like the one below:
  
  \begin{verbatim}
  Pool {
@@ -637,7 +994,7 @@ are specified in the Job resource.
  
  They are:
  
-\subsection{Allow Duplicate Jobs = \lt{}yes|no\gt{}}
+\subsection{Allow Duplicate Jobs = \lt{}yes\vb{}no\gt{}}
  \index[general]{Allow Duplicate Jobs}
    If this directive is enabled duplicate jobs will be run.  If
    the directive is set to {\bf no} (default) then only one job of a given name
@@ -650,7 +1007,7 @@ They are:
    will be cancelled.
  
  
-\subsection{Allow Higher Duplicates = \lt{}yes|no\gt{}}
+\subsection{Allow Higher Duplicates = \lt{}yes\vb{}no\gt{}}
  \index[general]{Allow Higher Duplicates}
    If this directive is set to {\bf yes} (default) the job with a higher
    priority (lower priority number) will be permitted to run, and
@@ -658,14 +1015,14 @@ They are:
    priorities of the two jobs are the same, the outcome is determined by
    other directives (see below).
  
-\subsection{Cancel Queued Duplicates = \lt{}yes|no\gt{}}
+\subsection{Cancel Queued Duplicates = \lt{}yes\vb{}no\gt{}}
  \index[general]{Cancel Queued Duplicates}
    If {\bf Allow Duplicate Jobs} is set to {\bf no} and
    if this directive is set to {\bf yes} any job that is
    already queued to run but not yet running will be canceled.
    The default is {\bf no}. 
  
-\subsection{Cancel Running Duplicates = \lt{}yes|no\gt{}}
+\subsection{Cancel Running Duplicates = \lt{}yes\vb{}no\gt{}}
  \index[general]{Cancel Running Duplicates}
    If {\bf Allow Duplicate Jobs} is set to {\bf no} and
    if this directive is set to {\bf yes} any job that is already running
@@ -734,7 +1091,7 @@ greater than the specified interval, and the job would normally be an
  {\bf Incremental}, it will be automatically
  upgraded to a {\bf Differential} backup.
  
-\section{Honor No Dump Flag = \lt{}yes|no\gt{}}
+\section{Honor No Dump Flag = \lt{}yes\vb{}no\gt{}}
  \index[general]{MaxDiffInterval}
  On FreeBSD systems, each file has a {\bf no dump flag} that can be set
  by the user, and when it is set it is an indication to backup programs
@@ -743,7 +1100,7 @@ new Options directive within a FileSet resource, which instructs Bacula to
  obey this flag.  The new directive is:
  
  \begin{verbatim}
-  Honor No Dump Flag = yes|no
+  Honor No Dump Flag = yes\vb{}no
  \end{verbatim}
  
  The default value is {\bf no}.
@@ -1311,7 +1668,7 @@ in the directory {\bf linux/usb}.
  \section{Miscellaneous}
  \index[general]{Misc New Features}
  
-\subsection{Allow Mixed Priority = \lt{}yes|no\gt{}}
+\subsection{Allow Mixed Priority = \lt{}yes\vb{}no\gt{}}
  \index[general]{Allow Mixed Priority}
     This directive is only implemented in version 2.5 and later.  When
     set to {\bf yes} (default {\bf no}), this job may run even if lower
@@ -1344,7 +1701,7 @@ in the directory {\bf linux/usb}.
    There were no files inserted into the tree, so file selection
    is not possible.Most likely your retention policy pruned the files
    
-  Do you want to restore all the files? (yes|no): no
+  Do you want to restore all the files? (yes\vb{}no): no
    
    Regexp matching files to restore? (empty to abort): /tmp/regress/(bin|tests)/
    Bootstrap records written to /tmp/regress/working/zog4-dir.restore.1.bsr