git.sur5r.net Git - bacula/docs/blob - docs/manual/migration.tex

   1
   2 \section*{Migration}
   3 \label{_MigrationChapter}
   4 \index[general]{Migration}
   5 \addcontentsline{toc}{section}{Migration}
   6
   7 The term Migration, as used in the context of Bacula, means moving data from
   8 one Volume to another.  In particular it refers to a Job (similar to a backup
   9 job) that reads data that was previously backed up to a Volume and writes
  10 it to another Volume.  As part of this process, the File catalog records
  11 associated with the first backup job are purged.  In other words, Migration
  12 moves Bacula Job data from one Volume to another by reading the Job data
  13 from the Volume it is stored on, writing it to a different Volume in a
  14 different Pool, and then purging the database records for the first Job.
  15
  16 The section process for which Job or Jobs are migrated
  17 can be based on quite a number of different criteria such as:
  18 \begin{itemize}
  19 \item a single previous Job
  20 \item a Volume
  21 \item a Client
  22 \item a regular expression matching a Job, Volume, or Client name
  23 \item the time a Job is on a Volume
  24 \item high and low water marks (usage or occupation) of a Pool
  25 \item Volume size
  26 \end{itemize}
  27
  28 The details of these selection criteria will be defined below.
  29
  30 To run a Migration job, you must first define a Job resource very similar
  31 to a Backup Job but with {\bf Type = Migrate} instead of {\bf Type =
  32 Backup}.  One of the key points to remember is that the Pool that is
  33 specified for the migration job is the only pool from which jobs will
  34 be migrated, with one exception noted below. Also, Bacula permits pools
  35 to contain Volumes with different Media Types. However, when doing
  36 migration, this is a very undesirable condition. For migration to work
  37 properly, you should use pools containing only Volumes of the same
  38 Media Type for all migration jobs.
  39
  40 The migration job normally is either manually started or starts
  41 from a Schedule much like a backup job. It searches
  42 for a previous backup Job or Jobs that match the parameters you have
  43 specified in the migration Job resource, primarily a {\bf Selection Type}
  44 (detailed a bit later).  Then for
  45 each previous backup JobId found, the Migration Job will run a new Job which
  46 copies the old Job data from the previous Volume to a new Volume in
  47 the Migration Pool.  It is possible that no prior Jobs are found for
  48 migration, in which case, the Migration job will simply terminate having
  49 done nothing, but normally at a minimum, three jobs are involved during a
  50 migration:
  51
  52 \begin{itemize}
  53 \item The currently running Migration control Job. This is only
  54       a control job for starting the migration child jobs.
  55 \item The previous Backup Job (already run). The File records
  56       for this Job are purged if the Migration job successfully
  57       terminates.  The original data remains on the Volume until
  58       it is recycled and rewritten.
  59 \item A new Migration Backup Job that moves the data from the
  60       previous Backup job to the new Volume.  If you subsequently
  61       do a restore, the data will be read from this Job.
  62 \end{itemize}
  63
  64 If the Migration control job finds a number of JobIds to migrate (e.g.
  65 it is asked to migrate one or more Volumes), it will start one new
  66 migration backup job for each JobId found on the specified Volumes.
  67
  68 \subsection*{Migration Job Resource Directives}
  69 \addcontentsline{toc}{section}{Migration Job Resource Directives}
  70
  71 The following directives can appear in a Director's Job resource, and they
  72 are used to define a Migration job.
  73
  74 \begin{description}
  75 \item [Pool = \lt{}Pool-name\gt{}] The Pool specified in the Migration
  76    control Job is not a new directive for the Job resource, but it is
  77    particularly important because it determines what Pool will be examined for
  78    finding JobIds to migrate.  The exception to this is when {\bf Selection
  79    Type = SQLQuery}, in which case no Pool is used, unless you
  80    specifically include it in the SQL query.
  81
  82 \item [Type = Migrate]
  83    {\bf Migrate} is a new type that defines the job that is run as being a
  84    Migration Job.  A Migration Job is a sort of control job and does not have
  85    any Files associated with it, and in that sense they are more or less like
  86     an Admin job.  Migration jobs simply check to see if there is anything to
  87    Migrate then possibly start and control new Backup jobs to migrate the data
  88    from the specified Pool to another Pool.
  89
  90 \item [Selection Type = \lt{}Selection-type-keyword\gt{}]
  91   The \lt{}Selection-type-keyword\gt{} determines how the migration job
  92   will go about selecting what JobIds to migrate. In most cases, it is
  93   used in conjunction with a {\bf Selection Pattern} to give you fine
  94   control over exactly what JobIds are selected.  The possible values
  95   for \lt{}Selection-type-keyword\gt{} are:
  96   \begin{description}
  97   \item [SmallestVolume] This selection keyword selects the volume with the
  98         fewest bytes from the Pool to be migrated.  The Pool to be migrated
  99         is the Pool defined in the Migration Job resource.  The migration
 100         control job will then start and run one migration backup job for
 101         each of the Jobs found on this Volume.  The Selection Pattern, if
 102         specified, is not used.
 103
 104   \item [OldestVolume] This selection keyword selects the volume with the
 105         oldest last write time in the Pool to be migrated.  The Pool to be
 106         migrated is the Pool defined in the Migration Job resource.  The
 107         migration control job will then start and run one migration backup
 108         job for each of the Jobs found on this Volume.  The Selection
 109         Pattern, if specified, is not used.
 110
 111   \item [Client] The Client selection type, first selects all the Clients
 112         that have been backed up in the Pool specified by the Migration
 113         Job resource, then it applies the {\bf Selection Pattern} (defined
 114         below) as a regular expression to the list of Client names, giving
 115         a filtered Client name list.  All jobs that were backed up for those
 116         filtered (regexed) Clients will be migrated.
 117         The migration control job will then start and run one migration
 118         backup job for each of the JobIds found for those filtered Clients.
 119
 120   \item [Volume] The Volume selection type, first selects all the Volumes
 121         that have been backed up in the Pool specified by the Migration
 122         Job resource, then it applies the {\bf Selection Pattern} (defined
 123         below) as a regular expression to the list of Volume names, giving
 124         a filtered Volume list.  All JobIds that were backed up for those
 125         filtered (regexed) Volumes will be migrated.
 126         The migration control job will then start and run one migration
 127         backup job for each of the JobIds found on those filtered Volumes.
 128
 129   \item [Job] The Job selection type, first selects all the Jobs (as
 130         defined on the {\bf Name} directive in a Job resource)
 131         that have been backed up in the Pool specified by the Migration
 132         Job resource, then it applies the {\bf Selection Pattern} (defined
 133         below) as a regular expression to the list of Job names, giving
 134         a filtered Job name list.  All JobIds that were run for those
 135         filtered (regexed) Job names will be migrated.  Note, for a given
 136         Job named, they can be many jobs (JobIds) that ran.
 137         The migration control job will then start and run one migration
 138         backup job for each of the Jobs found.
 139
 140   \item [SQLQuery] The SQLQuery selection type, used the {\bf Selection
 141         Pattern} as an SQL query to obtain the JobIds to be migrated.
 142         The Selection Pattern must be a valid SELECT SQL statement for your
 143         SQL engine, and it must return the JobId as the first field
 144         of the SELECT.
 145
 146   \item [PoolOccupancy] This selection type will cause the Migration job
 147         to compute the total size of the specified pool for all Media Types
 148         combined. If it exceeds the {\bf Migration High Bytes} defined in
 149         the Pool, the Migration job will migrate all JobIds beginning with
 150         the oldest Volume in the pool (determined by Last Write time) until
 151         the Pool bytes drop below the {\bf Migration Low Bytes} defined in the
 152         Pool. This calculation should be consider rather approximative because
 153         it is made once by the Migration job before migration is begun, and
 154         thus does not take into account additional data written into the Pool
 155         during the migration.  In addition, the calculation of the total Pool
 156         byte size is based on the Volume bytes saved in the Volume (Media)
 157 database
 158         entries. The bytes caculate for Migration is based on the value stored
 159         in the Job records of the Jobs to be migrated. These do not include the
 160         Storage daemon overhead as is in the total Pool size. As a consequence,
 161         normally, the migration will migrate more bytes than strictly necessary.
 162
 163   \item [PoolTime] The PoolTime selection type will cause the Migration job to
 164         look at the time each JobId has been in the Pool since the job ended.
 165         All Jobs in the Pool longer than the time specified on {\bf Migration Time}
 166         directive in the Pool resource will be migrated.
 167   \end{description}
 168
 169 \item [Selection Pattern = \lt{}Quoted-string\gt{}]
 170   The Selection Patterns permitted for each Selection-type-keyword are
 171   described above.
 172
 173   For the OldestVolume and SmallestVolume, this
 174   Selection pattern is not used (ignored).
 175
 176   For the Client, Volume, and Job
 177   keywords, this pattern must be a valid regular expression that will filter
 178   the appropriate item names found in the Pool.
 179
 180   For the SQLQuery keyword, this pattern must be a valid SELECT SQL statement
 181   that returns JobIds.
 182
 183 \end{description}
 184
 185 \subsection*{Migration Pool Resource Directives}
 186 \addcontentsline{toc}{section}{Migration Pool Resource Directives}
 187
 188 The following directives can appear in a Director's Pool resource, and they
 189 are used to define a Migration job.
 190
 191 \begin{description}
 192 \item [Migration Time = \lt{}time-specification\gt{}]
 193    If a PoolTime migration is done, the time specified here in seconds (time
 194    modifiers are permitted -- e.g. hours, ...) will be used. If the
 195    previous Backup Job or Jobs selected have been in the Pool longer than
 196    the specified PoolTime, then they will be migrated.
 197
 198 \item [Migration High Bytes =  \lt{}byte-specification\gt{}]
 199    This directive specifies the number of bytes in the Pool which will
 200    trigger a migration if a {\bf PoolOccupancy} migration selection
 201    type has been specified. The fact that the Pool
 202    usage goes above this level does not automatically trigger a migration
 203    job. However, if a migration job runs and has the PoolOccupancy selection
 204    type set, the Migration High Bytes will be applied.  Bacula does not
 205    currently restrict a pool to have only a single Media Type, so you
 206    must keep in mind that if you mix Media Types in a Pool, the results
 207    may not be what you want, as the Pool count of all bytes will be
 208    for all Media Types combined.
 209
 210 \item [Migration Low Bytes = \lt{}byte-specification\gt{}]
 211    This directive specifies the number of bytes in the Pool which will
 212    stop a migration if a {\bf PoolOccupancy} migration selection
 213    type has been specified and triggered by more than Migration High
 214    Bytes being in the pool. In other words, once a migration job
 215    is started with {\bf PoolOccupancy} migration selection and it
 216    determines that there are more than Migration High Bytes, the
 217    migration job will continue to run jobs until the number of
 218    bytes in the Pool drop to or below Migration Low Bytes.
 219
 220 \item [Next Pool = \lt{}pool-specification\gt{}]
 221    The Next Pool directive specifies the pool to which Jobs will be
 222    migrated.
 223
 224 \item [Storage = \lt{}storage-specification\gt{}]
 225    The Storage directive specifies what Storage resource will be used
 226    for all Jobs that use this Pool. It takes precedence over any other
 227    Storage specifications that may have been given such as in the
 228    Schedule Run directive, or in the Job resource.
 229 \end{description}
 230
 231 \subsection*{Important Migration Considerations}
 232 \index[general]{Important Migration Considerations}
 233 \addcontentsline{toc}{subsection}{Important Migration Considerations}
 234 \begin{itemize}
 235 \item Each Pool into which you migrate Jobs or Volumes {\bf must}
 236       contain Volumes of only one Media Type.
 237
 238 \item Migration takes place on a JobId by JobId basis. That is
 239       each JobId is migrated in its entirety and independently
 240       of other JobIds. Once the Job is migrated, it will be
 241       on the new medium in the new Pool, but for the most part,
 242       aside from having a new JobId, it will appear with all the
 243       same characteristics of the original job (start, end time, ...).
 244       The column RealEndTime in the catalog Job table will contain the
 245       time and date that the Migration terminated, and by comparing
 246       it with the EndTime column you can tell whether or not the
 247       job was migrated.  The original job is purged of its File
 248       records, and its Type field is changed from "B" to "M" to
 249       indicate that the job was migrated.
 250
 251 \item Jobs on Volumes will be Migration only if the Volume is
 252       marked, Full, Used, or Error.  Volumes that are still
 253       marked Append will not be considered for migration. This
 254       prevents Bacula from attempting to read the Volume at
 255       the same time it is writing it.
 256
 257 \item As noted above, for the Migration High Bytes, the calculation
 258       of the bytes to migrate is somewhat approximate.
 259
 260 \item If you keep Volumes of different Media Types in the same Pool,
 261       it is not clear how well migration will work.  We recommend only
 262       one Media Type per pool.
 263
 264 \item It is possible to get into a resource deadlock where Bacula does
 265       not find enough drives to simultaneously read and write all the
 266       Volumes needed to do Migrations. For the moment, you must take
 267       care as all the resource deadlock algorithms are not yet implemented.
 268
 269 \item Migration is done only when you run a Migration job. If you set a
 270       Migration High Bytes and that number of bytes is exceeded in the Pool
 271       no migration job will automatically start.  You must schedule the
 272       migration jobs yourself.
 273
 274 \item If you migrate a number of Volumes, a very large number of Migration
 275       jobs may start.
 276
 277 \item Figuring out what jobs will actually be migrated can be a bit complicated
 278       due to the flexibility provided by the regex patterns and the number of
 279       different options.  Turning on a debug level of 100 or more will provide
 280       a limited amount of debug information about the migration selection
 281       process.
 282
 283 \item Bacula currently does only minimal Storage conflict resolution, so you
 284       must take care to ensure that you don't try to read and write to the
 285       same device or Bacula may block waiting to reserve a drive that it
 286       will never find. In general, ensure that all your migration
 287       pools contain only one Media Type, and that you always
 288       migrate to pools with different Media Types.
 289 \end{itemize}
 290
 291
 292 \subsection*{Example Migration Jobs}
 293 \index[general]{Example Migration Jobs}
 294 \addcontentsline{toc}{subsection}{Example Migration Jobs}
 295
 296 When you specify a Migration Job, you must specify all the standard
 297 directives as for a Job.  However, certain such as the Level, Client, and
 298 FileSet, though they must be defined, are ignored by the Migration job
 299 because the values from the original job used instead.
 300
 301 As an example, suppose you have the following Job that
 302 you run every night:
 303
 304 \footnotesize
 305 \begin{verbatim}
 306 # Define the backup Job
 307 Job {
 308   Name = "NightlySave"
 309   Type = Backup
 310   Level = Incremental                 # default
 311   Client=rufus-fd
 312   FileSet="Full Set"
 313   Schedule = "WeeklyCycle"
 314   Messages = Standard
 315   Pool = Default
 316 }
 317
 318 # Default pool definition
 319 Pool {
 320   Name = Default
 321   Pool Type = Backup
 322   AutoPrune = yes
 323   Recycle = yes
 324   Next Pool = Tape
 325   Storage = File
 326   LabelFormat = "File"
 327 }
 328
 329 # Tape pool definition
 330 Pool {
 331   Name = Tape
 332   Pool Type = Backup
 333   AutoPrune = yes
 334   Recycle = yes
 335   Storage = DLTDrive
 336 }
 337
 338 # Definition of File storage device
 339 Storage {
 340   Name = File
 341   Address = rufus
 342   Password = "ccV3lVTsQRsdIUGyab0N4sMDavui2hOBkmpBU0aQKOr9"
 343   Device = "File"          # same as Device in Storage daemon
 344   Media Type = File        # same as MediaType in Storage daemon
 345 }
 346
 347 # Definition of DLT tape storage device
 348 Storage {
 349   Name = DLTDrive
 350   Address = rufus
 351   Password = "ccV3lVTsQRsdIUGyab0N4sMDavui2hOBkmpBU0aQKOr9"
 352   Device = "HP DLT 80"      # same as Device in Storage daemon
 353   Media Type = DLT8000      # same as MediaType in Storage daemon
 354 }
 355
 356 \end{verbatim}
 357 \normalsize
 358
 359 Where we have included only the essential information -- i.e. the
 360 Director, FileSet, Catalog, Client, Schedule, and Messages resources are
 361 omitted.
 362
 363 As you can see, by running the NightlySave Job, the data will be backed up
 364 to File storage using the Default pool to specify the Storage as File.
 365
 366 Now, if we add the following Job resource to this conf file.
 367
 368 \footnotesize
 369 \begin{verbatim}
 370 Job {
 371   Name = "migrate-volume"
 372   Type = Migrate
 373   Level = Full
 374   Client = rufus-fd
 375   FileSet = "Full Set"
 376   Messages = Standard
 377   Storage = DLTDrive
 378   Pool = Default
 379   Maximum Concurrent Jobs = 4
 380   Selection Type = Volume
 381   Selection Pattern = "File"
 382 }
 383 \end{verbatim}
 384 \normalsize
 385
 386 and then run the job named {\bf migrate-volume}, all volumes in the Pool
 387 named Default (as specified in the migrate-volume Job that match the
 388 regular expression pattern {\bf File} will be migrated to tape storage
 389 DLTDrive because the {\bf Next Pool} in the Default Pool specifies that
 390 Migrations should go to the pool named {\bf Tape}, which uses
 391 Storage {\bf DLTDrive}.
 392
 393 If instead, we use a Job resource as follows:
 394
 395 \footnotesize
 396 \begin{verbatim}
 397 Job {
 398   Name = "migrate"
 399   Type = Migrate
 400   Level = Full
 401   Client = rufus-fd
 402   FileSet="Full Set"
 403   Messages = Standard
 404   Storage = DLTDrive
 405   Pool = Default
 406   Maximum Concurrent Jobs = 4
 407   Selection Type = Job
 408   Selection Pattern = ".*Save"
 409 }
 410 \end{verbatim}
 411 \normalsize
 412
 413 All jobs ending with the name Save will be migrated from the File Default to
 414 the Tape Pool, or from File storage to Tape storage.