git.sur5r.net Git - bacula/docs/blob - docs/manual/migration.tex

   1
   2 \section*{Migration}
   3 \label{_MigrationChapter}
   4 \index[general]{Migration}
   5 \addcontentsline{toc}{section}{Migration}
   6
   7 The term Migration, as used in the context of Bacula, means moving data from
   8 one Volume to another.  In particular it refers to a Job (similar to a backup
   9 job) that reads data that was previously backed up to a Volume and writes
  10 it to another Volume.  As part of this process, the File catalog records
  11 associated with the first backup job are purged.  In other words, Migration
  12 moves Bacula Job data from one Volume to another.  Although we mention
  13 Volumes to simplify the subject, in reality, Migration reads the data
  14 from one Volume and writes it to different Volume in a different pool,
  15 which is equivalent to moving individual Jobs from one Pool to another.
  16
  17 Migrations can be based on quite a number of different criteria such as:
  18 \begin{itemize}
  19 \item a single previous Job
  20 \item a Volume
  21 \item a Client
  22 \item a regular expression matching a Job, Volume, or Client name
  23 \item the time a Job is on a Volume
  24 \item high and low water marks (usage or occupation) of a Pool
  25 \item Volume size
  26 \end{itemize}
  27
  28 The details of these selection criteria will be defined below.
  29
  30 To run a Migration job, you must first define a Job resource very similar
  31 to a Backup Job but with {\bf Type = Migrate} instead of {\bf Type =
  32 Backup}.  One of the key points to remember is that the Pool that is
  33 specified for the migration job is the only pool from which jobs will
  34 be migrated, with one exception noted below. Also, Bacula permits pools
  35 to contain Volumes with different Media Types. However, when doing
  36 migration, this is a very undesirable condition. For migration to work
  37 properly, you {\bf must} use pools containing only Volumes of the same
  38 Media Type for all migration jobs.
  39
  40 The migration job normally is either manually started or starts
  41 from a Schedule much like a backup job. It searches
  42 for a previous backup Job or Jobs that match the parameters you have
  43 specified in the migration Job resource, primarily a {\bf Selection Type}
  44 (detailed a bit later).  Then for
  45 each previous backup JobId found, the Migration Job will run a new Job which
  46 copies the old Job data from the previous Volume to a new Volume in
  47 the Migration Pool.  It is possible that no prior Jobs are found for
  48 migration, in which case, the Migration job will simply terminate having
  49 done nothing, but normally at a minimum, three jobs are involved during a
  50 migration:
  51
  52 \begin{itemize}
  53 \item The currently running Migration control Job
  54 \item The previous Backup Job (already run)
  55 \item A new Migration Backup Job that moves the data from the
  56       previous Backup job to the new Volume.
  57 \end{itemize}
  58
  59 If the Migration control job finds a number of JobIds to migrate (e.g.
  60 it is asked to migrate one or more Volumes), it will start one new
  61 migration backup job for each JobId found.
  62
  63 \subsection*{Migration Job Resource Directives}
  64 \addcontentsline{toc}{section}{Migration Job Resource Directives}
  65
  66 The following directives can appear in a Director's Job resource, and they
  67 are used to define a Migration job.
  68
  69 \begin{description}
  70 \item [Pool = \lt{}Pool-name\gt{}] The Pool specified in the Migration
  71    control Job is not a new directive for the Job resource, but it is
  72    particularly important because it determines what Pool will be examined for
  73    finding JobIds to migrate.  The exception to this is when {\bf Selection
  74    Type = SQLQuery}, in which case no Pool is used, unless you
  75    specifically include it in the SQL query.
  76
  77 \item [Type = Migrate]
  78    {\bf Migrate} is a new type that defines the job that is run as being a
  79    Migration Job.  A Migration Job is a sort of control job and does not have
  80    any Files associated with it, and in that sense they are more or less like
  81     an Admin job.  Migration jobs simply check to see if there is anything to
  82    Migrate then possibly start and control new Backup jobs to migrate the data
  83    from the specified Pool to another Pool.
  84
  85 \item [Selection Type = \lt{}Selection-type-keyword\gt{}]
  86   The \lt{}Selection-type-keyword\gt{} determines how the migration job
  87   will go about selecting what JobIds to migrate. In most cases, it is
  88   used in conjunction with a {\bf Selection Pattern} to give you fine
  89   control over exactly what JobIds are selected.  The possible values
  90   for \lt{}Selection-type-keyword\gt{} are:
  91   \begin{description}
  92   \item [SmallestVolume] This selection keyword selects the volume with the
  93         fewest bytes from the Pool to be migrated.  The Pool to be migrated
  94         is the Pool defined in the Migration Job resource.  The migration
  95         control job will then start and run one migration backup job for
  96         each of the Jobs found on this Volume.  The Selection Pattern, if
  97         specified, is not used.
  98
  99   \item [OldestVolume] This selection keyword selects the volume with the
 100         oldest last write time in the Pool to be migrated.  The Pool to be
 101         migrated is the Pool defined in the Migration Job resource.  The
 102         migration control job will then start and run one migration backup
 103         job for each of the Jobs found on this Volume.  The Selection
 104         Pattern, if specified, is not used.
 105
 106   \item [Client] The Client selection type, first selects all the Clients
 107         that have been backed up in the Pool specified by the Migration
 108         Job resource, then it applies the {\bf Selection Pattern} (defined
 109         below) as a regular expression to the list of Client names, giving
 110         a filtered Client name list.  All jobs that were backed up for those
 111         filtered (regexed) Clients will be migrated.
 112         The migration control job will then start and run one migration
 113         backup job for each of the JobIds found for those filtered Clients.
 114
 115   \item [Volume] The Volume selection type, first selects all the Volumes
 116         that have been backed up in the Pool specified by the Migration
 117         Job resource, then it applies the {\bf Selection Pattern} (defined
 118         below) as a regular expression to the list of Volume names, giving
 119         a filtered Volume list.  All JobIds that were backed up for those
 120         filtered (regexed) Volumes will be migrated.
 121         The migration control job will then start and run one migration
 122         backup job for each of the JobIds found on those filtered Volumes.
 123
 124   \item [Job] The Job selection type, first selects all the Jobs (as
 125         defined on the {\bf Name} directive in a Job resource)
 126         that have been backed up in the Pool specified by the Migration
 127         Job resource, then it applies the {\bf Selection Pattern} (defined
 128         below) as a regular expression to the list of Job names, giving
 129         a filtered Job name list.  All JobIds that were run for those
 130         filtered (regexed) Job names will be migrated.  Note, for a given
 131         Job named, they can be many jobs (JobIds) that ran.
 132         The migration control job will then start and run one migration
 133         backup job for each of the Jobs found.
 134
 135   \item [SQLQuery] The SQLQuery selection type, used the {\bf Selection
 136         Pattern} as an SQL query to obtain the JobIds to be migrated.
 137         The Selection Pattern must be a valid SELECT SQL statement for your
 138         SQL engine, and it must return the JobId as the first field
 139         of the SELECT.
 140
 141   \item [PoolOccupancy] This selection type will cause the Migration job
 142         to compute the total size of the specified pool for all Media Types
 143         combined. If it exceeds the {\bf Migration High Bytes} defined in
 144         the Pool, the Migration job will migrate all JobIds beginning with
 145         the oldest Volume in the pool (determined by Last Write time) until
 146         the Pool bytes drop below the {\bf Migration Low Bytes} defined in the
 147         Pool. This calculation should be consider rather approximative because
 148         it is made once by the Migration job before migration is begun, and
 149         thus does not take into account additional data written into the Pool
 150         during the migration.  In addition, the calculation of the total Pool
 151         byte size is based on the Volume bytes saved in the Volume (Media)
 152 database
 153         entries. The bytes caculate for Migration is based on the value stored
 154         in the Job records of the Jobs to be migrated. These do not include the
 155         Storage daemon overhead as is in the total Pool size. As a consequence,
 156         normally, the migration will migrate more bytes than strictly necessary.
 157
 158   \item [PoolTime] The PoolTime selection type will cause the Migration job to
 159         look at the time each JobId has been in the Pool since the job ended.
 160         All Jobs in the Pool longer than the time specified on {\bf Migration Time}
 161         directive in the Pool resource will be migrated.
 162   \end{description}
 163
 164 \item [Selection Pattern = \lt{}Quoted-string\gt{}]
 165   The Selection Patterns permitted for each Selection-type-keyword are
 166   described above.
 167
 168   For the OldestVolume and SmallestVolume, this
 169   Selection pattern is not used (ignored).
 170
 171   For the Client, Volume, and Job
 172   keywords, this pattern must be a valid regular expression that will filter
 173   the appropriate item names found in the Pool.
 174
 175   For the SQLQuery keyword, this pattern must be a valid SELECT SQL statement
 176   that returns JobIds.
 177
 178 \end{description}
 179
 180 \subsection*{Migration Pool Resource Directives}
 181 \addcontentsline{toc}{section}{Migration Pool Resource Directives}
 182
 183 The following directives can appear in a Director's Pool resource, and they
 184 are used to define a Migration job.
 185
 186 \begin{description}
 187 \item [Migration Time = \lt{}time-specification\gt{}]
 188    If a PoolTime migration is done, the time specified here in seconds (time
 189    modifiers are permitted -- e.g. hours, ...) will be used. If the
 190    previous Backup Job or Jobs selected have been in the Pool longer than
 191    the specified PoolTime, then they will be migrated.
 192
 193 \item [Migration High Bytes =  \lt{}byte-specification\gt{}]
 194    This directive specifies the number of bytes in the Pool which will
 195    trigger a migration if a {\bf PoolOccupancy} migration selection
 196    type has been specified. The fact that the Pool
 197    usage goes above this level does not automatically trigger a migration
 198    job. However, if a migration job runs and has the PoolOccupancy selection
 199    type set, the Migration High Bytes will be applied.  Bacula does not
 200    currently restrict a pool to have only a single Media Type, so you
 201    must keep in mind that if you mix Media Types in a Pool, the results
 202    may not be what you want, as the Pool count of all bytes will be
 203    for all Media Types combined.
 204
 205 \item [Migration Low Bytes = \lt{}byte-specification\gt{}]
 206    This directive specifies the number of bytes in the Pool which will
 207    stop a migration if a {\bf PoolOccupancy} migration selection
 208    type has been specified and triggered by more than Migration High
 209    Bytes being in the pool. In other words, once a migration job
 210    is started with {\bf PoolOccupancy} migration selection and it
 211    determines that there are more than Migration High Bytes, the
 212    migration job will continue to run jobs until the number of
 213    bytes in the Pool drop to or below Migration Low Bytes.
 214
 215 \item [Next Pool = \lt{}pool-specification\gt{}]
 216    The Next Pool directive specifies the pool to which Jobs will be
 217    migrated.
 218
 219 \item [Storage = \lt{}storage-specification\gt{}]
 220    The Storage directive specifies what Storage resource will be used
 221    for all Jobs that use this Pool. It takes precedence over any other
 222    Storage specifications that may have been given such as in the
 223    Schedule Run directive, or in the Job resource.
 224 \end{description}
 225
 226 \subsection*{Important Migration Considerations}
 227 \index[general]{Important Migration Considerations}
 228 \addcontentsline{toc}{subsection}{Important Migration Considerations}
 229 \begin{itemize}
 230 \item Each Pool into which you migrate Jobs or Volumes {\bf must}
 231       contain Volumes of only one Media Type.
 232
 233 \item Migration takes place on a JobId by JobId basis. That is
 234       each JobId is migrated in its entirety and independently
 235       of other JobIds. Once the Job is migrated, it will be
 236       on the new medium in the new Pool, but for the most part,
 237       aside from having a new JobId, it will appear with all the
 238       same characteristics of the original job (start, end time, ...).
 239       The column RealEndTime in the Job table will contain the
 240       time and date that the Migration terminated, and by comparing
 241       it with the EndTime column you can tell whether or not the
 242       job was migrated.  The original job is purged of its File
 243       records, and its Type field is changed from "B" to "M" to
 244       indicate that the job was migrated.
 245
 246 \item Jobs on Volumes will be Migration only if the Volume is
 247       marked, Full, Used, or Error.  Volumes that are still
 248       marked Append will not be considered for migration. This
 249       prevents Bacula from attempting to read the Volume at
 250       the same time it is writing it.
 251
 252 \item As noted above, for the Migration High Bytes, the calculation
 253       of the bytes to migrate is somewhat approximate.
 254
 255 \item If you keep Volumes of different Media Types in the same Pool,
 256       it is not clear how well migration will work.  We recommend only
 257       one Media Type per pool.
 258
 259 \item It is possible to get into a resource deadlock where Bacula does
 260       not find enough drives to simultaneously read and write all the
 261       Volumes needed to do Migrations. For the moment, you must take
 262       care as all the resource deadlock algorithms are not yet implemented.
 263
 264 \item Migration is done only when you run a Migration job. If you set a
 265       Migration High Bytes and that number of bytes is exceeded in the Pool
 266       no migration job will automatically start.  You must schedule the
 267       migration jobs yourself.
 268
 269 \item If you migrate a number of Volumes, a very large number of Migration
 270       jobs may start.
 271
 272 \item Figuring out what jobs will actually be migrated can be a bit complicated
 273       due to the flexibility provided by the regex patterns and the number of
 274       different options.  Turning on a debug level of 100 or more will provide
 275       a limited amount of debug information about the migration selection
 276       process.
 277
 278 \item Bacula does not currently do any Storage conflict resolution, so you
 279       must take care to ensure that you don't try to read and write to the
 280       same device or Bacula will block waiting to reserve a drive that it
 281       will never find. In general, ensure that all your migration
 282       pools contain only one Media Type, and that you always
 283       migrate to pools with different Media Types.
 284 \end{itemize}
 285
 286
 287 \subsection*{Example Migration Jobs}
 288 \index[general]{Example Migration Jobs}
 289 \addcontentsline{toc}{subsection}{Example Migration Jobs}
 290
 291 When you specify a Migration Job, you must specify all the standard
 292 directives as for a Job.  However, certain such as the Level, Client, and
 293 FileSet, though they must be defined, are ignored by the Migration job
 294 because the values from the original job used instead.
 295
 296 As an example, suppose you have the following Job that
 297 you run every night:
 298
 299 \footnotesize
 300 \begin{verbatim}
 301 # Define the backup Job
 302 Job {
 303   Name = "NightlySave"
 304   Type = Backup
 305   Level = Incremental                 # default
 306   Client=rufus-fd
 307   FileSet="Full Set"
 308   Schedule = "WeeklyCycle"
 309   Messages = Standard
 310   Pool = Default
 311 }
 312
 313 # Default pool definition
 314 Pool {
 315   Name = Default
 316   Pool Type = Backup
 317   AutoPrune = yes
 318   Recycle = yes
 319   Next Pool = Tape
 320   Storage = File
 321   LabelFormat = "File"
 322 }
 323
 324 # Tape pool definition
 325 Pool {
 326   Name = Tape
 327   Pool Type = Backup
 328   AutoPrune = yes
 329   Recycle = yes
 330   Storage = DLTDrive
 331 }
 332
 333 # Definition of File storage device
 334 Storage {
 335   Name = File
 336   Address = rufus
 337   Password = "ccV3lVTsQRsdIUGyab0N4sMDavui2hOBkmpBU0aQKOr9"
 338   Device = "File"          # same as Device in Storage daemon
 339   Media Type = File        # same as MediaType in Storage daemon
 340 }
 341
 342 # Definition of DLT tape storage device
 343 Storage {
 344   Name = DLTDrive
 345   Address = rufus
 346   Password = "ccV3lVTsQRsdIUGyab0N4sMDavui2hOBkmpBU0aQKOr9"
 347   Device = "HP DLT 80"      # same as Device in Storage daemon
 348   Media Type = DLT8000      # same as MediaType in Storage daemon
 349 }
 350
 351 \end{verbatim}
 352 \normalsize
 353
 354 Where we have included only the essential information -- i.e. the
 355 Director, FileSet, Catalog, Client, Schedule, and Messages resources are
 356 omitted.
 357
 358 As you can see, by running the NightlySave Job, the data will be backed up
 359 to File storage using the Default pool to specify the Storage as File.
 360
 361 Now, if we add the following Job resource to this conf file.
 362
 363 \footnotesize
 364 \begin{verbatim}
 365 Job {
 366   Name = "migrate-volume"
 367   Type = Migrate
 368   Level = Full
 369   Client = rufus-fd
 370   FileSet = "Full Set"
 371   Messages = Standard
 372   Storage = DLTDrive
 373   Pool = Default
 374   Maximum Concurrent Jobs = 4
 375   Selection Type = Volume
 376   Selection Pattern = "File"
 377 }
 378 \end{verbatim}
 379 \normalsize
 380
 381 and then run the job named {\bf migrate-volume}, all volumes in the Pool
 382 named Default (as specified in the migrate-volume Job that match the
 383 regular expression pattern {\bf File} will be migrated to tape storage
 384 DLTDrive because the {\bf Next Pool} in the Default Pool specifies that
 385 Migrations should go to the pool named {\bf Tape}, which uses
 386 Storage {\bf DLTDrive}.
 387
 388 If instead, we use a Job resource as follows:
 389
 390 \footnotesize
 391 \begin{verbatim}
 392 Job {
 393   Name = "migrate"
 394   Type = Migrate
 395   Level = Full
 396   Client = rufus-fd
 397   FileSet="Full Set"
 398   Messages = Standard
 399   Storage = DLTDrive
 400   Pool = Default
 401   Maximum Concurrent Jobs = 4
 402   Selection Type = Job
 403   Selection Pattern = ".*Save"
 404 }
 405 \end{verbatim}
 406 \normalsize
 407
 408 All jobs ending with the name Save will be migrated from the File Default to
 409 the Tape Pool, or from File storage to Tape storage.