git.sur5r.net Git - bacula/docs/blob - docs/manual/migration.tex

   1
   2 \section*{Migration}
   3 \label{_MigrationChapter}
   4 \index[general]{Migration}
   5 \addcontentsline{toc}{section}{Migration}
   6
   7 The term Migration, as used in the context of Bacula, means moving data from
   8 one Volume to another.  In particular it refers to a Job (similar to a backup
   9 job) that reads data that was previously backed up to a Volume and writes
  10 it to another Volume.  As part of this process, the File catalog records
  11 associated with the first backup job are purged.  In other words, Migration
  12 moves Bacula Job data from one Volume to another.  Although we mention
  13 Volumes to simplify the subject, in reality, Migration reads the data
  14 from one Volume and writes it to different Volume in a different pool,
  15 which is equivalent to moving individual Jobs from one Pool to another.
  16
  17 Migrations can be based on quite a number of different criteria such as:
  18 \begin{itemize}
  19 \item a single previous Job
  20 \item a Volume
  21 \item a Client
  22 \item a regular expression matching a Job, Volume, or Client name
  23 \item the time a Job is on a Volume
  24 \item high and low water marks (usage or occupation) of a Pool
  25 \item Volume size
  26 \end{itemize}
  27
  28 The details of these selection criteria will be defined below.
  29
  30 To run a Migration job, you must first define a Job resource very similar
  31 to a Backup Job but with {\bf Type = Migrate} instead of {\bf Type =
  32 Backup}.  One of the key points to remember is that the Pool that is
  33 specified for the migration job is the only pool from which jobs will
  34 be migrated, with one exception noted below.
  35
  36 The migration job normally is either manually started or starts
  37 from a Schedule much like a backup job. It searches
  38 for a previous backup Job or Jobs that match the parameters you have
  39 specified in the migration Job resource, primarily a {\bf Selection Type}
  40 (detailed a bit later).  Then for
  41 each previous backup JobId found, the Migration Job will run a new Job which
  42 copies the old Job data from the previous Volume to a new Volume in
  43 the Migration Pool.  It is possible that no prior Jobs are found for
  44 migration, in which case, the Migration job will simply terminate having
  45 done nothing, but normally at a minimum, three jobs are involved during a
  46 migration:
  47
  48 \begin{itemize}
  49 \item The currently running Migration control Job
  50 \item The previous Backup Job (already run)
  51 \item A new Migration Backup Job that moves the data from the
  52       previous Backup job to the new Volume.
  53 \end{itemize}
  54
  55 If the Migration control job finds a number of JobIds to migrate (e.g.
  56 it is asked to migrate one or more Volumes), it will start one new
  57 migration backup job for each JobId found.
  58
  59 \subsection*{Migration Job Resource Directives}
  60 \addcontentsline{toc}{section}{Migration Job Resource Directives}
  61
  62 The following directives can appear in a Director's Job resource, and they
  63 are used to define a Migration job.
  64
  65 \begin{description}
  66 \item [Pool = \lt{}Pool-name\gt{}] The Pool specified in the Migration
  67    control Job is not a new directive for the Job resource, but it is
  68    particularly important because it determines what Pool will be examined for
  69    finding JobIds to migrate.  The exception to this is when {\bf Selection
  70    Type = SQLQuery}, in which case no Pool is used, unless you
  71    specifically include it in the SQL query.
  72
  73 \item [Type = Migrate]
  74    {\bf Migrate} is a new type that defines the job that is run as being a
  75    Migration Job.  A Migration Job is a sort of control job and does not have
  76    any Files associated with it, and in that sense they are more or less like
  77     an Admin job.  Migration jobs simply check to see if there is anything to
  78    Migrate then possibly start and control new Backup jobs to migrate the data
  79    from the specified Pool to another Pool.
  80
  81 \item [Selection Type = \lt{}Selection-type-keyword\gt{}]
  82   The \lt{}Selection-type-keyword\gt{} determines how the migration job
  83   will go about selecting what JobIds to migrate. In most cases, it is
  84   used in conjunction with a {\bf Selection Pattern} to give you fine
  85   control over exactly what JobIds are selected.  The possible values
  86   for \lt{}Selection-type-keyword\gt{} are:
  87   \begin{description}
  88   \item [SmallestVolume] This selection keyword selects the volume with the
  89         fewest bytes from the Pool to be migrated.  The Pool to be migrated
  90         is the Pool defined in the Migration Job resource.  The migration
  91         control job will then start and run one migration backup job for
  92         each of the Jobs found on this Volume.  The Selection Pattern, if
  93         specified, is not used.
  94
  95   \item [OldestVolume] This selection keyword selects the volume with the
  96         oldest last write time in the Pool to be migrated.  The Pool to be
  97         migrated is the Pool defined in the Migration Job resource.  The
  98         migration control job will then start and run one migration backup
  99         job for each of the Jobs found on this Volume.  The Selection
 100         Pattern, if specified, is not used.
 101
 102   \item [Client] The Client selection type, first selects all the Clients
 103         that have been backed up in the Pool specified by the Migration
 104         Job resource, then it applies the {\bf Selection Pattern} (defined
 105         below) as a regular expression to the list of Client names, giving
 106         a filtered Client name list.  All jobs that were backed up for those
 107         filtered (regexed) Clients will be migrated.
 108         The migration control job will then start and run one migration
 109         backup job for each of the JobIds found for those filtered Clients.
 110
 111   \item [Volume] The Volume selection type, first selects all the Volumes
 112         that have been backed up in the Pool specified by the Migration
 113         Job resource, then it applies the {\bf Selection Pattern} (defined
 114         below) as a regular expression to the list of Volume names, giving
 115         a filtered Volume list.  All JobIds that were backed up for those
 116         filtered (regexed) Volumes will be migrated.
 117         The migration control job will then start and run one migration
 118         backup job for each of the JobIds found on those filtered Volumes.
 119
 120   \item [Job] The Job selection type, first selects all the Jobs (as
 121         defined on the {\bf Name} directive in a Job resource)
 122         that have been backed up in the Pool specified by the Migration
 123         Job resource, then it applies the {\bf Selection Pattern} (defined
 124         below) as a regular expression to the list of Job names, giving
 125         a filtered Job name list.  All JobIds that were run for those
 126         filtered (regexed) Job names will be migrated.  Note, for a given
 127         Job named, they can be many jobs (JobIds) that ran.
 128         The migration control job will then start and run one migration
 129         backup job for each of the Jobs found.
 130
 131   \item [SQLQuery] The SQLQuery selection type, used the {\bf Selection
 132         Pattern} as an SQL query to obtain the JobIds to be migrated.
 133         The Selection Pattern must be a valid SELECT SQL statement for your
 134         SQL engine, and it must return the JobId as the first field
 135         of the SELECT.
 136
 137   \item [PoolOccupancy] This selection type will cause the Migration job
 138         to compute the total size of the specified pool for all Media Types
 139         combined. If it exceeds the {\bf Migration High Bytes} defined in
 140         the Pool, the Migration job will migrate all JobIds beginning with
 141         the oldest Volume in the pool (determined by Last Write time) until
 142         the Pool bytes drop below the {\bf Migration Low Bytes} defined in the
 143         Pool. This calculation should be consider rather approximative because
 144         it is made once by the Migration job before migration is begun, and
 145         thus does not take into account additional data written into the Pool
 146         during the migration.  In addition, the calculation of the total Pool
 147         byte size is based on the Volume bytes saved in the Volume (Media)
 148 database
 149         entries. The bytes caculate for Migration is based on the value stored
 150         in the Job records of the Jobs to be migrated. These do not include the
 151         Storage daemon overhead as is in the total Pool size. As a consequence,
 152         normally, the migration will migrate more bytes than strictly necessary.
 153
 154   \item [PoolTime] The PoolTime selection type will cause the Migration job to
 155         look at the time each JobId has been in the Pool since the job ended.
 156         All Jobs in the Pool longer than the time specified on {\bf Migration Time}
 157         directive in the Pool resource will be migrated.
 158   \end{description}
 159
 160 \item [Selection Pattern = \lt{}Quoted-string\gt{}]
 161   The Selection Patterns permitted for each Selection-type-keyword are
 162   described above.
 163
 164   For the OldestVolume and SmallestVolume, this
 165   Selection pattern is not used (ignored).
 166
 167   For the Client, Volume, and Job
 168   keywords, this pattern must be a valid regular expression that will filter
 169   the appropriate item names found in the Pool.
 170
 171   For the SQLQuery keyword, this pattern must be a valid SELECT SQL statement
 172   that returns JobIds.
 173
 174 \end{description}
 175
 176 \subsection*{Migration Pool Resource Directives}
 177 \addcontentsline{toc}{section}{Migration Pool Resource Directives}
 178
 179 The following directives can appear in a Director's Pool resource, and they
 180 are used to define a Migration job.
 181
 182 \begin{description}
 183 \item [Migration Time = \lt{}time-specification\gt{}]
 184    If a PoolTime migration is done, the time specified here in seconds (time
 185    modifiers are permitted -- e.g. hours, ...) will be used. If the
 186    previous Backup Job or Jobs selected have been in the Pool longer than
 187    the specified PoolTime, then they will be migrated.
 188
 189 \item [Migration High Bytes =  \lt{}byte-specification\gt{}]
 190    This directive specifies the number of bytes in the Pool which will
 191    trigger a migration if a {\bf PoolOccupancy} migration selection
 192    type has been specified. The fact that the Pool
 193    usage goes above this level does not automatically trigger a migration
 194    job. However, if a migration job runs and has the PoolOccupancy selection
 195    type set, the Migration High Bytes will be applied.  Bacula does not
 196    currently restrict a pool to have only a single Media Type, so you
 197    must keep in mind that if you mix Media Types in a Pool, the results
 198    may not be what you want, as the Pool count of all bytes will be
 199    for all Media Types combined.
 200
 201 \item [Migration Low Bytes = \lt{}byte-specification\gt{}]
 202    This directive specifies the number of bytes in the Pool which will
 203    stop a migration if a {\bf PoolOccupancy} migration selection
 204    type has been specified and triggered by more than Migration High
 205    Bytes being in the pool. In other words, once a migration job
 206    is started with {\bf PoolOccupancy} migration selection and it
 207    determines that there are more than Migration High Bytes, the
 208    migration job will continue to run jobs until the number of
 209    bytes in the Pool drop to or below Migration Low Bytes.
 210
 211 \item [Next Pool = \lt{}pool-specification\gt{}]
 212    The Next Pool directive specifies the pool to which Jobs will be
 213    migrated.
 214
 215 \item [Storage = \lt{}storage-specification\gt{}]
 216    The Storage directive specifies what Storage resource will be used
 217    for all Jobs that use this Pool. It takes precedence over any other
 218    Storage specifications that may have been given such as in the
 219    Schedule Run directive, or in the Job resource.
 220 \end{description}
 221
 222 \subsection*{Important Migration Considerations}
 223 \index[general]{Important Migration Considerations}
 224 \addcontentsline{toc}{subsection}{Important Migration Considerations}
 225 \begin{itemize}
 226 \item Migration takes place on a JobId by JobId basis. That is
 227       each JobId is migrated in its entirety and independently
 228       of other JobIds. Once the Job is migrated, it will be
 229       on the new medium in the new Pool, but for the most part,
 230       aside from having a new JobId, it will appear with all the
 231       same characteristics of the original job (start, end time, ...).
 232       The column RealEndTime in the Job table will contain the
 233       time and date that the Migration terminated, and by comparing
 234       it with the EndTime column you can tell whether or not the
 235       job was migrated.  The original job is purged of its File
 236       records, and its Type field is changed from "B" to "M" to
 237       indicate that the job was migrated.
 238
 239 \item Jobs on Volumes will be Migration only if the Volume is
 240       marked, Full, Used, or Error.  Volumes that are still
 241       marked Append will not be considered for migration. This
 242       prevents Bacula from attempting to read the Volume at
 243       the same time it is writing it.
 244
 245 \item As noted above, for the Migration High Bytes, the calculation
 246       of the bytes to migrate is somewhat approximate.
 247
 248 \item If you keep Volumes of different Media Types in the same Pool,
 249       it is not clear how well migration will work.  We recommend only
 250       one Media Type per pool.
 251
 252 \item It is possible to get into a resource deadlock where Bacula does
 253       not find enough drives to simultaneously read and write all the
 254       Volumes needed to do Migrations. For the moment, you must take
 255       care as all the resource deadlock algorithms are not yet implemented.
 256
 257 \item Migration is done only when you run a Migration job. If you set a
 258       Migration High Bytes and that number of bytes is exceeded in the Pool
 259       no migration job will automatically start.  You must schedule the
 260       migration jobs yourself.
 261
 262 \item If you migrate a number of Volumes, a very large number of Migration
 263       jobs may start.
 264
 265 \item Figuring out what jobs will actually be migrated can be a bit complicated
 266       due to the flexibility provided by the regex patterns and the number of
 267       different options.  Turning on a debug level of 100 or more will provide
 268       a limited amount of debug information about the migration selection
 269       process.
 270 \end{itemize}
 271
 272
 273 \subsection*{Example Migration Jobs}
 274 \index[general]{Example Migration Jobs}
 275 \addcontentsline{toc}{subsection}{Example Migration Jobs}
 276
 277 When you specify a Migration Job, you must specify all the standard
 278 directives as for a Job.  However, certain such as the Level, Client, and
 279 FileSet, though they must be defined, are ignored by the Migration job
 280 because the values from the original job used instead.
 281
 282 As an example, suppose you have the following Job that
 283 you run every night:
 284
 285 \footnotesize
 286 \begin{verbatim}
 287 # Define the backup Job
 288 Job {
 289   Name = "NightlySave"
 290   Type = Backup
 291   Level = Incremental                 # default
 292   Client=rufus-fd
 293   FileSet="Full Set"
 294   Schedule = "WeeklyCycle"
 295   Messages = Standard
 296   Pool = Default
 297 }
 298
 299 # Default pool definition
 300 Pool {
 301   Name = Default
 302   Pool Type = Backup
 303   AutoPrune = yes
 304   Recycle = yes
 305   Next Pool = Tape
 306   Storage = File
 307   LabelFormat = "File"
 308 }
 309
 310 # Tape pool definition
 311 Pool {
 312   Name = Tape
 313   Pool Type = Backup
 314   AutoPrune = yes
 315   Recycle = yes
 316   Storage = DLTDrive
 317 }
 318
 319 # Definition of File storage device
 320 Storage {
 321   Name = File
 322   Address = rufus
 323   Password = "ccV3lVTsQRsdIUGyab0N4sMDavui2hOBkmpBU0aQKOr9"
 324   Device = "File"          # same as Device in Storage daemon
 325   Media Type = File        # same as MediaType in Storage daemon
 326 }
 327
 328 # Definition of DLT tape storage device
 329 Storage {
 330   Name = DLTDrive
 331   Address = rufus
 332   Password = "ccV3lVTsQRsdIUGyab0N4sMDavui2hOBkmpBU0aQKOr9"
 333   Device = "HP DLT 80"      # same as Device in Storage daemon
 334   Media Type = DLT8000      # same as MediaType in Storage daemon
 335 }
 336
 337 \end{verbatim}
 338 \normalsize
 339
 340 Where we have included only the essential information -- i.e. the
 341 Director, FileSet, Catalog, Client, Schedule, and Messages resources are
 342 omitted.
 343
 344 As you can see, by running the NightlySave Job, the data will be backed up
 345 to File storage using the Default pool to specify the Storage as File.
 346
 347 Now, if we add the following Job resource to this conf file.
 348
 349 \footnotesize
 350 \begin{verbatim}
 351 Job {
 352   Name = "migrate-volume"
 353   Type = Migrate
 354   Level = Full
 355   Client = rufus-fd
 356   FileSet = "Full Set"
 357   Messages = Standard
 358   Storage = DLTDrive
 359   Pool = Default
 360   Maximum Concurrent Jobs = 4
 361   Selection Type = Volume
 362   Selection Pattern = "File"
 363 }
 364 \end{verbatim}
 365 \normalsize
 366
 367 and then run the job named {\bf migrate-volume}, all volumes in the Pool
 368 named Default (as specified in the migrate-volume Job that match the
 369 regular expression pattern {\bf File} will be migrated to tape storage
 370 DLTDrive because the {\bf Next Pool} in the Default Pool specifies that
 371 Migrations should go to the pool named {\bf Tape}, which uses
 372 Storage {\bf DLTDrive}.
 373
 374 If instead, we use a Job resource as follows:
 375
 376 \footnotesize
 377 \begin{verbatim}
 378 Job {
 379   Name = "migrate"
 380   Type = Migrate
 381   Level = Full
 382   Client = rufus-fd
 383   FileSet="Full Set"
 384   Messages = Standard
 385   Storage = DLTDrive
 386   Pool = Default
 387   Maximum Concurrent Jobs = 4
 388   Selection Type = Job
 389   Selection Pattern = ".*Save"
 390 }
 391 \end{verbatim}
 392 \normalsize
 393
 394 All jobs ending with the name Save will be migrated from the File Default to
 395 the Tape Pool, or from File storage to Tape storage.