2 \chapter{Migration and Copy}
3 \label{MigrationChapter}
4 \index[general]{Migration}
7 The term Migration, as used in the context of Bacula, means moving data from
8 one Volume to another. In particular it refers to a Job (similar to a backup
9 job) that reads data that was previously backed up to a Volume and writes
10 it to another Volume. As part of this process, the File catalog records
11 associated with the first backup job are purged. In other words, Migration
12 moves Bacula Job data from one Volume to another by reading the Job data
13 from the Volume it is stored on, writing it to a different Volume in a
14 different Pool, and then purging the database records for the first Job.
16 The Copy process is essentially identical to the Migration feature with the
17 exception that the Job that is copied is left unchanged. This essentially
18 creates two identical copies of the same backup. However, the copy is treated
19 as a copy rather than a backup job, and hence is not directly available for
20 restore. If bacula founds a copy when a job record is purged (deleted) from the
21 catalog, it will promote the copy as \textsl{real} backup and will make it
22 available for automatic restore.
24 The Copy and the Migration jobs run without using the File daemon by copying
25 the data from the old backup Volume to a different Volume in a different Pool.
27 The selection process for which Job or Jobs are migrated
28 can be based on quite a number of different criteria such as:
30 \item a single previous Job
33 \item a regular expression matching a Job, Volume, or Client name
34 \item the time a Job has been on a Volume
35 \item high and low water marks (usage or occupation) of a Pool
39 The details of these selection criteria will be defined below.
41 To run a Migration job, you must first define a Job resource very similar
42 to a Backup Job but with {\bf Type = Migrate} instead of {\bf Type =
43 Backup}. One of the key points to remember is that the Pool that is
44 specified for the migration job is the only pool from which jobs will
45 be migrated, with one exception noted below. In addition, the Pool to
46 which the selected Job or Jobs will be migrated is defined by the {\bf
47 Next Pool = ...} in the Pool resource specified for the Migration Job.
49 Bacula permits Pools to contain Volumes with different Media Types.
50 However, when doing migration, this is a very undesirable condition. For
51 migration to work properly, you should use Pools containing only Volumes of
52 the same Media Type for all migration jobs.
54 The migration job normally is either manually started or starts
55 from a Schedule much like a backup job. It searches
56 for a previous backup Job or Jobs that match the parameters you have
57 specified in the migration Job resource, primarily a {\bf Selection Type}
58 (detailed a bit later). Then for
59 each previous backup JobId found, the Migration Job will run a new Job which
60 copies the old Job data from the previous Volume to a new Volume in
61 the Migration Pool. It is possible that no prior Jobs are found for
62 migration, in which case, the Migration job will simply terminate having
63 done nothing, but normally at a minimum, three jobs are involved during a
67 \item The currently running Migration control Job. This is only
68 a control job for starting the migration child jobs.
69 \item The previous Backup Job (already run). The File records
70 for this Job are purged if the Migration job successfully
71 terminates. The original data remains on the Volume until
72 it is recycled and rewritten.
73 \item A new Migration Backup Job that moves the data from the
74 previous Backup job to the new Volume. If you subsequently
75 do a restore, the data will be read from this Job.
78 If the Migration control job finds a number of JobIds to migrate (e.g.
79 it is asked to migrate one or more Volumes), it will start one new
80 migration backup job for each JobId found on the specified Volumes.
81 Please note that Migration doesn't scale too well since Migrations are
82 done on a Job by Job basis. This if you select a very large volume or
83 a number of volumes for migration, you may have a large number of
84 Jobs that start. Because each job must read the same Volume, they will
85 run consecutively (not simultaneously).
87 \section{Migration and Copy Job Resource Directives}
89 The following directives can appear in a Director's Job resource, and they
90 are used to define a Migration job.
93 \item [Pool = \lt{}Pool-name\gt{}] The Pool specified in the Migration
94 control Job is not a new directive for the Job resource, but it is
95 particularly important because it determines what Pool will be examined
96 for finding JobIds to migrate. The exception to this is when {\bf
97 Selection Type = SQLQuery}, and although a Pool directive must still be
98 specified, no Pool is used, unless you specifically include it in the
99 SQL query. Note, in any case, the Pool resource defined by the Pool
100 directove must contain a {\bf Next Pool = ...} directive to define the
101 Pool to which the data will be migrated.
103 \item [Type = Migrate]
104 {\bf Migrate} is a new type that defines the job that is run as being a
105 Migration Job. A Migration Job is a sort of control job and does not have
106 any Files associated with it, and in that sense they are more or less like
107 an Admin job. Migration jobs simply check to see if there is anything to
108 Migrate then possibly start and control new Backup jobs to migrate the data
109 from the specified Pool to another Pool. Note, any original JobId that
110 is migrated will be marked as having been migrated, and the original
111 JobId can nolonger be used for restores; all restores will be done from
112 the new migrated Job.
116 {\bf Copy} is a new type that defines the job that is run as being a
117 Copy Job. A Copy Job is a sort of control job and does not have
118 any Files associated with it, and in that sense they are more or less like
119 an Admin job. Copy jobs simply check to see if there is anything to
120 Copy then possibly start and control new Backup jobs to copy the data
121 from the specified Pool to another Pool. Note that when a copy is
122 made, the original JobIds are left unchanged. The new copies can not
123 be used for restoration unless you specifically choose them by JobId.
124 If you subsequently delete a JobId that has a copy, the copy will be
125 automatically upgraded to a Backup rather than a Copy, and it will
126 subsequently be used for restoration.
128 \item [Selection Type = \lt{}Selection-type-keyword\gt{}]
129 The \lt{}Selection-type-keyword\gt{} determines how the migration job
130 will go about selecting what JobIds to migrate. In most cases, it is
131 used in conjunction with a {\bf Selection Pattern} to give you fine
132 control over exactly what JobIds are selected. The possible values
133 for \lt{}Selection-type-keyword\gt{} are:
135 \item [SmallestVolume] This selection keyword selects the volume with the
136 fewest bytes from the Pool to be migrated. The Pool to be migrated
137 is the Pool defined in the Migration Job resource. The migration
138 control job will then start and run one migration backup job for
139 each of the Jobs found on this Volume. The Selection Pattern, if
140 specified, is not used.
142 \item [OldestVolume] This selection keyword selects the volume with the
143 oldest last write time in the Pool to be migrated. The Pool to be
144 migrated is the Pool defined in the Migration Job resource. The
145 migration control job will then start and run one migration backup
146 job for each of the Jobs found on this Volume. The Selection
147 Pattern, if specified, is not used.
149 \item [Client] The Client selection type, first selects all the Clients
150 that have been backed up in the Pool specified by the Migration
151 Job resource, then it applies the {\bf Selection Pattern} (defined
152 below) as a regular expression to the list of Client names, giving
153 a filtered Client name list. All jobs that were backed up for those
154 filtered (regexed) Clients will be migrated.
155 The migration control job will then start and run one migration
156 backup job for each of the JobIds found for those filtered Clients.
158 \item [Volume] The Volume selection type, first selects all the Volumes
159 that have been backed up in the Pool specified by the Migration
160 Job resource, then it applies the {\bf Selection Pattern} (defined
161 below) as a regular expression to the list of Volume names, giving
162 a filtered Volume list. All JobIds that were backed up for those
163 filtered (regexed) Volumes will be migrated.
164 The migration control job will then start and run one migration
165 backup job for each of the JobIds found on those filtered Volumes.
167 \item [Job] The Job selection type, first selects all the Jobs (as
168 defined on the {\bf Name} directive in a Job resource)
169 that have been backed up in the Pool specified by the Migration
170 Job resource, then it applies the {\bf Selection Pattern} (defined
171 below) as a regular expression to the list of Job names, giving
172 a filtered Job name list. All JobIds that were run for those
173 filtered (regexed) Job names will be migrated. Note, for a given
174 Job named, they can be many jobs (JobIds) that ran.
175 The migration control job will then start and run one migration
176 backup job for each of the Jobs found.
178 \item [SQLQuery] The SQLQuery selection type, used the {\bf Selection
179 Pattern} as an SQL query to obtain the JobIds to be migrated.
180 The Selection Pattern must be a valid SELECT SQL statement for your
181 SQL engine, and it must return the JobId as the first field
184 \item [PoolOccupancy] This selection type will cause the Migration job
185 to compute the total size of the specified pool for all Media Types
186 combined. If it exceeds the {\bf Migration High Bytes} defined in
187 the Pool, the Migration job will migrate all JobIds beginning with
188 the oldest Volume in the pool (determined by Last Write time) until
189 the Pool bytes drop below the {\bf Migration Low Bytes} defined in the
190 Pool. This calculation should be consider rather approximative because
191 it is made once by the Migration job before migration is begun, and
192 thus does not take into account additional data written into the Pool
193 during the migration. In addition, the calculation of the total Pool
194 byte size is based on the Volume bytes saved in the Volume (Media)
196 entries. The bytes calculate for Migration is based on the value stored
197 in the Job records of the Jobs to be migrated. These do not include the
198 Storage daemon overhead as is in the total Pool size. As a consequence,
199 normally, the migration will migrate more bytes than strictly necessary.
201 \item [PoolTime] The PoolTime selection type will cause the Migration job to
202 look at the time each JobId has been in the Pool since the job ended.
203 All Jobs in the Pool longer than the time specified on {\bf Migration Time}
204 directive in the Pool resource will be migrated.
206 \item [PoolUncopiedJobs] This selection which copies all jobs from a pool
207 to an other pool which were not copied before is available only for copy Jobs.
211 \item [Selection Pattern = \lt{}Quoted-string\gt{}]
212 The Selection Patterns permitted for each Selection-type-keyword are
215 For the OldestVolume and SmallestVolume, this
216 Selection pattern is not used (ignored).
218 For the Client, Volume, and Job
219 keywords, this pattern must be a valid regular expression that will filter
220 the appropriate item names found in the Pool.
222 For the SQLQuery keyword, this pattern must be a valid SELECT SQL statement
225 \item [ Purge Migration Job = \lt{}yes/no\gt{}]
226 This directive may be added to the Migration Job definition in the Director
227 configuration file to purge the job migrated at the end of a migration.
231 \section{Migration Pool Resource Directives}
233 The following directives can appear in a Director's Pool resource, and they
234 are used to define a Migration job.
237 \item [Migration Time = \lt{}time-specification\gt{}]
238 If a PoolTime migration is done, the time specified here in seconds (time
239 modifiers are permitted -- e.g. hours, ...) will be used. If the
240 previous Backup Job or Jobs selected have been in the Pool longer than
241 the specified PoolTime, then they will be migrated.
243 \item [Migration High Bytes = \lt{}byte-specification\gt{}]
244 This directive specifies the number of bytes in the Pool which will
245 trigger a migration if a {\bf PoolOccupancy} migration selection
246 type has been specified. The fact that the Pool
247 usage goes above this level does not automatically trigger a migration
248 job. However, if a migration job runs and has the PoolOccupancy selection
249 type set, the Migration High Bytes will be applied. Bacula does not
250 currently restrict a pool to have only a single Media Type, so you
251 must keep in mind that if you mix Media Types in a Pool, the results
252 may not be what you want, as the Pool count of all bytes will be
253 for all Media Types combined.
255 \item [Migration Low Bytes = \lt{}byte-specification\gt{}]
256 This directive specifies the number of bytes in the Pool which will
257 stop a migration if a {\bf PoolOccupancy} migration selection
258 type has been specified and triggered by more than Migration High
259 Bytes being in the pool. In other words, once a migration job
260 is started with {\bf PoolOccupancy} migration selection and it
261 determines that there are more than Migration High Bytes, the
262 migration job will continue to run jobs until the number of
263 bytes in the Pool drop to or below Migration Low Bytes.
265 \item [Next Pool = \lt{}pool-specification\gt{}]
266 The Next Pool directive specifies the pool to which Jobs will be
267 migrated. This directive is required to define the Pool into which
268 the data will be migrated. Without this directive, the migration job
269 will terminate in error.
271 \item [Storage = \lt{}storage-specification\gt{}]
272 The Storage directive specifies what Storage resource will be used
273 for all Jobs that use this Pool. It takes precedence over any other
274 Storage specifications that may have been given such as in the
275 Schedule Run directive, or in the Job resource. We highly recommend
276 that you define the Storage resource to be used in the Pool rather
277 than elsewhere (job, schedule run, ...).
280 \section{Important Migration Considerations}
281 \index[general]{Important Migration Considerations}
283 \item Each Pool into which you migrate Jobs or Volumes {\bf must}
284 contain Volumes of only one Media Type.
286 \item Migration takes place on a JobId by JobId basis. That is
287 each JobId is migrated in its entirety and independently
288 of other JobIds. Once the Job is migrated, it will be
289 on the new medium in the new Pool, but for the most part,
290 aside from having a new JobId, it will appear with all the
291 same characteristics of the original job (start, end time, ...).
292 The column RealEndTime in the catalog Job table will contain the
293 time and date that the Migration terminated, and by comparing
294 it with the EndTime column you can tell whether or not the
295 job was migrated. The original job is purged of its File
296 records, and its Type field is changed from "B" to "M" to
297 indicate that the job was migrated.
299 \item Jobs on Volumes will be Migration only if the Volume is
300 marked, Full, Used, or Error. Volumes that are still
301 marked Append will not be considered for migration. This
302 prevents Bacula from attempting to read the Volume at
303 the same time it is writing it. It also reduces other deadlock
304 situations, as well as avoids the problem that you migrate a
305 Volume and later find new files appended to that Volume.
307 \item As noted above, for the Migration High Bytes, the calculation
308 of the bytes to migrate is somewhat approximate.
310 \item If you keep Volumes of different Media Types in the same Pool,
311 it is not clear how well migration will work. We recommend only
312 one Media Type per pool.
314 \item It is possible to get into a resource deadlock where Bacula does
315 not find enough drives to simultaneously read and write all the
316 Volumes needed to do Migrations. For the moment, you must take
317 care as all the resource deadlock algorithms are not yet implemented.
319 \item Migration is done only when you run a Migration job. If you set a
320 Migration High Bytes and that number of bytes is exceeded in the Pool
321 no migration job will automatically start. You must schedule the
322 migration jobs, and they must run for any migration to take place.
324 \item If you migrate a number of Volumes, a very large number of Migration
327 \item Figuring out what jobs will actually be migrated can be a bit complicated
328 due to the flexibility provided by the regex patterns and the number of
329 different options. Turning on a debug level of 100 or more will provide
330 a limited amount of debug information about the migration selection
333 \item Bacula currently does only minimal Storage conflict resolution, so you
334 must take care to ensure that you don't try to read and write to the
335 same device or Bacula may block waiting to reserve a drive that it
336 will never find. In general, ensure that all your migration
337 pools contain only one Media Type, and that you always
338 migrate to pools with different Media Types.
340 \item The {\bf Next Pool = ...} directive must be defined in the Pool
341 referenced in the Migration Job to define the Pool into which the
342 data will be migrated.
344 \item Pay particular attention to the fact that data is migrated on a Job
345 by Job basis, and for any particular Volume, only one Job can read
346 that Volume at a time (no simultaneous read), so migration jobs that
347 all reference the same Volume will run sequentially. This can be a
348 potential bottle neck and does not scale very well to large numbers
351 \item Only migration of Selection Types of Job and Volume have
352 been carefully tested. All the other migration methods (time,
353 occupancy, smallest, oldest, ...) need additional testing.
355 \item Migration is only implemented for a single Storage daemon. You
356 cannot read on one Storage daemon and write on another.
360 \section{Example Migration Jobs}
361 \index[general]{Example Migration Jobs}
363 When you specify a Migration Job, you must specify all the standard
364 directives as for a Job. However, certain such as the Level, Client, and
365 FileSet, though they must be defined, are ignored by the Migration job
366 because the values from the original job used instead.
368 As an example, suppose you have the following Job that
369 you run every night. To note: there is no Storage directive in the
370 Job resource; there is a Storage directive in each of the Pool
371 resources; the Pool to be migrated (File) contains a Next Pool
372 directive that defines the output Pool (where the data is written
373 by the migration job).
377 # Define the backup Job
381 Level = Incremental # default
384 Schedule = "WeeklyCycle"
389 # Default pool definition
400 # Tape pool definition
409 # Definition of File storage device
413 Password = "ccV3lVTsQRsdIUGyab0N4sMDavui2hOBkmpBU0aQKOr9"
414 Device = "File" # same as Device in Storage daemon
415 Media Type = File # same as MediaType in Storage daemon
418 # Definition of DLT tape storage device
422 Password = "ccV3lVTsQRsdIUGyab0N4sMDavui2hOBkmpBU0aQKOr9"
423 Device = "HP DLT 80" # same as Device in Storage daemon
424 Media Type = DLT8000 # same as MediaType in Storage daemon
430 Where we have included only the essential information -- i.e. the
431 Director, FileSet, Catalog, Client, Schedule, and Messages resources are
434 As you can see, by running the NightlySave Job, the data will be backed up
435 to File storage using the Default pool to specify the Storage as File.
437 Now, if we add the following Job resource to this conf file.
442 Name = "migrate-volume"
449 Maximum Concurrent Jobs = 4
450 Selection Type = Volume
451 Selection Pattern = "File"
456 and then run the job named {\bf migrate-volume}, all volumes in the Pool
457 named Default (as specified in the migrate-volume Job that match the
458 regular expression pattern {\bf File} will be migrated to tape storage
459 DLTDrive because the {\bf Next Pool} in the Default Pool specifies that
460 Migrations should go to the pool named {\bf Tape}, which uses
461 Storage {\bf DLTDrive}.
463 If instead, we use a Job resource as follows:
475 Maximum Concurrent Jobs = 4
477 Selection Pattern = ".*Save"
482 All jobs ending with the name Save will be migrated from the File Default to
483 the Tape Pool, or from File storage to Tape storage.