3 Bacula Projects Roadmap
6 Below, you will find more information on future projects:
8 Item 1: Implement a Migration job type that will move the job
9 data from one device to another.
10 Origin: Sponsored by Riege Sofware International GmbH. Contact:
11 Daniel Holtkamp <holtkamp at riege dot com>
13 Status: Partially coded in 1.37 -- much more to do. Assigned to
16 What: The ability to copy, move, or archive data that is on a
17 device to another device is very important.
19 Why: An ISP might want to backup to disk, but after 30 days
20 migrate the data to tape backup and delete it from
21 disk. Bacula should be able to handle this
22 automatically. It needs to know what was put where,
23 and when, and what to migrate -- it is a bit like
24 retention periods. Doing so would allow space to be
25 freed up for current backups while maintaining older
28 Notes: Migration could be triggered by:
32 Highwater size (keep total size)
35 Item 2: Implement extraction of Win32 BackupWrite data.
36 Origin: Thorsten Engel <thorsten.engel at matrix-computer dot com>
38 Status: Assigned to Thorsten. Implemented in current CVS
40 What: This provides the Bacula File daemon with code that
41 can pick apart the stream output that Microsoft writes
42 for BackupWrite data, and thus the data can be read
43 and restored on non-Win32 machines.
45 Why: BackupWrite data is the portable=no option in Win32
46 FileSets, and in previous Baculas, this data could
47 only be extracted using a Win32 FD. With this new code,
48 the Windows data can be extracted and restored on
52 Item 3: Implement a Bacula GUI/management tool using Python
59 What: Implement a Bacula console, and management tools
62 Why: Don't we already have a wxWidgets GUI? Yes, but
63 it is written in C++ and changes to the user interface
64 must be hand tailored using C++ code. By developing
65 the user interface using Qt designer, the interface
66 can be very easily updated and most of the new Python
67 code will be automatically created. The user interface
68 changes become very simple, and only the new features
69 must be implement. In addition, the code will be in
70 Python, which will give many more users easy (or easier)
71 access to making additions or modifications.
73 Item 4: Implement a Python interface to the Bacula catalog.
78 What: Implement an interface for Python scripts to access
79 the catalog through Bacula.
81 Why: This will permit users to customize Bacula through
84 Item 5: Implement more Python events in Bacula.
89 What: Allow Python scripts to be called at more places
90 within Bacula and provide additional access to Bacula
93 Why: This will permit users to customize Bacula through
101 Item 6: Implement Base jobs.
102 Date: 28 October 2005
106 What: A base job is sort of like a Full save except that you
107 will want the FileSet to contain only files that are
108 unlikely to change in the future (i.e. a snapshot of
109 most of your system after installing it). After the
110 base job has been run, when you are doing a Full save,
111 you specify one or more Base jobs to be used. All
112 files that have been backed up in the Base job/jobs but
113 not modified will then be excluded from the backup.
114 During a restore, the Base jobs will be automatically
115 pulled in where necessary.
117 Why: This is something none of the competition does, as far as
118 we know (except perhpas BackupPC, which is a Perl program that
119 saves to disk only). It is big win for the user, it
120 makes Bacula stand out as offering a unique
121 optimization that immediately saves time and money.
122 Basically, imagine that you have 100 nearly identical
123 Windows or Linux machine containing the OS and user
124 files. Now for the OS part, a Base job will be backed
125 up once, and rather than making 100 copies of the OS,
126 there will be only one. If one or more of the systems
127 have some files updated, no problem, they will be
128 automatically restored.
130 Notes: Huge savings in tape usage even for a single machine.
131 Will require more resources because the DIR must send
132 FD a list of files/attribs, and the FD must search the
133 list and compare it for each file to be saved.
135 Item 7: Add Plug-ins to the FileSet Include statements.
136 Date: 28 October 2005
138 Status: Partially coded in 1.37 -- much more to do.
140 What: Allow users to specify wild-card and/or regular
141 expressions to be matched in both the Include and
142 Exclude directives in a FileSet. At the same time,
143 allow users to define plug-ins to be called (based on
144 regular expression/wild-card matching).
146 Why: This would give the users the ultimate ability to control
147 how files are backed up/restored. A user could write a
148 plug-in knows how to backup his Oracle database without
149 stopping/starting it, for example.
151 Item 8: Implement huge exclude list support using hashing.
152 Date: 28 October 2005
156 What: Allow users to specify very large exclude list (currently
157 more than about 1000 files is too many).
159 Why: This would give the users the ability to exclude all
160 files that are loaded with the OS (e.g. using rpms
161 or debs). If the user can restore the base OS from
162 CDs, there is no need to backup all those files. A
163 complete restore would be to restore the base OS, then
164 do a Bacula restore. By excluding the base OS files, the
165 backup set will be *much* smaller.
168 Item 9: Implement data encryption (as opposed to communications
170 Date: 28 October 2005
171 Origin: Sponsored by Landon and 13 contributors to EFF.
172 Status: Landon Fuller is currently implementing this.
174 What: Currently the data that is stored on the Volume is not
175 encrypted. For confidentiality, encryption of data at
176 the File daemon level is essential.
177 Data encryption encrypts the data in the File daemon and
178 decrypts the data in the File daemon during a restore.
180 Why: Large sites require this.
182 Item 10: Permit multiple Media Types in an Autochanger
186 What: Modify the Storage daemon so that multiple Media Types
187 can be specified in an autochanger. This would be somewhat
188 of a simplistic implementation in that each drive would
189 still be allowed to have only one Media Type. However,
190 the Storage daemon will ensure that only a drive with
191 the Media Type that matches what the Director specifies
194 Why: This will permit user with several different drive types
195 to make full use of their autochangers.
197 Item 11: Allow two different autochanger definitions that refer
198 to the same autochanger.
199 Date: 28 October 2005
203 What: Currently, the autochanger script is locked based on
204 the autochanger. That is, if multiple drives are being
205 simultaneously used, the Storage daemon ensures that only
206 one drive at a time can access the mtx-changer script.
207 This change would base the locking on the control device,
208 rather than the autochanger. It would then permit two autochanger
209 definitions for the same autochanger, but with different
210 drives. Logically, the autochanger could then be "partitioned"
211 for different jobs, clients, or class of jobs, and if the locking
212 is based on the control device (e.g. /dev/sg0) the mtx-changer
213 script will be locked appropriately.
215 Why: This will permit users to partition autochangers for specific
216 use. It would also permit implementation of multiple Media
217 Types with no changes to the Storage daemon.
219 Item 12: Implement red/black binary tree routines.
220 Date: 28 October 2005
224 What: Implement a red/black binary tree class. This could
225 then replace the current binary insert/search routines
226 used in the restore in memory tree. This could significantly
227 speed up the creation of the in memory restore tree.
229 Why: Performance enhancement.
231 Item 13: Let Bacula log tape usage and handle drive cleaning cycles.
232 Date: November 11, 2005
233 Origin: Arno Lehmann <al at its-lehmann dot de>
236 What: Make Bacula manage tape life cycle information and drive
239 Why: Both parts of this project are important when operating backups.
240 We need to know which tapes need replacement, and we need to
241 make sure the drives are cleaned when necessary. While many
242 tape libraries and even autoloaders can handle all this
243 automatically, support by Bacula can be helpful for smaller
244 (older) libraries and single drives. Also, checking drive
245 status during operation can prevent some failures (as I had to
246 learn the hard way...)
248 Notes: First, Bacula could (and even does, to some limited extent)
249 record tape and drive usage. For tapes, the number of mounts,
250 the amount of data, and the time the tape has actually been
251 running could be recorded. Data fields for Read and Write time
252 and Nmber of mounts already exist in the catalog (I'm not sure
253 if VolBytes is the sum of all bytes ever written to that volume
254 by Bacula). This information can be important when determining
255 which media to replace. For the tape drives known to Bacula,
256 similar information is interesting to determine the device
257 status and expected life time: Time it's been Reading and
258 Writing, number of tape Loads / Unloads / Errors. This
259 information is not yet recorded as far as I know.
261 The next step would be implementing drive cleaning setup.
262 Bacula already has knowledge about cleaning tapes. Once it has
263 some information about cleaning cycles (measured in drive run
264 time, number of tapes used, or calender days, for example) it
265 can automatically execute tape cleaning (with an autochanger,
266 obviously) or ask for operator assistence loading a cleaning
269 The next step would be to implement TAPEALERT checks not only
270 when changing tapes and only sending he information to the
271 administrator, but rather checking after each tape error,
272 checking on a regular basis (for example after each tape file),
273 and also before unloading and after loading a new tape. Then,
274 depending on the drives TAPEALERT state and the know drive
275 cleaning state Bacula could automatically schedule later
276 cleaning, clean immediately, or inform the operator.
278 Implementing this would perhaps require another catalog change
279 and perhaps major changes in SD code and the DIR-SD protocoll,
280 so I'd only consider this worth implementing if it would
281 actually be used or even needed by many people.
283 Item 14: Merging of multiple backups into a single one. (Also called Synthetic
284 Backup or Consolidation).
286 Origin: Marc Cousin and Eric Bollengier
287 Date: 15 November 2005
288 Status: Depends on first implementing project Item 1 (Migration).
290 What: A merged backup is a backup made without connecting to the Client.
291 It would be a Merge of existing backups into a single backup.
292 In effect, it is like a restore but to the backup medium.
294 For instance, say that last sunday we made a full backup. Then
295 all week long, we created incremental backups, in order to do
296 them fast. Now comes sunday again, and we need another full.
297 The merged backup makes it possible to do instead an incremental
298 backup (during the night for instance), and then create a merged
299 backup during the day, by using the full and incrementals from
300 the week. The merged backup will be exactly like a full made
301 sunday night on the tape, but the production interruption on the
302 Client will be minimal, as the Client will only have to send
305 In fact, if it's done correctly, you could merge all the
306 Incrementals into single Incremental, or all the Incrementals
307 and the last Differential into a new Differential, or the Full,
308 last differential and all the Incrementals into a new Full
309 backup. And there is no need to involve the Client.
311 Why: The benefit is that :
312 - the Client just does an incremental ;
313 - the merged backup on tape is just as a single full backup,
314 and can be restored very fast.
316 This is also a way of reducing the backup data since the old
317 data can then be pruned (or not) from the catalog, possibly
318 allowing older volumes to be recycled
320 Item 1: Automatic disabling of devices
322 Origin: Peter Eriksson <peter at ifm.liu dot se>
325 What: After a configurable amount of fatal errors with a tape drive
326 Bacula should automatically disable further use of a certain
327 tape drive. There should also be "disable"/"enable" commands in
330 Why: On a multi-drive jukebox there is a possibility of tape drives
331 going bad during large backups (needing a cleaning tape run,
332 tapes getting stuck). It would be advantageous if Bacula would
333 automatically disable further use of a problematic tape drive
334 after a configurable amount of errors has occured.
336 An example: I have a multi-drive jukebox (6 drives, 380+ slots)
337 where tapes occasionally get stuck inside the drive. Bacula will
338 notice that the "mtx-changer" command will fail and then fail
339 any backup jobs trying to use that drive. However, it will still
340 keep on trying to run new jobs using that drive and fail -
341 forever, and thus failing lots and lots of jobs... Since we have
342 many drives Bacula could have just automatically disabled
343 further use of that drive and used one of the other ones
347 ============= Empty RFC form ===========
348 Item n: One line summary ...
350 Origin: Name and email of originator.
353 What: More detailed explanation ...
355 Why: Why it is important ...
357 Notes: Additional notes or features (omit if not used)
358 ============== End RFC form ==============
361 Items completed for release 1.38.0:
362 #4 Embedded Python Scripting (implemented in all Daemons)
363 #5 Events that call a Python program (Implemented in all
364 daemons, but more cleanup work to be done).
365 #6 Select one from among Multiple Storage Devices for Job.
366 This is already implemented in 1.37.
367 #7 Single Job Writing to Multiple Storage Devices. This is
368 currently implemented with a Clone feature.
369 #- Full multiple drive Autochanger support (done in 1.37)
370 #- Built in support for communications encryption (TLS)
371 done by Landon Fuller.
372 # Support for Unicode characters
373 (via UTF-8) on Win32 machines thanks to Thorsten Engel.
374 Item 8: Break the one-to-one Relationship between a Job and a
375 Specific Storage Device (or Devices if #10 is implemented).
377 Completed items from last year's list:
378 Item 1: Multiple simultaneous Jobs. (done)
379 Item 3: Write the bscan program -- also write a bcopy program (done).
380 Item 5: Implement Label templates (done).
381 Item 6: Write a regression script (done)
382 Item 9: Add SSL to daemon communications (done by Landon Fuller)
383 Item 10: Define definitive tape format (done)
384 Item 3: GUI for interactive restore. Partially Implemented in 1.34
385 Note, there is now a complete Webmin plugin, a partial
386 GNOME console, and an excellent wx-console GUI.
387 Item 4: GUI for interactive backup
388 Item 2: Job Data Spooling.
389 Done: Regular expression matching.
390 Item 10: New daemon communication protocol (this has been dropped).