3 Bacula Projects Roadmap
4 Prioritized by user vote 07 December 2005
5 Status updated 15 December 2006
8 Item 1: Implement data encryption (as opposed to comm encryption)
9 Item 2: Implement Migration that moves Jobs from one Pool to another.
10 Item 3: Accurate restoration of renamed/deleted files from
11 Item 4: Implement a Bacula GUI/management tool using Python.
12 Item 5: Implement Base jobs.
13 Item 6: Allow FD to initiate a backup
14 Item 7: Improve Bacula's tape and drive usage and cleaning management.
15 Item 8: Implement creation and maintenance of copy pools
16 Item 9: Implement new {Client}Run{Before|After}Job feature.
17 Item 10: Merge multiple backups (Synthetic Backup or Consolidation).
18 Item 11: Deletion of Disk-Based Bacula Volumes
19 Item 12: Directive/mode to backup only file changes, not entire file
20 Item 13: Multiple threads in file daemon for the same job
21 Item 14: Implement red/black binary tree routines.
22 Item 15: Add support for FileSets in user directories CACHEDIR.TAG
23 Item 16: Implement extraction of Win32 BackupWrite data.
24 Item 17: Implement a Python interface to the Bacula catalog.
25 Item 18: Archival (removal) of User Files to Tape
26 Item 19: Add Plug-ins to the FileSet Include statements.
27 Item 20: Implement more Python events in Bacula.
28 Item 21: Quick release of FD-SD connection after backup.
29 Item 22: Permit multiple Media Types in an Autochanger
30 Item 23: Allow different autochanger definitions for one autochanger.
31 Item 24: Automatic disabling of devices
32 Item 25: Implement huge exclude list support using hashing.
34 Items complete and to be released in version 1.40.0:
35 Item 1: Implement data encryption (as opposed to comm encryption)
36 Item 2: Implement Migration that moves Jobs from one Pool to another.
37 Item 9: Implement new {Client}Run{Before|After}Job feature.
38 Item 16: Implement extraction of Win32 BackupWrite data.
40 Items implemented but not tested and hence consequences are unknown:
41 Item 22: Permit multiple Media Types in an Autochanger
44 Below, you will find more information on future projects:
46 Item 1: Implement data encryption (as opposed to comm encryption)
48 Origin: Sponsored by Landon and 13 contributors to EFF.
49 Status: Done: Landon Fuller has implemented this in 1.39.x.
51 What: Currently the data that is stored on the Volume is not
52 encrypted. For confidentiality, encryption of data at
53 the File daemon level is essential.
54 Data encryption encrypts the data in the File daemon and
55 decrypts the data in the File daemon during a restore.
57 Why: Large sites require this.
59 Item 2: Implement Migration that moves Jobs from one Pool to another.
60 Origin: Sponsored by Riege Software International GmbH. Contact:
61 Daniel Holtkamp <holtkamp at riege dot com>
63 Status: Done. Completed in version 1.39.31 by Kern.
65 What: The ability to copy, move, or archive data that is on a
66 device to another device is very important.
68 Why: An ISP might want to backup to disk, but after 30 days
69 migrate the data to tape backup and delete it from
70 disk. Bacula should be able to handle this
71 automatically. It needs to know what was put where,
72 and when, and what to migrate -- it is a bit like
73 retention periods. Doing so would allow space to be
74 freed up for current backups while maintaining older
77 Notes: Riege Software have asked for the following migration
80 Highwater mark (stopped by Lowwater mark?)
82 Notes: Migration could be additionally triggered by:
86 Item 3: Accurate restoration of renamed/deleted files from
87 Incremental/Differential backups
88 Date: 28 November 2005
89 Origin: Martin Simmons (martin at lispworks dot com)
92 What: When restoring a fileset for a specified date (including "most
93 recent"), Bacula should give you exactly the files and directories
94 that existed at the time of the last backup prior to that date.
96 Currently this only works if the last backup was a Full backup.
97 When the last backup was Incremental/Differential, files and
98 directories that have been renamed or deleted since the last Full
99 backup are not currently restored correctly. Ditto for files with
100 extra/fewer hard links than at the time of the last Full backup.
102 Why: Incremental/Differential would be much more useful if this worked.
104 Notes: Item 14 (Merging of multiple backups into a single one) seems to
105 rely on this working, otherwise the merged backups will not be
106 truly equivalent to a Full backup.
108 Kern: notes shortened. This can be done without the need for
109 inodes. It is essentially the same as the current Verify job,
110 but one additional database record must be written, which does
111 not need any database change.
113 Kern: see if we can correct restoration of directories if
114 replace=ifnewer is set. Currently, if the directory does not
115 exist, a "dummy" directory is created, then when all the files
116 are updated, the dummy directory is newer so the real values
119 Item 4: Implement a Bacula GUI/management tool using Python.
121 Date: 28 October 2005
122 Status: Lucus is working on this for Python GTK+.
124 What: Implement a Bacula console, and management tools
125 using Python and Qt or GTK.
127 Why: Don't we already have a wxWidgets GUI? Yes, but
128 it is written in C++ and changes to the user interface
129 must be hand tailored using C++ code. By developing
130 the user interface using Qt designer, the interface
131 can be very easily updated and most of the new Python
132 code will be automatically created. The user interface
133 changes become very simple, and only the new features
134 must be implement. In addition, the code will be in
135 Python, which will give many more users easy (or easier)
136 access to making additions or modifications.
138 Notes: This is currently being implemented using Python-GTK by
139 Lucas Di Pentima <lucas at lunix dot com dot ar>
141 Item 5: Implement Base jobs.
142 Date: 28 October 2005
146 What: A base job is sort of like a Full save except that you
147 will want the FileSet to contain only files that are
148 unlikely to change in the future (i.e. a snapshot of
149 most of your system after installing it). After the
150 base job has been run, when you are doing a Full save,
151 you specify one or more Base jobs to be used. All
152 files that have been backed up in the Base job/jobs but
153 not modified will then be excluded from the backup.
154 During a restore, the Base jobs will be automatically
155 pulled in where necessary.
157 Why: This is something none of the competition does, as far as
158 we know (except perhaps BackupPC, which is a Perl program that
159 saves to disk only). It is big win for the user, it
160 makes Bacula stand out as offering a unique
161 optimization that immediately saves time and money.
162 Basically, imagine that you have 100 nearly identical
163 Windows or Linux machine containing the OS and user
164 files. Now for the OS part, a Base job will be backed
165 up once, and rather than making 100 copies of the OS,
166 there will be only one. If one or more of the systems
167 have some files updated, no problem, they will be
168 automatically restored.
170 Notes: Huge savings in tape usage even for a single machine.
171 Will require more resources because the DIR must send
172 FD a list of files/attribs, and the FD must search the
173 list and compare it for each file to be saved.
175 Item 6: Allow FD to initiate a backup
176 Origin: Frank Volf (frank at deze dot org)
177 Date: 17 November 2005
180 What: Provide some means, possibly by a restricted console that
181 allows a FD to initiate a backup, and that uses the connection
182 established by the FD to the Director for the backup so that
183 a Director that is firewalled can do the backup.
185 Why: Makes backup of laptops much easier.
187 Item 7: Improve Bacula's tape and drive usage and cleaning management.
188 Date: 8 November 2005, November 11, 2005
189 Origin: Adam Thornton <athornton at sinenomine dot net>,
190 Arno Lehmann <al at its-lehmann dot de>
193 What: Make Bacula manage tape life cycle information, tape reuse
194 times and drive cleaning cycles.
196 Why: All three parts of this project are important when operating
198 We need to know which tapes need replacement, and we need to
199 make sure the drives are cleaned when necessary. While many
200 tape libraries and even autoloaders can handle all this
201 automatically, support by Bacula can be helpful for smaller
202 (older) libraries and single drives. Limiting the number of
203 times a tape is used might prevent tape errors when using
204 tapes until the drives can't read it any more. Also, checking
205 drive status during operation can prevent some failures (as I
206 [Arno] had to learn the hard way...)
208 Notes: First, Bacula could (and even does, to some limited extent)
209 record tape and drive usage. For tapes, the number of mounts,
210 the amount of data, and the time the tape has actually been
211 running could be recorded. Data fields for Read and Write
212 time and Number of mounts already exist in the catalog (I'm
213 not sure if VolBytes is the sum of all bytes ever written to
214 that volume by Bacula). This information can be important
215 when determining which media to replace. The ability to mark
216 Volumes as "used up" after a given number of write cycles
217 should also be implemented so that a tape is never actually
218 worn out. For the tape drives known to Bacula, similar
219 information is interesting to determine the device status and
220 expected life time: Time it's been Reading and Writing, number
221 of tape Loads / Unloads / Errors. This information is not yet
222 recorded as far as I [Arno] know. A new volume status would
223 be necessary for the new state, like "Used up" or "Worn out".
224 Volumes with this state could be used for restores, but not
225 for writing. These volumes should be migrated first (assuming
226 migration is implemented) and, once they are no longer needed,
227 could be moved to a Trash pool.
229 The next step would be to implement a drive cleaning setup.
230 Bacula already has knowledge about cleaning tapes. Once it
231 has some information about cleaning cycles (measured in drive
232 run time, number of tapes used, or calender days, for example)
233 it can automatically execute tape cleaning (with an
234 autochanger, obviously) or ask for operator assistance loading
237 The final step would be to implement TAPEALERT checks not only
238 when changing tapes and only sending the information to the
239 administrator, but rather checking after each tape error,
240 checking on a regular basis (for example after each tape
241 file), and also before unloading and after loading a new tape.
242 Then, depending on the drives TAPEALERT state and the known
243 drive cleaning state Bacula could automatically schedule later
244 cleaning, clean immediately, or inform the operator.
246 Implementing this would perhaps require another catalog change
247 and perhaps major changes in SD code and the DIR-SD protocol,
248 so I'd only consider this worth implementing if it would
249 actually be used or even needed by many people.
251 Implementation of these projects could happen in three distinct
252 sub-projects: Measuring Tape and Drive usage, retiring
253 volumes, and handling drive cleaning and TAPEALERTs.
255 Item 8: Implement creation and maintenance of copy pools
256 Date: 27 November 2005
257 Origin: David Boyes (dboyes at sinenomine dot net)
260 What: I would like Bacula to have the capability to write copies
261 of backed-up data on multiple physical volumes selected
262 from different pools without transferring the data
263 multiple times, and to accept any of the copy volumes
264 as valid for restore.
266 Why: In many cases, businesses are required to keep offsite
267 copies of backup volumes, or just wish for simple
268 protection against a human operator dropping a storage
269 volume and damaging it. The ability to generate multiple
270 volumes in the course of a single backup job allows
271 customers to simple check out one copy and send it
272 offsite, marking it as out of changer or otherwise
273 unavailable. Currently, the library and magazine
274 management capability in Bacula does not make this process
277 Restores would use the copy of the data on the first
278 available volume, in order of copy pool chain definition.
280 This is also a major scalability issue -- as the number of
281 clients increases beyond several thousand, and the volume
282 of data increases, transferring the data multiple times to
283 produce additional copies of the backups will become
284 physically impossible due to transfer speed
285 issues. Generating multiple copies at server side will
286 become the only practical option.
288 How: I suspect that this will require adding a multiplexing
289 SD that appears to be a SD to a specific FD, but 1-n FDs
290 to the specific back end SDs managing the primary and copy
291 pools. Storage pools will also need to acquire parameters
292 to define the pools to be used for copies.
294 Notes: I would commit some of my developers' time if we can agree
295 on the design and behavior.
297 Item 9: Implement new {Client}Run{Before|After}Job feature.
298 Date: 26 September 2005
299 Origin: Phil Stracchino
300 Status: Done. This has been implemented by Eric Bollengier
302 What: Some time ago, there was a discussion of RunAfterJob and
303 ClientRunAfterJob, and the fact that they do not run after failed
304 jobs. At the time, there was a suggestion to add a
305 RunAfterFailedJob directive (and, presumably, a matching
306 ClientRunAfterFailedJob directive), but to my knowledge these
307 were never implemented.
309 The current implementation doesn't permit to add new feature easily.
311 An alternate way of approaching the problem has just occurred to
312 me. Suppose the RunBeforeJob and RunAfterJob directives were
313 expanded in a manner like this example:
316 Command = "/opt/bacula/etc/checkhost %c"
317 RunsOnClient = No # default
318 AbortJobOnError = Yes # default
322 Command = c:/bacula/systemstate.bat
330 Command = c:/bacula/deletestatefile.bat
335 It's now possible to specify more than 1 command per Job.
336 (you can stop your database and your webserver without a script)
341 JobDefs = "DefaultJob"
342 Write Bootstrap = "/tmp/bacula/var/bacula/working/Client1.bsr"
345 RunBeforeJob = "echo test before ; echo test before2"
346 RunBeforeJob = "echo test before (2nd time)"
347 RunBeforeJob = "echo test before (3rd time)"
348 RunAfterJob = "echo test after"
349 ClientRunAfterJob = "echo test after client"
352 Command = "echo test RunScript in error"
356 RunsWhen = After # never by default
359 Command = "echo test RunScript on success"
361 RunsOnSuccess = yes # default
362 RunsOnFailure = no # default
367 Why: It would be a significant change to the structure of the
368 directives, but allows for a lot more flexibility, including
369 RunAfter commands that will run regardless of whether the job
370 succeeds, or RunBefore tasks that still allow the job to run even
371 if that specific RunBefore fails.
373 Notes: (More notes from Phil, Kern, David and Eric)
374 I would prefer to have a single new Resource called
377 RunsWhen = After|Before|Always
378 RunsAtJobLevels = All|Full|Diff|Inc # not yet implemented
380 The AbortJobOnError, RunsOnSuccess and RunsOnFailure directives
381 could be optional, and possibly RunWhen as well.
383 AbortJobOnError would be ignored unless RunsWhen was set to Before
384 and would default to Yes if omitted.
385 If AbortJobOnError was set to No, failure of the script
386 would still generate a warning.
388 RunsOnSuccess would be ignored unless RunsWhen was set to After
389 (or RunsBeforeJob set to No), and default to Yes.
391 RunsOnFailure would be ignored unless RunsWhen was set to After,
394 Allow having the before/after status on the script command
395 line so that the same script can be used both before/after.
397 Item 10: Merge multiple backups (Synthetic Backup or Consolidation).
398 Origin: Marc Cousin and Eric Bollengier
399 Date: 15 November 2005
400 Status: Waiting implementation. Depends on first implementing
401 project Item 2 (Migration).
403 What: A merged backup is a backup made without connecting to the Client.
404 It would be a Merge of existing backups into a single backup.
405 In effect, it is like a restore but to the backup medium.
407 For instance, say that last Sunday we made a full backup. Then
408 all week long, we created incremental backups, in order to do
409 them fast. Now comes Sunday again, and we need another full.
410 The merged backup makes it possible to do instead an incremental
411 backup (during the night for instance), and then create a merged
412 backup during the day, by using the full and incrementals from
413 the week. The merged backup will be exactly like a full made
414 Sunday night on the tape, but the production interruption on the
415 Client will be minimal, as the Client will only have to send
418 In fact, if it's done correctly, you could merge all the
419 Incrementals into single Incremental, or all the Incrementals
420 and the last Differential into a new Differential, or the Full,
421 last differential and all the Incrementals into a new Full
422 backup. And there is no need to involve the Client.
424 Why: The benefit is that :
425 - the Client just does an incremental ;
426 - the merged backup on tape is just as a single full backup,
427 and can be restored very fast.
429 This is also a way of reducing the backup data since the old
430 data can then be pruned (or not) from the catalog, possibly
431 allowing older volumes to be recycled
433 Item 11: Deletion of Disk-Based Bacula Volumes
435 Origin: Ross Boylan <RossBoylan at stanfordalumni dot org> (edited
439 What: Provide a way for Bacula to automatically remove Volumes
440 from the filesystem, or optionally to truncate them.
441 Obviously, the Volume must be pruned prior removal.
443 Why: This would allow users more control over their Volumes and
444 prevent disk based volumes from consuming too much space.
446 Notes: The following two directives might do the trick:
448 Volume Data Retention = <time period>
449 Remove Volume After = <time period>
451 The migration project should also remove a Volume that is
452 migrated. This might also work for tape Volumes.
454 Item 12: Directive/mode to backup only file changes, not entire file
455 Date: 11 November 2005
456 Origin: Joshua Kugler <joshua dot kugler at uaf dot edu>
457 Marek Bajon <mbajon at bimsplus dot com dot pl>
460 What: Currently when a file changes, the entire file will be backed up in
461 the next incremental or full backup. To save space on the tapes
462 it would be nice to have a mode whereby only the changes to the
463 file would be backed up when it is changed.
465 Why: This would save lots of space when backing up large files such as
466 logs, mbox files, Outlook PST files and the like.
468 Notes: This would require the usage of disk-based volumes as comparing
469 files would not be feasible using a tape drive.
471 Item 13: Multiple threads in file daemon for the same job
472 Date: 27 November 2005
473 Origin: Ove Risberg (Ove.Risberg at octocode dot com)
476 What: I want the file daemon to start multiple threads for a backup
477 job so the fastest possible backup can be made.
479 The file daemon could parse the FileSet information and start
480 one thread for each File entry located on a separate
483 A configuration option in the job section should be used to
484 enable or disable this feature. The configuration option could
485 specify the maximum number of threads in the file daemon.
487 If the threads could spool the data to separate spool files
488 the restore process will not be much slower.
490 Why: Multiple concurrent backups of a large fileserver with many
491 disks and controllers will be much faster.
493 Notes: I am willing to try to implement this but I will probably
494 need some help and advice. (No problem -- Kern)
496 Item 14: Implement red/black binary tree routines.
497 Date: 28 October 2005
499 Status: Class code is complete. Code needs to be integrated into
502 What: Implement a red/black binary tree class. This could
503 then replace the current binary insert/search routines
504 used in the restore in memory tree. This could significantly
505 speed up the creation of the in memory restore tree.
507 Why: Performance enhancement.
509 Item 15: Add support for FileSets in user directories CACHEDIR.TAG
510 Origin: Norbert Kiesel <nkiesel at tbdnetworks dot com>
511 Date: 21 November 2005
512 Status: (I think this is better done using a Python event that I
513 will implement in version 1.39.x).
515 What: CACHDIR.TAG is a proposal for identifying directories which
516 should be ignored for archiving/backup. It works by ignoring
517 directory trees which have a file named CACHEDIR.TAG with a
518 specific content. See
519 http://www.brynosaurus.com/cachedir/spec.html
523 I suggest that if this is implemented (I've also asked for this
524 feature some year ago) that it is made compatible with Legato
525 Networkers ".nsr" files where you can specify a lot of options on
526 how to handle files/directories (including denying further
527 parsing of .nsr files lower down into the directory trees). A
528 PDF version of the .nsr man page can be viewed at:
530 http://www.ifm.liu.se/~peter/nsr.pdf
532 Why: It's a nice alternative to "exclude" patterns for directories
533 which don't have regular pathnames. Also, it allows users to
534 control backup for themselves. Implementation should be pretty
535 simple. GNU tar >= 1.14 or so supports it, too.
537 Notes: I envision this as an optional feature to a fileset
541 Item 16: Implement extraction of Win32 BackupWrite data.
542 Origin: Thorsten Engel <thorsten.engel at matrix-computer dot com>
543 Date: 28 October 2005
544 Status: Done. Assigned to Thorsten. Implemented in current CVS
546 What: This provides the Bacula File daemon with code that
547 can pick apart the stream output that Microsoft writes
548 for BackupWrite data, and thus the data can be read
549 and restored on non-Win32 machines.
551 Why: BackupWrite data is the portable=no option in Win32
552 FileSets, and in previous Baculas, this data could
553 only be extracted using a Win32 FD. With this new code,
554 the Windows data can be extracted and restored on
558 Item 18: Implement a Python interface to the Bacula catalog.
559 Date: 28 October 2005
563 What: Implement an interface for Python scripts to access
564 the catalog through Bacula.
566 Why: This will permit users to customize Bacula through
569 Item 18: Archival (removal) of User Files to Tape
573 Origin: Ray Pengelly [ray at biomed dot queensu dot ca
576 What: The ability to archive data to storage based on certain parameters
577 such as age, size, or location. Once the data has been written to
578 storage and logged it is then pruned from the originating
579 filesystem. Note! We are talking about user's files and not
582 Why: This would allow fully automatic storage management which becomes
583 useful for large datastores. It would also allow for auto-staging
584 from one media type to another.
586 Example 1) Medical imaging needs to store large amounts of data.
587 They decide to keep data on their servers for 6 months and then put
588 it away for long term storage. The server then finds all files
589 older than 6 months writes them to tape. The files are then removed
592 Example 2) All data that hasn't been accessed in 2 months could be
593 moved from high-cost, fibre-channel disk storage to a low-cost
594 large-capacity SATA disk storage pool which doesn't have as quick of
595 access time. Then after another 6 months (or possibly as one
596 storage pool gets full) data is migrated to Tape.
598 Item 19: Add Plug-ins to the FileSet Include statements.
599 Date: 28 October 2005
601 Status: Partially coded in 1.37 -- much more to do.
603 What: Allow users to specify wild-card and/or regular
604 expressions to be matched in both the Include and
605 Exclude directives in a FileSet. At the same time,
606 allow users to define plug-ins to be called (based on
607 regular expression/wild-card matching).
609 Why: This would give the users the ultimate ability to control
610 how files are backed up/restored. A user could write a
611 plug-in knows how to backup his Oracle database without
612 stopping/starting it, for example.
614 Item 20: Implement more Python events in Bacula.
615 Date: 28 October 2005
619 What: Allow Python scripts to be called at more places
620 within Bacula and provide additional access to Bacula
623 Why: This will permit users to customize Bacula through
631 Also add a way to get a listing of currently running
632 jobs (possibly also scheduled jobs).
635 Item 21: Quick release of FD-SD connection after backup.
636 Origin: Frank Volf (frank at deze dot org)
637 Date: 17 November 2005
640 What: In the Bacula implementation a backup is finished after all data
641 and attributes are successfully written to storage. When using a
642 tape backup it is very annoying that a backup can take a day,
643 simply because the current tape (or whatever) is full and the
644 administrator has not put a new one in. During that time the
645 system cannot be taken off-line, because there is still an open
646 session between the storage daemon and the file daemon on the
649 Although this is a very good strategy for making "safe backups"
650 This can be annoying for e.g. laptops, that must remain
651 connected until the backup is completed.
653 Using a new feature called "migration" it will be possible to
654 spool first to harddisk (using a special 'spool' migration
655 scheme) and then migrate the backup to tape.
657 There is still the problem of getting the attributes committed.
658 If it takes a very long time to do, with the current code, the
659 job has not terminated, and the File daemon is not freed up. The
660 Storage daemon should release the File daemon as soon as all the
661 file data and all the attributes have been sent to it (the SD).
662 Currently the SD waits until everything is on tape and all the
663 attributes are transmitted to the Director before signaling
664 completion to the FD. I don't think I would have any problem
665 changing this. The reason is that even if the FD reports back to
666 the Dir that all is OK, the job will not terminate until the SD
667 has done the same thing -- so in a way keeping the SD-FD link
668 open to the very end is not really very productive ...
670 Why: Makes backup of laptops much easier.
672 Item 22: Permit multiple Media Types in an Autochanger
674 Status: Done. Implemented in 1.38.9 (I think).
676 What: Modify the Storage daemon so that multiple Media Types
677 can be specified in an autochanger. This would be somewhat
678 of a simplistic implementation in that each drive would
679 still be allowed to have only one Media Type. However,
680 the Storage daemon will ensure that only a drive with
681 the Media Type that matches what the Director specifies
684 Why: This will permit user with several different drive types
685 to make full use of their autochangers.
687 Item 23: Allow different autochanger definitions for one autochanger.
688 Date: 28 October 2005
692 What: Currently, the autochanger script is locked based on
693 the autochanger. That is, if multiple drives are being
694 simultaneously used, the Storage daemon ensures that only
695 one drive at a time can access the mtx-changer script.
696 This change would base the locking on the control device,
697 rather than the autochanger. It would then permit two autochanger
698 definitions for the same autochanger, but with different
699 drives. Logically, the autochanger could then be "partitioned"
700 for different jobs, clients, or class of jobs, and if the locking
701 is based on the control device (e.g. /dev/sg0) the mtx-changer
702 script will be locked appropriately.
704 Why: This will permit users to partition autochangers for specific
705 use. It would also permit implementation of multiple Media
706 Types with no changes to the Storage daemon.
708 Item 24: Automatic disabling of devices
710 Origin: Peter Eriksson <peter at ifm.liu dot se>
713 What: After a configurable amount of fatal errors with a tape drive
714 Bacula should automatically disable further use of a certain
715 tape drive. There should also be "disable"/"enable" commands in
718 Why: On a multi-drive jukebox there is a possibility of tape drives
719 going bad during large backups (needing a cleaning tape run,
720 tapes getting stuck). It would be advantageous if Bacula would
721 automatically disable further use of a problematic tape drive
722 after a configurable amount of errors has occurred.
724 An example: I have a multi-drive jukebox (6 drives, 380+ slots)
725 where tapes occasionally get stuck inside the drive. Bacula will
726 notice that the "mtx-changer" command will fail and then fail
727 any backup jobs trying to use that drive. However, it will still
728 keep on trying to run new jobs using that drive and fail -
729 forever, and thus failing lots and lots of jobs... Since we have
730 many drives Bacula could have just automatically disabled
731 further use of that drive and used one of the other ones
734 Item 25: Implement huge exclude list support using hashing.
735 Date: 28 October 2005
739 What: Allow users to specify very large exclude list (currently
740 more than about 1000 files is too many).
742 Why: This would give the users the ability to exclude all
743 files that are loaded with the OS (e.g. using rpms
744 or debs). If the user can restore the base OS from
745 CDs, there is no need to backup all those files. A
746 complete restore would be to restore the base OS, then
747 do a Bacula restore. By excluding the base OS files, the
748 backup set will be *much* smaller.
751 ============= Empty Feature Request form ===========
752 Item n: One line summary ...
754 Origin: Name and email of originator.
757 What: More detailed explanation ...
759 Why: Why it is important ...
761 Notes: Additional notes or features (omit if not used)
762 ============== End Feature Request form ==============
765 ===============================================
766 Feature requests submitted after cutoff for December 2005 vote
767 and not yet discussed.
768 ===============================================
769 Item n: Allow skipping execution of Jobs
770 Date: 29 November 2005
771 Origin: Florian Schnabel <florian.schnabel at docufy dot de>
774 What: An easy option to skip a certain job on a certain date.
775 Why: You could then easily skip tape backups on holidays. Especially
776 if you got no autochanger and can only fit one backup on a tape
777 that would be really handy, other jobs could proceed normally
778 and you won't get errors that way.
780 ===================================================
784 Origin: calvin streeting calvin at absentdream dot com
787 What: The abilty to archive to media (dvd/cd) in a uncompressd format
788 for dead filing (archiving not backing up)
790 Why: At my works when jobs are finished and moved off of the main file
791 servers (raid based systems) onto a simple linux file server (ide based
792 system) so users can find old information without contacting the IT
795 So this data dosn't realy change it only gets added to,
796 But it also needs backing up. At the moment it takes
797 about 8 hours to back up our servers (working data) so
798 rather than add more time to existing backups i am trying
799 to implement a system where we backup the acrhive data to
800 cd/dvd these disks would only need to be appended to
801 (burn only new/changed files to new disks for off site
802 storage). basialy understand the differnce between
803 achive data and live data.
805 Notes: scan the data and email me when it needs burning divide
806 into predifind chunks keep a recored of what is on what
807 disk make me a label (simple php->mysql=>pdf stuff) i
808 could do this bit ability to save data uncompresed so
809 it can be read in any other system (future proof data)
810 save the catalog with the disk as some kind of menu
813 Item : Tray monitor window cleanups
814 Origin: Alan Brown ajb2 at mssl dot ucl dot ac dot uk
817 What: Resizeable and scrollable windows in the tray monitor.
819 Why: With multiple clients, or with many jobs running, the displayed
820 window often ends up larger than the available screen, making
821 the trailing items difficult to read.
825 Item : Clustered file-daemons
826 Origin: Alan Brown ajb2 at mssl dot ucl dot ac dot uk
829 What: A "virtual" filedaemon, which is actually a cluster of real ones.
831 Why: In the case of clustered filesystems (SAN setups, GFS, or OCFS2, etc)
832 multiple machines may have access to the same set of filesystems
834 For performance reasons, one may wish to initate backups from
835 several of these machines simultaneously, instead of just using
836 one backup source for the common clustered filesystem.
838 For obvious reasons, normally backups of $A-FD/$PATH and
839 B-FD/$PATH are treated as different backup sets. In this case
840 they are the same communal set.
842 Likewise when restoring, it would be easier to just specify
843 one of the cluster machines and let bacula decide which to use.
845 This can be faked to some extent using DNS round robin entries
846 and a virtual IP address, however it means "status client" will
847 always give bogus answers. Additionally there is no way of
848 spreading the load evenly among the servers.
850 What is required is something similar to the storage daemon
851 autochanger directives, so that Bacula can keep track of
852 operating backups/restores and direct new jobs to a "free"
857 Item : Tray monitor window cleanups
858 Origin: Alan Brown ajb2 at mssl dot ucl dot ac dot uk
861 What: Resizeable and scrollable windows in the tray monitor.
863 Why: With multiple clients, or with many jobs running, the displayed
864 window often ends up larger than the available screen, making
865 the trailing items difficult to read.
869 Item: Commercial database support
870 Origin: Russell Howe <russell_howe dot wreckage dot org>
874 What: It would be nice for the database backend to support more
875 databases. I'm thinking of SQL Server at the moment, but I guess Oracle,
876 DB2, MaxDB, etc are all candidates. SQL Server would presumably be
877 implemented using FreeTDS or maybe an ODBC library?
879 Why: We only really have one database server, which is MS SQL Server
880 2000. Maintaining a second one for the backup software (we grew out of
881 SQLite, which I liked, but which didn't work so well with our database
882 size). We don't really have a machine with the resources to run
883 postgres, and would rather only maintain a single DBMS. We're stuck with
884 SQL Server because pretty much all the company's custom applications
885 (written by consultants) are locked into SQL Server 2000. I can imagine
886 this scenario is fairly common, and it would be nice to use the existing
887 properly specced database server for storing Bacula's catalog, rather
888 than having to run a second DBMS.
891 Item n: Split documentation
892 Origin: Maxx <maxxatworkat gmail dot com>
896 What: Split documentation in several books
898 Why: Bacula manual has now more than 600 pages, and looking for
899 implementation details is getting complicated. I think
900 it would be good to split the single volume in two or
903 1) Introduction, requirements and tutorial, typically
904 are useful only until first installation time
906 2) Basic installation and configuration, with all the
907 gory details about the directives supported 3)
908 Advanced Bacula: testing, troubleshooting, GUI and
909 ancillary programs, security managements, scripting,
914 Item n: Include an option to operate on all pools when doing
915 update vol parameters
917 Origin: Dmitriy Pinchukov <absh@bossdev.kiev.ua>
921 What: When I do update -> Volume parameters -> All Volumes
922 from Pool, then I have to select pools one by one. I'd like
923 console to have an option like "0: All Pools" in the list of
926 Why: I have many pools and therefore unhappy with manually
927 updating each of them using update -> Volume parameters -> All
928 Volumes from Pool -> pool #.
930 Item n: Automatic promotion of backup levels
931 Date: 19 January 2006
932 Origin: Adam Thornton <athornton@sinenomine.net>
935 What: Amanda has a feature whereby it estimates the space that a
936 differential, incremental, and full backup would take. If the
937 difference in space required between the scheduled level and the next
938 level up is beneath some user-defined critical threshold, the backup
939 level is bumped to the next type. Doing this minimizes the number of
940 volumes necessary during a restore, with a fairly minimal cost in
943 Why: I know at least one (quite sophisticated and smart) user
944 for whom the absence of this feature is a deal-breaker in terms of
945 using Bacula; if we had it it would eliminate the one cool thing
946 Amanda can do and we can't (at least, the one cool thing I know of).
951 Item n+1: Incorporation of XACML2/SAML2 parsing
952 Date: 19 January 2006
953 Origin: Adam Thornton <athornton@sinenomine.net>
956 What: XACML is "eXtensible Access Control Markup Language" and
957 "SAML is the "Security Assertion Markup Language"--an XML standard
958 for making statements about identity and authorization. Having these
959 would give us a framework to approach ACLs in a generic manner, and
960 in a way flexible enough to support the four major sorts of ACLs I
961 see as a concern to Bacula at this point, as well as (probably) to
962 deal with new sorts of ACLs that may appear in the future.
964 Why: Bacula is beginning to need to back up systems with ACLs
965 that do not map cleanly onto traditional Unix permissions. I see
966 four sets of ACLs--in general, mutually incompatible with one
967 another--that we're going to need to deal with. These are: NTFS
968 ACLs, POSIX ACLs, NFSv4 ACLS, and AFS ACLS. (Some may question the
969 relevance of AFS; AFS is one of Sine Nomine's core consulting
970 businesses, and having a reputable file-level backup and restore
971 technology for it (as Tivoli is probably going to drop AFS support
972 soon since IBM no longer supports AFS) would be of huge benefit to
973 our customers; we'd most likely create the AFS support at Sine Nomine
974 for inclusion into the Bacula (and perhaps some changes to the
975 OpenAFS volserver) core code.)
977 Now, obviously, Bacula already handles NTFS just fine. However, I
978 think there's a lot of value in implementing a generic ACL model, so
979 that it's easy to support whatever particular instances of ACLs come
980 down the pike: POSIX ACLS (think SELinux) and NFSv4 are the obvious
981 things arriving in the Linux world in a big way in the near future.
982 XACML, although overcomplicated for our needs, provides this
983 framework, and we should be able to leverage other people's
984 implementations to minimize the amount of work *we* have to do to get
985 a generic ACL framework. Basically, the costs of implementation are
986 high, but they're largely both external to Bacula and already sunk.
988 Item 1: Add an over-ride in the Schedule configuration to use a
989 different pool for different backup types.
992 Origin: Chad Slater <chad.slater@clickfox.com>
995 What: Adding a FullStorage=BigTapeLibrary in the Schedule resource
996 would help those of us who use different storage devices for different
997 backup levels cope with the "auto-upgrade" of a backup.
999 Why: Assume I add several new device to be backed up, i.e. several
1000 hosts with 1TB RAID. To avoid tape switching hassles, incrementals are
1001 stored in a disk set on a 2TB RAID. If you add these devices in the
1002 middle of the month, the incrementals are upgraded to "full" backups,
1003 but they try to use the same storage device as requested in the
1004 incremental job, filling up the RAID holding the differentials. If we
1005 could override the Storage parameter for full and/or differential
1006 backups, then the Full job would use the proper Storage device, which
1007 has more capacity (i.e. a 8TB tape library.
1010 Item: Implement multiple numeric backup levels as supported by dump
1012 Origin: Daniel Rich <drich@employees.org>
1014 What: Dump allows specification of backup levels numerically instead of just
1015 "full", "incr", and "diff". In this system, at any given level, all
1016 files are backed up that were were modified since the last backup of a
1017 higher level (with 0 being the highest and 9 being the lowest). A
1018 level 0 is therefore equivalent to a full, level 9 an incremental, and
1019 the levels 1 through 8 are varying levels of differentials. For
1020 bacula's sake, these could be represented as "full", "incr", and
1021 "diff1", "diff2", etc.
1023 Why: Support of multiple backup levels would provide for more advanced backup
1024 rotation schemes such as "Towers of Hanoi". This would allow better
1025 flexibility in performing backups, and can lead to shorter recover
1028 Notes: Legato Networker supports a similar system with full, incr, and 1-9 as
1031 Kern notes: I think this would add very little functionality, but a *lot* of
1032 additional overhead to Bacula.
1034 Item 1: include JobID in spool file name
1035 Origin: Mark Bergman <mark.bergman@uphs.upenn.edu>
1036 Date: Tue Aug 22 17:13:39 EDT 2006
1039 What: Change the name of the spool file to include the JobID
1041 Why: JobIDs are the common key used to refer to jobs, yet the
1042 spoolfile name doesn't include that information. The date/time
1043 stamp is useful (and should be retained).
1047 Item 2: include timestamp of job launch in "stat clients" output
1048 Origin: Mark Bergman <mark.bergman@uphs.upenn.edu>
1049 Date: Tue Aug 22 17:13:39 EDT 2006
1052 What: The "stat clients" command doesn't include any detail on when
1053 the active backup jobs were launched.
1055 Why: Including the timestamp would make it much easier to decide whether
1056 a job is running properly.
1058 Notes: It may be helpful to have the output from "stat clients" formatted
1059 more like that from "stat dir" (and other commands), in a column
1060 format. The per-client information that's currently shown (level,
1061 client name, JobId, Volume, pool, device, Files, etc.) is good, but
1062 somewhat hard to parse (both programmatically and visually),
1063 particularly when there are many active clients.
1065 Item 1: Filesystemwatch triggered backup.
1066 Date: 31 August 2006
1067 Origin: Jesper Krogh <jesper@krogh.cc>
1068 Status: Unimplemented, depends probably on "client initiated backups"
1070 What: With inotify and similar filesystem triggeret notification
1071 systems is it possible to have the file-daemon to monitor
1072 filesystem changes and initiate backup.
1074 Why: There are 2 situations where this is nice to have.
1075 1) It is possible to get a much finer-grained backup than
1076 the fixed schedules used now.. A file created and deleted
1077 a few hours later, can automatically be caught.
1079 2) The introduced load on the system will probably be
1080 distributed more even on the system.
1082 Notes: This can be combined with configration that specifies
1083 something like: "at most every 15 minutes or when changes
1086 Item n: Message mailing based on backup types
1087 Origin: Evan Kaufman <evan.kaufman@gmail.com>
1088 Date: January 6, 2006
1091 What: In the "Messages" resource definitions, allowing messages
1092 to be mailed based on the type (backup, restore, etc.) and level
1093 (full, differential, etc) of job that created the originating
1096 Why: It would, for example, allow someone's boss to be emailed
1097 automatically only when a Full Backup job runs, so he can
1098 retrieve the tapes for offsite storage, even if the IT dept.
1099 doesn't (or can't) explicitly notify him. At the same time, his
1100 mailbox wouldnt be filled by notifications of Verifies, Restores,
1101 or Incremental/Differential Backups (which would likely be kept
1105 One way this could be done is through additional message types, for example:
1108 # email the boss only on full system backups
1109 Mail = boss@mycompany.com = full, !incremental, !differential, !restore,
1111 # email us only when something breaks
1112 MailOnError = itdept@mycompany.com = all
1116 Item n: Allow inclusion/exclusion of files in a fileset by creation/mod times
1117 Origin: Evan Kaufman <evan.kaufman@gmail.com>
1118 Date: January 11, 2006
1121 What: In the vein of the Wild and Regex directives in a Fileset's
1122 Options, it would be helpful to allow a user to include or exclude
1123 files and directories by creation or modification times.
1125 You could factor the Exclude=yes|no option in much the same way it
1126 affects the Wild and Regex directives. For example, you could exclude
1127 all files modified before a certain date:
1131 Modified Before = ####
1134 Or you could exclude all files created/modified since a certain date:
1138 Created Modified Since = ####
1141 The format of the time/date could be done several ways, say the number
1142 of seconds since the epoch:
1143 1137008553 = Jan 11 2006, 1:42:33PM # result of `date +%s`
1145 Or a human readable date in a cryptic form:
1146 20060111134233 = Jan 11 2006, 1:42:33PM # YYYYMMDDhhmmss
1148 Why: I imagine a feature like this could have many uses. It would
1149 allow a user to do a full backup while excluding the base operating
1150 system files, so if I installed a Linux snapshot from a CD yesterday,
1151 I'll *exclude* all files modified *before* today. If I need to
1152 recover the system, I use the CD I already have, plus the tape backup.
1153 Or if, say, a Windows client is hit by a particularly corrosive
1154 virus, and I need to *exclude* any files created/modified *since* the
1157 Notes: Of course, this feature would work in concert with other
1158 in/exclude rules, and wouldnt override them (or each other).
1160 Notes: The directives I'd imagine would be along the lines of
1161 "[Created] [Modified] [Before|Since] = <date>".
1162 So one could compare against 'ctime' and/or 'mtime', but ONLY 'before'
1166 Item: Implement support for stacking arbitrary stream filters, sinks.
1167 Date: 23 November 2006
1168 Origin: Landon Fuller <landonf@threerings.net>
1169 Status: Planning. Assigned to landonf.
1172 Implement support for the following:
1173 - Stacking arbitrary stream filters (eg, encryption, compression,
1174 sparse data handling))
1175 - Attaching file sinks to terminate stream filters (ie, write out
1176 the resultant data to a file)
1177 - Refactor the restoration state machine accordingly
1180 The existing stream implementation suffers from the following:
1181 - All state (compression, encryption, stream restoration), is
1182 global across the entire restore process, for all streams. There are
1183 multiple entry and exit points in the restoration state machine, and
1184 thus multiple places where state must be allocated, deallocated,
1185 initialized, or reinitialized. This results in exceptional complexity
1186 for the author of a stream filter.
1187 - The developer must enumerate all possible combinations of filters
1188 and stream types (ie, win32 data with encryption, without encryption,
1189 with encryption AND compression, etc).
1192 This feature request only covers implementing the stream filters/
1193 sinks, and refactoring the file daemon's restoration implementation
1194 accordingly. If I have extra time, I will also rewrite the backup
1195 implementation. My intent in implementing the restoration first is to
1196 solve pressing bugs in the restoration handling, and to ensure that
1197 the new restore implementation handles existing backups correctly.
1199 I do not plan on changing the network or tape data structures to
1200 support defining arbitrary stream filters, but supporting that
1201 functionality is the ultimate goal.
1203 Assistance with either code or testing would be fantastic.
1205 Item 1: On the bconsole "restore" command line, implement separate
1206 option for specifying the host to restore from, and the
1209 Date: 11 December 2006
1211 Origin: Discussion on Bacula-users entitled 'Scripted restores to
1212 different clients', December 2006
1214 Status: New feature request
1216 What: While using bconsole interactively, you can specify the client
1217 that a backup job is to be restored for, and then you can
1218 specify later a different client to send the restored files
1219 back to. However, using the 'restore' command with all options
1220 on the command line, this cannot be done, due to the ambiguous
1221 'client' parameter. Additionally, this parameter means different
1222 things depending on if it's specified on the command line or
1223 afterwards, in the Modify Job screens.
1225 Why: This feature would enable restore jobs to be more completely
1226 automated, for example by a web or GUI front-end.
1228 Notes: client can also be implied by specifying the jobid on the command