3 Bacula Projects Roadmap
4 Prioritized by user vote 07 December 2005
5 Status updated 30 July 2006
8 Item 1: Implement data encryption (as opposed to comm encryption)
9 Item 2: Implement Migration that moves Jobs from one Pool to another.
10 Item 3: Accurate restoration of renamed/deleted files from
11 Item 4: Implement a Bacula GUI/management tool using Python.
12 Item 5: Implement Base jobs.
13 Item 6: Allow FD to initiate a backup
14 Item 7: Improve Bacula's tape and drive usage and cleaning management.
15 Item 8: Implement creation and maintenance of copy pools
16 Item 9: Implement new {Client}Run{Before|After}Job feature.
17 Item 10: Merge multiple backups (Synthetic Backup or Consolidation).
18 Item 11: Deletion of Disk-Based Bacula Volumes
19 Item 12: Directive/mode to backup only file changes, not entire file
20 Item 13: Multiple threads in file daemon for the same job
21 Item 14: Implement red/black binary tree routines.
22 Item 15: Add support for FileSets in user directories CACHEDIR.TAG
23 Item 16: Implement extraction of Win32 BackupWrite data.
24 Item 17: Implement a Python interface to the Bacula catalog.
25 Item 18: Archival (removal) of User Files to Tape
26 Item 19: Add Plug-ins to the FileSet Include statements.
27 Item 20: Implement more Python events in Bacula.
28 Item 21: Quick release of FD-SD connection after backup.
29 Item 22: Permit multiple Media Types in an Autochanger
30 Item 23: Allow different autochanger definitions for one autochanger.
31 Item 24: Automatic disabling of devices
32 Item 25: Implement huge exclude list support using hashing.
35 Below, you will find more information on future projects:
37 Item 1: Implement data encryption (as opposed to comm encryption)
39 Origin: Sponsored by Landon and 13 contributors to EFF.
40 Status: Done: Landon Fuller has implemented this in 1.39.x.
42 What: Currently the data that is stored on the Volume is not
43 encrypted. For confidentiality, encryption of data at
44 the File daemon level is essential.
45 Data encryption encrypts the data in the File daemon and
46 decrypts the data in the File daemon during a restore.
48 Why: Large sites require this.
50 Item 2: Implement Migration that moves Jobs from one Pool to another.
51 Origin: Sponsored by Riege Software International GmbH. Contact:
52 Daniel Holtkamp <holtkamp at riege dot com>
54 Status: 90% complete: Working in 1.39, more to do. Assigned to
57 What: The ability to copy, move, or archive data that is on a
58 device to another device is very important.
60 Why: An ISP might want to backup to disk, but after 30 days
61 migrate the data to tape backup and delete it from
62 disk. Bacula should be able to handle this
63 automatically. It needs to know what was put where,
64 and when, and what to migrate -- it is a bit like
65 retention periods. Doing so would allow space to be
66 freed up for current backups while maintaining older
69 Notes: Riege Software have asked for the following migration
72 Highwater mark (stopped by Lowwater mark?)
74 Notes: Migration could be additionally triggered by:
78 Item 3: Accurate restoration of renamed/deleted files from
79 Incremental/Differential backups
80 Date: 28 November 2005
81 Origin: Martin Simmons (martin at lispworks dot com)
84 What: When restoring a fileset for a specified date (including "most
85 recent"), Bacula should give you exactly the files and directories
86 that existed at the time of the last backup prior to that date.
88 Currently this only works if the last backup was a Full backup.
89 When the last backup was Incremental/Differential, files and
90 directories that have been renamed or deleted since the last Full
91 backup are not currently restored correctly. Ditto for files with
92 extra/fewer hard links than at the time of the last Full backup.
94 Why: Incremental/Differential would be much more useful if this worked.
96 Notes: Item 14 (Merging of multiple backups into a single one) seems to
97 rely on this working, otherwise the merged backups will not be
98 truly equivalent to a Full backup.
100 Kern: notes shortened. This can be done without the need for
101 inodes. It is essentially the same as the current Verify job,
102 but one additional database record must be written, which does
103 not need any database change.
105 Kern: see if we can correct restoration of directories if
106 replace=ifnewer is set. Currently, if the directory does not
107 exist, a "dummy" directory is created, then when all the files
108 are updated, the dummy directory is newer so the real values
111 Item 4: Implement a Bacula GUI/management tool using Python.
113 Date: 28 October 2005
114 Status: Lucus is working on this for Python GTK+.
116 What: Implement a Bacula console, and management tools
117 using Python and Qt or GTK.
119 Why: Don't we already have a wxWidgets GUI? Yes, but
120 it is written in C++ and changes to the user interface
121 must be hand tailored using C++ code. By developing
122 the user interface using Qt designer, the interface
123 can be very easily updated and most of the new Python
124 code will be automatically created. The user interface
125 changes become very simple, and only the new features
126 must be implement. In addition, the code will be in
127 Python, which will give many more users easy (or easier)
128 access to making additions or modifications.
130 Notes: This is currently being implemented using Python-GTK by
131 Lucas Di Pentima <lucas at lunix dot com dot ar>
133 Item 5: Implement Base jobs.
134 Date: 28 October 2005
138 What: A base job is sort of like a Full save except that you
139 will want the FileSet to contain only files that are
140 unlikely to change in the future (i.e. a snapshot of
141 most of your system after installing it). After the
142 base job has been run, when you are doing a Full save,
143 you specify one or more Base jobs to be used. All
144 files that have been backed up in the Base job/jobs but
145 not modified will then be excluded from the backup.
146 During a restore, the Base jobs will be automatically
147 pulled in where necessary.
149 Why: This is something none of the competition does, as far as
150 we know (except perhaps BackupPC, which is a Perl program that
151 saves to disk only). It is big win for the user, it
152 makes Bacula stand out as offering a unique
153 optimization that immediately saves time and money.
154 Basically, imagine that you have 100 nearly identical
155 Windows or Linux machine containing the OS and user
156 files. Now for the OS part, a Base job will be backed
157 up once, and rather than making 100 copies of the OS,
158 there will be only one. If one or more of the systems
159 have some files updated, no problem, they will be
160 automatically restored.
162 Notes: Huge savings in tape usage even for a single machine.
163 Will require more resources because the DIR must send
164 FD a list of files/attribs, and the FD must search the
165 list and compare it for each file to be saved.
167 Item 6: Allow FD to initiate a backup
168 Origin: Frank Volf (frank at deze dot org)
169 Date: 17 November 2005
172 What: Provide some means, possibly by a restricted console that
173 allows a FD to initiate a backup, and that uses the connection
174 established by the FD to the Director for the backup so that
175 a Director that is firewalled can do the backup.
177 Why: Makes backup of laptops much easier.
179 Item 7: Improve Bacula's tape and drive usage and cleaning management.
180 Date: 8 November 2005, November 11, 2005
181 Origin: Adam Thornton <athornton at sinenomine dot net>,
182 Arno Lehmann <al at its-lehmann dot de>
185 What: Make Bacula manage tape life cycle information, tape reuse
186 times and drive cleaning cycles.
188 Why: All three parts of this project are important when operating
190 We need to know which tapes need replacement, and we need to
191 make sure the drives are cleaned when necessary. While many
192 tape libraries and even autoloaders can handle all this
193 automatically, support by Bacula can be helpful for smaller
194 (older) libraries and single drives. Limiting the number of
195 times a tape is used might prevent tape errors when using
196 tapes until the drives can't read it any more. Also, checking
197 drive status during operation can prevent some failures (as I
198 [Arno] had to learn the hard way...)
200 Notes: First, Bacula could (and even does, to some limited extent)
201 record tape and drive usage. For tapes, the number of mounts,
202 the amount of data, and the time the tape has actually been
203 running could be recorded. Data fields for Read and Write
204 time and Number of mounts already exist in the catalog (I'm
205 not sure if VolBytes is the sum of all bytes ever written to
206 that volume by Bacula). This information can be important
207 when determining which media to replace. The ability to mark
208 Volumes as "used up" after a given number of write cycles
209 should also be implemented so that a tape is never actually
210 worn out. For the tape drives known to Bacula, similar
211 information is interesting to determine the device status and
212 expected life time: Time it's been Reading and Writing, number
213 of tape Loads / Unloads / Errors. This information is not yet
214 recorded as far as I [Arno] know. A new volume status would
215 be necessary for the new state, like "Used up" or "Worn out".
216 Volumes with this state could be used for restores, but not
217 for writing. These volumes should be migrated first (assuming
218 migration is implemented) and, once they are no longer needed,
219 could be moved to a Trash pool.
221 The next step would be to implement a drive cleaning setup.
222 Bacula already has knowledge about cleaning tapes. Once it
223 has some information about cleaning cycles (measured in drive
224 run time, number of tapes used, or calender days, for example)
225 it can automatically execute tape cleaning (with an
226 autochanger, obviously) or ask for operator assistance loading
229 The final step would be to implement TAPEALERT checks not only
230 when changing tapes and only sending the information to the
231 administrator, but rather checking after each tape error,
232 checking on a regular basis (for example after each tape
233 file), and also before unloading and after loading a new tape.
234 Then, depending on the drives TAPEALERT state and the known
235 drive cleaning state Bacula could automatically schedule later
236 cleaning, clean immediately, or inform the operator.
238 Implementing this would perhaps require another catalog change
239 and perhaps major changes in SD code and the DIR-SD protocol,
240 so I'd only consider this worth implementing if it would
241 actually be used or even needed by many people.
243 Implementation of these projects could happen in three distinct
244 sub-projects: Measuring Tape and Drive usage, retiring
245 volumes, and handling drive cleaning and TAPEALERTs.
247 Item 8: Implement creation and maintenance of copy pools
248 Date: 27 November 2005
249 Origin: David Boyes (dboyes at sinenomine dot net)
252 What: I would like Bacula to have the capability to write copies
253 of backed-up data on multiple physical volumes selected
254 from different pools without transferring the data
255 multiple times, and to accept any of the copy volumes
256 as valid for restore.
258 Why: In many cases, businesses are required to keep offsite
259 copies of backup volumes, or just wish for simple
260 protection against a human operator dropping a storage
261 volume and damaging it. The ability to generate multiple
262 volumes in the course of a single backup job allows
263 customers to simple check out one copy and send it
264 offsite, marking it as out of changer or otherwise
265 unavailable. Currently, the library and magazine
266 management capability in Bacula does not make this process
269 Restores would use the copy of the data on the first
270 available volume, in order of copy pool chain definition.
272 This is also a major scalability issue -- as the number of
273 clients increases beyond several thousand, and the volume
274 of data increases, transferring the data multiple times to
275 produce additional copies of the backups will become
276 physically impossible due to transfer speed
277 issues. Generating multiple copies at server side will
278 become the only practical option.
280 How: I suspect that this will require adding a multiplexing
281 SD that appears to be a SD to a specific FD, but 1-n FDs
282 to the specific back end SDs managing the primary and copy
283 pools. Storage pools will also need to acquire parameters
284 to define the pools to be used for copies.
286 Notes: I would commit some of my developers' time if we can agree
287 on the design and behavior.
289 Item 9: Implement new {Client}Run{Before|After}Job feature.
290 Date: 26 September 2005
291 Origin: Phil Stracchino
292 Status: Done. This has been implemented by Eric Bollengier
294 What: Some time ago, there was a discussion of RunAfterJob and
295 ClientRunAfterJob, and the fact that they do not run after failed
296 jobs. At the time, there was a suggestion to add a
297 RunAfterFailedJob directive (and, presumably, a matching
298 ClientRunAfterFailedJob directive), but to my knowledge these
299 were never implemented.
301 The current implementation doesn't permit to add new feature easily.
303 An alternate way of approaching the problem has just occurred to
304 me. Suppose the RunBeforeJob and RunAfterJob directives were
305 expanded in a manner like this example:
308 Command = "/opt/bacula/etc/checkhost %c"
309 RunsOnClient = No # default
310 AbortJobOnError = Yes # default
314 Command = c:/bacula/systemstate.bat
322 Command = c:/bacula/deletestatefile.bat
327 It's now possible to specify more than 1 command per Job.
328 (you can stop your database and your webserver without a script)
333 JobDefs = "DefaultJob"
334 Write Bootstrap = "/tmp/bacula/var/bacula/working/Client1.bsr"
337 RunBeforeJob = "echo test before ; echo test before2"
338 RunBeforeJob = "echo test before (2nd time)"
339 RunBeforeJob = "echo test before (3rd time)"
340 RunAfterJob = "echo test after"
341 ClientRunAfterJob = "echo test after client"
344 Command = "echo test RunScript in error"
348 RunsWhen = After # never by default
351 Command = "echo test RunScript on success"
353 RunsOnSuccess = yes # default
354 RunsOnFailure = no # default
359 Why: It would be a significant change to the structure of the
360 directives, but allows for a lot more flexibility, including
361 RunAfter commands that will run regardless of whether the job
362 succeeds, or RunBefore tasks that still allow the job to run even
363 if that specific RunBefore fails.
365 Notes: (More notes from Phil, Kern, David and Eric)
366 I would prefer to have a single new Resource called
369 RunsWhen = After|Before|Always
370 RunsAtJobLevels = All|Full|Diff|Inc # not yet implemented
372 The AbortJobOnError, RunsOnSuccess and RunsOnFailure directives
373 could be optional, and possibly RunWhen as well.
375 AbortJobOnError would be ignored unless RunsWhen was set to Before
376 and would default to Yes if omitted.
377 If AbortJobOnError was set to No, failure of the script
378 would still generate a warning.
380 RunsOnSuccess would be ignored unless RunsWhen was set to After
381 (or RunsBeforeJob set to No), and default to Yes.
383 RunsOnFailure would be ignored unless RunsWhen was set to After,
386 Allow having the before/after status on the script command
387 line so that the same script can be used both before/after.
389 Item 10: Merge multiple backups (Synthetic Backup or Consolidation).
390 Origin: Marc Cousin and Eric Bollengier
391 Date: 15 November 2005
392 Status: Waiting implementation. Depends on first implementing
393 project Item 2 (Migration).
395 What: A merged backup is a backup made without connecting to the Client.
396 It would be a Merge of existing backups into a single backup.
397 In effect, it is like a restore but to the backup medium.
399 For instance, say that last Sunday we made a full backup. Then
400 all week long, we created incremental backups, in order to do
401 them fast. Now comes Sunday again, and we need another full.
402 The merged backup makes it possible to do instead an incremental
403 backup (during the night for instance), and then create a merged
404 backup during the day, by using the full and incrementals from
405 the week. The merged backup will be exactly like a full made
406 Sunday night on the tape, but the production interruption on the
407 Client will be minimal, as the Client will only have to send
410 In fact, if it's done correctly, you could merge all the
411 Incrementals into single Incremental, or all the Incrementals
412 and the last Differential into a new Differential, or the Full,
413 last differential and all the Incrementals into a new Full
414 backup. And there is no need to involve the Client.
416 Why: The benefit is that :
417 - the Client just does an incremental ;
418 - the merged backup on tape is just as a single full backup,
419 and can be restored very fast.
421 This is also a way of reducing the backup data since the old
422 data can then be pruned (or not) from the catalog, possibly
423 allowing older volumes to be recycled
425 Item 11: Deletion of Disk-Based Bacula Volumes
427 Origin: Ross Boylan <RossBoylan at stanfordalumni dot org> (edited
431 What: Provide a way for Bacula to automatically remove Volumes
432 from the filesystem, or optionally to truncate them.
433 Obviously, the Volume must be pruned prior removal.
435 Why: This would allow users more control over their Volumes and
436 prevent disk based volumes from consuming too much space.
438 Notes: The following two directives might do the trick:
440 Volume Data Retention = <time period>
441 Remove Volume After = <time period>
443 The migration project should also remove a Volume that is
444 migrated. This might also work for tape Volumes.
446 Item 12: Directive/mode to backup only file changes, not entire file
447 Date: 11 November 2005
448 Origin: Joshua Kugler <joshua dot kugler at uaf dot edu>
449 Marek Bajon <mbajon at bimsplus dot com dot pl>
452 What: Currently when a file changes, the entire file will be backed up in
453 the next incremental or full backup. To save space on the tapes
454 it would be nice to have a mode whereby only the changes to the
455 file would be backed up when it is changed.
457 Why: This would save lots of space when backing up large files such as
458 logs, mbox files, Outlook PST files and the like.
460 Notes: This would require the usage of disk-based volumes as comparing
461 files would not be feasible using a tape drive.
463 Item 13: Multiple threads in file daemon for the same job
464 Date: 27 November 2005
465 Origin: Ove Risberg (Ove.Risberg at octocode dot com)
468 What: I want the file daemon to start multiple threads for a backup
469 job so the fastest possible backup can be made.
471 The file daemon could parse the FileSet information and start
472 one thread for each File entry located on a separate
475 A configuration option in the job section should be used to
476 enable or disable this feature. The configuration option could
477 specify the maximum number of threads in the file daemon.
479 If the theads could spool the data to separate spool files
480 the restore process will not be much slower.
482 Why: Multiple concurrent backups of a large fileserver with many
483 disks and controllers will be much faster.
485 Notes: I am willing to try to implement this but I will probably
486 need some help and advice. (No problem -- Kern)
488 Item 14: Implement red/black binary tree routines.
489 Date: 28 October 2005
491 Status: Class code is complete. Code needs to be integrated into
494 What: Implement a red/black binary tree class. This could
495 then replace the current binary insert/search routines
496 used in the restore in memory tree. This could significantly
497 speed up the creation of the in memory restore tree.
499 Why: Performance enhancement.
501 Item 15: Add support for FileSets in user directories CACHEDIR.TAG
502 Origin: Norbert Kiesel <nkiesel at tbdnetworks dot com>
503 Date: 21 November 2005
504 Status: (I think this is better done using a Python event that I
505 will implement in version 1.39.x).
507 What: CACHDIR.TAG is a proposal for identifying directories which
508 should be ignored for archiving/backup. It works by ignoring
509 directory trees which have a file named CACHEDIR.TAG with a
510 specific content. See
511 http://www.brynosaurus.com/cachedir/spec.html
515 I suggest that if this is implemented (I've also asked for this
516 feature some year ago) that it is made compatible with Legato
517 Networkers ".nsr" files where you can specify a lot of options on
518 how to handle files/directories (including denying further
519 parsing of .nsr files lower down into the directory trees). A
520 PDF version of the .nsr man page can be viewed at:
522 http://www.ifm.liu.se/~peter/nsr.pdf
524 Why: It's a nice alternative to "exclude" patterns for directories
525 which don't have regular pathnames. Also, it allows users to
526 control backup for themselves. Implementation should be pretty
527 simple. GNU tar >= 1.14 or so supports it, too.
529 Notes: I envision this as an optional feature to a fileset
533 Item 16: Implement extraction of Win32 BackupWrite data.
534 Origin: Thorsten Engel <thorsten.engel at matrix-computer dot com>
535 Date: 28 October 2005
536 Status: Done. Assigned to Thorsten. Implemented in current CVS
538 What: This provides the Bacula File daemon with code that
539 can pick apart the stream output that Microsoft writes
540 for BackupWrite data, and thus the data can be read
541 and restored on non-Win32 machines.
543 Why: BackupWrite data is the portable=no option in Win32
544 FileSets, and in previous Baculas, this data could
545 only be extracted using a Win32 FD. With this new code,
546 the Windows data can be extracted and restored on
550 Item 18: Implement a Python interface to the Bacula catalog.
551 Date: 28 October 2005
555 What: Implement an interface for Python scripts to access
556 the catalog through Bacula.
558 Why: This will permit users to customize Bacula through
561 Item 18: Archival (removal) of User Files to Tape
565 Origin: Ray Pengelly [ray at biomed dot queensu dot ca
568 What: The ability to archive data to storage based on certain parameters
569 such as age, size, or location. Once the data has been written to
570 storage and logged it is then pruned from the originating
571 filesystem. Note! We are talking about user's files and not
574 Why: This would allow fully automatic storage management which becomes
575 useful for large datastores. It would also allow for auto-staging
576 from one media type to another.
578 Example 1) Medical imaging needs to store large amounts of data.
579 They decide to keep data on their servers for 6 months and then put
580 it away for long term storage. The server then finds all files
581 older than 6 months writes them to tape. The files are then removed
584 Example 2) All data that hasn't been accessed in 2 months could be
585 moved from high-cost, fibre-channel disk storage to a low-cost
586 large-capacity SATA disk storage pool which doesn't have as quick of
587 access time. Then after another 6 months (or possibly as one
588 storage pool gets full) data is migrated to Tape.
590 Item 19: Add Plug-ins to the FileSet Include statements.
591 Date: 28 October 2005
593 Status: Partially coded in 1.37 -- much more to do.
595 What: Allow users to specify wild-card and/or regular
596 expressions to be matched in both the Include and
597 Exclude directives in a FileSet. At the same time,
598 allow users to define plug-ins to be called (based on
599 regular expression/wild-card matching).
601 Why: This would give the users the ultimate ability to control
602 how files are backed up/restored. A user could write a
603 plug-in knows how to backup his Oracle database without
604 stopping/starting it, for example.
606 Item 20: Implement more Python events in Bacula.
607 Date: 28 October 2005
611 What: Allow Python scripts to be called at more places
612 within Bacula and provide additional access to Bacula
615 Why: This will permit users to customize Bacula through
623 Also add a way to get a listing of currently running
624 jobs (possibly also scheduled jobs).
627 Item 21: Quick release of FD-SD connection after backup.
628 Origin: Frank Volf (frank at deze dot org)
629 Date: 17 November 2005
632 What: In the Bacula implementation a backup is finished after all data
633 and attributes are successfully written to storage. When using a
634 tape backup it is very annoying that a backup can take a day,
635 simply because the current tape (or whatever) is full and the
636 administrator has not put a new one in. During that time the
637 system cannot be taken off-line, because there is still an open
638 session between the storage daemon and the file daemon on the
641 Although this is a very good strategy for making "safe backups"
642 This can be annoying for e.g. laptops, that must remain
643 connected until the backup is completed.
645 Using a new feature called "migration" it will be possible to
646 spool first to harddisk (using a special 'spool' migration
647 scheme) and then migrate the backup to tape.
649 There is still the problem of getting the attributes committed.
650 If it takes a very long time to do, with the current code, the
651 job has not terminated, and the File daemon is not freed up. The
652 Storage daemon should release the File daemon as soon as all the
653 file data and all the attributes have been sent to it (the SD).
654 Currently the SD waits until everything is on tape and all the
655 attributes are transmitted to the Director before signaling
656 completion to the FD. I don't think I would have any problem
657 changing this. The reason is that even if the FD reports back to
658 the Dir that all is OK, the job will not terminate until the SD
659 has done the same thing -- so in a way keeping the SD-FD link
660 open to the very end is not really very productive ...
662 Why: Makes backup of laptops much easier.
664 Item 22: Permit multiple Media Types in an Autochanger
666 Status: Done. Implemented in 1.38.9 (I think).
668 What: Modify the Storage daemon so that multiple Media Types
669 can be specified in an autochanger. This would be somewhat
670 of a simplistic implementation in that each drive would
671 still be allowed to have only one Media Type. However,
672 the Storage daemon will ensure that only a drive with
673 the Media Type that matches what the Director specifies
676 Why: This will permit user with several different drive types
677 to make full use of their autochangers.
679 Item 23: Allow different autochanger definitions for one autochanger.
680 Date: 28 October 2005
684 What: Currently, the autochanger script is locked based on
685 the autochanger. That is, if multiple drives are being
686 simultaneously used, the Storage daemon ensures that only
687 one drive at a time can access the mtx-changer script.
688 This change would base the locking on the control device,
689 rather than the autochanger. It would then permit two autochanger
690 definitions for the same autochanger, but with different
691 drives. Logically, the autochanger could then be "partitioned"
692 for different jobs, clients, or class of jobs, and if the locking
693 is based on the control device (e.g. /dev/sg0) the mtx-changer
694 script will be locked appropriately.
696 Why: This will permit users to partition autochangers for specific
697 use. It would also permit implementation of multiple Media
698 Types with no changes to the Storage daemon.
700 Item 24: Automatic disabling of devices
702 Origin: Peter Eriksson <peter at ifm.liu dot se>
705 What: After a configurable amount of fatal errors with a tape drive
706 Bacula should automatically disable further use of a certain
707 tape drive. There should also be "disable"/"enable" commands in
710 Why: On a multi-drive jukebox there is a possibility of tape drives
711 going bad during large backups (needing a cleaning tape run,
712 tapes getting stuck). It would be advantageous if Bacula would
713 automatically disable further use of a problematic tape drive
714 after a configurable amount of errors has occurred.
716 An example: I have a multi-drive jukebox (6 drives, 380+ slots)
717 where tapes occasionally get stuck inside the drive. Bacula will
718 notice that the "mtx-changer" command will fail and then fail
719 any backup jobs trying to use that drive. However, it will still
720 keep on trying to run new jobs using that drive and fail -
721 forever, and thus failing lots and lots of jobs... Since we have
722 many drives Bacula could have just automatically disabled
723 further use of that drive and used one of the other ones
726 Item 25: Implement huge exclude list support using hashing.
727 Date: 28 October 2005
731 What: Allow users to specify very large exclude list (currently
732 more than about 1000 files is too many).
734 Why: This would give the users the ability to exclude all
735 files that are loaded with the OS (e.g. using rpms
736 or debs). If the user can restore the base OS from
737 CDs, there is no need to backup all those files. A
738 complete restore would be to restore the base OS, then
739 do a Bacula restore. By excluding the base OS files, the
740 backup set will be *much* smaller.
743 ============= Empty Feature Request form ===========
744 Item n: One line summary ...
746 Origin: Name and email of originator.
749 What: More detailed explanation ...
751 Why: Why it is important ...
753 Notes: Additional notes or features (omit if not used)
754 ============== End Feature Request form ==============
757 ===============================================
758 Feature requests submitted after cutoff for December 2005 vote
759 and not yet discussed.
760 ===============================================
761 Item n: Allow skipping execution of Jobs
762 Date: 29 November 2005
763 Origin: Florian Schnabel <florian.schnabel at docufy dot de>
766 What: An easy option to skip a certain job on a certain date.
767 Why: You could then easily skip tape backups on holidays. Especially
768 if you got no autochanger and can only fit one backup on a tape
769 that would be really handy, other jobs could proceed normally
770 and you won't get errors that way.
772 ===================================================
776 Origin: calvin streeting calvin at absentdream dot com
779 What: The abilty to archive to media (dvd/cd) in a uncompressd format
780 for dead filing (archiving not backing up)
782 Why: At my works when jobs are finished and moved off of the main file
783 servers (raid based systems) onto a simple linux file server (ide based
784 system) so users can find old information without contacting the IT
787 So this data dosn't realy change it only gets added to,
788 But it also needs backing up. At the moment it takes
789 about 8 hours to back up our servers (working data) so
790 rather than add more time to existing backups i am trying
791 to implement a system where we backup the acrhive data to
792 cd/dvd these disks would only need to be appended to
793 (burn only new/changed files to new disks for off site
794 storage). basialy understand the differnce between
795 achive data and live data.
797 Notes: scan the data and email me when it needs burning divide
798 into predifind chunks keep a recored of what is on what
799 disk make me a label (simple php->mysql=>pdf stuff) i
800 could do this bit ability to save data uncompresed so
801 it can be read in any other system (future proof data)
802 save the catalog with the disk as some kind of menu
805 Item : Tray monitor window cleanups
806 Origin: Alan Brown ajb2 at mssl dot ucl dot ac dot uk
809 What: Resizeable and scrollable windows in the tray monitor.
811 Why: With multiple clients, or with many jobs running, the displayed
812 window often ends up larger than the available screen, making
813 the trailing items difficult to read.
817 Item : Clustered file-daemons
818 Origin: Alan Brown ajb2 at mssl dot ucl dot ac dot uk
821 What: A "virtual" filedaemon, which is actually a cluster of real ones.
823 Why: In the case of clustered filesystems (SAN setups, GFS, or OCFS2, etc)
824 multiple machines may have access to the same set of filesystems
826 For performance reasons, one may wish to initate backups from
827 several of these machines simultaneously, instead of just using
828 one backup source for the common clustered filesystem.
830 For obvious reasons, normally backups of $A-FD/$PATH and
831 B-FD/$PATH are treated as different backup sets. In this case
832 they are the same communal set.
834 Likewise when restoring, it would be easier to just specify
835 one of the cluster machines and let bacula decide which to use.
837 This can be faked to some extent using DNS round robin entries
838 and a virtual IP address, however it means "status client" will
839 always give bogus answers. Additionally there is no way of
840 spreading the load evenly among the servers.
842 What is required is something similar to the storage daemon
843 autochanger directives, so that Bacula can keep track of
844 operating backups/restores and direct new jobs to a "free"
849 Item : Tray monitor window cleanups
850 Origin: Alan Brown ajb2 at mssl dot ucl dot ac dot uk
853 What: Resizeable and scrollable windows in the tray monitor.
855 Why: With multiple clients, or with many jobs running, the displayed
856 window often ends up larger than the available screen, making
857 the trailing items difficult to read.
861 Item: Commercial database support
862 Origin: Russell Howe <russell_howe dot wreckage dot org>
866 What: It would be nice for the database backend to support more
867 databases. I'm thinking of SQL Server at the moment, but I guess Oracle,
868 DB2, MaxDB, etc are all candidates. SQL Server would presumably be
869 implemented using FreeTDS or maybe an ODBC library?
871 Why: We only really have one database server, which is MS SQL Server
872 2000. Maintaining a second one for the backup software (we grew out of
873 SQLite, which I liked, but which didn't work so well with our database
874 size). We don't really have a machine with the resources to run
875 postgres, and would rather only maintain a single DBMS. We're stuck with
876 SQL Server because pretty much all the company's custom applications
877 (written by consultants) are locked into SQL Server 2000. I can imagine
878 this scenario is fairly common, and it would be nice to use the existing
879 properly specced database server for storing Bacula's catalog, rather
880 than having to run a second DBMS.
883 Item n: Split documentation
884 Origin: Maxx <maxxatworkat gmail dot com>
888 What: Split documentation in several books
890 Why: Bacula manual has now more than 600 pages, and looking for
891 implementation details is getting complicated. I think
892 it would be good to split the single volume in two or
895 1) Introduction, requirements and tutorial, typically
896 are useful only until first installation time
898 2) Basic installation and configuration, with all the
899 gory details about the directives supported 3)
900 Advanced Bacula: testing, troubleshooting, GUI and
901 ancillary programs, security managements, scripting,
906 Item n: Include an option to operate on all pools when doing
907 update vol parameters
909 Origin: Dmitriy Pinchukov <absh@bossdev.kiev.ua>
913 What: When I do update -> Volume parameters -> All Volumes
914 from Pool, then I have to select pools one by one. I'd like
915 console to have an option like "0: All Pools" in the list of
918 Why: I have many pools and therefore unhappy with manually
919 updating each of them using update -> Volume parameters -> All
920 Volumes from Pool -> pool #.
922 Item n: Automatic promotion of backup levels
923 Date: 19 January 2006
924 Origin: Adam Thornton <athornton@sinenomine.net>
927 What: Amanda has a feature whereby it estimates the space that a
928 differential, incremental, and full backup would take. If the
929 difference in space required between the scheduled level and the next
930 level up is beneath some user-defined critical threshold, the backup
931 level is bumped to the next type. Doing this minimizes the number of
932 volumes necessary during a restore, with a fairly minimal cost in
935 Why: I know at least one (quite sophisticated and smart) user
936 for whom the absence of this feature is a deal-breaker in terms of
937 using Bacula; if we had it it would eliminate the one cool thing
938 Amanda can do and we can't (at least, the one cool thing I know of).
943 Item n+1: Incorporation of XACML2/SAML2 parsing
944 Date: 19 January 2006
945 Origin: Adam Thornton <athornton@sinenomine.net>
948 What: XACML is "eXtensible Access Control Markup Language" and
949 "SAML is the "Security Assertion Markup Language"--an XML standard
950 for making statements about identity and authorization. Having these
951 would give us a framework to approach ACLs in a generic manner, and
952 in a way flexible enough to support the four major sorts of ACLs I
953 see as a concern to Bacula at this point, as well as (probably) to
954 deal with new sorts of ACLs that may appear in the future.
956 Why: Bacula is beginning to need to back up systems with ACLs
957 that do not map cleanly onto traditional Unix permissions. I see
958 four sets of ACLs--in general, mutually incompatible with one
959 another--that we're going to need to deal with. These are: NTFS
960 ACLs, POSIX ACLs, NFSv4 ACLS, and AFS ACLS. (Some may question the
961 relevance of AFS; AFS is one of Sine Nomine's core consulting
962 businesses, and having a reputable file-level backup and restore
963 technology for it (as Tivoli is probably going to drop AFS support
964 soon since IBM no longer supports AFS) would be of huge benefit to
965 our customers; we'd most likely create the AFS support at Sine Nomine
966 for inclusion into the Bacula (and perhaps some changes to the
967 OpenAFS volserver) core code.)
969 Now, obviously, Bacula already handles NTFS just fine. However, I
970 think there's a lot of value in implementing a generic ACL model, so
971 that it's easy to support whatever particular instances of ACLs come
972 down the pike: POSIX ACLS (think SELinux) and NFSv4 are the obvious
973 things arriving in the Linux world in a big way in the near future.
974 XACML, although overcomplicated for our needs, provides this
975 framework, and we should be able to leverage other people's
976 implementations to minimize the amount of work *we* have to do to get
977 a generic ACL framework. Basically, the costs of implementation are
978 high, but they're largely both external to Bacula and already sunk.
980 Item 1: Add an over-ride in the Schedule configuration to use a
981 different pool for different backup types.
984 Origin: Chad Slater <chad.slater@clickfox.com>
987 What: Adding a FullStorage=BigTapeLibrary in the Schedule resource
988 would help those of us who use different storage devices for different
989 backup levels cope with the "auto-upgrade" of a backup.
991 Why: Assume I add several new device to be backed up, i.e. several
992 hosts with 1TB RAID. To avoid tape switching hassles, incrementals are
993 stored in a disk set on a 2TB RAID. If you add these devices in the
994 middle of the month, the incrementals are upgraded to "full" backups,
995 but they try to use the same storage device as requested in the
996 incremental job, filling up the RAID holding the differentials. If we
997 could override the Storage parameter for full and/or differential
998 backups, then the Full job would use the proper Storage device, which
999 has more capacity (i.e. a 8TB tape library.
1002 Item: Implement multiple numeric backup levels as supported by dump
1004 Origin: Daniel Rich <drich@employees.org>
1006 What: Dump allows specification of backup levels numerically instead of just
1007 "full", "incr", and "diff". In this system, at any given level, all
1008 files are backed up that were were modified since the last backup of a
1009 higher level (with 0 being the highest and 9 being the lowest). A
1010 level 0 is therefore equivalent to a full, level 9 an incremental, and
1011 the levels 1 through 8 are varying levels of differentials. For
1012 bacula's sake, these could be represented as "full", "incr", and
1013 "diff1", "diff2", etc.
1015 Why: Support of multiple backup levels would provide for more advanced backup
1016 rotation schemes such as "Towers of Hanoi". This would allow better
1017 flexibility in performing backups, and can lead to shorter recover
1020 Notes: Legato Networker supports a similar system with full, incr, and 1-9 as
1023 Kern notes: I think this would add very little functionality, but a *lot* of
1024 additional overhead to Bacula.
1026 Item 1: include JobID in spool file name
1027 Origin: Mark Bergman <mark.bergman@uphs.upenn.edu>
1028 Date: Tue Aug 22 17:13:39 EDT 2006
1031 What: Change the name of the spool file to include the JobID
1033 Why: JobIDs are the common key used to refer to jobs, yet the
1034 spoolfile name doesn't include that information. The date/time
1035 stamp is useful (and should be retained).
1039 Item 2: include timestamp of job launch in "stat clients" output
1040 Origin: Mark Bergman <mark.bergman@uphs.upenn.edu>
1041 Date: Tue Aug 22 17:13:39 EDT 2006
1044 What: The "stat clients" command doesn't include any detail on when
1045 the active backup jobs were launched.
1047 Why: Including the timestamp would make it much easier to decide whether
1048 a job is running properly.
1050 Notes: It may be helpful to have the output from "stat clients" formatted
1051 more like that from "stat dir" (and other commands), in a column
1052 format. The per-client information that's currently shown (level,
1053 client name, JobId, Volume, pool, device, Files, etc.) is good, but
1054 somewhat hard to parse (both programmatically and visually),
1055 particularly when there are many active clients.
1057 Item 1: Filesystemwatch triggered backup.
1058 Date: 31 August 2006
1059 Origin: Jesper Krogh <jesper@krogh.cc>
1060 Status: Unimplemented, depends probably on "client initiated backups"
1062 What: With inotify and similar filesystem triggeret notification
1063 systems is it possible to have the file-daemon to monitor
1064 filesystem changes and initiate backup.
1066 Why: There are 2 situations where this is nice to have.
1067 1) It is possible to get a much finer-grained backup than
1068 the fixed schedules used now.. A file created and deleted
1069 a few hours later, can automatically be caught.
1071 2) The introduced load on the system will probably be
1072 distributed more even on the system.
1074 Notes: This can be combined with configration that specifies
1075 something like: "at most every 15 minutes or when changes
1078 Item n: Message mailing based on backup types
1079 Origin: Evan Kaufman <evan.kaufman@gmail.com>
1080 Date: January 6, 2006
1083 What: In the "Messages" resource definitions, allowing messages
1084 to be mailed based on the type (backup, restore, etc.) and level
1085 (full, differential, etc) of job that created the originating
1088 Why: It would, for example, allow someone's boss to be emailed
1089 automatically only when a Full Backup job runs, so he can
1090 retrieve the tapes for offsite storage, even if the IT dept.
1091 doesn't (or can't) explicitly notify him. At the same time, his
1092 mailbox wouldnt be filled by notifications of Verifies, Restores,
1093 or Incremental/Differential Backups (which would likely be kept
1097 One way this could be done is through additional message types, for example:
1100 # email the boss only on full system backups
1101 Mail = boss@mycompany.com = full, !incremental, !differential, !restore,
1103 # email us only when something breaks
1104 MailOnError = itdept@mycompany.com = all
1108 Item n: Allow inclusion/exclusion of files in a fileset by creation/mod times
1109 Origin: Evan Kaufman <evan.kaufman@gmail.com>
1110 Date: January 11, 2006
1113 What: In the vein of the Wild and Regex directives in a Fileset's
1114 Options, it would be helpful to allow a user to include or exclude
1115 files and directories by creation or modification times.
1117 You could factor the Exclude=yes|no option in much the same way it
1118 affects the Wild and Regex directives. For example, you could exclude
1119 all files modified before a certain date:
1123 Modified Before = ####
1126 Or you could exclude all files created/modified since a certain date:
1130 Created Modified Since = ####
1133 The format of the time/date could be done several ways, say the number
1134 of seconds since the epoch:
1135 1137008553 = Jan 11 2006, 1:42:33PM # result of `date +%s`
1137 Or a human readable date in a cryptic form:
1138 20060111134233 = Jan 11 2006, 1:42:33PM # YYYYMMDDhhmmss
1140 Why: I imagine a feature like this could have many uses. It would
1141 allow a user to do a full backup while excluding the base operating
1142 system files, so if I installed a Linux snapshot from a CD yesterday,
1143 I'll *exclude* all files modified *before* today. If I need to
1144 recover the system, I use the CD I already have, plus the tape backup.
1145 Or if, say, a Windows client is hit by a particularly corrosive
1146 virus, and I need to *exclude* any files created/modified *since* the
1149 Notes: Of course, this feature would work in concert with other
1150 in/exclude rules, and wouldnt override them (or each other).
1152 Notes: The directives I'd imagine would be along the lines of
1153 "[Created] [Modified] [Before|Since] = <date>".
1154 So one could compare against 'ctime' and/or 'mtime', but ONLY 'before'
1158 Item 1: Bacula support for a MailOnSuccess feature.
1159 Origin: Jaime Ventura <jaimeventura at ipp dot pt>
1160 Date: 15 November 2006
1161 Status: for 1.38.11: coded(patch on attachment), compiled, tested
1162 for 1.39.28: coded(patch on attachment), complied, NOT tested
1165 What: be able to send a email message for a specified email address if (and only if) a job finishes successfully.
1166 Its similar to the MailOnError feature.
1168 Why: The importance is about the same as MailOnError feature.
1169 Since its not possible to do it using bacula's message types(info, error,...)filter, this could be done using some kind of filter, right after the mail was sent.
1170 But since there is a MailOnError feature, why not have a MailOnSuccess feature?
1175 Why its not possible to do it using bacula's message types(info, error,...)?
1177 Imagine I want bacula to send ONLY successful job reports/messages to baculaOK@domain.
1178 When a job starts, bacula send the message : 10-Nov 17:37 bserver-dir: Start Backup JobId 1605, Job=Job.GSI04.2006-11-10_17.37.30
1179 Since this is a info message (msgtype = M_INFO) the "bacula's messaging system" put it on the job messages (jcr->jcr_msgs) to be sent
1180 to all dest that have the info type enabled (including baculaOK@domain).
1181 But when/if the job fails, that message (10-Nov 17:37 bserver-dir: Start Backup JobId 1605, Job=Job.GSI04.2006-11-10_17.37.30) has already
1182 been queued to be sent to baculaOK@domain, even though it refers to a unsuccessful backup.
1183 So when its time to send all messages to emails, the "bacula's messaging system" send that message (10-Nov 17:37 bserver-dir: Start
1184 Backup JobId 1605, Job=Job.GSI04.2006-11-10_17.37.30) to baculaOK@domain, but using the subject "bacula ERROR", because the job terminated unsuccessful.
1186 This "problem" could also happen if I wanted bacula to send emails regarding unsuccessful backups, if i didnt use the MailOnError feature.
1187 This feature is implemented so that if messages that where queued to be sent if the backup was unsuccessful, to be discarded if the backup is