3 Bacula Projects Roadmap
4 Status updated 3 January 2007
7 Item 1: Accurate restoration of renamed/deleted files
8 Item 2: Implement a Bacula GUI/management tool.
9 Item 3: Implement Base jobs.
10 Item 4: Implement from-client and to-client on restore command line.
11 Item 5: Implement creation and maintenance of copy pools
12 Item 6: Merge multiple backups (Synthetic Backup or Consolidation).
13 Item 8: Deletion of Disk-Based Bacula Volumes
14 Item 9: Implement a Python interface to the Bacula catalog.
15 Item 10: Archival (removal) of User Files to Tape
16 Item 11: Add Plug-ins to the FileSet Include statements.
17 Item 12: Implement more Python events in Bacula.
18 Item 13: Quick release of FD-SD connection after backup.
19 Item 14: Implement huge exclude list support using hashing.
20 Item 15: Allow skipping execution of Jobs
21 Item 16: Tray monitor window cleanups
22 Item 17: Split documentation
23 Item 18: Automatic promotion of backup levels
24 Item 19: Add an override in Schedule for Pools based on backup types.
25 Item 20: An option to operate on all pools with update vol parameters
26 Item 21: Include JobID in spool file name
27 Item 22: Include timestamp of job launch in "stat clients" output
28 Item 23: Message mailing based on backup types
29 Item 24: Allow inclusion/exclusion of files in a fileset by creation/mod times
30 Item 25: Add a scheduling syntax that permits weekly rotations
31 Item 26: Improve Bacula's tape and drive usage and cleaning management.
32 Item 27: Implement support for stacking arbitrary stream filters, sinks.
33 Item 28: Allow FD to initiate a backup
34 Item 29: Directive/mode to backup only file changes, not entire file
35 Item 30: Automatic disabling of devices
36 Item 31: Incorporation of XACML2/SAML2 parsing
37 Item 32: Clustered file-daemons
38 Item 33: Commercial database support
40 Item 35: Filesystem watch triggered backup.
41 Item 36: Implement multiple numeric backup levels as supported by dump
42 Item 37: Implement a server-side compression feature
43 Item 38: Cause daemons to use a specific IP address to source communications
44 Item 39: Multiple threads in file daemon for the same job
45 Item 40: Restore only file attributes (permissions, ACL, owner, group...)
46 Item 41: Add an item to the restore option where you can select a pool
48 Below, you will find more information on future projects:
50 Item 1: Accurate restoration of renamed/deleted files
51 Date: 28 November 2005
52 Origin: Martin Simmons (martin at lispworks dot com)
53 Status: Robert Nelson will implement this
55 What: When restoring a fileset for a specified date (including "most
56 recent"), Bacula should give you exactly the files and directories
57 that existed at the time of the last backup prior to that date.
59 Currently this only works if the last backup was a Full backup.
60 When the last backup was Incremental/Differential, files and
61 directories that have been renamed or deleted since the last Full
62 backup are not currently restored correctly. Ditto for files with
63 extra/fewer hard links than at the time of the last Full backup.
65 Why: Incremental/Differential would be much more useful if this worked.
67 Notes: Merging of multiple backups into a single one seems to
68 rely on this working, otherwise the merged backups will not be
69 truly equivalent to a Full backup.
71 Kern: notes shortened. This can be done without the need for
72 inodes. It is essentially the same as the current Verify job,
73 but one additional database record must be written, which does
74 not need any database change.
76 Kern: see if we can correct restoration of directories if
77 replace=ifnewer is set. Currently, if the directory does not
78 exist, a "dummy" directory is created, then when all the files
79 are updated, the dummy directory is newer so the real values
82 Item 2: Implement a Bacula GUI/management tool.
87 What: Implement a Bacula console, and management tools
88 probably using Qt3 and C++.
90 Why: Don't we already have a wxWidgets GUI? Yes, but
91 it is written in C++ and changes to the user interface
92 must be hand tailored using C++ code. By developing
93 the user interface using Qt designer, the interface
94 can be very easily updated and most of the new Python
95 code will be automatically created. The user interface
96 changes become very simple, and only the new features
97 must be implement. In addition, the code will be in
98 Python, which will give many more users easy (or easier)
99 access to making additions or modifications.
101 Notes: There is a partial Python-GTK implementation
102 Lucas Di Pentima <lucas at lunix dot com dot ar> but
103 it is no longer being developed.
106 Item 3: Implement Base jobs.
107 Date: 28 October 2005
111 What: A base job is sort of like a Full save except that you
112 will want the FileSet to contain only files that are
113 unlikely to change in the future (i.e. a snapshot of
114 most of your system after installing it). After the
115 base job has been run, when you are doing a Full save,
116 you specify one or more Base jobs to be used. All
117 files that have been backed up in the Base job/jobs but
118 not modified will then be excluded from the backup.
119 During a restore, the Base jobs will be automatically
120 pulled in where necessary.
122 Why: This is something none of the competition does, as far as
123 we know (except perhaps BackupPC, which is a Perl program that
124 saves to disk only). It is big win for the user, it
125 makes Bacula stand out as offering a unique
126 optimization that immediately saves time and money.
127 Basically, imagine that you have 100 nearly identical
128 Windows or Linux machine containing the OS and user
129 files. Now for the OS part, a Base job will be backed
130 up once, and rather than making 100 copies of the OS,
131 there will be only one. If one or more of the systems
132 have some files updated, no problem, they will be
133 automatically restored.
135 Notes: Huge savings in tape usage even for a single machine.
136 Will require more resources because the DIR must send
137 FD a list of files/attribs, and the FD must search the
138 list and compare it for each file to be saved.
140 Item 4: Implement from-client and to-client on restore command line.
141 Date: 11 December 2006
142 Origin: Discussion on Bacula-users entitled 'Scripted restores to
143 different clients', December 2006
144 Status: New feature request
146 What: While using bconsole interactively, you can specify the client
147 that a backup job is to be restored for, and then you can
148 specify later a different client to send the restored files
149 back to. However, using the 'restore' command with all options
150 on the command line, this cannot be done, due to the ambiguous
151 'client' parameter. Additionally, this parameter means different
152 things depending on if it's specified on the command line or
153 afterwards, in the Modify Job screens.
155 Why: This feature would enable restore jobs to be more completely
156 automated, for example by a web or GUI front-end.
158 Notes: client can also be implied by specifying the jobid on the command
161 Item 5: Implement creation and maintenance of copy pools
162 Date: 27 November 2005
163 Origin: David Boyes (dboyes at sinenomine dot net)
166 What: I would like Bacula to have the capability to write copies
167 of backed-up data on multiple physical volumes selected
168 from different pools without transferring the data
169 multiple times, and to accept any of the copy volumes
170 as valid for restore.
172 Why: In many cases, businesses are required to keep offsite
173 copies of backup volumes, or just wish for simple
174 protection against a human operator dropping a storage
175 volume and damaging it. The ability to generate multiple
176 volumes in the course of a single backup job allows
177 customers to simple check out one copy and send it
178 offsite, marking it as out of changer or otherwise
179 unavailable. Currently, the library and magazine
180 management capability in Bacula does not make this process
183 Restores would use the copy of the data on the first
184 available volume, in order of copy pool chain definition.
186 This is also a major scalability issue -- as the number of
187 clients increases beyond several thousand, and the volume
188 of data increases, transferring the data multiple times to
189 produce additional copies of the backups will become
190 physically impossible due to transfer speed
191 issues. Generating multiple copies at server side will
192 become the only practical option.
194 How: I suspect that this will require adding a multiplexing
195 SD that appears to be a SD to a specific FD, but 1-n FDs
196 to the specific back end SDs managing the primary and copy
197 pools. Storage pools will also need to acquire parameters
198 to define the pools to be used for copies.
200 Notes: I would commit some of my developers' time if we can agree
201 on the design and behavior.
203 Item 6: Merge multiple backups (Synthetic Backup or Consolidation).
204 Origin: Marc Cousin and Eric Bollengier
205 Date: 15 November 2005
206 Status: Waiting implementation. Depends on first implementing
207 project Item 2 (Migration) which is now done.
209 What: A merged backup is a backup made without connecting to the Client.
210 It would be a Merge of existing backups into a single backup.
211 In effect, it is like a restore but to the backup medium.
213 For instance, say that last Sunday we made a full backup. Then
214 all week long, we created incremental backups, in order to do
215 them fast. Now comes Sunday again, and we need another full.
216 The merged backup makes it possible to do instead an incremental
217 backup (during the night for instance), and then create a merged
218 backup during the day, by using the full and incrementals from
219 the week. The merged backup will be exactly like a full made
220 Sunday night on the tape, but the production interruption on the
221 Client will be minimal, as the Client will only have to send
224 In fact, if it's done correctly, you could merge all the
225 Incrementals into single Incremental, or all the Incrementals
226 and the last Differential into a new Differential, or the Full,
227 last differential and all the Incrementals into a new Full
228 backup. And there is no need to involve the Client.
230 Why: The benefit is that :
231 - the Client just does an incremental ;
232 - the merged backup on tape is just as a single full backup,
233 and can be restored very fast.
235 This is also a way of reducing the backup data since the old
236 data can then be pruned (or not) from the catalog, possibly
237 allowing older volumes to be recycled
239 Item 8: Deletion of Disk-Based Bacula Volumes
241 Origin: Ross Boylan <RossBoylan at stanfordalumni dot org> (edited
245 What: Provide a way for Bacula to automatically remove Volumes
246 from the filesystem, or optionally to truncate them.
247 Obviously, the Volume must be pruned prior removal.
249 Why: This would allow users more control over their Volumes and
250 prevent disk based volumes from consuming too much space.
252 Notes: The following two directives might do the trick:
254 Volume Data Retention = <time period>
255 Remove Volume After = <time period>
257 The migration project should also remove a Volume that is
258 migrated. This might also work for tape Volumes.
260 Item 9: Implement a Python interface to the Bacula catalog.
261 Date: 28 October 2005
265 What: Implement an interface for Python scripts to access
266 the catalog through Bacula.
268 Why: This will permit users to customize Bacula through
271 Item 10: Archival (removal) of User Files to Tape
275 Origin: Ray Pengelly [ray at biomed dot queensu dot ca
278 What: The ability to archive data to storage based on certain parameters
279 such as age, size, or location. Once the data has been written to
280 storage and logged it is then pruned from the originating
281 filesystem. Note! We are talking about user's files and not
284 Why: This would allow fully automatic storage management which becomes
285 useful for large datastores. It would also allow for auto-staging
286 from one media type to another.
288 Example 1) Medical imaging needs to store large amounts of data.
289 They decide to keep data on their servers for 6 months and then put
290 it away for long term storage. The server then finds all files
291 older than 6 months writes them to tape. The files are then removed
294 Example 2) All data that hasn't been accessed in 2 months could be
295 moved from high-cost, fibre-channel disk storage to a low-cost
296 large-capacity SATA disk storage pool which doesn't have as quick of
297 access time. Then after another 6 months (or possibly as one
298 storage pool gets full) data is migrated to Tape.
300 Item 11: Add Plug-ins to the FileSet Include statements.
301 Date: 28 October 2005
303 Status: Partially coded in 1.37 -- much more to do.
305 What: Allow users to specify wild-card and/or regular
306 expressions to be matched in both the Include and
307 Exclude directives in a FileSet. At the same time,
308 allow users to define plug-ins to be called (based on
309 regular expression/wild-card matching).
311 Why: This would give the users the ultimate ability to control
312 how files are backed up/restored. A user could write a
313 plug-in knows how to backup his Oracle database without
314 stopping/starting it, for example.
316 Item 12: Implement more Python events in Bacula.
317 Date: 28 October 2005
321 What: Allow Python scripts to be called at more places
322 within Bacula and provide additional access to Bacula
325 Why: This will permit users to customize Bacula through
333 Also add a way to get a listing of currently running
334 jobs (possibly also scheduled jobs).
337 Item 13: Quick release of FD-SD connection after backup.
338 Origin: Frank Volf (frank at deze dot org)
339 Date: 17 November 2005
342 What: In the Bacula implementation a backup is finished after all data
343 and attributes are successfully written to storage. When using a
344 tape backup it is very annoying that a backup can take a day,
345 simply because the current tape (or whatever) is full and the
346 administrator has not put a new one in. During that time the
347 system cannot be taken off-line, because there is still an open
348 session between the storage daemon and the file daemon on the
351 Although this is a very good strategy for making "safe backups"
352 This can be annoying for e.g. laptops, that must remain
353 connected until the backup is completed.
355 Using a new feature called "migration" it will be possible to
356 spool first to harddisk (using a special 'spool' migration
357 scheme) and then migrate the backup to tape.
359 There is still the problem of getting the attributes committed.
360 If it takes a very long time to do, with the current code, the
361 job has not terminated, and the File daemon is not freed up. The
362 Storage daemon should release the File daemon as soon as all the
363 file data and all the attributes have been sent to it (the SD).
364 Currently the SD waits until everything is on tape and all the
365 attributes are transmitted to the Director before signaling
366 completion to the FD. I don't think I would have any problem
367 changing this. The reason is that even if the FD reports back to
368 the Dir that all is OK, the job will not terminate until the SD
369 has done the same thing -- so in a way keeping the SD-FD link
370 open to the very end is not really very productive ...
372 Why: Makes backup of laptops much faster.
376 Item 14: Implement huge exclude list support using hashing.
377 Date: 28 October 2005
381 What: Allow users to specify very large exclude list (currently
382 more than about 1000 files is too many).
384 Why: This would give the users the ability to exclude all
385 files that are loaded with the OS (e.g. using rpms
386 or debs). If the user can restore the base OS from
387 CDs, there is no need to backup all those files. A
388 complete restore would be to restore the base OS, then
389 do a Bacula restore. By excluding the base OS files, the
390 backup set will be *much* smaller.
393 Item 15: Allow skipping execution of Jobs
394 Date: 29 November 2005
395 Origin: Florian Schnabel <florian.schnabel at docufy dot de>
398 What: An easy option to skip a certain job on a certain date.
399 Why: You could then easily skip tape backups on holidays. Especially
400 if you got no autochanger and can only fit one backup on a tape
401 that would be really handy, other jobs could proceed normally
402 and you won't get errors that way.
405 Item 16: Tray monitor window cleanups
406 Origin: Alan Brown ajb2 at mssl dot ucl dot ac dot uk
409 What: Resizeable and scrollable windows in the tray monitor.
411 Why: With multiple clients, or with many jobs running, the displayed
412 window often ends up larger than the available screen, making
413 the trailing items difficult to read.
416 Item 17: Split documentation
417 Origin: Maxx <maxxatworkat gmail dot com>
421 What: Split documentation in several books
423 Why: Bacula manual has now more than 600 pages, and looking for
424 implementation details is getting complicated. I think
425 it would be good to split the single volume in two or
428 1) Introduction, requirements and tutorial, typically
429 are useful only until first installation time
431 2) Basic installation and configuration, with all the
432 gory details about the directives supported 3)
433 Advanced Bacula: testing, troubleshooting, GUI and
434 ancillary programs, security managements, scripting,
439 Item 18: Automatic promotion of backup levels
440 Date: 19 January 2006
441 Origin: Adam Thornton <athornton@sinenomine.net>
444 What: Amanda has a feature whereby it estimates the space that a
445 differential, incremental, and full backup would take. If the
446 difference in space required between the scheduled level and the next
447 level up is beneath some user-defined critical threshold, the backup
448 level is bumped to the next type. Doing this minimizes the number of
449 volumes necessary during a restore, with a fairly minimal cost in
452 Why: I know at least one (quite sophisticated and smart) user
453 for whom the absence of this feature is a deal-breaker in terms of
454 using Bacula; if we had it it would eliminate the one cool thing
455 Amanda can do and we can't (at least, the one cool thing I know of).
458 Item 19: Add an override in Schedule for Pools based on backup types.
460 Origin: Chad Slater <chad.slater@clickfox.com>
463 What: Adding a FullStorage=BigTapeLibrary in the Schedule resource
464 would help those of us who use different storage devices for different
465 backup levels cope with the "auto-upgrade" of a backup.
467 Why: Assume I add several new device to be backed up, i.e. several
468 hosts with 1TB RAID. To avoid tape switching hassles, incrementals are
469 stored in a disk set on a 2TB RAID. If you add these devices in the
470 middle of the month, the incrementals are upgraded to "full" backups,
471 but they try to use the same storage device as requested in the
472 incremental job, filling up the RAID holding the differentials. If we
473 could override the Storage parameter for full and/or differential
474 backups, then the Full job would use the proper Storage device, which
475 has more capacity (i.e. a 8TB tape library.
477 Item 20: An option to operate on all pools with update vol parameters
478 Origin: Dmitriy Pinchukov <absh@bossdev.kiev.ua>
482 What: When I do update -> Volume parameters -> All Volumes
483 from Pool, then I have to select pools one by one. I'd like
484 console to have an option like "0: All Pools" in the list of
487 Why: I have many pools and therefore unhappy with manually
488 updating each of them using update -> Volume parameters -> All
489 Volumes from Pool -> pool #.
493 Item 21: Include JobID in spool file name
494 Origin: Mark Bergman <mark.bergman@uphs.upenn.edu>
495 Date: Tue Aug 22 17:13:39 EDT 2006
496 Status: Ok (patches/testing/project-include-jobid-in-spool-name.patch)
498 What: Change the name of the spool file to include the JobID
500 Why: JobIDs are the common key used to refer to jobs, yet the
501 spoolfile name doesn't include that information. The date/time
502 stamp is useful (and should be retained).
506 Item 22: Include timestamp of job launch in "stat clients" output
507 Origin: Mark Bergman <mark.bergman@uphs.upenn.edu>
508 Date: Tue Aug 22 17:13:39 EDT 2006
511 What: The "stat clients" command doesn't include any detail on when
512 the active backup jobs were launched.
514 Why: Including the timestamp would make it much easier to decide whether
515 a job is running properly.
517 Notes: It may be helpful to have the output from "stat clients" formatted
518 more like that from "stat dir" (and other commands), in a column
519 format. The per-client information that's currently shown (level,
520 client name, JobId, Volume, pool, device, Files, etc.) is good, but
521 somewhat hard to parse (both programmatically and visually),
522 particularly when there are many active clients.
526 Item 23: Message mailing based on backup types
527 Origin: Evan Kaufman <evan.kaufman@gmail.com>
528 Date: January 6, 2006
531 What: In the "Messages" resource definitions, allowing messages
532 to be mailed based on the type (backup, restore, etc.) and level
533 (full, differential, etc) of job that created the originating
536 Why: It would, for example, allow someone's boss to be emailed
537 automatically only when a Full Backup job runs, so he can
538 retrieve the tapes for offsite storage, even if the IT dept.
539 doesn't (or can't) explicitly notify him. At the same time, his
540 mailbox wouldnt be filled by notifications of Verifies, Restores,
541 or Incremental/Differential Backups (which would likely be kept
544 Notes: One way this could be done is through additional message types, for example:
547 # email the boss only on full system backups
548 Mail = boss@mycompany.com = full, !incremental, !differential, !restore,
550 # email us only when something breaks
551 MailOnError = itdept@mycompany.com = all
555 Item 24: Allow inclusion/exclusion of files in a fileset by creation/mod times
556 Origin: Evan Kaufman <evan.kaufman@gmail.com>
557 Date: January 11, 2006
560 What: In the vein of the Wild and Regex directives in a Fileset's
561 Options, it would be helpful to allow a user to include or exclude
562 files and directories by creation or modification times.
564 You could factor the Exclude=yes|no option in much the same way it
565 affects the Wild and Regex directives. For example, you could exclude
566 all files modified before a certain date:
570 Modified Before = ####
573 Or you could exclude all files created/modified since a certain date:
577 Created Modified Since = ####
580 The format of the time/date could be done several ways, say the number
581 of seconds since the epoch:
582 1137008553 = Jan 11 2006, 1:42:33PM # result of `date +%s`
584 Or a human readable date in a cryptic form:
585 20060111134233 = Jan 11 2006, 1:42:33PM # YYYYMMDDhhmmss
587 Why: I imagine a feature like this could have many uses. It would
588 allow a user to do a full backup while excluding the base operating
589 system files, so if I installed a Linux snapshot from a CD yesterday,
590 I'll *exclude* all files modified *before* today. If I need to
591 recover the system, I use the CD I already have, plus the tape backup.
592 Or if, say, a Windows client is hit by a particularly corrosive
593 virus, and I need to *exclude* any files created/modified *since* the
596 Notes: Of course, this feature would work in concert with other
597 in/exclude rules, and wouldnt override them (or each other).
599 Notes: The directives I'd imagine would be along the lines of
600 "[Created] [Modified] [Before|Since] = <date>".
601 So one could compare against 'ctime' and/or 'mtime', but ONLY 'before'
605 Item 25: Add a scheduling syntax that permits weekly rotations
606 Date: 15 December 2006
607 Origin: Gregory Brauer (greg at wildbrain dot com)
610 What: Currently, Bacula only understands how to deal with weeks of the
611 month or weeks of the year in schedules. This makes it impossible
612 to do a true weekly rotation of tapes. There will always be a
613 discontinuity that will require disruptive manual intervention at
614 least monthly or yearly because week boundaries never align with
615 month or year boundaries.
617 A solution would be to add a new syntax that defines (at least)
618 a start timestamp, and repetition period.
620 Why: Rotated backups done at weekly intervals are useful, and Bacula
621 cannot currently do them without extensive hacking.
623 Notes: Here is an example syntax showing a 3-week rotation where full
624 Backups would be performed every week on Saturday, and an
625 incremental would be performed every week on Tuesday. Each
626 set of tapes could be removed from the loader for the following
627 two cycles before coming back and being reused on the third
628 week. Since the execution times are determined by intervals
629 from a given point in time, there will never be any issues with
630 having to adjust to any sort of arbitrary time boundary. In
631 the example provided, I even define the starting schedule
632 as crossing both a year and a month boundary, but the run times
633 would be based on the "Repeat" value and would therefore happen
638 Name = "Week 1 Rotation"
639 #Saturday. Would run Dec 30, Jan 20, Feb 10, etc.
643 Start = 2006-12-30 01:00
647 #Tuesday. Would run Jan 2, Jan 23, Feb 13, etc.
651 Start = 2007-01-02 01:00
658 Name = "Week 2 Rotation"
659 #Saturday. Would run Jan 6, Jan 27, Feb 17, etc.
663 Start = 2007-01-06 01:00
667 #Tuesday. Would run Jan 9, Jan 30, Feb 20, etc.
671 Start = 2007-01-09 01:00
678 Name = "Week 3 Rotation"
679 #Saturday. Would run Jan 13, Feb 3, Feb 24, etc.
683 Start = 2007-01-13 01:00
687 #Tuesday. Would run Jan 16, Feb 6, Feb 27, etc.
691 Start = 2007-01-16 01:00
698 Item 26: Improve Bacula's tape and drive usage and cleaning management.
699 Date: 8 November 2005, November 11, 2005
700 Origin: Adam Thornton <athornton at sinenomine dot net>,
701 Arno Lehmann <al at its-lehmann dot de>
704 What: Make Bacula manage tape life cycle information, tape reuse
705 times and drive cleaning cycles.
707 Why: All three parts of this project are important when operating
709 We need to know which tapes need replacement, and we need to
710 make sure the drives are cleaned when necessary. While many
711 tape libraries and even autoloaders can handle all this
712 automatically, support by Bacula can be helpful for smaller
713 (older) libraries and single drives. Limiting the number of
714 times a tape is used might prevent tape errors when using
715 tapes until the drives can't read it any more. Also, checking
716 drive status during operation can prevent some failures (as I
717 [Arno] had to learn the hard way...)
719 Notes: First, Bacula could (and even does, to some limited extent)
720 record tape and drive usage. For tapes, the number of mounts,
721 the amount of data, and the time the tape has actually been
722 running could be recorded. Data fields for Read and Write
723 time and Number of mounts already exist in the catalog (I'm
724 not sure if VolBytes is the sum of all bytes ever written to
725 that volume by Bacula). This information can be important
726 when determining which media to replace. The ability to mark
727 Volumes as "used up" after a given number of write cycles
728 should also be implemented so that a tape is never actually
729 worn out. For the tape drives known to Bacula, similar
730 information is interesting to determine the device status and
731 expected life time: Time it's been Reading and Writing, number
732 of tape Loads / Unloads / Errors. This information is not yet
733 recorded as far as I [Arno] know. A new volume status would
734 be necessary for the new state, like "Used up" or "Worn out".
735 Volumes with this state could be used for restores, but not
736 for writing. These volumes should be migrated first (assuming
737 migration is implemented) and, once they are no longer needed,
738 could be moved to a Trash pool.
740 The next step would be to implement a drive cleaning setup.
741 Bacula already has knowledge about cleaning tapes. Once it
742 has some information about cleaning cycles (measured in drive
743 run time, number of tapes used, or calender days, for example)
744 it can automatically execute tape cleaning (with an
745 autochanger, obviously) or ask for operator assistance loading
748 The final step would be to implement TAPEALERT checks not only
749 when changing tapes and only sending the information to the
750 administrator, but rather checking after each tape error,
751 checking on a regular basis (for example after each tape
752 file), and also before unloading and after loading a new tape.
753 Then, depending on the drives TAPEALERT state and the known
754 drive cleaning state Bacula could automatically schedule later
755 cleaning, clean immediately, or inform the operator.
757 Implementing this would perhaps require another catalog change
758 and perhaps major changes in SD code and the DIR-SD protocol,
759 so I'd only consider this worth implementing if it would
760 actually be used or even needed by many people.
762 Implementation of these projects could happen in three distinct
763 sub-projects: Measuring Tape and Drive usage, retiring
764 volumes, and handling drive cleaning and TAPEALERTs.
766 Item 27: Implement support for stacking arbitrary stream filters, sinks.
767 Date: 23 November 2006
768 Origin: Landon Fuller <landonf@threerings.net>
769 Status: Planning. Assigned to landonf.
771 What: Implement support for the following:
772 - Stacking arbitrary stream filters (eg, encryption, compression,
773 sparse data handling))
774 - Attaching file sinks to terminate stream filters (ie, write out
775 the resultant data to a file)
776 - Refactor the restoration state machine accordingly
778 Why: The existing stream implementation suffers from the following:
779 - All state (compression, encryption, stream restoration), is
780 global across the entire restore process, for all streams. There are
781 multiple entry and exit points in the restoration state machine, and
782 thus multiple places where state must be allocated, deallocated,
783 initialized, or reinitialized. This results in exceptional complexity
784 for the author of a stream filter.
785 - The developer must enumerate all possible combinations of filters
786 and stream types (ie, win32 data with encryption, without encryption,
787 with encryption AND compression, etc).
789 Notes: This feature request only covers implementing the stream filters/
790 sinks, and refactoring the file daemon's restoration implementation
791 accordingly. If I have extra time, I will also rewrite the backup
792 implementation. My intent in implementing the restoration first is to
793 solve pressing bugs in the restoration handling, and to ensure that
794 the new restore implementation handles existing backups correctly.
796 I do not plan on changing the network or tape data structures to
797 support defining arbitrary stream filters, but supporting that
798 functionality is the ultimate goal.
800 Assistance with either code or testing would be fantastic.
802 Item 28: Allow FD to initiate a backup
803 Origin: Frank Volf (frank at deze dot org)
804 Date: 17 November 2005
807 What: Provide some means, possibly by a restricted console that
808 allows a FD to initiate a backup, and that uses the connection
809 established by the FD to the Director for the backup so that
810 a Director that is firewalled can do the backup.
812 Why: Makes backup of laptops much easier.
814 Item 29: Directive/mode to backup only file changes, not entire file
815 Date: 11 November 2005
816 Origin: Joshua Kugler <joshua dot kugler at uaf dot edu>
817 Marek Bajon <mbajon at bimsplus dot com dot pl>
820 What: Currently when a file changes, the entire file will be backed up in
821 the next incremental or full backup. To save space on the tapes
822 it would be nice to have a mode whereby only the changes to the
823 file would be backed up when it is changed.
825 Why: This would save lots of space when backing up large files such as
826 logs, mbox files, Outlook PST files and the like.
828 Notes: This would require the usage of disk-based volumes as comparing
829 files would not be feasible using a tape drive.
831 Item 30: Automatic disabling of devices
833 Origin: Peter Eriksson <peter at ifm.liu dot se>
836 What: After a configurable amount of fatal errors with a tape drive
837 Bacula should automatically disable further use of a certain
838 tape drive. There should also be "disable"/"enable" commands in
841 Why: On a multi-drive jukebox there is a possibility of tape drives
842 going bad during large backups (needing a cleaning tape run,
843 tapes getting stuck). It would be advantageous if Bacula would
844 automatically disable further use of a problematic tape drive
845 after a configurable amount of errors has occurred.
847 An example: I have a multi-drive jukebox (6 drives, 380+ slots)
848 where tapes occasionally get stuck inside the drive. Bacula will
849 notice that the "mtx-changer" command will fail and then fail
850 any backup jobs trying to use that drive. However, it will still
851 keep on trying to run new jobs using that drive and fail -
852 forever, and thus failing lots and lots of jobs... Since we have
853 many drives Bacula could have just automatically disabled
854 further use of that drive and used one of the other ones
857 Item 31: Incorporation of XACML2/SAML2 parsing
858 Date: 19 January 2006
859 Origin: Adam Thornton <athornton@sinenomine.net>
862 What: XACML is "eXtensible Access Control Markup Language" and
863 "SAML is the "Security Assertion Markup Language"--an XML standard
864 for making statements about identity and authorization. Having these
865 would give us a framework to approach ACLs in a generic manner, and
866 in a way flexible enough to support the four major sorts of ACLs I
867 see as a concern to Bacula at this point, as well as (probably) to
868 deal with new sorts of ACLs that may appear in the future.
870 Why: Bacula is beginning to need to back up systems with ACLs
871 that do not map cleanly onto traditional Unix permissions. I see
872 four sets of ACLs--in general, mutually incompatible with one
873 another--that we're going to need to deal with. These are: NTFS
874 ACLs, POSIX ACLs, NFSv4 ACLS, and AFS ACLS. (Some may question the
875 relevance of AFS; AFS is one of Sine Nomine's core consulting
876 businesses, and having a reputable file-level backup and restore
877 technology for it (as Tivoli is probably going to drop AFS support
878 soon since IBM no longer supports AFS) would be of huge benefit to
879 our customers; we'd most likely create the AFS support at Sine Nomine
880 for inclusion into the Bacula (and perhaps some changes to the
881 OpenAFS volserver) core code.)
883 Now, obviously, Bacula already handles NTFS just fine. However, I
884 think there's a lot of value in implementing a generic ACL model, so
885 that it's easy to support whatever particular instances of ACLs come
886 down the pike: POSIX ACLS (think SELinux) and NFSv4 are the obvious
887 things arriving in the Linux world in a big way in the near future.
888 XACML, although overcomplicated for our needs, provides this
889 framework, and we should be able to leverage other people's
890 implementations to minimize the amount of work *we* have to do to get
891 a generic ACL framework. Basically, the costs of implementation are
892 high, but they're largely both external to Bacula and already sunk.
895 Item 32: Clustered file-daemons
896 Origin: Alan Brown ajb2 at mssl dot ucl dot ac dot uk
899 What: A "virtual" filedaemon, which is actually a cluster of real ones.
901 Why: In the case of clustered filesystems (SAN setups, GFS, or OCFS2, etc)
902 multiple machines may have access to the same set of filesystems
904 For performance reasons, one may wish to initate backups from
905 several of these machines simultaneously, instead of just using
906 one backup source for the common clustered filesystem.
908 For obvious reasons, normally backups of $A-FD/$PATH and
909 B-FD/$PATH are treated as different backup sets. In this case
910 they are the same communal set.
912 Likewise when restoring, it would be easier to just specify
913 one of the cluster machines and let bacula decide which to use.
915 This can be faked to some extent using DNS round robin entries
916 and a virtual IP address, however it means "status client" will
917 always give bogus answers. Additionally there is no way of
918 spreading the load evenly among the servers.
920 What is required is something similar to the storage daemon
921 autochanger directives, so that Bacula can keep track of
922 operating backups/restores and direct new jobs to a "free"
927 Item 33: Commercial database support
928 Origin: Russell Howe <russell_howe dot wreckage dot org>
932 What: It would be nice for the database backend to support more
933 databases. I'm thinking of SQL Server at the moment, but I guess Oracle,
934 DB2, MaxDB, etc are all candidates. SQL Server would presumably be
935 implemented using FreeTDS or maybe an ODBC library?
937 Why: We only really have one database server, which is MS SQL Server
938 2000. Maintaining a second one for the backup software (we grew out of
939 SQLite, which I liked, but which didn't work so well with our database
940 size). We don't really have a machine with the resources to run
941 postgres, and would rather only maintain a single DBMS. We're stuck with
942 SQL Server because pretty much all the company's custom applications
943 (written by consultants) are locked into SQL Server 2000. I can imagine
944 this scenario is fairly common, and it would be nice to use the existing
945 properly specced database server for storing Bacula's catalog, rather
946 than having to run a second DBMS.
949 Item 34: Archive data
951 Origin: calvin streeting calvin at absentdream dot com
954 What: The abilty to archive to media (dvd/cd) in a uncompressed format
955 for dead filing (archiving not backing up)
957 Why: At my works when jobs are finished and moved off of the main file
958 servers (raid based systems) onto a simple linux file server (ide based
959 system) so users can find old information without contacting the IT
962 So this data dosn't realy change it only gets added to,
963 But it also needs backing up. At the moment it takes
964 about 8 hours to back up our servers (working data) so
965 rather than add more time to existing backups i am trying
966 to implement a system where we backup the acrhive data to
967 cd/dvd these disks would only need to be appended to
968 (burn only new/changed files to new disks for off site
969 storage). basialy understand the differnce between
970 achive data and live data.
972 Notes: Scan the data and email me when it needs burning divide
973 into predifind chunks keep a recored of what is on what
974 disk make me a label (simple php->mysql=>pdf stuff) i
975 could do this bit ability to save data uncompresed so
976 it can be read in any other system (future proof data)
977 save the catalog with the disk as some kind of menu
980 Item 35: Filesystem watch triggered backup.
982 Origin: Jesper Krogh <jesper@krogh.cc>
983 Status: Unimplemented, depends probably on "client initiated backups"
985 What: With inotify and similar filesystem triggeret notification
986 systems is it possible to have the file-daemon to monitor
987 filesystem changes and initiate backup.
989 Why: There are 2 situations where this is nice to have.
990 1) It is possible to get a much finer-grained backup than
991 the fixed schedules used now.. A file created and deleted
992 a few hours later, can automatically be caught.
994 2) The introduced load on the system will probably be
995 distributed more even on the system.
997 Notes: This can be combined with configration that specifies
998 something like: "at most every 15 minutes or when changes
1001 Kern Notes: I would rather see this implemented by an external program
1002 that monitors the Filesystem changes, then uses the console
1003 to start the appropriate job.
1005 Item 36: Implement multiple numeric backup levels as supported by dump
1007 Origin: Daniel Rich <drich@employees.org>
1009 What: Dump allows specification of backup levels numerically instead of just
1010 "full", "incr", and "diff". In this system, at any given level, all
1011 files are backed up that were were modified since the last backup of a
1012 higher level (with 0 being the highest and 9 being the lowest). A
1013 level 0 is therefore equivalent to a full, level 9 an incremental, and
1014 the levels 1 through 8 are varying levels of differentials. For
1015 bacula's sake, these could be represented as "full", "incr", and
1016 "diff1", "diff2", etc.
1018 Why: Support of multiple backup levels would provide for more advanced backup
1019 rotation schemes such as "Towers of Hanoi". This would allow better
1020 flexibility in performing backups, and can lead to shorter recover
1023 Notes: Legato Networker supports a similar system with full, incr, and 1-9 as
1026 Item 37: Implement a server-side compression feature
1027 Date: 18 December 2006
1028 Origin: Vadim A. Umanski , e-mail umanski@ext.ru
1030 What: The ability to compress backup data on server receiving data
1031 instead of doing that on client sending data.
1032 Why: The need is practical. I've got some machines that can send
1033 data to the network 4 or 5 times faster than compressing
1034 them (I've measured that). They're using fast enough SCSI/FC
1035 disk subsystems but rather slow CPUs (ex. UltraSPARC II).
1036 And the backup server has got a quite fast CPUs (ex. Dual P4
1037 Xeons) and quite a low load. When you have 20, 50 or 100 GB
1038 of raw data - running a job 4 to 5 times faster - that
1039 really matters. On the other hand, the data can be
1040 compressed 50% or better - so losing twice more space for
1041 disk backup is not good at all. And the network is all mine
1042 (I have a dedicated management/provisioning network) and I
1043 can get as high bandwidth as I need - 100Mbps, 1000Mbps...
1044 That's why the server-side compression feature is needed!
1047 Item 38: Cause daemons to use a specific IP address to source communications
1048 Origin: Bill Moran <wmoran@collaborativefusion.com>
1051 What: Cause Bacula daemons (dir, fd, sd) to always use the ip address
1052 specified in the [DIR|DF|SD]Addr directive as the source IP
1053 for initiating communication.
1054 Why: On complex networks, as well as extremely secure networks, it's
1055 not unusual to have multiple possible routes through the network.
1056 Often, each of these routes is secured by different policies
1057 (effectively, firewalls allow or deny different traffic depending
1058 on the source address)
1059 Unfortunately, it can sometimes be difficult or impossible to
1060 represent this in a system routing table, as the result is
1061 excessive subnetting that quickly exhausts available IP space.
1062 The best available workaround is to provide multiple IPs to
1063 a single machine that are all on the same subnet. In order
1064 for this to work properly, applications must support the ability
1065 to bind outgoing connections to a specified address, otherwise
1066 the operating system will always choose the first IP that
1067 matches the required route.
1068 Notes: Many other programs support this. For example, the following
1069 can be configured in BIND:
1070 query-source address 10.0.0.1;
1071 transfer-source 10.0.0.2;
1072 Which means queries from this server will always come from
1073 10.0.0.1 and zone transfers will always originate from
1076 Item 39: Multiple threads in file daemon for the same job
1077 Date: 27 November 2005
1078 Origin: Ove Risberg (Ove.Risberg at octocode dot com)
1081 What: I want the file daemon to start multiple threads for a backup
1082 job so the fastest possible backup can be made.
1084 The file daemon could parse the FileSet information and start
1085 one thread for each File entry located on a separate
1088 A confiuration option in the job section should be used to
1089 enable or disable this feature. The confgutration option could
1090 specify the maximum number of threads in the file daemon.
1092 If the theads could spool the data to separate spool files
1093 the restore process will not be much slower.
1095 Why: Multiple concurrent backups of a large fileserver with many
1096 disks and controllers will be much faster.
1098 Item 40: Restore only file attributes (permissions, ACL, owner, group...)
1099 Origin: Eric Bollengier
1103 What: The goal of this project is to be able to restore only rights
1104 and attributes of files without crushing them.
1106 Why: Who have never had to repair a chmod -R 777, or a wild update
1107 of recursive right under Windows? At this time, you must have
1108 enough space to restore data, dump attributes (easy with acl,
1109 more complex with unix/windows rights) and apply them to your
1110 broken tree. With this options, it will be very easy to compare
1111 right or ACL over the time.
1113 Notes: If the file is here, we skip restore and we change rights.
1114 If the file isn't here, we can create an empty one and apply
1115 rights or do nothing.
1117 Item 41: Add an item to the restore option where you can select a pool
1118 Origin: kshatriyak at gmail dot com
1122 What: In the restore option (Select the most recent backup for a
1123 client) it would be useful to add an option where you can limit
1124 the selection to a certain pool.
1126 Why: When using cloned jobs, most of the time you have 2 pools - a
1127 disk pool and a tape pool. People who have 2 pools would like to
1128 select the most recent backup from disk, not from tape (tape
1129 would be only needed in emergency). However, the most recent
1130 backup (which may just differ a second from the disk backup) may
1131 be on tape and would be selected. The problem becomes bigger if
1132 you have a full and differential - the most "recent" full backup
1133 may be on disk, while the most recent differential may be on tape
1134 (though the differential on disk may differ even only a second or
1135 so). Bacula will complain that the backups reside on different
1136 media then. For now the only solution now when restoring things
1137 when you have 2 pools is to manually search for the right
1138 job-id's and enter them by hand, which is a bit fault tolerant.
1140 ============= Empty Feature Request form ===========
1141 Item n: One line summary ...
1142 Date: Date submitted
1143 Origin: Name and email of originator.
1146 What: More detailed explanation ...
1148 Why: Why it is important ...
1150 Notes: Additional notes or features (omit if not used)
1151 ============== End Feature Request form ==============