3 Bacula Projects Roadmap
4 Status updated 15 December 2006
8 Item 1: Accurate restoration of renamed/deleted files from
9 Item 2: Implement a Bacula GUI/management tool.
10 Item 3: Implement Base jobs.
11 Item 4: Implement from-client and to-client on restore command line.
12 Item 5: Implement creation and maintenance of copy pools
13 Item 6: Merge multiple backups (Synthetic Backup or Consolidation).
14 Item 8: Deletion of Disk-Based Bacula Volumes
15 Item 9: Implement a Python interface to the Bacula catalog.
16 Item 10: Archival (removal) of User Files to Tape
17 Item 11: Add Plug-ins to the FileSet Include statements.
18 Item 12: Implement more Python events in Bacula.
19 Item 13: Quick release of FD-SD connection after backup.
20 Item 14: Implement huge exclude list support using hashing.
21 Item 15: Allow skipping execution of Jobs
22 Item 16: Tray monitor window cleanups
23 Item 17: Split documentation
24 Item 18: Automatic promotion of backup levels
25 Item 19: Add an override in Schedule for Pools based on backup types.
26 Item 20: An option to operate on all pools with update vol parameters
27 Item 21: Include JobID in spool file name
28 Item 22: Include timestamp of job launch in "stat clients" output
29 Item 23: Message mailing based on backup types
30 Item 24: Allow inclusion/exclusion of files in a fileset by creation/mod times
31 Item 25: Add a scheduling syntax that permits weekly rotations
32 Item 26: Improve Bacula's tape and drive usage and cleaning management.
33 Item 27: Implement support for stacking arbitrary stream filters, sinks.
34 Item 28: Allow FD to initiate a backup
35 Item 29: Directive/mode to backup only file changes, not entire file
36 Item 30: Automatic disabling of devices
37 Item 31: Incorporation of XACML2/SAML2 parsing
38 Item 32: Clustered file-daemons
39 Item 33: Commercial database support
41 Item 35: Filesystem watch triggered backup.
42 Item 36: Implement multiple numeric backup levels as supported by dump
45 Below, you will find more information on future projects:
47 Item 1: Accurate restoration of renamed/deleted files from
48 Incremental/Differential backups
49 Date: 28 November 2005
50 Origin: Martin Simmons (martin at lispworks dot com)
53 What: When restoring a fileset for a specified date (including "most
54 recent"), Bacula should give you exactly the files and directories
55 that existed at the time of the last backup prior to that date.
57 Currently this only works if the last backup was a Full backup.
58 When the last backup was Incremental/Differential, files and
59 directories that have been renamed or deleted since the last Full
60 backup are not currently restored correctly. Ditto for files with
61 extra/fewer hard links than at the time of the last Full backup.
63 Why: Incremental/Differential would be much more useful if this worked.
65 Notes: Merging of multiple backups into a single one seems to
66 rely on this working, otherwise the merged backups will not be
67 truly equivalent to a Full backup.
69 Kern: notes shortened. This can be done without the need for
70 inodes. It is essentially the same as the current Verify job,
71 but one additional database record must be written, which does
72 not need any database change.
74 Kern: see if we can correct restoration of directories if
75 replace=ifnewer is set. Currently, if the directory does not
76 exist, a "dummy" directory is created, then when all the files
77 are updated, the dummy directory is newer so the real values
80 Item 2: Implement a Bacula GUI/management tool.
85 What: Implement a Bacula console, and management tools
86 probably using Qt3 and C++.
88 Why: Don't we already have a wxWidgets GUI? Yes, but
89 it is written in C++ and changes to the user interface
90 must be hand tailored using C++ code. By developing
91 the user interface using Qt designer, the interface
92 can be very easily updated and most of the new Python
93 code will be automatically created. The user interface
94 changes become very simple, and only the new features
95 must be implement. In addition, the code will be in
96 Python, which will give many more users easy (or easier)
97 access to making additions or modifications.
99 Notes: There is a partial Python-GTK implementation
100 Lucas Di Pentima <lucas at lunix dot com dot ar> but
101 it is no longer being developed.
104 Item 3: Implement Base jobs.
105 Date: 28 October 2005
109 What: A base job is sort of like a Full save except that you
110 will want the FileSet to contain only files that are
111 unlikely to change in the future (i.e. a snapshot of
112 most of your system after installing it). After the
113 base job has been run, when you are doing a Full save,
114 you specify one or more Base jobs to be used. All
115 files that have been backed up in the Base job/jobs but
116 not modified will then be excluded from the backup.
117 During a restore, the Base jobs will be automatically
118 pulled in where necessary.
120 Why: This is something none of the competition does, as far as
121 we know (except perhaps BackupPC, which is a Perl program that
122 saves to disk only). It is big win for the user, it
123 makes Bacula stand out as offering a unique
124 optimization that immediately saves time and money.
125 Basically, imagine that you have 100 nearly identical
126 Windows or Linux machine containing the OS and user
127 files. Now for the OS part, a Base job will be backed
128 up once, and rather than making 100 copies of the OS,
129 there will be only one. If one or more of the systems
130 have some files updated, no problem, they will be
131 automatically restored.
133 Notes: Huge savings in tape usage even for a single machine.
134 Will require more resources because the DIR must send
135 FD a list of files/attribs, and the FD must search the
136 list and compare it for each file to be saved.
138 Item 4: Implement from-client and to-client on restore command line.
139 Date: 11 December 2006
140 Origin: Discussion on Bacula-users entitled 'Scripted restores to
141 different clients', December 2006
142 Status: New feature request
144 What: While using bconsole interactively, you can specify the client
145 that a backup job is to be restored for, and then you can
146 specify later a different client to send the restored files
147 back to. However, using the 'restore' command with all options
148 on the command line, this cannot be done, due to the ambiguous
149 'client' parameter. Additionally, this parameter means different
150 things depending on if it's specified on the command line or
151 afterwards, in the Modify Job screens.
153 Why: This feature would enable restore jobs to be more completely
154 automated, for example by a web or GUI front-end.
156 Notes: client can also be implied by specifying the jobid on the command
159 Item 5: Implement creation and maintenance of copy pools
160 Date: 27 November 2005
161 Origin: David Boyes (dboyes at sinenomine dot net)
164 What: I would like Bacula to have the capability to write copies
165 of backed-up data on multiple physical volumes selected
166 from different pools without transferring the data
167 multiple times, and to accept any of the copy volumes
168 as valid for restore.
170 Why: In many cases, businesses are required to keep offsite
171 copies of backup volumes, or just wish for simple
172 protection against a human operator dropping a storage
173 volume and damaging it. The ability to generate multiple
174 volumes in the course of a single backup job allows
175 customers to simple check out one copy and send it
176 offsite, marking it as out of changer or otherwise
177 unavailable. Currently, the library and magazine
178 management capability in Bacula does not make this process
181 Restores would use the copy of the data on the first
182 available volume, in order of copy pool chain definition.
184 This is also a major scalability issue -- as the number of
185 clients increases beyond several thousand, and the volume
186 of data increases, transferring the data multiple times to
187 produce additional copies of the backups will become
188 physically impossible due to transfer speed
189 issues. Generating multiple copies at server side will
190 become the only practical option.
192 How: I suspect that this will require adding a multiplexing
193 SD that appears to be a SD to a specific FD, but 1-n FDs
194 to the specific back end SDs managing the primary and copy
195 pools. Storage pools will also need to acquire parameters
196 to define the pools to be used for copies.
198 Notes: I would commit some of my developers' time if we can agree
199 on the design and behavior.
201 Item 6: Merge multiple backups (Synthetic Backup or Consolidation).
202 Origin: Marc Cousin and Eric Bollengier
203 Date: 15 November 2005
204 Status: Waiting implementation. Depends on first implementing
205 project Item 2 (Migration) which is now done.
207 What: A merged backup is a backup made without connecting to the Client.
208 It would be a Merge of existing backups into a single backup.
209 In effect, it is like a restore but to the backup medium.
211 For instance, say that last Sunday we made a full backup. Then
212 all week long, we created incremental backups, in order to do
213 them fast. Now comes Sunday again, and we need another full.
214 The merged backup makes it possible to do instead an incremental
215 backup (during the night for instance), and then create a merged
216 backup during the day, by using the full and incrementals from
217 the week. The merged backup will be exactly like a full made
218 Sunday night on the tape, but the production interruption on the
219 Client will be minimal, as the Client will only have to send
222 In fact, if it's done correctly, you could merge all the
223 Incrementals into single Incremental, or all the Incrementals
224 and the last Differential into a new Differential, or the Full,
225 last differential and all the Incrementals into a new Full
226 backup. And there is no need to involve the Client.
228 Why: The benefit is that :
229 - the Client just does an incremental ;
230 - the merged backup on tape is just as a single full backup,
231 and can be restored very fast.
233 This is also a way of reducing the backup data since the old
234 data can then be pruned (or not) from the catalog, possibly
235 allowing older volumes to be recycled
237 Item 8: Deletion of Disk-Based Bacula Volumes
239 Origin: Ross Boylan <RossBoylan at stanfordalumni dot org> (edited
243 What: Provide a way for Bacula to automatically remove Volumes
244 from the filesystem, or optionally to truncate them.
245 Obviously, the Volume must be pruned prior removal.
247 Why: This would allow users more control over their Volumes and
248 prevent disk based volumes from consuming too much space.
250 Notes: The following two directives might do the trick:
252 Volume Data Retention = <time period>
253 Remove Volume After = <time period>
255 The migration project should also remove a Volume that is
256 migrated. This might also work for tape Volumes.
258 Item 9: Implement a Python interface to the Bacula catalog.
259 Date: 28 October 2005
263 What: Implement an interface for Python scripts to access
264 the catalog through Bacula.
266 Why: This will permit users to customize Bacula through
269 Item 10: Archival (removal) of User Files to Tape
273 Origin: Ray Pengelly [ray at biomed dot queensu dot ca
276 What: The ability to archive data to storage based on certain parameters
277 such as age, size, or location. Once the data has been written to
278 storage and logged it is then pruned from the originating
279 filesystem. Note! We are talking about user's files and not
282 Why: This would allow fully automatic storage management which becomes
283 useful for large datastores. It would also allow for auto-staging
284 from one media type to another.
286 Example 1) Medical imaging needs to store large amounts of data.
287 They decide to keep data on their servers for 6 months and then put
288 it away for long term storage. The server then finds all files
289 older than 6 months writes them to tape. The files are then removed
292 Example 2) All data that hasn't been accessed in 2 months could be
293 moved from high-cost, fibre-channel disk storage to a low-cost
294 large-capacity SATA disk storage pool which doesn't have as quick of
295 access time. Then after another 6 months (or possibly as one
296 storage pool gets full) data is migrated to Tape.
298 Item 11: Add Plug-ins to the FileSet Include statements.
299 Date: 28 October 2005
301 Status: Partially coded in 1.37 -- much more to do.
303 What: Allow users to specify wild-card and/or regular
304 expressions to be matched in both the Include and
305 Exclude directives in a FileSet. At the same time,
306 allow users to define plug-ins to be called (based on
307 regular expression/wild-card matching).
309 Why: This would give the users the ultimate ability to control
310 how files are backed up/restored. A user could write a
311 plug-in knows how to backup his Oracle database without
312 stopping/starting it, for example.
314 Item 12: Implement more Python events in Bacula.
315 Date: 28 October 2005
319 What: Allow Python scripts to be called at more places
320 within Bacula and provide additional access to Bacula
323 Why: This will permit users to customize Bacula through
331 Also add a way to get a listing of currently running
332 jobs (possibly also scheduled jobs).
335 Item 13: Quick release of FD-SD connection after backup.
336 Origin: Frank Volf (frank at deze dot org)
337 Date: 17 November 2005
340 What: In the Bacula implementation a backup is finished after all data
341 and attributes are successfully written to storage. When using a
342 tape backup it is very annoying that a backup can take a day,
343 simply because the current tape (or whatever) is full and the
344 administrator has not put a new one in. During that time the
345 system cannot be taken off-line, because there is still an open
346 session between the storage daemon and the file daemon on the
349 Although this is a very good strategy for making "safe backups"
350 This can be annoying for e.g. laptops, that must remain
351 connected until the backup is completed.
353 Using a new feature called "migration" it will be possible to
354 spool first to harddisk (using a special 'spool' migration
355 scheme) and then migrate the backup to tape.
357 There is still the problem of getting the attributes committed.
358 If it takes a very long time to do, with the current code, the
359 job has not terminated, and the File daemon is not freed up. The
360 Storage daemon should release the File daemon as soon as all the
361 file data and all the attributes have been sent to it (the SD).
362 Currently the SD waits until everything is on tape and all the
363 attributes are transmitted to the Director before signaling
364 completion to the FD. I don't think I would have any problem
365 changing this. The reason is that even if the FD reports back to
366 the Dir that all is OK, the job will not terminate until the SD
367 has done the same thing -- so in a way keeping the SD-FD link
368 open to the very end is not really very productive ...
370 Why: Makes backup of laptops much faster.
374 Item 14: Implement huge exclude list support using hashing.
375 Date: 28 October 2005
379 What: Allow users to specify very large exclude list (currently
380 more than about 1000 files is too many).
382 Why: This would give the users the ability to exclude all
383 files that are loaded with the OS (e.g. using rpms
384 or debs). If the user can restore the base OS from
385 CDs, there is no need to backup all those files. A
386 complete restore would be to restore the base OS, then
387 do a Bacula restore. By excluding the base OS files, the
388 backup set will be *much* smaller.
391 Item 15: Allow skipping execution of Jobs
392 Date: 29 November 2005
393 Origin: Florian Schnabel <florian.schnabel at docufy dot de>
396 What: An easy option to skip a certain job on a certain date.
397 Why: You could then easily skip tape backups on holidays. Especially
398 if you got no autochanger and can only fit one backup on a tape
399 that would be really handy, other jobs could proceed normally
400 and you won't get errors that way.
403 Item 16: Tray monitor window cleanups
404 Origin: Alan Brown ajb2 at mssl dot ucl dot ac dot uk
407 What: Resizeable and scrollable windows in the tray monitor.
409 Why: With multiple clients, or with many jobs running, the displayed
410 window often ends up larger than the available screen, making
411 the trailing items difficult to read.
414 Item 17: Split documentation
415 Origin: Maxx <maxxatworkat gmail dot com>
419 What: Split documentation in several books
421 Why: Bacula manual has now more than 600 pages, and looking for
422 implementation details is getting complicated. I think
423 it would be good to split the single volume in two or
426 1) Introduction, requirements and tutorial, typically
427 are useful only until first installation time
429 2) Basic installation and configuration, with all the
430 gory details about the directives supported 3)
431 Advanced Bacula: testing, troubleshooting, GUI and
432 ancillary programs, security managements, scripting,
437 Item 18: Automatic promotion of backup levels
438 Date: 19 January 2006
439 Origin: Adam Thornton <athornton@sinenomine.net>
442 What: Amanda has a feature whereby it estimates the space that a
443 differential, incremental, and full backup would take. If the
444 difference in space required between the scheduled level and the next
445 level up is beneath some user-defined critical threshold, the backup
446 level is bumped to the next type. Doing this minimizes the number of
447 volumes necessary during a restore, with a fairly minimal cost in
450 Why: I know at least one (quite sophisticated and smart) user
451 for whom the absence of this feature is a deal-breaker in terms of
452 using Bacula; if we had it it would eliminate the one cool thing
453 Amanda can do and we can't (at least, the one cool thing I know of).
456 Item 19: Add an override in Schedule for Pools based on backup types.
458 Origin: Chad Slater <chad.slater@clickfox.com>
461 What: Adding a FullStorage=BigTapeLibrary in the Schedule resource
462 would help those of us who use different storage devices for different
463 backup levels cope with the "auto-upgrade" of a backup.
465 Why: Assume I add several new device to be backed up, i.e. several
466 hosts with 1TB RAID. To avoid tape switching hassles, incrementals are
467 stored in a disk set on a 2TB RAID. If you add these devices in the
468 middle of the month, the incrementals are upgraded to "full" backups,
469 but they try to use the same storage device as requested in the
470 incremental job, filling up the RAID holding the differentials. If we
471 could override the Storage parameter for full and/or differential
472 backups, then the Full job would use the proper Storage device, which
473 has more capacity (i.e. a 8TB tape library.
475 Item 20: An option to operate on all pools with update vol parameters
476 Origin: Dmitriy Pinchukov <absh@bossdev.kiev.ua>
480 What: When I do update -> Volume parameters -> All Volumes
481 from Pool, then I have to select pools one by one. I'd like
482 console to have an option like "0: All Pools" in the list of
485 Why: I have many pools and therefore unhappy with manually
486 updating each of them using update -> Volume parameters -> All
487 Volumes from Pool -> pool #.
491 Item 21: Include JobID in spool file name
492 Origin: Mark Bergman <mark.bergman@uphs.upenn.edu>
493 Date: Tue Aug 22 17:13:39 EDT 2006
496 What: Change the name of the spool file to include the JobID
498 Why: JobIDs are the common key used to refer to jobs, yet the
499 spoolfile name doesn't include that information. The date/time
500 stamp is useful (and should be retained).
504 Item 22: Include timestamp of job launch in "stat clients" output
505 Origin: Mark Bergman <mark.bergman@uphs.upenn.edu>
506 Date: Tue Aug 22 17:13:39 EDT 2006
509 What: The "stat clients" command doesn't include any detail on when
510 the active backup jobs were launched.
512 Why: Including the timestamp would make it much easier to decide whether
513 a job is running properly.
515 Notes: It may be helpful to have the output from "stat clients" formatted
516 more like that from "stat dir" (and other commands), in a column
517 format. The per-client information that's currently shown (level,
518 client name, JobId, Volume, pool, device, Files, etc.) is good, but
519 somewhat hard to parse (both programmatically and visually),
520 particularly when there are many active clients.
524 Item 23: Message mailing based on backup types
525 Origin: Evan Kaufman <evan.kaufman@gmail.com>
526 Date: January 6, 2006
529 What: In the "Messages" resource definitions, allowing messages
530 to be mailed based on the type (backup, restore, etc.) and level
531 (full, differential, etc) of job that created the originating
534 Why: It would, for example, allow someone's boss to be emailed
535 automatically only when a Full Backup job runs, so he can
536 retrieve the tapes for offsite storage, even if the IT dept.
537 doesn't (or can't) explicitly notify him. At the same time, his
538 mailbox wouldnt be filled by notifications of Verifies, Restores,
539 or Incremental/Differential Backups (which would likely be kept
542 Notes: One way this could be done is through additional message types, for example:
545 # email the boss only on full system backups
546 Mail = boss@mycompany.com = full, !incremental, !differential, !restore,
548 # email us only when something breaks
549 MailOnError = itdept@mycompany.com = all
553 Item 24: Allow inclusion/exclusion of files in a fileset by creation/mod times
554 Origin: Evan Kaufman <evan.kaufman@gmail.com>
555 Date: January 11, 2006
558 What: In the vein of the Wild and Regex directives in a Fileset's
559 Options, it would be helpful to allow a user to include or exclude
560 files and directories by creation or modification times.
562 You could factor the Exclude=yes|no option in much the same way it
563 affects the Wild and Regex directives. For example, you could exclude
564 all files modified before a certain date:
568 Modified Before = ####
571 Or you could exclude all files created/modified since a certain date:
575 Created Modified Since = ####
578 The format of the time/date could be done several ways, say the number
579 of seconds since the epoch:
580 1137008553 = Jan 11 2006, 1:42:33PM # result of `date +%s`
582 Or a human readable date in a cryptic form:
583 20060111134233 = Jan 11 2006, 1:42:33PM # YYYYMMDDhhmmss
585 Why: I imagine a feature like this could have many uses. It would
586 allow a user to do a full backup while excluding the base operating
587 system files, so if I installed a Linux snapshot from a CD yesterday,
588 I'll *exclude* all files modified *before* today. If I need to
589 recover the system, I use the CD I already have, plus the tape backup.
590 Or if, say, a Windows client is hit by a particularly corrosive
591 virus, and I need to *exclude* any files created/modified *since* the
594 Notes: Of course, this feature would work in concert with other
595 in/exclude rules, and wouldnt override them (or each other).
597 Notes: The directives I'd imagine would be along the lines of
598 "[Created] [Modified] [Before|Since] = <date>".
599 So one could compare against 'ctime' and/or 'mtime', but ONLY 'before'
603 Item 25: Add a scheduling syntax that permits weekly rotations
604 Date: 15 December 2006
605 Origin: Gregory Brauer (greg at wildbrain dot com)
608 What: Currently, Bacula only understands how to deal with weeks of the
609 month or weeks of the year in schedules. This makes it impossible
610 to do a true weekly rotation of tapes. There will always be a
611 discontinuity that will require disruptive manual intervention at
612 least monthly or yearly because week boundaries never align with
613 month or year boundaries.
615 A solution would be to add a new syntax that defines (at least)
616 a start timestamp, and repetition period.
618 Why: Rotated backups done at weekly intervals are useful, and Bacula
619 cannot currently do them without extensive hacking.
621 Notes: Here is an example syntax showing a 3-week rotation where full
622 Backups would be performed every week on Saturday, and an
623 incremental would be performed every week on Tuesday. Each
624 set of tapes could be removed from the loader for the following
625 two cycles before coming back and being reused on the third
626 week. Since the execution times are determined by intervals
627 from a given point in time, there will never be any issues with
628 having to adjust to any sort of arbitrary time boundary. In
629 the example provided, I even define the starting schedule
630 as crossing both a year and a month boundary, but the run times
631 would be based on the "Repeat" value and would therefore happen
636 Name = "Week 1 Rotation"
637 #Saturday. Would run Dec 30, Jan 20, Feb 10, etc.
641 Start = 2006-12-30 01:00
645 #Tuesday. Would run Jan 2, Jan 23, Feb 13, etc.
649 Start = 2007-01-02 01:00
656 Name = "Week 2 Rotation"
657 #Saturday. Would run Jan 6, Jan 27, Feb 17, etc.
661 Start = 2007-01-06 01:00
665 #Tuesday. Would run Jan 9, Jan 30, Feb 20, etc.
669 Start = 2007-01-09 01:00
676 Name = "Week 3 Rotation"
677 #Saturday. Would run Jan 13, Feb 3, Feb 24, etc.
681 Start = 2007-01-13 01:00
685 #Tuesday. Would run Jan 16, Feb 6, Feb 27, etc.
689 Start = 2007-01-16 01:00
696 Item 26: Improve Bacula's tape and drive usage and cleaning management.
697 Date: 8 November 2005, November 11, 2005
698 Origin: Adam Thornton <athornton at sinenomine dot net>,
699 Arno Lehmann <al at its-lehmann dot de>
702 What: Make Bacula manage tape life cycle information, tape reuse
703 times and drive cleaning cycles.
705 Why: All three parts of this project are important when operating
707 We need to know which tapes need replacement, and we need to
708 make sure the drives are cleaned when necessary. While many
709 tape libraries and even autoloaders can handle all this
710 automatically, support by Bacula can be helpful for smaller
711 (older) libraries and single drives. Limiting the number of
712 times a tape is used might prevent tape errors when using
713 tapes until the drives can't read it any more. Also, checking
714 drive status during operation can prevent some failures (as I
715 [Arno] had to learn the hard way...)
717 Notes: First, Bacula could (and even does, to some limited extent)
718 record tape and drive usage. For tapes, the number of mounts,
719 the amount of data, and the time the tape has actually been
720 running could be recorded. Data fields for Read and Write
721 time and Number of mounts already exist in the catalog (I'm
722 not sure if VolBytes is the sum of all bytes ever written to
723 that volume by Bacula). This information can be important
724 when determining which media to replace. The ability to mark
725 Volumes as "used up" after a given number of write cycles
726 should also be implemented so that a tape is never actually
727 worn out. For the tape drives known to Bacula, similar
728 information is interesting to determine the device status and
729 expected life time: Time it's been Reading and Writing, number
730 of tape Loads / Unloads / Errors. This information is not yet
731 recorded as far as I [Arno] know. A new volume status would
732 be necessary for the new state, like "Used up" or "Worn out".
733 Volumes with this state could be used for restores, but not
734 for writing. These volumes should be migrated first (assuming
735 migration is implemented) and, once they are no longer needed,
736 could be moved to a Trash pool.
738 The next step would be to implement a drive cleaning setup.
739 Bacula already has knowledge about cleaning tapes. Once it
740 has some information about cleaning cycles (measured in drive
741 run time, number of tapes used, or calender days, for example)
742 it can automatically execute tape cleaning (with an
743 autochanger, obviously) or ask for operator assistance loading
746 The final step would be to implement TAPEALERT checks not only
747 when changing tapes and only sending the information to the
748 administrator, but rather checking after each tape error,
749 checking on a regular basis (for example after each tape
750 file), and also before unloading and after loading a new tape.
751 Then, depending on the drives TAPEALERT state and the known
752 drive cleaning state Bacula could automatically schedule later
753 cleaning, clean immediately, or inform the operator.
755 Implementing this would perhaps require another catalog change
756 and perhaps major changes in SD code and the DIR-SD protocol,
757 so I'd only consider this worth implementing if it would
758 actually be used or even needed by many people.
760 Implementation of these projects could happen in three distinct
761 sub-projects: Measuring Tape and Drive usage, retiring
762 volumes, and handling drive cleaning and TAPEALERTs.
764 Item 27: Implement support for stacking arbitrary stream filters, sinks.
765 Date: 23 November 2006
766 Origin: Landon Fuller <landonf@threerings.net>
767 Status: Planning. Assigned to landonf.
770 Implement support for the following:
771 - Stacking arbitrary stream filters (eg, encryption, compression,
772 sparse data handling))
773 - Attaching file sinks to terminate stream filters (ie, write out
774 the resultant data to a file)
775 - Refactor the restoration state machine accordingly
778 The existing stream implementation suffers from the following:
779 - All state (compression, encryption, stream restoration), is
780 global across the entire restore process, for all streams. There are
781 multiple entry and exit points in the restoration state machine, and
782 thus multiple places where state must be allocated, deallocated,
783 initialized, or reinitialized. This results in exceptional complexity
784 for the author of a stream filter.
785 - The developer must enumerate all possible combinations of filters
786 and stream types (ie, win32 data with encryption, without encryption,
787 with encryption AND compression, etc).
790 This feature request only covers implementing the stream filters/
791 sinks, and refactoring the file daemon's restoration implementation
792 accordingly. If I have extra time, I will also rewrite the backup
793 implementation. My intent in implementing the restoration first is to
794 solve pressing bugs in the restoration handling, and to ensure that
795 the new restore implementation handles existing backups correctly.
797 I do not plan on changing the network or tape data structures to
798 support defining arbitrary stream filters, but supporting that
799 functionality is the ultimate goal.
801 Assistance with either code or testing would be fantastic.
803 Item 28: Allow FD to initiate a backup
804 Origin: Frank Volf (frank at deze dot org)
805 Date: 17 November 2005
808 What: Provide some means, possibly by a restricted console that
809 allows a FD to initiate a backup, and that uses the connection
810 established by the FD to the Director for the backup so that
811 a Director that is firewalled can do the backup.
813 Why: Makes backup of laptops much easier.
815 Item 29: Directive/mode to backup only file changes, not entire file
816 Date: 11 November 2005
817 Origin: Joshua Kugler <joshua dot kugler at uaf dot edu>
818 Marek Bajon <mbajon at bimsplus dot com dot pl>
821 What: Currently when a file changes, the entire file will be backed up in
822 the next incremental or full backup. To save space on the tapes
823 it would be nice to have a mode whereby only the changes to the
824 file would be backed up when it is changed.
826 Why: This would save lots of space when backing up large files such as
827 logs, mbox files, Outlook PST files and the like.
829 Notes: This would require the usage of disk-based volumes as comparing
830 files would not be feasible using a tape drive.
832 Item 30: Automatic disabling of devices
834 Origin: Peter Eriksson <peter at ifm.liu dot se>
837 What: After a configurable amount of fatal errors with a tape drive
838 Bacula should automatically disable further use of a certain
839 tape drive. There should also be "disable"/"enable" commands in
842 Why: On a multi-drive jukebox there is a possibility of tape drives
843 going bad during large backups (needing a cleaning tape run,
844 tapes getting stuck). It would be advantageous if Bacula would
845 automatically disable further use of a problematic tape drive
846 after a configurable amount of errors has occurred.
848 An example: I have a multi-drive jukebox (6 drives, 380+ slots)
849 where tapes occasionally get stuck inside the drive. Bacula will
850 notice that the "mtx-changer" command will fail and then fail
851 any backup jobs trying to use that drive. However, it will still
852 keep on trying to run new jobs using that drive and fail -
853 forever, and thus failing lots and lots of jobs... Since we have
854 many drives Bacula could have just automatically disabled
855 further use of that drive and used one of the other ones
858 Item 31: Incorporation of XACML2/SAML2 parsing
859 Date: 19 January 2006
860 Origin: Adam Thornton <athornton@sinenomine.net>
863 What: XACML is "eXtensible Access Control Markup Language" and
864 "SAML is the "Security Assertion Markup Language"--an XML standard
865 for making statements about identity and authorization. Having these
866 would give us a framework to approach ACLs in a generic manner, and
867 in a way flexible enough to support the four major sorts of ACLs I
868 see as a concern to Bacula at this point, as well as (probably) to
869 deal with new sorts of ACLs that may appear in the future.
871 Why: Bacula is beginning to need to back up systems with ACLs
872 that do not map cleanly onto traditional Unix permissions. I see
873 four sets of ACLs--in general, mutually incompatible with one
874 another--that we're going to need to deal with. These are: NTFS
875 ACLs, POSIX ACLs, NFSv4 ACLS, and AFS ACLS. (Some may question the
876 relevance of AFS; AFS is one of Sine Nomine's core consulting
877 businesses, and having a reputable file-level backup and restore
878 technology for it (as Tivoli is probably going to drop AFS support
879 soon since IBM no longer supports AFS) would be of huge benefit to
880 our customers; we'd most likely create the AFS support at Sine Nomine
881 for inclusion into the Bacula (and perhaps some changes to the
882 OpenAFS volserver) core code.)
884 Now, obviously, Bacula already handles NTFS just fine. However, I
885 think there's a lot of value in implementing a generic ACL model, so
886 that it's easy to support whatever particular instances of ACLs come
887 down the pike: POSIX ACLS (think SELinux) and NFSv4 are the obvious
888 things arriving in the Linux world in a big way in the near future.
889 XACML, although overcomplicated for our needs, provides this
890 framework, and we should be able to leverage other people's
891 implementations to minimize the amount of work *we* have to do to get
892 a generic ACL framework. Basically, the costs of implementation are
893 high, but they're largely both external to Bacula and already sunk.
896 Item 32: Clustered file-daemons
897 Origin: Alan Brown ajb2 at mssl dot ucl dot ac dot uk
900 What: A "virtual" filedaemon, which is actually a cluster of real ones.
902 Why: In the case of clustered filesystems (SAN setups, GFS, or OCFS2, etc)
903 multiple machines may have access to the same set of filesystems
905 For performance reasons, one may wish to initate backups from
906 several of these machines simultaneously, instead of just using
907 one backup source for the common clustered filesystem.
909 For obvious reasons, normally backups of $A-FD/$PATH and
910 B-FD/$PATH are treated as different backup sets. In this case
911 they are the same communal set.
913 Likewise when restoring, it would be easier to just specify
914 one of the cluster machines and let bacula decide which to use.
916 This can be faked to some extent using DNS round robin entries
917 and a virtual IP address, however it means "status client" will
918 always give bogus answers. Additionally there is no way of
919 spreading the load evenly among the servers.
921 What is required is something similar to the storage daemon
922 autochanger directives, so that Bacula can keep track of
923 operating backups/restores and direct new jobs to a "free"
928 Item 33: Commercial database support
929 Origin: Russell Howe <russell_howe dot wreckage dot org>
933 What: It would be nice for the database backend to support more
934 databases. I'm thinking of SQL Server at the moment, but I guess Oracle,
935 DB2, MaxDB, etc are all candidates. SQL Server would presumably be
936 implemented using FreeTDS or maybe an ODBC library?
938 Why: We only really have one database server, which is MS SQL Server
939 2000. Maintaining a second one for the backup software (we grew out of
940 SQLite, which I liked, but which didn't work so well with our database
941 size). We don't really have a machine with the resources to run
942 postgres, and would rather only maintain a single DBMS. We're stuck with
943 SQL Server because pretty much all the company's custom applications
944 (written by consultants) are locked into SQL Server 2000. I can imagine
945 this scenario is fairly common, and it would be nice to use the existing
946 properly specced database server for storing Bacula's catalog, rather
947 than having to run a second DBMS.
950 Item 34: Archive data
952 Origin: calvin streeting calvin at absentdream dot com
955 What: The abilty to archive to media (dvd/cd) in a uncompressed format
956 for dead filing (archiving not backing up)
958 Why: At my works when jobs are finished and moved off of the main file
959 servers (raid based systems) onto a simple linux file server (ide based
960 system) so users can find old information without contacting the IT
963 So this data dosn't realy change it only gets added to,
964 But it also needs backing up. At the moment it takes
965 about 8 hours to back up our servers (working data) so
966 rather than add more time to existing backups i am trying
967 to implement a system where we backup the acrhive data to
968 cd/dvd these disks would only need to be appended to
969 (burn only new/changed files to new disks for off site
970 storage). basialy understand the differnce between
971 achive data and live data.
973 Notes: Scan the data and email me when it needs burning divide
974 into predifind chunks keep a recored of what is on what
975 disk make me a label (simple php->mysql=>pdf stuff) i
976 could do this bit ability to save data uncompresed so
977 it can be read in any other system (future proof data)
978 save the catalog with the disk as some kind of menu
981 Item 35: Filesystem watch triggered backup.
983 Origin: Jesper Krogh <jesper@krogh.cc>
984 Status: Unimplemented, depends probably on "client initiated backups"
986 What: With inotify and similar filesystem triggeret notification
987 systems is it possible to have the file-daemon to monitor
988 filesystem changes and initiate backup.
990 Why: There are 2 situations where this is nice to have.
991 1) It is possible to get a much finer-grained backup than
992 the fixed schedules used now.. A file created and deleted
993 a few hours later, can automatically be caught.
995 2) The introduced load on the system will probably be
996 distributed more even on the system.
998 Notes: This can be combined with configration that specifies
999 something like: "at most every 15 minutes or when changes
1002 Kern Notes: I would rather see this implemented by an external program
1003 that monitors the Filesystem changes, then uses the console
1004 to start the appropriate job.
1006 Item 36: Implement multiple numeric backup levels as supported by dump
1008 Origin: Daniel Rich <drich@employees.org>
1010 What: Dump allows specification of backup levels numerically instead of just
1011 "full", "incr", and "diff". In this system, at any given level, all
1012 files are backed up that were were modified since the last backup of a
1013 higher level (with 0 being the highest and 9 being the lowest). A
1014 level 0 is therefore equivalent to a full, level 9 an incremental, and
1015 the levels 1 through 8 are varying levels of differentials. For
1016 bacula's sake, these could be represented as "full", "incr", and
1017 "diff1", "diff2", etc.
1019 Why: Support of multiple backup levels would provide for more advanced backup
1020 rotation schemes such as "Towers of Hanoi". This would allow better
1021 flexibility in performing backups, and can lead to shorter recover
1024 Notes: Legato Networker supports a similar system with full, incr, and 1-9 as
1027 Kern notes: I think this would add very little functionality, but a *lot* of
1028 additional overhead to Bacula.
1032 ============= Empty Feature Request form ===========
1033 Item n: One line summary ...
1034 Date: Date submitted
1035 Origin: Name and email of originator.
1038 What: More detailed explanation ...
1040 Why: Why it is important ...
1042 Notes: Additional notes or features (omit if not used)
1043 ============== End Feature Request form ==============