3 Bacula Projects Roadmap
4 Status updated 26 January 2007
5 After re-ordering in vote priority
8 Item: 18 Quick release of FD-SD connection after backup.
9 Item: 40 Include JobID in spool file name
12 Item: 1 Accurate restoration of renamed/deleted files
13 Item: 2 Implement a Bacula GUI/management tool.
14 Item: 3 Allow FD to initiate a backup
15 Item: 4 Merge multiple backups (Synthetic Backup or Consolidation).
16 Item: 5 Deletion of Disk-Based Bacula Volumes
17 Item: 6 Implement Base jobs.
18 Item: 7 Implement creation and maintenance of copy pools
19 Item: 8 Directive/mode to backup only file changes, not entire file
20 Item: 9 Implement a server-side compression feature
21 Item: 10 Improve Bacula's tape and drive usage and cleaning management.
22 Item: 11 Allow skipping execution of Jobs
23 Item: 12 Add a scheduling syntax that permits weekly rotations
24 Item: 13 Archival (removal) of User Files to Tape
25 Item: 14 Cause daemons to use a specific IP address to source communications
26 Item: 15 Multiple threads in file daemon for the same job
27 Item: 16 Add Plug-ins to the FileSet Include statements.
28 Item: 17 Restore only file attributes (permissions, ACL, owner, group...)
29 Item: 18* Quick release of FD-SD connection after backup.
30 Item: 19 Implement a Python interface to the Bacula catalog.
32 Item: 21 Split documentation
33 Item: 22 Implement support for stacking arbitrary stream filters, sinks.
34 Item: 23 Implement from-client and to-client on restore command line.
35 Item: 24 Add an override in Schedule for Pools based on backup types.
36 Item: 25 Implement huge exclude list support using hashing.
37 Item: 26 Implement more Python events in Bacula.
38 Item: 27 Incorporation of XACML2/SAML2 parsing
39 Item: 28 Filesystem watch triggered backup.
40 Item: 29 Allow inclusion/exclusion of files in a fileset by creation/mod times
41 Item: 30 Tray monitor window cleanups
42 Item: 31 Implement multiple numeric backup levels as supported by dump
43 Item: 32 Automatic promotion of backup levels
44 Item: 33 Clustered file-daemons
45 Item: 34 Commercial database support
46 Item: 35 Automatic disabling of devices
47 Item: 36 An option to operate on all pools with update vol parameters
48 Item: 37 Add an item to the restore option where you can select a pool
49 Item: 38 Include timestamp of job launch in "stat clients" output
50 Item: 39 Message mailing based on backup types
51 Item: 40* Include JobID in spool file name
54 Item 1: Accurate restoration of renamed/deleted files
55 Date: 28 November 2005
56 Origin: Martin Simmons (martin at lispworks dot com)
57 Status: Robert Nelson will implement this
59 What: When restoring a fileset for a specified date (including "most
60 recent"), Bacula should give you exactly the files and directories
61 that existed at the time of the last backup prior to that date.
63 Currently this only works if the last backup was a Full backup.
64 When the last backup was Incremental/Differential, files and
65 directories that have been renamed or deleted since the last Full
66 backup are not currently restored correctly. Ditto for files with
67 extra/fewer hard links than at the time of the last Full backup.
69 Why: Incremental/Differential would be much more useful if this worked.
71 Notes: Merging of multiple backups into a single one seems to
72 rely on this working, otherwise the merged backups will not be
73 truly equivalent to a Full backup.
75 Kern: notes shortened. This can be done without the need for
76 inodes. It is essentially the same as the current Verify job,
77 but one additional database record must be written, which does
78 not need any database change.
80 Kern: see if we can correct restoration of directories if
81 replace=ifnewer is set. Currently, if the directory does not
82 exist, a "dummy" directory is created, then when all the files
83 are updated, the dummy directory is newer so the real values
86 Item 2: Implement a Bacula GUI/management tool.
91 What: Implement a Bacula console, and management tools
92 probably using Qt3 and C++.
94 Why: Don't we already have a wxWidgets GUI? Yes, but
95 it is written in C++ and changes to the user interface
96 must be hand tailored using C++ code. By developing
97 the user interface using Qt designer, the interface
98 can be very easily updated and most of the new Python
99 code will be automatically created. The user interface
100 changes become very simple, and only the new features
101 must be implement. In addition, the code will be in
102 Python, which will give many more users easy (or easier)
103 access to making additions or modifications.
105 Notes: There is a partial Python-GTK implementation
106 Lucas Di Pentima <lucas at lunix dot com dot ar> but
107 it is no longer being developed.
109 Item 3: Allow FD to initiate a backup
110 Origin: Frank Volf (frank at deze dot org)
111 Date: 17 November 2005
114 What: Provide some means, possibly by a restricted console that
115 allows a FD to initiate a backup, and that uses the connection
116 established by the FD to the Director for the backup so that
117 a Director that is firewalled can do the backup.
119 Why: Makes backup of laptops much easier.
122 Item 4: Merge multiple backups (Synthetic Backup or Consolidation).
123 Origin: Marc Cousin and Eric Bollengier
124 Date: 15 November 2005
125 Status: Waiting implementation. Depends on first implementing
126 project Item 2 (Migration) which is now done.
128 What: A merged backup is a backup made without connecting to the Client.
129 It would be a Merge of existing backups into a single backup.
130 In effect, it is like a restore but to the backup medium.
132 For instance, say that last Sunday we made a full backup. Then
133 all week long, we created incremental backups, in order to do
134 them fast. Now comes Sunday again, and we need another full.
135 The merged backup makes it possible to do instead an incremental
136 backup (during the night for instance), and then create a merged
137 backup during the day, by using the full and incrementals from
138 the week. The merged backup will be exactly like a full made
139 Sunday night on the tape, but the production interruption on the
140 Client will be minimal, as the Client will only have to send
143 In fact, if it's done correctly, you could merge all the
144 Incrementals into single Incremental, or all the Incrementals
145 and the last Differential into a new Differential, or the Full,
146 last differential and all the Incrementals into a new Full
147 backup. And there is no need to involve the Client.
149 Why: The benefit is that :
150 - the Client just does an incremental ;
151 - the merged backup on tape is just as a single full backup,
152 and can be restored very fast.
154 This is also a way of reducing the backup data since the old
155 data can then be pruned (or not) from the catalog, possibly
156 allowing older volumes to be recycled
158 Item 5: Deletion of Disk-Based Bacula Volumes
160 Origin: Ross Boylan <RossBoylan at stanfordalumni dot org> (edited
164 What: Provide a way for Bacula to automatically remove Volumes
165 from the filesystem, or optionally to truncate them.
166 Obviously, the Volume must be pruned prior removal.
168 Why: This would allow users more control over their Volumes and
169 prevent disk based volumes from consuming too much space.
171 Notes: The following two directives might do the trick:
173 Volume Data Retention = <time period>
174 Remove Volume After = <time period>
176 The migration project should also remove a Volume that is
177 migrated. This might also work for tape Volumes.
179 Item 6: Implement Base jobs.
180 Date: 28 October 2005
184 What: A base job is sort of like a Full save except that you
185 will want the FileSet to contain only files that are
186 unlikely to change in the future (i.e. a snapshot of
187 most of your system after installing it). After the
188 base job has been run, when you are doing a Full save,
189 you specify one or more Base jobs to be used. All
190 files that have been backed up in the Base job/jobs but
191 not modified will then be excluded from the backup.
192 During a restore, the Base jobs will be automatically
193 pulled in where necessary.
195 Why: This is something none of the competition does, as far as
196 we know (except perhaps BackupPC, which is a Perl program that
197 saves to disk only). It is big win for the user, it
198 makes Bacula stand out as offering a unique
199 optimization that immediately saves time and money.
200 Basically, imagine that you have 100 nearly identical
201 Windows or Linux machine containing the OS and user
202 files. Now for the OS part, a Base job will be backed
203 up once, and rather than making 100 copies of the OS,
204 there will be only one. If one or more of the systems
205 have some files updated, no problem, they will be
206 automatically restored.
208 Notes: Huge savings in tape usage even for a single machine.
209 Will require more resources because the DIR must send
210 FD a list of files/attribs, and the FD must search the
211 list and compare it for each file to be saved.
213 Item 7: Implement creation and maintenance of copy pools
214 Date: 27 November 2005
215 Origin: David Boyes (dboyes at sinenomine dot net)
218 What: I would like Bacula to have the capability to write copies
219 of backed-up data on multiple physical volumes selected
220 from different pools without transferring the data
221 multiple times, and to accept any of the copy volumes
222 as valid for restore.
224 Why: In many cases, businesses are required to keep offsite
225 copies of backup volumes, or just wish for simple
226 protection against a human operator dropping a storage
227 volume and damaging it. The ability to generate multiple
228 volumes in the course of a single backup job allows
229 customers to simple check out one copy and send it
230 offsite, marking it as out of changer or otherwise
231 unavailable. Currently, the library and magazine
232 management capability in Bacula does not make this process
235 Restores would use the copy of the data on the first
236 available volume, in order of copy pool chain definition.
238 This is also a major scalability issue -- as the number of
239 clients increases beyond several thousand, and the volume
240 of data increases, transferring the data multiple times to
241 produce additional copies of the backups will become
242 physically impossible due to transfer speed
243 issues. Generating multiple copies at server side will
244 become the only practical option.
246 How: I suspect that this will require adding a multiplexing
247 SD that appears to be a SD to a specific FD, but 1-n FDs
248 to the specific back end SDs managing the primary and copy
249 pools. Storage pools will also need to acquire parameters
250 to define the pools to be used for copies.
252 Notes: I would commit some of my developers' time if we can agree
253 on the design and behavior.
255 Item 8: Directive/mode to backup only file changes, not entire file
256 Date: 11 November 2005
257 Origin: Joshua Kugler <joshua dot kugler at uaf dot edu>
258 Marek Bajon <mbajon at bimsplus dot com dot pl>
261 What: Currently when a file changes, the entire file will be backed up in
262 the next incremental or full backup. To save space on the tapes
263 it would be nice to have a mode whereby only the changes to the
264 file would be backed up when it is changed.
266 Why: This would save lots of space when backing up large files such as
267 logs, mbox files, Outlook PST files and the like.
269 Notes: This would require the usage of disk-based volumes as comparing
270 files would not be feasible using a tape drive.
272 Item 9: Implement a server-side compression feature
273 Date: 18 December 2006
274 Origin: Vadim A. Umanski , e-mail umanski@ext.ru
276 What: The ability to compress backup data on server receiving data
277 instead of doing that on client sending data.
278 Why: The need is practical. I've got some machines that can send
279 data to the network 4 or 5 times faster than compressing
280 them (I've measured that). They're using fast enough SCSI/FC
281 disk subsystems but rather slow CPUs (ex. UltraSPARC II).
282 And the backup server has got a quite fast CPUs (ex. Dual P4
283 Xeons) and quite a low load. When you have 20, 50 or 100 GB
284 of raw data - running a job 4 to 5 times faster - that
285 really matters. On the other hand, the data can be
286 compressed 50% or better - so losing twice more space for
287 disk backup is not good at all. And the network is all mine
288 (I have a dedicated management/provisioning network) and I
289 can get as high bandwidth as I need - 100Mbps, 1000Mbps...
290 That's why the server-side compression feature is needed!
293 Item 10: Improve Bacula's tape and drive usage and cleaning management.
294 Date: 8 November 2005, November 11, 2005
295 Origin: Adam Thornton <athornton at sinenomine dot net>,
296 Arno Lehmann <al at its-lehmann dot de>
299 What: Make Bacula manage tape life cycle information, tape reuse
300 times and drive cleaning cycles.
302 Why: All three parts of this project are important when operating
304 We need to know which tapes need replacement, and we need to
305 make sure the drives are cleaned when necessary. While many
306 tape libraries and even autoloaders can handle all this
307 automatically, support by Bacula can be helpful for smaller
308 (older) libraries and single drives. Limiting the number of
309 times a tape is used might prevent tape errors when using
310 tapes until the drives can't read it any more. Also, checking
311 drive status during operation can prevent some failures (as I
312 [Arno] had to learn the hard way...)
314 Notes: First, Bacula could (and even does, to some limited extent)
315 record tape and drive usage. For tapes, the number of mounts,
316 the amount of data, and the time the tape has actually been
317 running could be recorded. Data fields for Read and Write
318 time and Number of mounts already exist in the catalog (I'm
319 not sure if VolBytes is the sum of all bytes ever written to
320 that volume by Bacula). This information can be important
321 when determining which media to replace. The ability to mark
322 Volumes as "used up" after a given number of write cycles
323 should also be implemented so that a tape is never actually
324 worn out. For the tape drives known to Bacula, similar
325 information is interesting to determine the device status and
326 expected life time: Time it's been Reading and Writing, number
327 of tape Loads / Unloads / Errors. This information is not yet
328 recorded as far as I [Arno] know. A new volume status would
329 be necessary for the new state, like "Used up" or "Worn out".
330 Volumes with this state could be used for restores, but not
331 for writing. These volumes should be migrated first (assuming
332 migration is implemented) and, once they are no longer needed,
333 could be moved to a Trash pool.
335 The next step would be to implement a drive cleaning setup.
336 Bacula already has knowledge about cleaning tapes. Once it
337 has some information about cleaning cycles (measured in drive
338 run time, number of tapes used, or calender days, for example)
339 it can automatically execute tape cleaning (with an
340 autochanger, obviously) or ask for operator assistance loading
343 The final step would be to implement TAPEALERT checks not only
344 when changing tapes and only sending the information to the
345 administrator, but rather checking after each tape error,
346 checking on a regular basis (for example after each tape
347 file), and also before unloading and after loading a new tape.
348 Then, depending on the drives TAPEALERT state and the known
349 drive cleaning state Bacula could automatically schedule later
350 cleaning, clean immediately, or inform the operator.
352 Implementing this would perhaps require another catalog change
353 and perhaps major changes in SD code and the DIR-SD protocol,
354 so I'd only consider this worth implementing if it would
355 actually be used or even needed by many people.
357 Implementation of these projects could happen in three distinct
358 sub-projects: Measuring Tape and Drive usage, retiring
359 volumes, and handling drive cleaning and TAPEALERTs.
361 Item 11: Allow skipping execution of Jobs
362 Date: 29 November 2005
363 Origin: Florian Schnabel <florian.schnabel at docufy dot de>
366 What: An easy option to skip a certain job on a certain date.
367 Why: You could then easily skip tape backups on holidays. Especially
368 if you got no autochanger and can only fit one backup on a tape
369 that would be really handy, other jobs could proceed normally
370 and you won't get errors that way.
372 Item 12: Add a scheduling syntax that permits weekly rotations
373 Date: 15 December 2006
374 Origin: Gregory Brauer (greg at wildbrain dot com)
377 What: Currently, Bacula only understands how to deal with weeks of the
378 month or weeks of the year in schedules. This makes it impossible
379 to do a true weekly rotation of tapes. There will always be a
380 discontinuity that will require disruptive manual intervention at
381 least monthly or yearly because week boundaries never align with
382 month or year boundaries.
384 A solution would be to add a new syntax that defines (at least)
385 a start timestamp, and repetition period.
387 Why: Rotated backups done at weekly intervals are useful, and Bacula
388 cannot currently do them without extensive hacking.
390 Notes: Here is an example syntax showing a 3-week rotation where full
391 Backups would be performed every week on Saturday, and an
392 incremental would be performed every week on Tuesday. Each
393 set of tapes could be removed from the loader for the following
394 two cycles before coming back and being reused on the third
395 week. Since the execution times are determined by intervals
396 from a given point in time, there will never be any issues with
397 having to adjust to any sort of arbitrary time boundary. In
398 the example provided, I even define the starting schedule
399 as crossing both a year and a month boundary, but the run times
400 would be based on the "Repeat" value and would therefore happen
405 Name = "Week 1 Rotation"
406 #Saturday. Would run Dec 30, Jan 20, Feb 10, etc.
410 Start = 2006-12-30 01:00
414 #Tuesday. Would run Jan 2, Jan 23, Feb 13, etc.
418 Start = 2007-01-02 01:00
425 Name = "Week 2 Rotation"
426 #Saturday. Would run Jan 6, Jan 27, Feb 17, etc.
430 Start = 2007-01-06 01:00
434 #Tuesday. Would run Jan 9, Jan 30, Feb 20, etc.
438 Start = 2007-01-09 01:00
445 Name = "Week 3 Rotation"
446 #Saturday. Would run Jan 13, Feb 3, Feb 24, etc.
450 Start = 2007-01-13 01:00
454 #Tuesday. Would run Jan 16, Feb 6, Feb 27, etc.
458 Start = 2007-01-16 01:00
464 Item 13: Archival (removal) of User Files to Tape
466 Origin: Ray Pengelly [ray at biomed dot queensu dot ca
469 What: The ability to archive data to storage based on certain parameters
470 such as age, size, or location. Once the data has been written to
471 storage and logged it is then pruned from the originating
472 filesystem. Note! We are talking about user's files and not
475 Why: This would allow fully automatic storage management which becomes
476 useful for large datastores. It would also allow for auto-staging
477 from one media type to another.
479 Example 1) Medical imaging needs to store large amounts of data.
480 They decide to keep data on their servers for 6 months and then put
481 it away for long term storage. The server then finds all files
482 older than 6 months writes them to tape. The files are then removed
485 Example 2) All data that hasn't been accessed in 2 months could be
486 moved from high-cost, fibre-channel disk storage to a low-cost
487 large-capacity SATA disk storage pool which doesn't have as quick of
488 access time. Then after another 6 months (or possibly as one
489 storage pool gets full) data is migrated to Tape.
491 Item 14: Cause daemons to use a specific IP address to source communications
492 Origin: Bill Moran <wmoran@collaborativefusion.com>
495 What: Cause Bacula daemons (dir, fd, sd) to always use the ip address
496 specified in the [DIR|DF|SD]Addr directive as the source IP
497 for initiating communication.
498 Why: On complex networks, as well as extremely secure networks, it's
499 not unusual to have multiple possible routes through the network.
500 Often, each of these routes is secured by different policies
501 (effectively, firewalls allow or deny different traffic depending
502 on the source address)
503 Unfortunately, it can sometimes be difficult or impossible to
504 represent this in a system routing table, as the result is
505 excessive subnetting that quickly exhausts available IP space.
506 The best available workaround is to provide multiple IPs to
507 a single machine that are all on the same subnet. In order
508 for this to work properly, applications must support the ability
509 to bind outgoing connections to a specified address, otherwise
510 the operating system will always choose the first IP that
511 matches the required route.
512 Notes: Many other programs support this. For example, the following
513 can be configured in BIND:
514 query-source address 10.0.0.1;
515 transfer-source 10.0.0.2;
516 Which means queries from this server will always come from
517 10.0.0.1 and zone transfers will always originate from
520 Item 15: Multiple threads in file daemon for the same job
521 Date: 27 November 2005
522 Origin: Ove Risberg (Ove.Risberg at octocode dot com)
525 What: I want the file daemon to start multiple threads for a backup
526 job so the fastest possible backup can be made.
528 The file daemon could parse the FileSet information and start
529 one thread for each File entry located on a separate
532 A confiuration option in the job section should be used to
533 enable or disable this feature. The confgutration option could
534 specify the maximum number of threads in the file daemon.
536 If the theads could spool the data to separate spool files
537 the restore process will not be much slower.
539 Why: Multiple concurrent backups of a large fileserver with many
540 disks and controllers will be much faster.
542 Item 16: Add Plug-ins to the FileSet Include statements.
543 Date: 28 October 2005
545 Status: Partially coded in 1.37 -- much more to do.
547 What: Allow users to specify wild-card and/or regular
548 expressions to be matched in both the Include and
549 Exclude directives in a FileSet. At the same time,
550 allow users to define plug-ins to be called (based on
551 regular expression/wild-card matching).
553 Why: This would give the users the ultimate ability to control
554 how files are backed up/restored. A user could write a
555 plug-in knows how to backup his Oracle database without
556 stopping/starting it, for example.
558 Item 17: Restore only file attributes (permissions, ACL, owner, group...)
559 Origin: Eric Bollengier
563 What: The goal of this project is to be able to restore only rights
564 and attributes of files without crushing them.
566 Why: Who have never had to repair a chmod -R 777, or a wild update
567 of recursive right under Windows? At this time, you must have
568 enough space to restore data, dump attributes (easy with acl,
569 more complex with unix/windows rights) and apply them to your
570 broken tree. With this options, it will be very easy to compare
571 right or ACL over the time.
573 Notes: If the file is here, we skip restore and we change rights.
574 If the file isn't here, we can create an empty one and apply
575 rights or do nothing.
576 Item 18: Quick release of FD-SD connection after backup.
577 Origin: Frank Volf (frank at deze dot org)
578 Date: 17 November 2005
579 Status: Done -- implemented by Kern -- in CVS 26Jan07
581 What: In the Bacula implementation a backup is finished after all data
582 and attributes are successfully written to storage. When using a
583 tape backup it is very annoying that a backup can take a day,
584 simply because the current tape (or whatever) is full and the
585 administrator has not put a new one in. During that time the
586 system cannot be taken off-line, because there is still an open
587 session between the storage daemon and the file daemon on the
590 Although this is a very good strategy for making "safe backups"
591 This can be annoying for e.g. laptops, that must remain
592 connected until the backup is completed.
594 Using a new feature called "migration" it will be possible to
595 spool first to harddisk (using a special 'spool' migration
596 scheme) and then migrate the backup to tape.
598 There is still the problem of getting the attributes committed.
599 If it takes a very long time to do, with the current code, the
600 job has not terminated, and the File daemon is not freed up. The
601 Storage daemon should release the File daemon as soon as all the
602 file data and all the attributes have been sent to it (the SD).
603 Currently the SD waits until everything is on tape and all the
604 attributes are transmitted to the Director before signaling
605 completion to the FD. I don't think I would have any problem
606 changing this. The reason is that even if the FD reports back to
607 the Dir that all is OK, the job will not terminate until the SD
608 has done the same thing -- so in a way keeping the SD-FD link
609 open to the very end is not really very productive ...
611 Why: Makes backup of laptops much faster.
613 Item 19: Implement a Python interface to the Bacula catalog.
614 Date: 28 October 2005
618 What: Implement an interface for Python scripts to access
619 the catalog through Bacula.
621 Why: This will permit users to customize Bacula through
624 Item 20: Archive data
626 Origin: calvin streeting calvin at absentdream dot com
629 What: The abilty to archive to media (dvd/cd) in a uncompressed format
630 for dead filing (archiving not backing up)
632 Why: At my works when jobs are finished and moved off of the main file
633 servers (raid based systems) onto a simple linux file server (ide based
634 system) so users can find old information without contacting the IT
637 So this data dosn't realy change it only gets added to,
638 But it also needs backing up. At the moment it takes
639 about 8 hours to back up our servers (working data) so
640 rather than add more time to existing backups i am trying
641 to implement a system where we backup the acrhive data to
642 cd/dvd these disks would only need to be appended to
643 (burn only new/changed files to new disks for off site
644 storage). basialy understand the differnce between
645 achive data and live data.
647 Notes: Scan the data and email me when it needs burning divide
648 into predifind chunks keep a recored of what is on what
649 disk make me a label (simple php->mysql=>pdf stuff) i
650 could do this bit ability to save data uncompresed so
651 it can be read in any other system (future proof data)
652 save the catalog with the disk as some kind of menu
655 Item 21: Split documentation
656 Origin: Maxx <maxxatworkat gmail dot com>
660 What: Split documentation in several books
662 Why: Bacula manual has now more than 600 pages, and looking for
663 implementation details is getting complicated. I think
664 it would be good to split the single volume in two or
667 1) Introduction, requirements and tutorial, typically
668 are useful only until first installation time
670 2) Basic installation and configuration, with all the
671 gory details about the directives supported 3)
672 Advanced Bacula: testing, troubleshooting, GUI and
673 ancillary programs, security managements, scripting,
677 Item 22: Implement support for stacking arbitrary stream filters, sinks.
678 Date: 23 November 2006
679 Origin: Landon Fuller <landonf@threerings.net>
680 Status: Planning. Assigned to landonf.
682 What: Implement support for the following:
683 - Stacking arbitrary stream filters (eg, encryption, compression,
684 sparse data handling))
685 - Attaching file sinks to terminate stream filters (ie, write out
686 the resultant data to a file)
687 - Refactor the restoration state machine accordingly
689 Why: The existing stream implementation suffers from the following:
690 - All state (compression, encryption, stream restoration), is
691 global across the entire restore process, for all streams. There are
692 multiple entry and exit points in the restoration state machine, and
693 thus multiple places where state must be allocated, deallocated,
694 initialized, or reinitialized. This results in exceptional complexity
695 for the author of a stream filter.
696 - The developer must enumerate all possible combinations of filters
697 and stream types (ie, win32 data with encryption, without encryption,
698 with encryption AND compression, etc).
700 Notes: This feature request only covers implementing the stream filters/
701 sinks, and refactoring the file daemon's restoration implementation
702 accordingly. If I have extra time, I will also rewrite the backup
703 implementation. My intent in implementing the restoration first is to
704 solve pressing bugs in the restoration handling, and to ensure that
705 the new restore implementation handles existing backups correctly.
707 I do not plan on changing the network or tape data structures to
708 support defining arbitrary stream filters, but supporting that
709 functionality is the ultimate goal.
711 Assistance with either code or testing would be fantastic.
713 Item 23: Implement from-client and to-client on restore command line.
714 Date: 11 December 2006
715 Origin: Discussion on Bacula-users entitled 'Scripted restores to
716 different clients', December 2006
717 Status: New feature request
719 What: While using bconsole interactively, you can specify the client
720 that a backup job is to be restored for, and then you can
721 specify later a different client to send the restored files
722 back to. However, using the 'restore' command with all options
723 on the command line, this cannot be done, due to the ambiguous
724 'client' parameter. Additionally, this parameter means different
725 things depending on if it's specified on the command line or
726 afterwards, in the Modify Job screens.
728 Why: This feature would enable restore jobs to be more completely
729 automated, for example by a web or GUI front-end.
731 Notes: client can also be implied by specifying the jobid on the command
734 Item 24: Add an override in Schedule for Pools based on backup types.
736 Origin: Chad Slater <chad.slater@clickfox.com>
739 What: Adding a FullStorage=BigTapeLibrary in the Schedule resource
740 would help those of us who use different storage devices for different
741 backup levels cope with the "auto-upgrade" of a backup.
743 Why: Assume I add several new device to be backed up, i.e. several
744 hosts with 1TB RAID. To avoid tape switching hassles, incrementals are
745 stored in a disk set on a 2TB RAID. If you add these devices in the
746 middle of the month, the incrementals are upgraded to "full" backups,
747 but they try to use the same storage device as requested in the
748 incremental job, filling up the RAID holding the differentials. If we
749 could override the Storage parameter for full and/or differential
750 backups, then the Full job would use the proper Storage device, which
751 has more capacity (i.e. a 8TB tape library.
753 Item 25: Implement huge exclude list support using hashing.
754 Date: 28 October 2005
758 What: Allow users to specify very large exclude list (currently
759 more than about 1000 files is too many).
761 Why: This would give the users the ability to exclude all
762 files that are loaded with the OS (e.g. using rpms
763 or debs). If the user can restore the base OS from
764 CDs, there is no need to backup all those files. A
765 complete restore would be to restore the base OS, then
766 do a Bacula restore. By excluding the base OS files, the
767 backup set will be *much* smaller.
769 Item 26: Implement more Python events in Bacula.
770 Date: 28 October 2005
774 What: Allow Python scripts to be called at more places
775 within Bacula and provide additional access to Bacula
778 Why: This will permit users to customize Bacula through
786 Also add a way to get a listing of currently running
787 jobs (possibly also scheduled jobs).
790 Item 27: Incorporation of XACML2/SAML2 parsing
791 Date: 19 January 2006
792 Origin: Adam Thornton <athornton@sinenomine.net>
795 What: XACML is "eXtensible Access Control Markup Language" and
796 "SAML is the "Security Assertion Markup Language"--an XML standard
797 for making statements about identity and authorization. Having these
798 would give us a framework to approach ACLs in a generic manner, and
799 in a way flexible enough to support the four major sorts of ACLs I
800 see as a concern to Bacula at this point, as well as (probably) to
801 deal with new sorts of ACLs that may appear in the future.
803 Why: Bacula is beginning to need to back up systems with ACLs
804 that do not map cleanly onto traditional Unix permissions. I see
805 four sets of ACLs--in general, mutually incompatible with one
806 another--that we're going to need to deal with. These are: NTFS
807 ACLs, POSIX ACLs, NFSv4 ACLS, and AFS ACLS. (Some may question the
808 relevance of AFS; AFS is one of Sine Nomine's core consulting
809 businesses, and having a reputable file-level backup and restore
810 technology for it (as Tivoli is probably going to drop AFS support
811 soon since IBM no longer supports AFS) would be of huge benefit to
812 our customers; we'd most likely create the AFS support at Sine Nomine
813 for inclusion into the Bacula (and perhaps some changes to the
814 OpenAFS volserver) core code.)
816 Now, obviously, Bacula already handles NTFS just fine. However, I
817 think there's a lot of value in implementing a generic ACL model, so
818 that it's easy to support whatever particular instances of ACLs come
819 down the pike: POSIX ACLS (think SELinux) and NFSv4 are the obvious
820 things arriving in the Linux world in a big way in the near future.
821 XACML, although overcomplicated for our needs, provides this
822 framework, and we should be able to leverage other people's
823 implementations to minimize the amount of work *we* have to do to get
824 a generic ACL framework. Basically, the costs of implementation are
825 high, but they're largely both external to Bacula and already sunk.
827 Item 28: Filesystem watch triggered backup.
829 Origin: Jesper Krogh <jesper@krogh.cc>
830 Status: Unimplemented, depends probably on "client initiated backups"
832 What: With inotify and similar filesystem triggeret notification
833 systems is it possible to have the file-daemon to monitor
834 filesystem changes and initiate backup.
836 Why: There are 2 situations where this is nice to have.
837 1) It is possible to get a much finer-grained backup than
838 the fixed schedules used now.. A file created and deleted
839 a few hours later, can automatically be caught.
841 2) The introduced load on the system will probably be
842 distributed more even on the system.
844 Notes: This can be combined with configration that specifies
845 something like: "at most every 15 minutes or when changes
848 Kern Notes: I would rather see this implemented by an external program
849 that monitors the Filesystem changes, then uses the console
850 to start the appropriate job.
852 Item 29: Allow inclusion/exclusion of files in a fileset by creation/mod times
853 Origin: Evan Kaufman <evan.kaufman@gmail.com>
854 Date: January 11, 2006
857 What: In the vein of the Wild and Regex directives in a Fileset's
858 Options, it would be helpful to allow a user to include or exclude
859 files and directories by creation or modification times.
861 You could factor the Exclude=yes|no option in much the same way it
862 affects the Wild and Regex directives. For example, you could exclude
863 all files modified before a certain date:
867 Modified Before = ####
870 Or you could exclude all files created/modified since a certain date:
874 Created Modified Since = ####
877 The format of the time/date could be done several ways, say the number
878 of seconds since the epoch:
879 1137008553 = Jan 11 2006, 1:42:33PM # result of `date +%s`
881 Or a human readable date in a cryptic form:
882 20060111134233 = Jan 11 2006, 1:42:33PM # YYYYMMDDhhmmss
884 Why: I imagine a feature like this could have many uses. It would
885 allow a user to do a full backup while excluding the base operating
886 system files, so if I installed a Linux snapshot from a CD yesterday,
887 I'll *exclude* all files modified *before* today. If I need to
888 recover the system, I use the CD I already have, plus the tape backup.
889 Or if, say, a Windows client is hit by a particularly corrosive
890 virus, and I need to *exclude* any files created/modified *since* the
893 Notes: Of course, this feature would work in concert with other
894 in/exclude rules, and wouldnt override them (or each other).
896 Notes: The directives I'd imagine would be along the lines of
897 "[Created] [Modified] [Before|Since] = <date>".
898 So one could compare against 'ctime' and/or 'mtime', but ONLY 'before'
902 Item 30: Tray monitor window cleanups
903 Origin: Alan Brown ajb2 at mssl dot ucl dot ac dot uk
906 What: Resizeable and scrollable windows in the tray monitor.
908 Why: With multiple clients, or with many jobs running, the displayed
909 window often ends up larger than the available screen, making
910 the trailing items difficult to read.
913 Item 31: Implement multiple numeric backup levels as supported by dump
915 Origin: Daniel Rich <drich@employees.org>
917 What: Dump allows specification of backup levels numerically instead of just
918 "full", "incr", and "diff". In this system, at any given level, all
919 files are backed up that were were modified since the last backup of a
920 higher level (with 0 being the highest and 9 being the lowest). A
921 level 0 is therefore equivalent to a full, level 9 an incremental, and
922 the levels 1 through 8 are varying levels of differentials. For
923 bacula's sake, these could be represented as "full", "incr", and
924 "diff1", "diff2", etc.
926 Why: Support of multiple backup levels would provide for more advanced backup
927 rotation schemes such as "Towers of Hanoi". This would allow better
928 flexibility in performing backups, and can lead to shorter recover
931 Notes: Legato Networker supports a similar system with full, incr, and 1-9 as
934 Item 32: Automatic promotion of backup levels
935 Date: 19 January 2006
936 Origin: Adam Thornton <athornton@sinenomine.net>
939 What: Amanda has a feature whereby it estimates the space that a
940 differential, incremental, and full backup would take. If the
941 difference in space required between the scheduled level and the next
942 level up is beneath some user-defined critical threshold, the backup
943 level is bumped to the next type. Doing this minimizes the number of
944 volumes necessary during a restore, with a fairly minimal cost in
947 Why: I know at least one (quite sophisticated and smart) user
948 for whom the absence of this feature is a deal-breaker in terms of
949 using Bacula; if we had it it would eliminate the one cool thing
950 Amanda can do and we can't (at least, the one cool thing I know of).
952 Item 33: Clustered file-daemons
953 Origin: Alan Brown ajb2 at mssl dot ucl dot ac dot uk
956 What: A "virtual" filedaemon, which is actually a cluster of real ones.
958 Why: In the case of clustered filesystems (SAN setups, GFS, or OCFS2, etc)
959 multiple machines may have access to the same set of filesystems
961 For performance reasons, one may wish to initate backups from
962 several of these machines simultaneously, instead of just using
963 one backup source for the common clustered filesystem.
965 For obvious reasons, normally backups of $A-FD/$PATH and
966 B-FD/$PATH are treated as different backup sets. In this case
967 they are the same communal set.
969 Likewise when restoring, it would be easier to just specify
970 one of the cluster machines and let bacula decide which to use.
972 This can be faked to some extent using DNS round robin entries
973 and a virtual IP address, however it means "status client" will
974 always give bogus answers. Additionally there is no way of
975 spreading the load evenly among the servers.
977 What is required is something similar to the storage daemon
978 autochanger directives, so that Bacula can keep track of
979 operating backups/restores and direct new jobs to a "free"
984 Item 34: Commercial database support
985 Origin: Russell Howe <russell_howe dot wreckage dot org>
989 What: It would be nice for the database backend to support more
990 databases. I'm thinking of SQL Server at the moment, but I guess Oracle,
991 DB2, MaxDB, etc are all candidates. SQL Server would presumably be
992 implemented using FreeTDS or maybe an ODBC library?
994 Why: We only really have one database server, which is MS SQL Server
995 2000. Maintaining a second one for the backup software (we grew out of
996 SQLite, which I liked, but which didn't work so well with our database
997 size). We don't really have a machine with the resources to run
998 postgres, and would rather only maintain a single DBMS. We're stuck with
999 SQL Server because pretty much all the company's custom applications
1000 (written by consultants) are locked into SQL Server 2000. I can imagine
1001 this scenario is fairly common, and it would be nice to use the existing
1002 properly specced database server for storing Bacula's catalog, rather
1003 than having to run a second DBMS.
1005 Item 35: Automatic disabling of devices
1007 Origin: Peter Eriksson <peter at ifm.liu dot se>
1010 What: After a configurable amount of fatal errors with a tape drive
1011 Bacula should automatically disable further use of a certain
1012 tape drive. There should also be "disable"/"enable" commands in
1013 the "bconsole" tool.
1015 Why: On a multi-drive jukebox there is a possibility of tape drives
1016 going bad during large backups (needing a cleaning tape run,
1017 tapes getting stuck). It would be advantageous if Bacula would
1018 automatically disable further use of a problematic tape drive
1019 after a configurable amount of errors has occurred.
1021 An example: I have a multi-drive jukebox (6 drives, 380+ slots)
1022 where tapes occasionally get stuck inside the drive. Bacula will
1023 notice that the "mtx-changer" command will fail and then fail
1024 any backup jobs trying to use that drive. However, it will still
1025 keep on trying to run new jobs using that drive and fail -
1026 forever, and thus failing lots and lots of jobs... Since we have
1027 many drives Bacula could have just automatically disabled
1028 further use of that drive and used one of the other ones
1031 Item 36: An option to operate on all pools with update vol parameters
1032 Origin: Dmitriy Pinchukov <absh@bossdev.kiev.ua>
1033 Date: 16 August 2006
1036 What: When I do update -> Volume parameters -> All Volumes
1037 from Pool, then I have to select pools one by one. I'd like
1038 console to have an option like "0: All Pools" in the list of
1041 Why: I have many pools and therefore unhappy with manually
1042 updating each of them using update -> Volume parameters -> All
1043 Volumes from Pool -> pool #.
1045 Item 37: Add an item to the restore option where you can select a pool
1046 Origin: kshatriyak at gmail dot com
1050 What: In the restore option (Select the most recent backup for a
1051 client) it would be useful to add an option where you can limit
1052 the selection to a certain pool.
1054 Why: When using cloned jobs, most of the time you have 2 pools - a
1055 disk pool and a tape pool. People who have 2 pools would like to
1056 select the most recent backup from disk, not from tape (tape
1057 would be only needed in emergency). However, the most recent
1058 backup (which may just differ a second from the disk backup) may
1059 be on tape and would be selected. The problem becomes bigger if
1060 you have a full and differential - the most "recent" full backup
1061 may be on disk, while the most recent differential may be on tape
1062 (though the differential on disk may differ even only a second or
1063 so). Bacula will complain that the backups reside on different
1064 media then. For now the only solution now when restoring things
1065 when you have 2 pools is to manually search for the right
1066 job-id's and enter them by hand, which is a bit fault tolerant.
1068 Item 38: Include timestamp of job launch in "stat clients" output
1069 Origin: Mark Bergman <mark.bergman@uphs.upenn.edu>
1070 Date: Tue Aug 22 17:13:39 EDT 2006
1073 What: The "stat clients" command doesn't include any detail on when
1074 the active backup jobs were launched.
1076 Why: Including the timestamp would make it much easier to decide whether
1077 a job is running properly.
1079 Notes: It may be helpful to have the output from "stat clients" formatted
1080 more like that from "stat dir" (and other commands), in a column
1081 format. The per-client information that's currently shown (level,
1082 client name, JobId, Volume, pool, device, Files, etc.) is good, but
1083 somewhat hard to parse (both programmatically and visually),
1084 particularly when there are many active clients.
1087 Item 39: Message mailing based on backup types
1088 Origin: Evan Kaufman <evan.kaufman@gmail.com>
1089 Date: January 6, 2006
1092 What: In the "Messages" resource definitions, allowing messages
1093 to be mailed based on the type (backup, restore, etc.) and level
1094 (full, differential, etc) of job that created the originating
1097 Why: It would, for example, allow someone's boss to be emailed
1098 automatically only when a Full Backup job runs, so he can
1099 retrieve the tapes for offsite storage, even if the IT dept.
1100 doesn't (or can't) explicitly notify him. At the same time, his
1101 mailbox wouldnt be filled by notifications of Verifies, Restores,
1102 or Incremental/Differential Backups (which would likely be kept
1105 Notes: One way this could be done is through additional message types, for example:
1108 # email the boss only on full system backups
1109 Mail = boss@mycompany.com = full, !incremental, !differential, !restore,
1111 # email us only when something breaks
1112 MailOnError = itdept@mycompany.com = all
1116 Item 40: Include JobID in spool file name ****DONE****
1117 Origin: Mark Bergman <mark.bergman@uphs.upenn.edu>
1118 Date: Tue Aug 22 17:13:39 EDT 2006
1119 Status: Done. (patches/testing/project-include-jobid-in-spool-name.patch)
1120 No need to vote for this item.
1122 What: Change the name of the spool file to include the JobID
1124 Why: JobIDs are the common key used to refer to jobs, yet the
1125 spoolfile name doesn't include that information. The date/time
1126 stamp is useful (and should be retained).
1128 ============= Empty Feature Request form ===========
1129 Item n: One line summary ...
1130 Date: Date submitted
1131 Origin: Name and email of originator.
1134 What: More detailed explanation ...
1136 Why: Why it is important ...
1138 Notes: Additional notes or features (omit if not used)
1139 ============== End Feature Request form ==============