3 Bacula Projects Roadmap
4 Status updated 26 January 2007
5 After re-ordering in vote priority
8 Item: 18 Quick release of FD-SD connection after backup.
9 Item: 40 Include JobID in spool file name
10 Item: 25 Implement huge exclude list support using dlist
13 Item: 1 Accurate restoration of renamed/deleted files
14 Item: 2 Implement a Bacula GUI/management tool.
15 Item: 3 Allow FD to initiate a backup
16 Item: 4 Merge multiple backups (Synthetic Backup or Consolidation).
17 Item: 5 Deletion of Disk-Based Bacula Volumes
18 Item: 6 Implement Base jobs.
19 Item: 7 Implement creation and maintenance of copy pools
20 Item: 8 Directive/mode to backup only file changes, not entire file
21 Item: 9 Implement a server-side compression feature
22 Item: 10 Improve Bacula's tape and drive usage and cleaning management.
23 Item: 11 Allow skipping execution of Jobs
24 Item: 12 Add a scheduling syntax that permits weekly rotations
25 Item: 13 Archival (removal) of User Files to Tape
26 Item: 14 Cause daemons to use a specific IP address to source communications
27 Item: 15 Multiple threads in file daemon for the same job
28 Item: 16 Add Plug-ins to the FileSet Include statements.
29 Item: 17 Restore only file attributes (permissions, ACL, owner, group...)
30 Item: 18* Quick release of FD-SD connection after backup.
31 Item: 19 Implement a Python interface to the Bacula catalog.
33 Item: 21 Split documentation
34 Item: 22 Implement support for stacking arbitrary stream filters, sinks.
35 Item: 23 Implement from-client and to-client on restore command line.
36 Item: 24 Add an override in Schedule for Pools based on backup types.
37 Item: 25* Implement huge exclude list support using hashing.
38 Item: 26 Implement more Python events in Bacula.
39 Item: 27 Incorporation of XACML2/SAML2 parsing
40 Item: 28 Filesystem watch triggered backup.
41 Item: 29 Allow inclusion/exclusion of files in a fileset by creation/mod times
42 Item: 30 Tray monitor window cleanups
43 Item: 31 Implement multiple numeric backup levels as supported by dump
44 Item: 32 Automatic promotion of backup levels
45 Item: 33 Clustered file-daemons
46 Item: 34 Commercial database support
47 Item: 35 Automatic disabling of devices
48 Item: 36 An option to operate on all pools with update vol parameters
49 Item: 37 Add an item to the restore option where you can select a pool
50 Item: 38 Include timestamp of job launch in "stat clients" output
51 Item: 39 Message mailing based on backup types
52 Item: 40* Include JobID in spool file name
55 Item 1: Accurate restoration of renamed/deleted files
56 Date: 28 November 2005
57 Origin: Martin Simmons (martin at lispworks dot com)
58 Status: Robert Nelson will implement this
60 What: When restoring a fileset for a specified date (including "most
61 recent"), Bacula should give you exactly the files and directories
62 that existed at the time of the last backup prior to that date.
64 Currently this only works if the last backup was a Full backup.
65 When the last backup was Incremental/Differential, files and
66 directories that have been renamed or deleted since the last Full
67 backup are not currently restored correctly. Ditto for files with
68 extra/fewer hard links than at the time of the last Full backup.
70 Why: Incremental/Differential would be much more useful if this worked.
72 Notes: Merging of multiple backups into a single one seems to
73 rely on this working, otherwise the merged backups will not be
74 truly equivalent to a Full backup.
76 Kern: notes shortened. This can be done without the need for
77 inodes. It is essentially the same as the current Verify job,
78 but one additional database record must be written, which does
79 not need any database change.
81 Kern: see if we can correct restoration of directories if
82 replace=ifnewer is set. Currently, if the directory does not
83 exist, a "dummy" directory is created, then when all the files
84 are updated, the dummy directory is newer so the real values
87 Item 2: Implement a Bacula GUI/management tool.
92 What: Implement a Bacula console, and management tools
93 probably using Qt3 and C++.
95 Why: Don't we already have a wxWidgets GUI? Yes, but
96 it is written in C++ and changes to the user interface
97 must be hand tailored using C++ code. By developing
98 the user interface using Qt designer, the interface
99 can be very easily updated and most of the new Python
100 code will be automatically created. The user interface
101 changes become very simple, and only the new features
102 must be implement. In addition, the code will be in
103 Python, which will give many more users easy (or easier)
104 access to making additions or modifications.
106 Notes: There is a partial Python-GTK implementation
107 Lucas Di Pentima <lucas at lunix dot com dot ar> but
108 it is no longer being developed.
110 Item 3: Allow FD to initiate a backup
111 Origin: Frank Volf (frank at deze dot org)
112 Date: 17 November 2005
115 What: Provide some means, possibly by a restricted console that
116 allows a FD to initiate a backup, and that uses the connection
117 established by the FD to the Director for the backup so that
118 a Director that is firewalled can do the backup.
120 Why: Makes backup of laptops much easier.
123 Item 4: Merge multiple backups (Synthetic Backup or Consolidation).
124 Origin: Marc Cousin and Eric Bollengier
125 Date: 15 November 2005
126 Status: Waiting implementation. Depends on first implementing
127 project Item 2 (Migration) which is now done.
129 What: A merged backup is a backup made without connecting to the Client.
130 It would be a Merge of existing backups into a single backup.
131 In effect, it is like a restore but to the backup medium.
133 For instance, say that last Sunday we made a full backup. Then
134 all week long, we created incremental backups, in order to do
135 them fast. Now comes Sunday again, and we need another full.
136 The merged backup makes it possible to do instead an incremental
137 backup (during the night for instance), and then create a merged
138 backup during the day, by using the full and incrementals from
139 the week. The merged backup will be exactly like a full made
140 Sunday night on the tape, but the production interruption on the
141 Client will be minimal, as the Client will only have to send
144 In fact, if it's done correctly, you could merge all the
145 Incrementals into single Incremental, or all the Incrementals
146 and the last Differential into a new Differential, or the Full,
147 last differential and all the Incrementals into a new Full
148 backup. And there is no need to involve the Client.
150 Why: The benefit is that :
151 - the Client just does an incremental ;
152 - the merged backup on tape is just as a single full backup,
153 and can be restored very fast.
155 This is also a way of reducing the backup data since the old
156 data can then be pruned (or not) from the catalog, possibly
157 allowing older volumes to be recycled
159 Item 5: Deletion of Disk-Based Bacula Volumes
161 Origin: Ross Boylan <RossBoylan at stanfordalumni dot org> (edited
165 What: Provide a way for Bacula to automatically remove Volumes
166 from the filesystem, or optionally to truncate them.
167 Obviously, the Volume must be pruned prior removal.
169 Why: This would allow users more control over their Volumes and
170 prevent disk based volumes from consuming too much space.
172 Notes: The following two directives might do the trick:
174 Volume Data Retention = <time period>
175 Remove Volume After = <time period>
177 The migration project should also remove a Volume that is
178 migrated. This might also work for tape Volumes.
180 Item 6: Implement Base jobs.
181 Date: 28 October 2005
185 What: A base job is sort of like a Full save except that you
186 will want the FileSet to contain only files that are
187 unlikely to change in the future (i.e. a snapshot of
188 most of your system after installing it). After the
189 base job has been run, when you are doing a Full save,
190 you specify one or more Base jobs to be used. All
191 files that have been backed up in the Base job/jobs but
192 not modified will then be excluded from the backup.
193 During a restore, the Base jobs will be automatically
194 pulled in where necessary.
196 Why: This is something none of the competition does, as far as
197 we know (except perhaps BackupPC, which is a Perl program that
198 saves to disk only). It is big win for the user, it
199 makes Bacula stand out as offering a unique
200 optimization that immediately saves time and money.
201 Basically, imagine that you have 100 nearly identical
202 Windows or Linux machine containing the OS and user
203 files. Now for the OS part, a Base job will be backed
204 up once, and rather than making 100 copies of the OS,
205 there will be only one. If one or more of the systems
206 have some files updated, no problem, they will be
207 automatically restored.
209 Notes: Huge savings in tape usage even for a single machine.
210 Will require more resources because the DIR must send
211 FD a list of files/attribs, and the FD must search the
212 list and compare it for each file to be saved.
214 Item 7: Implement creation and maintenance of copy pools
215 Date: 27 November 2005
216 Origin: David Boyes (dboyes at sinenomine dot net)
219 What: I would like Bacula to have the capability to write copies
220 of backed-up data on multiple physical volumes selected
221 from different pools without transferring the data
222 multiple times, and to accept any of the copy volumes
223 as valid for restore.
225 Why: In many cases, businesses are required to keep offsite
226 copies of backup volumes, or just wish for simple
227 protection against a human operator dropping a storage
228 volume and damaging it. The ability to generate multiple
229 volumes in the course of a single backup job allows
230 customers to simple check out one copy and send it
231 offsite, marking it as out of changer or otherwise
232 unavailable. Currently, the library and magazine
233 management capability in Bacula does not make this process
236 Restores would use the copy of the data on the first
237 available volume, in order of copy pool chain definition.
239 This is also a major scalability issue -- as the number of
240 clients increases beyond several thousand, and the volume
241 of data increases, transferring the data multiple times to
242 produce additional copies of the backups will become
243 physically impossible due to transfer speed
244 issues. Generating multiple copies at server side will
245 become the only practical option.
247 How: I suspect that this will require adding a multiplexing
248 SD that appears to be a SD to a specific FD, but 1-n FDs
249 to the specific back end SDs managing the primary and copy
250 pools. Storage pools will also need to acquire parameters
251 to define the pools to be used for copies.
253 Notes: I would commit some of my developers' time if we can agree
254 on the design and behavior.
256 Item 8: Directive/mode to backup only file changes, not entire file
257 Date: 11 November 2005
258 Origin: Joshua Kugler <joshua dot kugler at uaf dot edu>
259 Marek Bajon <mbajon at bimsplus dot com dot pl>
262 What: Currently when a file changes, the entire file will be backed up in
263 the next incremental or full backup. To save space on the tapes
264 it would be nice to have a mode whereby only the changes to the
265 file would be backed up when it is changed.
267 Why: This would save lots of space when backing up large files such as
268 logs, mbox files, Outlook PST files and the like.
270 Notes: This would require the usage of disk-based volumes as comparing
271 files would not be feasible using a tape drive.
273 Item 9: Implement a server-side compression feature
274 Date: 18 December 2006
275 Origin: Vadim A. Umanski , e-mail umanski@ext.ru
277 What: The ability to compress backup data on server receiving data
278 instead of doing that on client sending data.
279 Why: The need is practical. I've got some machines that can send
280 data to the network 4 or 5 times faster than compressing
281 them (I've measured that). They're using fast enough SCSI/FC
282 disk subsystems but rather slow CPUs (ex. UltraSPARC II).
283 And the backup server has got a quite fast CPUs (ex. Dual P4
284 Xeons) and quite a low load. When you have 20, 50 or 100 GB
285 of raw data - running a job 4 to 5 times faster - that
286 really matters. On the other hand, the data can be
287 compressed 50% or better - so losing twice more space for
288 disk backup is not good at all. And the network is all mine
289 (I have a dedicated management/provisioning network) and I
290 can get as high bandwidth as I need - 100Mbps, 1000Mbps...
291 That's why the server-side compression feature is needed!
294 Item 10: Improve Bacula's tape and drive usage and cleaning management.
295 Date: 8 November 2005, November 11, 2005
296 Origin: Adam Thornton <athornton at sinenomine dot net>,
297 Arno Lehmann <al at its-lehmann dot de>
300 What: Make Bacula manage tape life cycle information, tape reuse
301 times and drive cleaning cycles.
303 Why: All three parts of this project are important when operating
305 We need to know which tapes need replacement, and we need to
306 make sure the drives are cleaned when necessary. While many
307 tape libraries and even autoloaders can handle all this
308 automatically, support by Bacula can be helpful for smaller
309 (older) libraries and single drives. Limiting the number of
310 times a tape is used might prevent tape errors when using
311 tapes until the drives can't read it any more. Also, checking
312 drive status during operation can prevent some failures (as I
313 [Arno] had to learn the hard way...)
315 Notes: First, Bacula could (and even does, to some limited extent)
316 record tape and drive usage. For tapes, the number of mounts,
317 the amount of data, and the time the tape has actually been
318 running could be recorded. Data fields for Read and Write
319 time and Number of mounts already exist in the catalog (I'm
320 not sure if VolBytes is the sum of all bytes ever written to
321 that volume by Bacula). This information can be important
322 when determining which media to replace. The ability to mark
323 Volumes as "used up" after a given number of write cycles
324 should also be implemented so that a tape is never actually
325 worn out. For the tape drives known to Bacula, similar
326 information is interesting to determine the device status and
327 expected life time: Time it's been Reading and Writing, number
328 of tape Loads / Unloads / Errors. This information is not yet
329 recorded as far as I [Arno] know. A new volume status would
330 be necessary for the new state, like "Used up" or "Worn out".
331 Volumes with this state could be used for restores, but not
332 for writing. These volumes should be migrated first (assuming
333 migration is implemented) and, once they are no longer needed,
334 could be moved to a Trash pool.
336 The next step would be to implement a drive cleaning setup.
337 Bacula already has knowledge about cleaning tapes. Once it
338 has some information about cleaning cycles (measured in drive
339 run time, number of tapes used, or calender days, for example)
340 it can automatically execute tape cleaning (with an
341 autochanger, obviously) or ask for operator assistance loading
344 The final step would be to implement TAPEALERT checks not only
345 when changing tapes and only sending the information to the
346 administrator, but rather checking after each tape error,
347 checking on a regular basis (for example after each tape
348 file), and also before unloading and after loading a new tape.
349 Then, depending on the drives TAPEALERT state and the known
350 drive cleaning state Bacula could automatically schedule later
351 cleaning, clean immediately, or inform the operator.
353 Implementing this would perhaps require another catalog change
354 and perhaps major changes in SD code and the DIR-SD protocol,
355 so I'd only consider this worth implementing if it would
356 actually be used or even needed by many people.
358 Implementation of these projects could happen in three distinct
359 sub-projects: Measuring Tape and Drive usage, retiring
360 volumes, and handling drive cleaning and TAPEALERTs.
362 Item 11: Allow skipping execution of Jobs
363 Date: 29 November 2005
364 Origin: Florian Schnabel <florian.schnabel at docufy dot de>
367 What: An easy option to skip a certain job on a certain date.
368 Why: You could then easily skip tape backups on holidays. Especially
369 if you got no autochanger and can only fit one backup on a tape
370 that would be really handy, other jobs could proceed normally
371 and you won't get errors that way.
373 Item 12: Add a scheduling syntax that permits weekly rotations
374 Date: 15 December 2006
375 Origin: Gregory Brauer (greg at wildbrain dot com)
378 What: Currently, Bacula only understands how to deal with weeks of the
379 month or weeks of the year in schedules. This makes it impossible
380 to do a true weekly rotation of tapes. There will always be a
381 discontinuity that will require disruptive manual intervention at
382 least monthly or yearly because week boundaries never align with
383 month or year boundaries.
385 A solution would be to add a new syntax that defines (at least)
386 a start timestamp, and repetition period.
388 Why: Rotated backups done at weekly intervals are useful, and Bacula
389 cannot currently do them without extensive hacking.
391 Notes: Here is an example syntax showing a 3-week rotation where full
392 Backups would be performed every week on Saturday, and an
393 incremental would be performed every week on Tuesday. Each
394 set of tapes could be removed from the loader for the following
395 two cycles before coming back and being reused on the third
396 week. Since the execution times are determined by intervals
397 from a given point in time, there will never be any issues with
398 having to adjust to any sort of arbitrary time boundary. In
399 the example provided, I even define the starting schedule
400 as crossing both a year and a month boundary, but the run times
401 would be based on the "Repeat" value and would therefore happen
406 Name = "Week 1 Rotation"
407 #Saturday. Would run Dec 30, Jan 20, Feb 10, etc.
411 Start = 2006-12-30 01:00
415 #Tuesday. Would run Jan 2, Jan 23, Feb 13, etc.
419 Start = 2007-01-02 01:00
426 Name = "Week 2 Rotation"
427 #Saturday. Would run Jan 6, Jan 27, Feb 17, etc.
431 Start = 2007-01-06 01:00
435 #Tuesday. Would run Jan 9, Jan 30, Feb 20, etc.
439 Start = 2007-01-09 01:00
446 Name = "Week 3 Rotation"
447 #Saturday. Would run Jan 13, Feb 3, Feb 24, etc.
451 Start = 2007-01-13 01:00
455 #Tuesday. Would run Jan 16, Feb 6, Feb 27, etc.
459 Start = 2007-01-16 01:00
465 Item 13: Archival (removal) of User Files to Tape
467 Origin: Ray Pengelly [ray at biomed dot queensu dot ca
470 What: The ability to archive data to storage based on certain parameters
471 such as age, size, or location. Once the data has been written to
472 storage and logged it is then pruned from the originating
473 filesystem. Note! We are talking about user's files and not
476 Why: This would allow fully automatic storage management which becomes
477 useful for large datastores. It would also allow for auto-staging
478 from one media type to another.
480 Example 1) Medical imaging needs to store large amounts of data.
481 They decide to keep data on their servers for 6 months and then put
482 it away for long term storage. The server then finds all files
483 older than 6 months writes them to tape. The files are then removed
486 Example 2) All data that hasn't been accessed in 2 months could be
487 moved from high-cost, fibre-channel disk storage to a low-cost
488 large-capacity SATA disk storage pool which doesn't have as quick of
489 access time. Then after another 6 months (or possibly as one
490 storage pool gets full) data is migrated to Tape.
492 Item 14: Cause daemons to use a specific IP address to source communications
493 Origin: Bill Moran <wmoran@collaborativefusion.com>
496 What: Cause Bacula daemons (dir, fd, sd) to always use the ip address
497 specified in the [DIR|DF|SD]Addr directive as the source IP
498 for initiating communication.
499 Why: On complex networks, as well as extremely secure networks, it's
500 not unusual to have multiple possible routes through the network.
501 Often, each of these routes is secured by different policies
502 (effectively, firewalls allow or deny different traffic depending
503 on the source address)
504 Unfortunately, it can sometimes be difficult or impossible to
505 represent this in a system routing table, as the result is
506 excessive subnetting that quickly exhausts available IP space.
507 The best available workaround is to provide multiple IPs to
508 a single machine that are all on the same subnet. In order
509 for this to work properly, applications must support the ability
510 to bind outgoing connections to a specified address, otherwise
511 the operating system will always choose the first IP that
512 matches the required route.
513 Notes: Many other programs support this. For example, the following
514 can be configured in BIND:
515 query-source address 10.0.0.1;
516 transfer-source 10.0.0.2;
517 Which means queries from this server will always come from
518 10.0.0.1 and zone transfers will always originate from
521 Item 15: Multiple threads in file daemon for the same job
522 Date: 27 November 2005
523 Origin: Ove Risberg (Ove.Risberg at octocode dot com)
526 What: I want the file daemon to start multiple threads for a backup
527 job so the fastest possible backup can be made.
529 The file daemon could parse the FileSet information and start
530 one thread for each File entry located on a separate
533 A confiuration option in the job section should be used to
534 enable or disable this feature. The confgutration option could
535 specify the maximum number of threads in the file daemon.
537 If the theads could spool the data to separate spool files
538 the restore process will not be much slower.
540 Why: Multiple concurrent backups of a large fileserver with many
541 disks and controllers will be much faster.
543 Item 16: Add Plug-ins to the FileSet Include statements.
544 Date: 28 October 2005
546 Status: Partially coded in 1.37 -- much more to do.
548 What: Allow users to specify wild-card and/or regular
549 expressions to be matched in both the Include and
550 Exclude directives in a FileSet. At the same time,
551 allow users to define plug-ins to be called (based on
552 regular expression/wild-card matching).
554 Why: This would give the users the ultimate ability to control
555 how files are backed up/restored. A user could write a
556 plug-in knows how to backup his Oracle database without
557 stopping/starting it, for example.
559 Item 17: Restore only file attributes (permissions, ACL, owner, group...)
560 Origin: Eric Bollengier
564 What: The goal of this project is to be able to restore only rights
565 and attributes of files without crushing them.
567 Why: Who have never had to repair a chmod -R 777, or a wild update
568 of recursive right under Windows? At this time, you must have
569 enough space to restore data, dump attributes (easy with acl,
570 more complex with unix/windows rights) and apply them to your
571 broken tree. With this options, it will be very easy to compare
572 right or ACL over the time.
574 Notes: If the file is here, we skip restore and we change rights.
575 If the file isn't here, we can create an empty one and apply
576 rights or do nothing.
578 Item 18: Quick release of FD-SD connection after backup.
579 Origin: Frank Volf (frank at deze dot org)
580 Date: 17 November 2005
581 Status: Done -- implemented by Kern -- in CVS 26Jan07
583 What: In the Bacula implementation a backup is finished after all data
584 and attributes are successfully written to storage. When using a
585 tape backup it is very annoying that a backup can take a day,
586 simply because the current tape (or whatever) is full and the
587 administrator has not put a new one in. During that time the
588 system cannot be taken off-line, because there is still an open
589 session between the storage daemon and the file daemon on the
592 Although this is a very good strategy for making "safe backups"
593 This can be annoying for e.g. laptops, that must remain
594 connected until the backup is completed.
596 Using a new feature called "migration" it will be possible to
597 spool first to harddisk (using a special 'spool' migration
598 scheme) and then migrate the backup to tape.
600 There is still the problem of getting the attributes committed.
601 If it takes a very long time to do, with the current code, the
602 job has not terminated, and the File daemon is not freed up. The
603 Storage daemon should release the File daemon as soon as all the
604 file data and all the attributes have been sent to it (the SD).
605 Currently the SD waits until everything is on tape and all the
606 attributes are transmitted to the Director before signaling
607 completion to the FD. I don't think I would have any problem
608 changing this. The reason is that even if the FD reports back to
609 the Dir that all is OK, the job will not terminate until the SD
610 has done the same thing -- so in a way keeping the SD-FD link
611 open to the very end is not really very productive ...
613 Why: Makes backup of laptops much faster.
615 Item 19: Implement a Python interface to the Bacula catalog.
616 Date: 28 October 2005
620 What: Implement an interface for Python scripts to access
621 the catalog through Bacula.
623 Why: This will permit users to customize Bacula through
626 Item 20: Archive data
628 Origin: calvin streeting calvin at absentdream dot com
631 What: The abilty to archive to media (dvd/cd) in a uncompressed format
632 for dead filing (archiving not backing up)
634 Why: At my works when jobs are finished and moved off of the main file
635 servers (raid based systems) onto a simple linux file server (ide based
636 system) so users can find old information without contacting the IT
639 So this data dosn't realy change it only gets added to,
640 But it also needs backing up. At the moment it takes
641 about 8 hours to back up our servers (working data) so
642 rather than add more time to existing backups i am trying
643 to implement a system where we backup the acrhive data to
644 cd/dvd these disks would only need to be appended to
645 (burn only new/changed files to new disks for off site
646 storage). basialy understand the differnce between
647 achive data and live data.
649 Notes: Scan the data and email me when it needs burning divide
650 into predifind chunks keep a recored of what is on what
651 disk make me a label (simple php->mysql=>pdf stuff) i
652 could do this bit ability to save data uncompresed so
653 it can be read in any other system (future proof data)
654 save the catalog with the disk as some kind of menu
657 Item 21: Split documentation
658 Origin: Maxx <maxxatworkat gmail dot com>
662 What: Split documentation in several books
664 Why: Bacula manual has now more than 600 pages, and looking for
665 implementation details is getting complicated. I think
666 it would be good to split the single volume in two or
669 1) Introduction, requirements and tutorial, typically
670 are useful only until first installation time
672 2) Basic installation and configuration, with all the
673 gory details about the directives supported 3)
674 Advanced Bacula: testing, troubleshooting, GUI and
675 ancillary programs, security managements, scripting,
679 Item 22: Implement support for stacking arbitrary stream filters, sinks.
680 Date: 23 November 2006
681 Origin: Landon Fuller <landonf@threerings.net>
682 Status: Planning. Assigned to landonf.
684 What: Implement support for the following:
685 - Stacking arbitrary stream filters (eg, encryption, compression,
686 sparse data handling))
687 - Attaching file sinks to terminate stream filters (ie, write out
688 the resultant data to a file)
689 - Refactor the restoration state machine accordingly
691 Why: The existing stream implementation suffers from the following:
692 - All state (compression, encryption, stream restoration), is
693 global across the entire restore process, for all streams. There are
694 multiple entry and exit points in the restoration state machine, and
695 thus multiple places where state must be allocated, deallocated,
696 initialized, or reinitialized. This results in exceptional complexity
697 for the author of a stream filter.
698 - The developer must enumerate all possible combinations of filters
699 and stream types (ie, win32 data with encryption, without encryption,
700 with encryption AND compression, etc).
702 Notes: This feature request only covers implementing the stream filters/
703 sinks, and refactoring the file daemon's restoration implementation
704 accordingly. If I have extra time, I will also rewrite the backup
705 implementation. My intent in implementing the restoration first is to
706 solve pressing bugs in the restoration handling, and to ensure that
707 the new restore implementation handles existing backups correctly.
709 I do not plan on changing the network or tape data structures to
710 support defining arbitrary stream filters, but supporting that
711 functionality is the ultimate goal.
713 Assistance with either code or testing would be fantastic.
715 Item 23: Implement from-client and to-client on restore command line.
716 Date: 11 December 2006
717 Origin: Discussion on Bacula-users entitled 'Scripted restores to
718 different clients', December 2006
719 Status: New feature request
721 What: While using bconsole interactively, you can specify the client
722 that a backup job is to be restored for, and then you can
723 specify later a different client to send the restored files
724 back to. However, using the 'restore' command with all options
725 on the command line, this cannot be done, due to the ambiguous
726 'client' parameter. Additionally, this parameter means different
727 things depending on if it's specified on the command line or
728 afterwards, in the Modify Job screens.
730 Why: This feature would enable restore jobs to be more completely
731 automated, for example by a web or GUI front-end.
733 Notes: client can also be implied by specifying the jobid on the command
736 Item 24: Add an override in Schedule for Pools based on backup types.
738 Origin: Chad Slater <chad.slater@clickfox.com>
741 What: Adding a FullStorage=BigTapeLibrary in the Schedule resource
742 would help those of us who use different storage devices for different
743 backup levels cope with the "auto-upgrade" of a backup.
745 Why: Assume I add several new device to be backed up, i.e. several
746 hosts with 1TB RAID. To avoid tape switching hassles, incrementals are
747 stored in a disk set on a 2TB RAID. If you add these devices in the
748 middle of the month, the incrementals are upgraded to "full" backups,
749 but they try to use the same storage device as requested in the
750 incremental job, filling up the RAID holding the differentials. If we
751 could override the Storage parameter for full and/or differential
752 backups, then the Full job would use the proper Storage device, which
753 has more capacity (i.e. a 8TB tape library.
755 Item 25: Implement huge exclude list support using hashing (dlists).
756 Date: 28 October 2005
758 Status: Done in 2.1.2 but was done with dlists (doubly linked lists
759 since hashing will not help. The huge list also supports
760 large include lists).
762 What: Allow users to specify very large exclude list (currently
763 more than about 1000 files is too many).
765 Why: This would give the users the ability to exclude all
766 files that are loaded with the OS (e.g. using rpms
767 or debs). If the user can restore the base OS from
768 CDs, there is no need to backup all those files. A
769 complete restore would be to restore the base OS, then
770 do a Bacula restore. By excluding the base OS files, the
771 backup set will be *much* smaller.
773 Item 26: Implement more Python events in Bacula.
774 Date: 28 October 2005
778 What: Allow Python scripts to be called at more places
779 within Bacula and provide additional access to Bacula
782 Why: This will permit users to customize Bacula through
790 Also add a way to get a listing of currently running
791 jobs (possibly also scheduled jobs).
794 Item 27: Incorporation of XACML2/SAML2 parsing
795 Date: 19 January 2006
796 Origin: Adam Thornton <athornton@sinenomine.net>
799 What: XACML is "eXtensible Access Control Markup Language" and
800 "SAML is the "Security Assertion Markup Language"--an XML standard
801 for making statements about identity and authorization. Having these
802 would give us a framework to approach ACLs in a generic manner, and
803 in a way flexible enough to support the four major sorts of ACLs I
804 see as a concern to Bacula at this point, as well as (probably) to
805 deal with new sorts of ACLs that may appear in the future.
807 Why: Bacula is beginning to need to back up systems with ACLs
808 that do not map cleanly onto traditional Unix permissions. I see
809 four sets of ACLs--in general, mutually incompatible with one
810 another--that we're going to need to deal with. These are: NTFS
811 ACLs, POSIX ACLs, NFSv4 ACLS, and AFS ACLS. (Some may question the
812 relevance of AFS; AFS is one of Sine Nomine's core consulting
813 businesses, and having a reputable file-level backup and restore
814 technology for it (as Tivoli is probably going to drop AFS support
815 soon since IBM no longer supports AFS) would be of huge benefit to
816 our customers; we'd most likely create the AFS support at Sine Nomine
817 for inclusion into the Bacula (and perhaps some changes to the
818 OpenAFS volserver) core code.)
820 Now, obviously, Bacula already handles NTFS just fine. However, I
821 think there's a lot of value in implementing a generic ACL model, so
822 that it's easy to support whatever particular instances of ACLs come
823 down the pike: POSIX ACLS (think SELinux) and NFSv4 are the obvious
824 things arriving in the Linux world in a big way in the near future.
825 XACML, although overcomplicated for our needs, provides this
826 framework, and we should be able to leverage other people's
827 implementations to minimize the amount of work *we* have to do to get
828 a generic ACL framework. Basically, the costs of implementation are
829 high, but they're largely both external to Bacula and already sunk.
831 Item 28: Filesystem watch triggered backup.
833 Origin: Jesper Krogh <jesper@krogh.cc>
834 Status: Unimplemented, depends probably on "client initiated backups"
836 What: With inotify and similar filesystem triggeret notification
837 systems is it possible to have the file-daemon to monitor
838 filesystem changes and initiate backup.
840 Why: There are 2 situations where this is nice to have.
841 1) It is possible to get a much finer-grained backup than
842 the fixed schedules used now.. A file created and deleted
843 a few hours later, can automatically be caught.
845 2) The introduced load on the system will probably be
846 distributed more even on the system.
848 Notes: This can be combined with configration that specifies
849 something like: "at most every 15 minutes or when changes
852 Kern Notes: I would rather see this implemented by an external program
853 that monitors the Filesystem changes, then uses the console
854 to start the appropriate job.
856 Item 29: Allow inclusion/exclusion of files in a fileset by creation/mod times
857 Origin: Evan Kaufman <evan.kaufman@gmail.com>
858 Date: January 11, 2006
861 What: In the vein of the Wild and Regex directives in a Fileset's
862 Options, it would be helpful to allow a user to include or exclude
863 files and directories by creation or modification times.
865 You could factor the Exclude=yes|no option in much the same way it
866 affects the Wild and Regex directives. For example, you could exclude
867 all files modified before a certain date:
871 Modified Before = ####
874 Or you could exclude all files created/modified since a certain date:
878 Created Modified Since = ####
881 The format of the time/date could be done several ways, say the number
882 of seconds since the epoch:
883 1137008553 = Jan 11 2006, 1:42:33PM # result of `date +%s`
885 Or a human readable date in a cryptic form:
886 20060111134233 = Jan 11 2006, 1:42:33PM # YYYYMMDDhhmmss
888 Why: I imagine a feature like this could have many uses. It would
889 allow a user to do a full backup while excluding the base operating
890 system files, so if I installed a Linux snapshot from a CD yesterday,
891 I'll *exclude* all files modified *before* today. If I need to
892 recover the system, I use the CD I already have, plus the tape backup.
893 Or if, say, a Windows client is hit by a particularly corrosive
894 virus, and I need to *exclude* any files created/modified *since* the
897 Notes: Of course, this feature would work in concert with other
898 in/exclude rules, and wouldnt override them (or each other).
900 Notes: The directives I'd imagine would be along the lines of
901 "[Created] [Modified] [Before|Since] = <date>".
902 So one could compare against 'ctime' and/or 'mtime', but ONLY 'before'
906 Item 30: Tray monitor window cleanups
907 Origin: Alan Brown ajb2 at mssl dot ucl dot ac dot uk
910 What: Resizeable and scrollable windows in the tray monitor.
912 Why: With multiple clients, or with many jobs running, the displayed
913 window often ends up larger than the available screen, making
914 the trailing items difficult to read.
917 Item 31: Implement multiple numeric backup levels as supported by dump
919 Origin: Daniel Rich <drich@employees.org>
921 What: Dump allows specification of backup levels numerically instead of just
922 "full", "incr", and "diff". In this system, at any given level, all
923 files are backed up that were were modified since the last backup of a
924 higher level (with 0 being the highest and 9 being the lowest). A
925 level 0 is therefore equivalent to a full, level 9 an incremental, and
926 the levels 1 through 8 are varying levels of differentials. For
927 bacula's sake, these could be represented as "full", "incr", and
928 "diff1", "diff2", etc.
930 Why: Support of multiple backup levels would provide for more advanced backup
931 rotation schemes such as "Towers of Hanoi". This would allow better
932 flexibility in performing backups, and can lead to shorter recover
935 Notes: Legato Networker supports a similar system with full, incr, and 1-9 as
938 Item 32: Automatic promotion of backup levels
939 Date: 19 January 2006
940 Origin: Adam Thornton <athornton@sinenomine.net>
943 What: Amanda has a feature whereby it estimates the space that a
944 differential, incremental, and full backup would take. If the
945 difference in space required between the scheduled level and the next
946 level up is beneath some user-defined critical threshold, the backup
947 level is bumped to the next type. Doing this minimizes the number of
948 volumes necessary during a restore, with a fairly minimal cost in
951 Why: I know at least one (quite sophisticated and smart) user
952 for whom the absence of this feature is a deal-breaker in terms of
953 using Bacula; if we had it it would eliminate the one cool thing
954 Amanda can do and we can't (at least, the one cool thing I know of).
956 Item 33: Clustered file-daemons
957 Origin: Alan Brown ajb2 at mssl dot ucl dot ac dot uk
960 What: A "virtual" filedaemon, which is actually a cluster of real ones.
962 Why: In the case of clustered filesystems (SAN setups, GFS, or OCFS2, etc)
963 multiple machines may have access to the same set of filesystems
965 For performance reasons, one may wish to initate backups from
966 several of these machines simultaneously, instead of just using
967 one backup source for the common clustered filesystem.
969 For obvious reasons, normally backups of $A-FD/$PATH and
970 B-FD/$PATH are treated as different backup sets. In this case
971 they are the same communal set.
973 Likewise when restoring, it would be easier to just specify
974 one of the cluster machines and let bacula decide which to use.
976 This can be faked to some extent using DNS round robin entries
977 and a virtual IP address, however it means "status client" will
978 always give bogus answers. Additionally there is no way of
979 spreading the load evenly among the servers.
981 What is required is something similar to the storage daemon
982 autochanger directives, so that Bacula can keep track of
983 operating backups/restores and direct new jobs to a "free"
988 Item 34: Commercial database support
989 Origin: Russell Howe <russell_howe dot wreckage dot org>
993 What: It would be nice for the database backend to support more
994 databases. I'm thinking of SQL Server at the moment, but I guess Oracle,
995 DB2, MaxDB, etc are all candidates. SQL Server would presumably be
996 implemented using FreeTDS or maybe an ODBC library?
998 Why: We only really have one database server, which is MS SQL Server
999 2000. Maintaining a second one for the backup software (we grew out of
1000 SQLite, which I liked, but which didn't work so well with our database
1001 size). We don't really have a machine with the resources to run
1002 postgres, and would rather only maintain a single DBMS. We're stuck with
1003 SQL Server because pretty much all the company's custom applications
1004 (written by consultants) are locked into SQL Server 2000. I can imagine
1005 this scenario is fairly common, and it would be nice to use the existing
1006 properly specced database server for storing Bacula's catalog, rather
1007 than having to run a second DBMS.
1009 Item 35: Automatic disabling of devices
1011 Origin: Peter Eriksson <peter at ifm.liu dot se>
1014 What: After a configurable amount of fatal errors with a tape drive
1015 Bacula should automatically disable further use of a certain
1016 tape drive. There should also be "disable"/"enable" commands in
1017 the "bconsole" tool.
1019 Why: On a multi-drive jukebox there is a possibility of tape drives
1020 going bad during large backups (needing a cleaning tape run,
1021 tapes getting stuck). It would be advantageous if Bacula would
1022 automatically disable further use of a problematic tape drive
1023 after a configurable amount of errors has occurred.
1025 An example: I have a multi-drive jukebox (6 drives, 380+ slots)
1026 where tapes occasionally get stuck inside the drive. Bacula will
1027 notice that the "mtx-changer" command will fail and then fail
1028 any backup jobs trying to use that drive. However, it will still
1029 keep on trying to run new jobs using that drive and fail -
1030 forever, and thus failing lots and lots of jobs... Since we have
1031 many drives Bacula could have just automatically disabled
1032 further use of that drive and used one of the other ones
1035 Item 36: An option to operate on all pools with update vol parameters
1036 Origin: Dmitriy Pinchukov <absh@bossdev.kiev.ua>
1037 Date: 16 August 2006
1040 What: When I do update -> Volume parameters -> All Volumes
1041 from Pool, then I have to select pools one by one. I'd like
1042 console to have an option like "0: All Pools" in the list of
1045 Why: I have many pools and therefore unhappy with manually
1046 updating each of them using update -> Volume parameters -> All
1047 Volumes from Pool -> pool #.
1049 Item 37: Add an item to the restore option where you can select a pool
1050 Origin: kshatriyak at gmail dot com
1054 What: In the restore option (Select the most recent backup for a
1055 client) it would be useful to add an option where you can limit
1056 the selection to a certain pool.
1058 Why: When using cloned jobs, most of the time you have 2 pools - a
1059 disk pool and a tape pool. People who have 2 pools would like to
1060 select the most recent backup from disk, not from tape (tape
1061 would be only needed in emergency). However, the most recent
1062 backup (which may just differ a second from the disk backup) may
1063 be on tape and would be selected. The problem becomes bigger if
1064 you have a full and differential - the most "recent" full backup
1065 may be on disk, while the most recent differential may be on tape
1066 (though the differential on disk may differ even only a second or
1067 so). Bacula will complain that the backups reside on different
1068 media then. For now the only solution now when restoring things
1069 when you have 2 pools is to manually search for the right
1070 job-id's and enter them by hand, which is a bit fault tolerant.
1072 Item 38: Include timestamp of job launch in "stat clients" output
1073 Origin: Mark Bergman <mark.bergman@uphs.upenn.edu>
1074 Date: Tue Aug 22 17:13:39 EDT 2006
1077 What: The "stat clients" command doesn't include any detail on when
1078 the active backup jobs were launched.
1080 Why: Including the timestamp would make it much easier to decide whether
1081 a job is running properly.
1083 Notes: It may be helpful to have the output from "stat clients" formatted
1084 more like that from "stat dir" (and other commands), in a column
1085 format. The per-client information that's currently shown (level,
1086 client name, JobId, Volume, pool, device, Files, etc.) is good, but
1087 somewhat hard to parse (both programmatically and visually),
1088 particularly when there are many active clients.
1091 Item 39: Message mailing based on backup types
1092 Origin: Evan Kaufman <evan.kaufman@gmail.com>
1093 Date: January 6, 2006
1096 What: In the "Messages" resource definitions, allowing messages
1097 to be mailed based on the type (backup, restore, etc.) and level
1098 (full, differential, etc) of job that created the originating
1101 Why: It would, for example, allow someone's boss to be emailed
1102 automatically only when a Full Backup job runs, so he can
1103 retrieve the tapes for offsite storage, even if the IT dept.
1104 doesn't (or can't) explicitly notify him. At the same time, his
1105 mailbox wouldnt be filled by notifications of Verifies, Restores,
1106 or Incremental/Differential Backups (which would likely be kept
1109 Notes: One way this could be done is through additional message types, for example:
1112 # email the boss only on full system backups
1113 Mail = boss@mycompany.com = full, !incremental, !differential, !restore,
1115 # email us only when something breaks
1116 MailOnError = itdept@mycompany.com = all
1120 Item 40: Include JobID in spool file name ****DONE****
1121 Origin: Mark Bergman <mark.bergman@uphs.upenn.edu>
1122 Date: Tue Aug 22 17:13:39 EDT 2006
1123 Status: Done. (patches/testing/project-include-jobid-in-spool-name.patch)
1124 No need to vote for this item.
1126 What: Change the name of the spool file to include the JobID
1128 Why: JobIDs are the common key used to refer to jobs, yet the
1129 spoolfile name doesn't include that information. The date/time
1130 stamp is useful (and should be retained).
1132 ============= New Freature Requests after vote of 26 Jan 2007 ========
1133 Item n: Enable to relocate files and directories when restoring
1135 Origin: Eric Bollengier <eric@eb.homelinux.org>
1138 What: The where= option is not powerful enough. It will be
1139 a great feature if bacula can restore a file in the
1140 same directory, but with a different name, or in
1141 an other directory without recreating the full path.
1143 Why: When i want to restore a production environment to a
1144 development environment, i just want change the first
1145 directory. ie restore /prod/data/file.dat to /rect/data/file.dat.
1146 At this time, i have to move by hand files. You must have a big
1147 dump space to restore and move data after.
1149 When i use Linux or SAN snapshot, i mount them to /mnt/snap_xxx
1150 so, when a restore a file, i have to move by hand
1151 from /mnt/snap_xxx/file to /xxx/file. I can't replace a file
1154 When a user ask me to restore a file in its personal folder,
1155 (without replace the existing one), i can't restore from
1156 my_file.txt to my_file.txt.old witch is very practical.
1159 Notes: I think we can enhance the where= option very easily by
1160 allowing regexp expression. (by replacing bregex by libpcre
1161 see http://en.wikipedia.org/wiki/PCRE and http://www.pcre.org/)
1163 Since, many users think that regexp are not user friendly, i think
1164 that bat, bconsole or brestore must provide a simple way to
1165 configure where= option (i think to something like in
1166 openoffice "search and replace").
1168 Ie, if user uses where=/tmp/bacula-restore, we keep the old
1171 If user uses something like where=s!/prod!/test!, files will
1172 be restored from /prod/xxx to /test/xxx.
1174 If user uses something like where=s/$/.old/, files will
1175 be restored from /prod/xxx.txt to /prod/xxx.txt.old.
1177 If user uses something like where=s/txt$/old.txt/, files will
1178 be restored from /prod/xxx.txt to /prod/xxx.old.txt
1180 if user uses something like where=s/([a-z]+)$/old.$1/, files will
1181 be restored from /prod/xxx.ext to /prod/xxx.old.ext
1183 Item n: Implement Catalog directive for Pool resource in Director
1185 Origin: Alan Davis adavis@ruckus.com
1189 What: The current behavior is for the director to create all pools
1190 found in the configuration file in all catalogs. Add a
1191 Catalog directive to the Pool resource to specify which
1192 catalog to use for each pool definition.
1194 Why: This allows different catalogs to have different pool
1195 attributes and eliminates the side-effect of adding
1196 pools to catalogs that don't need/use them.
1201 Item n: Implement NDMP protocol support
1206 What: Network Data Management Protocol is implemented by a number of
1207 NAS filer vendors to enable backups using third-party
1210 Why: This would allow NAS filer backups in Bacula without incurring
1211 the overhead of NFS or SBM/CIFS.
1213 Notes: Further information is available:
1215 http://www.ndmp.org/wp/wp.shtml
1216 http://www.traakan.com/ndmjob/index.html
1218 There are currently no viable open-source NDMP
1219 implementations. There is a reference SDK and example
1220 app available from ndmp.org but it has problems
1221 compiling on recent Linux and Solaris OS'. The ndmjob
1222 reference implementation from Traakan is known to
1223 compile on Solaris 10.
1225 Notes (Kern): I am not at all in favor of this until NDMP becomes
1226 an Open Standard or until there are Open Source libraries
1227 that interface to it.
1228 ============= Empty Feature Request form ===========
1229 Item n: One line summary ...
1230 Date: Date submitted
1231 Origin: Name and email of originator.
1234 What: More detailed explanation ...
1236 Why: Why it is important ...
1238 Notes: Additional notes or features (omit if not used)
1239 ============== End Feature Request form ==============