3 Bacula Projects Roadmap
4 Status updated 8 August 2010
9 Item 1: Ability to restart failed jobs
14 Item 6: Include timestamp of job launch in "stat clients" output
15 Item 7: Include all conf files in specified directory
16 Item 8: Reduction of communications bandwidth for a backup
17 Item 9: Concurrent spooling and despooling within a single job.
18 Item 10: Start spooling even when waiting on tape
19 Item 11: Add ability to Verify any specified Job.
20 Item 12: Data encryption on storage daemon
21 Item 13: Possibilty to schedule Jobs on last Friday of the month
22 Item 14: Scheduling syntax that permits more flexibility and options
23 Item 15: Ability to defer Batch Insert to a later time
24 Item 16: Add MaxVolumeSize/MaxVolumeBytes to Storage resource
25 Item 17: Message mailing based on backup types
26 Item 18: Handle Windows Encrypted Files using Win raw encryption
27 Item 19: Job migration between different SDs
28 Item 19. Allow FD to initiate a backup
29 Item 20: Implement Storage daemon compression
30 Item 21: Ability to import/export Bacula database entities
31 Item 22: Implementation of running Job speed limit.
32 Item 23: Add an override in Schedule for Pools based on backup types
33 Item 24: Automatic promotion of backup levels based on backup size
34 Item 25: Allow FileSet inclusion/exclusion by creation/mod times
35 Item 26: Archival (removal) of User Files to Tape
36 Item 27: Ability to reconnect a disconnected comm line
37 Item 28: Multiple threads in file daemon for the same job
38 Item 29: Automatic disabling of devices
39 Item 30: Enable persistent naming/number of SQL queries
40 Item 31: Bacula Dir, FD and SD to support proxies
41 Item 32: Add Minumum Spool Size directive
42 Item 33: Command that releases all drives in an autochanger
43 Item 34: Run bscan on a remote storage daemon from within bconsole.
44 Item 35: Implement a Migration job type that will create a reverse
45 Item 36: Extend the verify code to make it possible to verify
46 Item 37: Separate "Storage" and "Device" in the bacula-dir.conf
47 Item 38: Least recently used device selection for tape drives in autochanger.
48 Item 39: Implement a Storage device like Amazon's S3.
49 Item 40: Convert tray monitor on Windows to a stand alone program
50 Item 41: Improve Bacula's tape and drive usage and cleaning management
51 Item 42: Relabel disk volume after recycling
54 Item 1: Ability to restart failed jobs
59 What: Often jobs fail because of a communications line drop or max run time,
60 cancel, or some other non-critical problem. Currrently any data
61 saved is lost. This implementation should modify the Storage daemon
62 so that it saves all the files that it knows are completely backed
65 The jobs should then be marked as incomplete and a subsequent
66 Incremental Accurate backup will then take into account all the
69 Why: Avoids backuping data already saved.
71 Notes: Requires Accurate to restart correctly. Must completed have a minimum
72 volume of data or files stored on Volume before enabling.
79 What: Various ideas for redesigns planned for the SD:
80 1. One thread per drive
81 2. Design a class structure for all objects in the SD.
82 3. Make Device into C++ classes for each device type
83 4. Make Device have a proxy (front end intercept class) that will permit control over locking and changing the real device pointer. It can also permit delaying opening, so that we can adapt to having another program that tells us the Archive device name.
84 5. Allow plugins to create new on the fly devices
85 6. Separate SD volume manager
86 7. Volume manager tells Bacula what drive or device to use for a given volume
88 Why: It will simplify the SD, make it more modular, reduce locking
89 conflicts, and allow multiple buffer backups.
93 Origin: Bacula Systems
94 Status: Enterprise only if implemented by Bacula Systems
96 What: Backup/restore via NDMP -- most important NetApp compatibility
100 Origin: Bacula Systems
101 Status: Enterprise only if implemented by Bacula Systems
103 What: Backup/restore SAP databases (MaxDB, Oracle, possibly DB2)
105 Item 5: Oracle backup
107 Origin: Bacula Systems
108 Status: Enterprise only if implemented by Bacula Systems
110 What: Backup/restore Oracle databases
112 Item 6: Include timestamp of job launch in "stat clients" output
113 Origin: Mark Bergman <mark.bergman@uphs.upenn.edu>
114 Date: Tue Aug 22 17:13:39 EDT 2006
117 What: The "stat clients" command doesn't include any detail on when
118 the active backup jobs were launched.
120 Why: Including the timestamp would make it much easier to decide whether
121 a job is running properly.
123 Notes: It may be helpful to have the output from "stat clients" formatted
124 more like that from "stat dir" (and other commands), in a column
125 format. The per-client information that's currently shown (level,
126 client name, JobId, Volume, pool, device, Files, etc.) is good, but
127 somewhat hard to parse (both programmatically and visually),
128 particularly when there are many active clients.
131 Item 7: Include all conf files in specified directory
132 Date: 18 October 2008
133 Origin: Database, Lda. Maputo, Mozambique
134 Contact:Cameron Smith / cameron.ord@database.co.mz
137 What: A directive something like "IncludeConf = /etc/bacula/subconfs" Every
138 time Bacula Director restarts or reloads, it will walk the given
139 directory (non-recursively) and include the contents of any files
140 therein, as though they were appended to bacula-dir.conf
142 Why: Permits simplified and safer configuration for larger installations with
143 many client PCs. Currently, through judicious use of JobDefs and
144 similar directives, it is possible to reduce the client-specific part of
145 a configuration to a minimum. The client-specific directives can be
146 prepared according to a standard template and dropped into a known
147 directory. However it is still necessary to add a line to the "master"
148 (bacula-dir.conf) referencing each new file. This exposes the master to
149 unnecessary risk of accidental mistakes and makes automation of adding
150 new client-confs, more difficult (it is easier to automate dropping a
151 file into a dir, than rewriting an existing file). Ken has previously
152 made a convincing argument for NOT including Bacula's core configuration
153 in an RDBMS, but I believe that the present request is a reasonable
154 extension to the current "flat-file-based" configuration philosophy.
156 Notes: There is NO need for any special syntax to these files. They should
157 contain standard directives which are simply "inlined" to the parent
158 file as already happens when you explicitly reference an external file.
160 Notes: (kes) this can already be done with scripting
161 From: John Jorgensen <jorgnsn@lcd.uregina.ca>
162 The bacula-dir.conf at our site contains these lines:
165 # Include subfiles associated with configuration of clients.
166 # They define the bulk of the Clients, Jobs, and FileSets.
168 @|"sh -c 'for f in /etc/bacula/clientdefs/*.conf ; do echo @${f} ; done'"
170 and when we get a new client, we just put its configuration into
171 a new file called something like:
173 /etc/bacula/clientdefs/clientname.conf
178 Item 8: Reduction of communications bandwidth for a backup
179 Date: 14 October 2008
180 Origin: Robin O'Leary (Equiinet)
183 What: Using rdiff techniques, Bacula could significantly reduce
184 the network data transfer volume to do a backup.
186 Why: Faster backup across the Internet
188 Notes: This requires retaining certain data on the client during a Full
189 backup that will speed up subsequent backups.
192 Item 9: Concurrent spooling and despooling within a single job.
194 Origin: Jesper Krogh <jesper@krogh.cc>
196 What: When a job has spooling enabled and the spool area size is
197 less than the total volumes size the storage daemon will:
198 1) Spool to spool-area
200 3) Go to 1 if more data to be backed up.
202 Typical disks will serve data with a speed of 100MB/s when
203 dealing with large files, network it typical capable of doing 115MB/s
204 (GbitE). Tape drives will despool with 50-90MB/s (LTO3) 70-120MB/s
205 (LTO4) depending on compression and data.
207 As bacula currently works it'll hold back data from the client until
208 de-spooling is done, now matter if the spool area can handle another
209 block of data. Say given a FileSet of 4TB and a spool-area of 100GB and
210 a Maximum Job Spool Size set to 50GB then above sequence could be
211 changed to allow to spool to the other 50GB while despooling the first
212 50GB and not holding back the client while doing it. As above numbers
213 show, depending on tape-drive and disk-arrays this potentially leads to
214 a cut of the backup-time of 50% for the individual jobs.
216 Real-world example, backing up 112.6GB (large files) to LTO4 tapes
217 (despools with ~75MB/s, data is gzipped on the remote filesystem.
218 Maximum Job Spool Size = 8GB
222 Elapsed time (total time): 46m 15s => 2775s
223 Despooling time: 25m 41s => 1541s (55%)
224 Spooling time: 20m 34s => 1234s (45%)
225 Reported speed: 40.58MB/s
226 Spooling speed: 112.6GB/1234s => 91.25MB/s
227 Despooling speed: 112.6GB/1541s => 73.07MB/s
229 So disk + net can "keep up" with the LTO4 drive (in this test)
231 Prosed change would effectively make the backup run in the "despooling
232 time" 1541s giving a reduction to 55% of the total run time.
234 In the situation where the individual job cannot keep up with LTO-drive
235 spooling enables efficient multiplexing of multiple concurrent jobs onto
238 Why: When dealing with larger volumes the general utillization of the
239 network/disk is important to maximize in order to be able to run a full
240 backup over a weekend. Current work-around is to split the FileSet in
241 smaller FileSet and Jobs but that leads to more configuration mangement
242 and is harder to review for completeness. Subsequently it makes restores
247 Item 10: Start spooling even when waiting on tape
248 Origin: Tobias Barth <tobias.barth@web-arts.com>
252 What: If a job can be spooled to disk before writing it to tape, it should
253 be spooled immediately. Currently, bacula waits until the correct
254 tape is inserted into the drive.
256 Why: It could save hours. When bacula waits on the operator who must insert
257 the correct tape (e.g. a new tape or a tape from another media
258 pool), bacula could already prepare the spooled data in the spooling
259 directory and immediately start despooling when the tape was
260 inserted by the operator.
262 2nd step: Use 2 or more spooling directories. When one directory is
263 currently despooling, the next (on different disk drives) could
264 already be spooling the next data.
266 Notes: I am using bacula 2.2.8, which has none of those features
270 Item 11: Add ability to Verify any specified Job.
271 Date: 17 January 2008
272 Origin: portrix.net Hamburg, Germany.
273 Contact: Christian Sabelmann
274 Status: 70% of the required Code is part of the Verify function since v. 2.x
277 The ability to tell Bacula which Job should verify instead of
278 automatically verify just the last one.
281 It is sad that such a powerfull feature like Verify Jobs
282 (VolumeToCatalog) is restricted to be used only with the last backup Job
283 of a client. Actual users who have to do daily Backups are forced to
284 also do daily Verify Jobs in order to take advantage of this useful
285 feature. This Daily Verify after Backup conduct is not always desired
286 and Verify Jobs have to be sometimes scheduled. (Not necessarily
287 scheduled in Bacula). With this feature Admins can verify Jobs once a
288 Week or less per month, selecting the Jobs they want to verify. This
289 feature is also not to difficult to implement taking in account older bug
290 reports about this feature and the selection of the Job to be verified.
292 Notes: For the verify Job, the user could select the Job to be verified
293 from a List of the latest Jobs of a client. It would also be possible to
294 verify a certain volume. All of these would naturaly apply only for
295 Jobs whose file information are still in the catalog.
298 Item 12: Data encryption on storage daemon
299 Origin: Tobias Barth <tobias.barth at web-arts.com>
300 Date: 04 February 2009
303 What: The storage demon should be able to do the data encryption that can
304 currently be done by the file daemon.
306 Why: This would have 2 advantages:
307 1) one could encrypt the data of unencrypted tapes by doing a
309 2) the storage daemon would be the only machine that would have
310 to keep the encryption keys.
313 As an addendum to the feature request, here are some crypto
314 implementation details I wrote up regarding SD-encryption back in Jan
316 http://www.mail-archive.com/bacula-users@lists.sourceforge.net/msg28860.html
320 Item 13: Possibilty to schedule Jobs on last Friday of the month
321 Origin: Carsten Menke <bootsy52 at gmx dot net>
325 What: Currently if you want to run your monthly Backups on the last
326 Friday of each month this is only possible with workarounds (e.g
327 scripting) (As some months got 4 Fridays and some got 5 Fridays)
328 The same is true if you plan to run your yearly Backups on the
329 last Friday of the year. It would be nice to have the ability to
330 use the builtin scheduler for this.
332 Why: In many companies the last working day of the week is Friday (or
333 Saturday), so to get the most data of the month onto the monthly
334 tape, the employees are advised to insert the tape for the
335 monthly backups on the last friday of the month.
337 Notes: To give this a complete functionality it would be nice if the
338 "first" and "last" Keywords could be implemented in the
339 scheduler, so it is also possible to run monthy backups at the
340 first friday of the month and many things more. So if the syntax
341 would expand to this {first|last} {Month|Week|Day|Mo-Fri} of the
342 {Year|Month|Week} you would be able to run really flexible jobs.
344 To got a certain Job run on the last Friday of the Month for example
347 Run = pool=Monthly last Fri of the Month at 23:50
351 Run = pool=Yearly last Fri of the Year at 23:50
353 ## Certain Jobs the last Week of a Month
355 Run = pool=LastWeek last Week of the Month at 23:50
357 ## Monthly Backup on the last day of the month
359 Run = pool=Monthly last Day of the Month at 23:50
361 Item 14: Scheduling syntax that permits more flexibility and options
362 Date: 15 December 2006
363 Origin: Gregory Brauer (greg at wildbrain dot com) and
364 Florian Schnabel <florian.schnabel at docufy dot de>
367 What: Currently, Bacula only understands how to deal with weeks of the
368 month or weeks of the year in schedules. This makes it impossible
369 to do a true weekly rotation of tapes. There will always be a
370 discontinuity that will require disruptive manual intervention at
371 least monthly or yearly because week boundaries never align with
372 month or year boundaries.
374 A solution would be to add a new syntax that defines (at least)
375 a start timestamp, and repetition period.
377 An easy option to skip a certain job on a certain date.
380 Why: Rotated backups done at weekly intervals are useful, and Bacula
381 cannot currently do them without extensive hacking.
383 You could then easily skip tape backups on holidays. Especially
384 if you got no autochanger and can only fit one backup on a tape
385 that would be really handy, other jobs could proceed normally
386 and you won't get errors that way.
389 Notes: Here is an example syntax showing a 3-week rotation where full
390 Backups would be performed every week on Saturday, and an
391 incremental would be performed every week on Tuesday. Each
392 set of tapes could be removed from the loader for the following
393 two cycles before coming back and being reused on the third
394 week. Since the execution times are determined by intervals
395 from a given point in time, there will never be any issues with
396 having to adjust to any sort of arbitrary time boundary. In
397 the example provided, I even define the starting schedule
398 as crossing both a year and a month boundary, but the run times
399 would be based on the "Repeat" value and would therefore happen
404 Name = "Week 1 Rotation"
405 #Saturday. Would run Dec 30, Jan 20, Feb 10, etc.
409 Start = 2006-12-30 01:00
413 #Tuesday. Would run Jan 2, Jan 23, Feb 13, etc.
417 Start = 2007-01-02 01:00
424 Name = "Week 2 Rotation"
425 #Saturday. Would run Jan 6, Jan 27, Feb 17, etc.
429 Start = 2007-01-06 01:00
433 #Tuesday. Would run Jan 9, Jan 30, Feb 20, etc.
437 Start = 2007-01-09 01:00
444 Name = "Week 3 Rotation"
445 #Saturday. Would run Jan 13, Feb 3, Feb 24, etc.
449 Start = 2007-01-13 01:00
453 #Tuesday. Would run Jan 16, Feb 6, Feb 27, etc.
457 Start = 2007-01-16 01:00
463 Notes: Kern: I have merged the previously separate project of skipping
464 jobs (via Schedule syntax) into this.
467 Item 15: Ability to defer Batch Insert to a later time
472 What: Instead of doing a Job Batch Insert at the end of the Job
473 which might create resource contention with lots of Job,
474 defer the insert to a later time.
476 Why: Permits to focus on getting the data on the Volume and
477 putting the metadata into the Catalog outside the backup
480 Notes: Will use the proposed Bacula ASCII database import/export
481 format (i.e. dependent on the import/export entities project).
484 Item 16: Add MaxVolumeSize/MaxVolumeBytes to Storage resource
485 Origin: Bastian Friedrich <bastian.friedrich@collax.com>
489 What: SD has a "Maximum Volume Size" statement, which is deprecated and
490 superseded by the Pool resource statement "Maximum Volume Bytes".
491 It would be good if either statement could be used in Storage
494 Why: Pools do not have to be restricted to a single storage type/device;
495 thus, it may be impossible to define Maximum Volume Bytes in the
496 Pool resource. The old MaxVolSize statement is deprecated, as it
497 is SD side only. I am using the same pool for different devices.
499 Notes: State of idea currently unknown. Storage resources in the dir
500 config currently translate to very slim catalog entries; these
501 entries would require extensions to implement what is described
502 here. Quite possibly, numerous other statements that are currently
503 available in Pool resources could be used in Storage resources too
507 Item 17: Message mailing based on backup types
508 Origin: Evan Kaufman <evan.kaufman@gmail.com>
509 Date: January 6, 2006
512 What: In the "Messages" resource definitions, allowing messages
513 to be mailed based on the type (backup, restore, etc.) and level
514 (full, differential, etc) of job that created the originating
517 Why: It would, for example, allow someone's boss to be emailed
518 automatically only when a Full Backup job runs, so he can
519 retrieve the tapes for offsite storage, even if the IT dept.
520 doesn't (or can't) explicitly notify him. At the same time, his
521 mailbox wouldnt be filled by notifications of Verifies, Restores,
522 or Incremental/Differential Backups (which would likely be kept
525 Notes: One way this could be done is through additional message types, for
529 # email the boss only on full system backups
530 Mail = boss@mycompany.com = full, !incremental, !differential, !restore,
532 # email us only when something breaks
533 MailOnError = itdept@mycompany.com = all
536 Notes: Kern: This should be rather trivial to implement.
539 Item 18: Handle Windows Encrypted Files using Win raw encryption
540 Origin: Michael Mohr, SAG Mohr.External@infineon.com
541 Date: 22 February 2008
542 Origin: Alex Ehrlich (Alex.Ehrlich-at-mail.ee)
546 What: Make it possible to backup and restore Encypted Files from and to
547 Windows systems without the need to decrypt it by using the raw
548 encryption functions API (see:
549 http://msdn2.microsoft.com/en-us/library/aa363783.aspx)
550 that is provided for that reason by Microsoft.
551 If a file ist encrypted could be examined by evaluating the
552 FILE_ATTRIBUTE_ENCRYTED flag of the GetFileAttributes
554 For each file backed up or restored by FD on Windows, check if
555 the file is encrypted; if so then use OpenEncryptedFileRaw,
556 ReadEncryptedFileRaw, WriteEncryptedFileRaw,
557 CloseEncryptedFileRaw instead of BackupRead and BackupWrite
560 Why: Without the usage of this interface the fd-daemon running
561 under the system account can't read encypted Files because
562 the key needed for the decrytion is missed by them. As a result
563 actually encrypted files are not backed up
564 by bacula and also no error is shown while missing these files.
566 Notes: Using xxxEncryptedFileRaw API would allow to backup and
567 restore EFS-encrypted files without decrypting their data.
568 Note that such files cannot be restored "portably" (at least,
569 easily) but they would be restoreable to a different (or
570 reinstalled) Win32 machine; the restore would require setup
571 of a EFS recovery agent in advance, of course, and this shall
572 be clearly reflected in the documentation, but this is the
573 normal Windows SysAdmin's business.
574 When "portable" backup is requested the EFS-encrypted files
575 shall be clearly reported as errors.
576 See MSDN on the "Backup and Restore of Encrypted Files" topic:
577 http://msdn.microsoft.com/en-us/library/aa363783.aspx
578 Maybe the EFS support requires a new flag in the database for
580 Unfortunately, the implementation is not as straightforward as
581 1-to-1 replacement of BackupRead with ReadEncryptedFileRaw,
582 requiring some FD code rewrite to work with
583 encrypted-file-related callback functions.
585 Item 19: Job migration between different SDs
586 Origin: Mariusz Czulada <manieq AT wp DOT eu>
590 What: Allow to specify in migration job devices on Storage Daemon other then
591 the one used for migrated jobs (possibly on different/distant host)
593 Why: Sometimes we have more then one system which requires backup
594 implementation. Often, these systems are functionally unrelated and
595 placed in different locations. Having a big backup device (a tape
596 library) in each location is not cost-effective. It would be much
597 better to have one powerful enough tape library which could handle
598 backups from all systems, assuming relatively fast and reliable WAN
599 connections. In such architecture backups are done in service windows
600 on local bacula servers, then migrated to central storage off the peak
603 Notes: If migration to different SD is working, migration to the same SD, as
604 now, could be done the same way (i mean 'localhost') to unify the
607 Item 19. Allow FD to initiate a backup
608 Origin: Frank Volf (frank at deze dot org)
609 Date: 17 November 2005
612 What: Provide some means, possibly by a restricted console that
613 allows a FD to initiate a backup, and that uses the connection
614 established by the FD to the Director for the backup so that
615 a Director that is firewalled can do the backup.
616 Why: Makes backup of laptops much easier.
617 Notes: - The FD already has code for the monitor interface
618 - It could be nice to have a .job command that lists authorized
620 - Commands need to be restricted on the Director side
621 (for example by re-using the runscript flag)
622 - The Client resource can be used to authorize the connection
623 - In a first time, the client can't modify job parameters
624 - We need a way to run a status command to follow job progression
626 This project consists of the following points
627 1. Modify the FD to have a "mini-console" interface that
628 permits it to connect to the Director and start a
629 backup job of itself.
630 2. The list of jobs that can be started by the FD are
631 defined in the Director (possibly via a restricted
633 3. Modify the existing tray monitor code in the Win32 FD
634 so that it is a separate program from the FD.
635 4. The tray monitor program should be extended to permit
637 5. No new Director directives should be added without
638 prior consultation with the Bacula developers.
639 6. The comm line used by the FD to connect to the Director
640 should be re-used by the Director to do the backup.
641 This feature is partially implemented in the Director.
642 7. The FD may have a new directive that allows it to start
643 a backup when the FD starts.
644 8. The console interface to the FD should be extended to
645 permit a properly authorized console to initiate a
649 Item 20: Implement Storage daemon compression
650 Date: 18 December 2006
651 Origin: Vadim A. Umanski , e-mail umanski@ext.ru
653 What: The ability to compress backup data on the SD receiving data
654 instead of doing that on client sending data.
655 Why: The need is practical. I've got some machines that can send
656 data to the network 4 or 5 times faster than compressing
657 them (I've measured that). They're using fast enough SCSI/FC
658 disk subsystems but rather slow CPUs (ex. UltraSPARC II).
659 And the backup server has got a quite fast CPUs (ex. Dual P4
660 Xeons) and quite a low load. When you have 20, 50 or 100 GB
661 of raw data - running a job 4 to 5 times faster - that
662 really matters. On the other hand, the data can be
663 compressed 50% or better - so losing twice more space for
664 disk backup is not good at all. And the network is all mine
665 (I have a dedicated management/provisioning network) and I
666 can get as high bandwidth as I need - 100Mbps, 1000Mbps...
667 That's why the server-side compression feature is needed!
670 Item 21: Ability to import/export Bacula database entities
675 What: Create a Bacula ASCII SQL database independent format that permits
676 importing and exporting database catalog Job entities.
678 Why: For achival, database clustering, tranfer to other databases
681 Notes: Job selection should be by Job, time, Volume, Client, Pool and possibly
685 Item 22: Implementation of running Job speed limit.
686 Origin: Alex F, alexxzell at yahoo dot com
687 Date: 29 January 2009
689 What: I noticed the need for an integrated bandwidth limiter for
690 running jobs. It would be very useful just to specify another
691 field in bacula-dir.conf, like speed = how much speed you wish
692 for that specific job to run at
694 Why: Because of a couple of reasons. First, it's very hard to implement a
695 traffic shaping utility and also make it reliable. Second, it is very
696 uncomfortable to have to implement these apps to, let's say 50 clients
697 (including desktops, servers). This would also be unreliable because you
698 have to make sure that the apps are properly working when needed; users
699 could also disable them (accidentally or not). It would be very useful
700 to provide Bacula this ability. All information would be centralized,
701 you would not have to go to 50 different clients in 10 different
702 locations for configuration; eliminating 3rd party additions help in
703 establishing efficiency. Would also avoid bandwidth congestion,
704 especially where there is little available.
707 Item 23: Add an override in Schedule for Pools based on backup types
709 Origin: Chad Slater <chad.slater@clickfox.com>
712 What: Adding a FullStorage=BigTapeLibrary in the Schedule resource
713 would help those of us who use different storage devices for different
714 backup levels cope with the "auto-upgrade" of a backup.
716 Why: Assume I add several new devices to be backed up, i.e. several
717 hosts with 1TB RAID. To avoid tape switching hassles, incrementals are
718 stored in a disk set on a 2TB RAID. If you add these devices in the
719 middle of the month, the incrementals are upgraded to "full" backups,
720 but they try to use the same storage device as requested in the
721 incremental job, filling up the RAID holding the differentials. If we
722 could override the Storage parameter for full and/or differential
723 backups, then the Full job would use the proper Storage device, which
724 has more capacity (i.e. a 8TB tape library.
727 Item 24: Automatic promotion of backup levels based on backup size
728 Date: 19 January 2006
729 Origin: Adam Thornton <athornton@sinenomine.net>
732 What: Other backup programs have a feature whereby it estimates the space
733 that a differential, incremental, and full backup would take. If
734 the difference in space required between the scheduled level and the
735 next level up is beneath some user-defined critical threshold, the
736 backup level is bumped to the next type. Doing this minimizes the
737 number of volumes necessary during a restore, with a fairly minimal
738 cost in backup media space.
740 Why: I know at least one (quite sophisticated and smart) user for whom the
741 absence of this feature is a deal-breaker in terms of using Bacula;
742 if we had it it would eliminate the one cool thing other backup
743 programs can do and we can't (at least, the one cool thing I know
747 Item 25: Allow FileSet inclusion/exclusion by creation/mod times
748 Origin: Evan Kaufman <evan.kaufman@gmail.com>
749 Date: January 11, 2006
752 What: In the vein of the Wild and Regex directives in a Fileset's
753 Options, it would be helpful to allow a user to include or exclude
754 files and directories by creation or modification times.
756 You could factor the Exclude=yes|no option in much the same way it
757 affects the Wild and Regex directives. For example, you could exclude
758 all files modified before a certain date:
762 Modified Before = ####
765 Or you could exclude all files created/modified since a certain date:
769 Created Modified Since = ####
772 The format of the time/date could be done several ways, say the number
773 of seconds since the epoch:
774 1137008553 = Jan 11 2006, 1:42:33PM # result of `date +%s`
776 Or a human readable date in a cryptic form:
777 20060111134233 = Jan 11 2006, 1:42:33PM # YYYYMMDDhhmmss
779 Why: I imagine a feature like this could have many uses. It would
780 allow a user to do a full backup while excluding the base operating
781 system files, so if I installed a Linux snapshot from a CD yesterday,
782 I'll *exclude* all files modified *before* today. If I need to
783 recover the system, I use the CD I already have, plus the tape backup.
784 Or if, say, a Windows client is hit by a particularly corrosive
785 virus, and I need to *exclude* any files created/modified *since* the
788 Notes: Of course, this feature would work in concert with other
789 in/exclude rules, and wouldnt override them (or each other).
791 Notes: The directives I'd imagine would be along the lines of
792 "[Created] [Modified] [Before|Since] = <date>".
793 So one could compare against 'ctime' and/or 'mtime', but ONLY 'before'
797 Item 26: Archival (removal) of User Files to Tape
799 Origin: Ray Pengelly [ray at biomed dot queensu dot ca
802 What: The ability to archive data to storage based on certain parameters
803 such as age, size, or location. Once the data has been written to
804 storage and logged it is then pruned from the originating
805 filesystem. Note! We are talking about user's files and not
808 Why: This would allow fully automatic storage management which becomes
809 useful for large datastores. It would also allow for auto-staging
810 from one media type to another.
812 Example 1) Medical imaging needs to store large amounts of data.
813 They decide to keep data on their servers for 6 months and then put
814 it away for long term storage. The server then finds all files
815 older than 6 months writes them to tape. The files are then removed
818 Example 2) All data that hasn't been accessed in 2 months could be
819 moved from high-cost, fibre-channel disk storage to a low-cost
820 large-capacity SATA disk storage pool which doesn't have as quick of
821 access time. Then after another 6 months (or possibly as one
822 storage pool gets full) data is migrated to Tape.
824 Item 27: Ability to reconnect a disconnected comm line
829 What: Often jobs fail because of a communications line drop. In that
830 case, Bacula should be able to reconnect to the other daemon and
833 Why: Avoids backuping data already saved.
835 Notes: *Very* complicated from a design point of view because of authenication.
837 Item 28: Multiple threads in file daemon for the same job
838 Date: 27 November 2005
839 Origin: Ove Risberg (Ove.Risberg at octocode dot com)
842 What: I want the file daemon to start multiple threads for a backup
843 job so the fastest possible backup can be made.
845 The file daemon could parse the FileSet information and start
846 one thread for each File entry located on a separate
849 A confiuration option in the job section should be used to
850 enable or disable this feature. The confgutration option could
851 specify the maximum number of threads in the file daemon.
853 If the theads could spool the data to separate spool files
854 the restore process will not be much slower.
856 Why: Multiple concurrent backups of a large fileserver with many
857 disks and controllers will be much faster.
859 Notes: (KES) This is not necessary and could be accomplished
860 by having two jobs. In addition, the current VSS code
864 Item 29: Automatic disabling of devices
866 Origin: Peter Eriksson <peter at ifm.liu dot se>
869 What: After a configurable amount of fatal errors with a tape drive
870 Bacula should automatically disable further use of a certain
871 tape drive. There should also be "disable"/"enable" commands in
874 Why: On a multi-drive jukebox there is a possibility of tape drives
875 going bad during large backups (needing a cleaning tape run,
876 tapes getting stuck). It would be advantageous if Bacula would
877 automatically disable further use of a problematic tape drive
878 after a configurable amount of errors has occurred.
880 An example: I have a multi-drive jukebox (6 drives, 380+ slots)
881 where tapes occasionally get stuck inside the drive. Bacula will
882 notice that the "mtx-changer" command will fail and then fail
883 any backup jobs trying to use that drive. However, it will still
884 keep on trying to run new jobs using that drive and fail -
885 forever, and thus failing lots and lots of jobs... Since we have
886 many drives Bacula could have just automatically disabled
887 further use of that drive and used one of the other ones
891 Item 30: Enable persistent naming/number of SQL queries
897 Change the parsing of the query.sql file and the query command so that
898 queries are named/numbered by a fixed value, not their order in the
903 One of the real strengths of bacula is the ability to query the
904 database, and the fact that complex queries can be saved and
905 referenced from a file is very powerful. However, the choice
906 of query (both for interactive use, and by scripting input
907 to the bconsole command) is completely dependent on the order
908 within the query.sql file. The descriptve labels are helpful for
909 interactive use, but users become used to calling a particular
910 query "by number", or may use scripts to execute queries. This
911 presents a problem if the number or order of queries in the file
914 If the query.sql file used the numeric tags as a real value (rather
915 than a comment), then users could have a higher confidence that they
916 are executing the intended query, that their local changes wouldn't
917 conflict with future bacula upgrades.
919 For scripting, it's very important that the intended query is
920 what's actually executed. The current method of parsing the
921 query.sql file discourages scripting because the addition or
922 deletion of queries within the file will require corresponding
923 changes to scripts. It may not be obvious to users that deleting
924 query "17" in the query.sql file will require changing all
925 references to higher numbered queries. Similarly, when new
926 bacula distributions change the number of "official" queries,
927 user-developed queries cannot simply be appended to the file
928 without also changing any references to those queries in scripts
929 or procedural documentation, etc.
931 In addition, using fixed numbers for queries would encourage more
932 user-initiated development of queries, by supporting conventions
935 queries numbered 1-50 are supported/developed/distributed by
936 with official bacula releases
938 queries numbered 100-200 are community contributed, and are
939 related to media management
941 queries numbered 201-300 are community contributed, and are
942 related to checksums, finding duplicated files across
943 different backups, etc.
945 queries numbered 301-400 are community contributed, and are
946 related to backup statistics (average file size, size per
947 client per backup level, time for all clients by backup level,
948 storage capacity by media type, etc.)
950 queries numbered 500-999 are locally created
953 Alternatively, queries could be called by keyword (tag), rather
957 Item 31: Bacula Dir, FD and SD to support proxies
958 Origin: Karl Grindley @ MIT Lincoln Laboratory <kgrindley at ll dot mit dot edu>
962 What: Support alternate methods for nailing up a TCP session such
963 as SOCKS5, SOCKS4 and HTTP (CONNECT) proxies. Such a feature
964 would allow tunneling of bacula traffic in and out of proxied
967 Why: Currently, bacula is architected to only function on a flat network, with
968 no barriers or limitations. Due to the large configuration states of
969 any network and the infinite configuration where file daemons and
970 storage daemons may sit in relation to one another, bacula often is
971 not usable on a network where filtered or air-gaped networks exist.
972 While often solutions such as ACL modifications to firewalls or port
973 redirection via SNAT or DNAT will solve the issue, often however,
974 these solutions are not adequate or not allowed by hard policy.
976 In an air-gapped network with only a highly locked down proxy services
977 are provided (SOCKS4/5 and/or HTTP and/or SSH outbound) ACLs or
978 iptable rules will not work.
980 Notes: Director resource tunneling: This configuration option to utilize a
981 proxy to connect to a client should be specified in the client
982 resource Client resource tunneling: should be configured in the client
983 resource in the director config file? Or configured on the bacula-fd
984 configuration file on the fd host itself? If the ladder, this would
985 allow only certain clients to use a proxy, where others do not when
986 establishing the TCP connection to the storage server.
988 Also worth noting, there are other 3rd party, light weight apps that
989 could be utilized to bootstrap this. Instead of sockifing bacula
990 itself, use an external program to broker proxy authentication, and
991 connection to the remote host. OpenSSH does this by using the
992 "ProxyCommand" syntax in the client configuration and uses stdin and
993 stdout to the command. Connect.c is a very popular one.
994 (http://bent.latency.net/bent/darcs/goto-san-connect-1.85/src/connect.html).
995 One could also possibly use stunnel, netcat, etc.
998 Item 32: Add Minumum Spool Size directive
1000 Origin: Frank Sweetser <fs@wpi.edu>
1002 What: Add a new SD directive, "minimum spool size" (or similar). This
1003 directive would specify a minimum level of free space available for
1004 spooling. If the unused spool space is less than this level, any
1005 new spooling requests would be blocked as if the "maximum spool
1006 size" threshold had bee reached. Already spooling jobs would be
1007 unaffected by this directive.
1009 Why: I've been bitten by this scenario a couple of times:
1011 Assume a maximum spool size of 100M. Two concurrent jobs, A and B,
1012 are both running. Due to timing quirks and previously running jobs,
1013 job A has used 99.9M of space in the spool directory. While A is
1014 busy despooling to disk, B is happily using the remaining 0.1M of
1015 spool space. This ends up in a spool/despool sequence every 0.1M of
1016 data. In addition to fragmenting the data on the volume far more
1017 than was necessary, in larger data sets (ie, tens or hundreds of
1018 gigabytes) it can easily produce multi-megabyte report emails!
1024 Item 33: Command that releases all drives in an autochanger
1025 Origin: Blake Dunlap (blake@nxs.net)
1029 What: It would be nice if there was a release command that
1030 would release all drives in an autochanger instead of having to
1031 do each one in turn.
1033 Why: It can take some time for a release to occur, and the
1034 commands must be given for each drive in turn, which can quicky
1035 scale if there are several drives in the library. (Having to
1036 watch the console, to give each command can waste a good bit of
1037 time when you start getting into the 16 drive range when the
1038 tapes can take up to 3 minutes to eject each)
1040 Notes: Due to the way some autochangers/libraries work, you
1041 cannot assume that new tapes inserted will go into slots that are
1042 not currently believed to be in use by bacula (the tape from that
1043 slot is in a drive). This would make any changes in
1044 configuration quicker/easier, as all drives need to be released
1045 before any modifications to slots.
1047 Item 34: Run bscan on a remote storage daemon from within bconsole.
1048 Date: 07 October 2009
1049 Origin: Graham Keeling <graham@equiinet.com>
1052 What: The ability to be able to run bscan on a remote storage daemon from
1053 within bconsole in order to populate your catalog.
1055 Why: Currently, it seems you have to:
1056 a) log in to a console on the remote machine
1057 b) figure out where the storage daemon config file is
1058 c) figure out the storage device from the config file
1059 d) figure out the catalog IP address
1060 e) figure out the catalog port
1061 f) open the port on the catalog firewall
1062 g) configure the catalog database to accept connections from the
1064 h) build a 'bscan' command from (b)-(e) above and run it
1065 It would be much nicer to be able to type something like this into
1067 *bscan storage=<storage> device=<device> volume=<volume>
1069 *bscan storage=<storage> all
1070 It seems to me that the scan could also do a better job than the
1071 external bscan program currently does. It would possibly be able to
1072 deduce some extra details, such as the catalog StorageId for the
1075 Notes: (Kern). If you need to do a bscan, you have done something wrong,
1076 so this functionality should not need to be integrated into the
1077 the Storage daemon. However, I am not opposed to someone implementing
1078 this feature providing that all the code is in a shared object (or dll)
1079 and does not add significantly to the size of the Storage daemon. In
1080 addition, the code should be written in a way such that the same source
1081 code is used in both the bscan program and the Storage daemon to avoid
1082 adding a lot of new code that must be maintained by the project.
1084 Item 35: Implement a Migration job type that will create a reverse
1085 incremental (or decremental) backup from two existing full backups.
1086 Date: 05 October 2009
1087 Origin: Griffith College Dublin. Some sponsorship available.
1088 Contact: Gavin McCullagh <gavin.mccullagh@gcd.ie>
1091 What: The ability to take two full backup jobs and derive a reverse
1092 incremental backup from them. The older full backup data may then
1095 Why: Long-term backups based on keeping full backups can be expensive in
1096 media. In many cases (eg a NAS), as the client accumulates files
1097 over months and years, the same file will be duplicated unchanged,
1098 across many media and datasets. Eg, Less than 10% (and
1099 shrinking) of our monthly full mail server backup is new files,
1100 the other 90% is also in the previous full backup.
1101 Regularly converting the oldest full backup into a reverse
1102 incremental backup allows the admin to keep access to old backup
1103 jobs, but remove all of the duplicated files, freeing up media.
1105 Notes: This feature was previously discussed on the bacula-devel list
1106 here: http://www.mail-archive.com/bacula-devel@lists.sourceforge.net/msg04962.html
1110 Item 36: Extend the verify code to make it possible to verify
1111 older jobs, not only the last one that has finished
1113 Origin: Ralf Gross (Ralf-Lists <at> ralfgross.de)
1114 Status: not implemented or documented
1116 What: At the moment a VolumeToCatalog job compares only the
1117 last job with the data in the catalog. It's not possible
1118 to compare the data (md5sums) of an older volume with the
1119 data in the catalog.
1121 Why: If a verify job fails, one has to immediately check the
1122 source of the problem, fix it and rerun the verify job.
1123 This has to happen before the next backup of the
1124 verified backup job starts.
1125 More important: It's not possible to check jobs that are
1126 kept for a long time (archiv). If a jobid could be
1127 specified for a verify job, older backups/tapes could be
1128 checked on a regular base.
1130 Notes: verify documentation:
1131 VolumeToCatalog: This level causes Bacula to read the file
1132 attribute data written to the Volume from the last Job [...]
1134 Verify Job = <Job-Resource-Name> If you run a verify job
1135 without this directive, the last job run will be compared
1136 with the catalog, which means that you must immediately
1137 follow a backup by a verify command. If you specify a Verify
1138 Job Bacula will find the last job with that name that ran [...]
1140 example bconsole verify dialog:
1143 JobName: VerifyServerXXX
1144 Level: VolumeToCatalog
1145 Client: ServerXXX-fd
1146 FileSet: ServerXXX-Vol1
1147 Pool: Full (From Job resource)
1148 Storage: Neo4100 (From Pool resource)
1149 Verify Job: ServerXXX-Vol1
1151 When: 2009-04-20 09:03:04
1153 OK to run? (yes/mod/no): m
1154 Parameters to modify:
1167 Item 37: Separate "Storage" and "Device" in the bacula-dir.conf
1169 Origin: "James Harper" <james.harper@bendigoit.com.au>
1170 Status: not implemented or documented
1172 What: Separate "Storage" and "Device" in the bacula-dir.conf
1173 The resulting config would looks something like:
1176 Name = name_of_server
1177 Address = hostname/IP address
1179 Password = shh_its_a_secret
1180 Maximum Concurrent Jobs = 7
1184 Name = name_of_device
1185 Storage = name_of_server
1186 Device = name_of_device_on_sd
1187 Media Type = media_type
1188 Maximum Concurrent Jobs = 1
1191 Maximum Concurrent Jobs would be specified with a server and a device
1192 maximum, which would both be honoured by the director. Almost everything
1193 that mentions a 'Storage' would need to be changed to 'Device', although
1194 perhaps a 'Storage' would just be a synonym for 'Device' for backwards
1197 Why: If you have multiple Storage definitions pointing to different
1198 Devices in the same Storage daemon, the "status storage" command
1199 prompts for each different device, but they all give the same
1204 Item 38: Least recently used device selection for tape drives in autochanger.
1205 Date: 12 October 2009
1206 Origin: Thomas Carter <tcarter@memc.com>
1209 What: A better tape drive selection algorithm for multi-drive
1210 autochangers. The AUTOCHANGER class contains an array list of tape
1211 devices. When a tape drive is needed, this list is always searched in
1212 order. This causes lower number drives (specifically drive 0) to do a
1213 majority of the work with higher numbered drives possibly never being
1214 used. When a drive in an autochanger is reserved for use, its entry should
1215 be moved to the end of the list; this would give a rough LRU drive
1218 Why: The current implementation places a majority of use and wear on drive
1219 0 of a multi-drive autochanger.
1223 Item 39: Implement a Storage device like Amazon's S3.
1224 Date: 25 August 2008
1225 Origin: Soren Hansen <soren@ubuntu.com>
1226 Status: Not started.
1227 What: Enable the storage daemon to store backup data on Amazon's
1230 Why: Amazon's S3 is a cheap way to store data off-site.
1232 Notes: If we configure the Pool to put only one job per volume (they don't
1233 support append operation), and the volume size isn't to big (100MB?),
1234 it should be easy to adapt the disk-changer script to add get/put
1235 procedure with curl. So, the data would be safetly copied during the
1238 Cloud should be only used with Copy jobs, users should always have
1239 a copy of their data on their site.
1241 We should also think to have our own cache, trying always to have
1242 cloud volume on the local disk. (I don't know if users want to store
1243 100GB on cloud, so it shouldn't be a disk size problem). For example,
1244 if bacula want to recycle a volume, it will start by downloading the
1245 file to truncate it few seconds later, if we can avoid that...
1247 Item 40: Convert tray monitor on Windows to a stand alone program
1252 What: Separate Win32 tray monitor to be a separate program.
1254 Why: Vista does not allow SYSTEM services to interact with the
1255 desktop, so the current tray monitor does not work on Vista
1258 Notes: Requires communicating with the FD via the network (simulate
1259 a console connection).
1261 Item 41: Improve Bacula's tape and drive usage and cleaning management
1262 Date: 8 November 2005, November 11, 2005
1263 Origin: Adam Thornton <athornton at sinenomine dot net>,
1264 Arno Lehmann <al at its-lehmann dot de>
1268 1. Measure tape and drive usage (mostly implemented)
1269 2. Retiring a volume when too old or too many errors
1270 3. Handle cleaning and tape alerts.
1275 Item 42: Relabel disk volume after recycling
1276 Origin: Pasi Kärkkäinen <pasik@iki.fi>
1278 Status: Not implemented yet, no code written.
1280 What: The ability to relabel the disk volume (and thus rename the file on the
1281 disk) after it has been recycled. Useful when you have a single job
1282 per disk volume, and you use a custom Label format, for example:
1284 "${Client}-${Level}-${NumVols:p/4/0/r}-${Year}_${Month}_${Day}-${Hour}_${Minute}"
1286 Why: Disk volumes in Bacula get the label/filename when they are used for the
1287 first time. If you use recycling and custom label format like above,
1288 the disk volume name doesn't match the contents after it has been
1289 recycled. This feature makes it possible to keep the label/filename
1290 in sync with the content and thus makes it easy to check/monitor the
1291 backups from the shell and/or normal file management tools, because
1292 the filenames of the disk volumes match the content.
1294 Notes: The configuration option could be "Relabel after Recycling = Yes".
1298 ========= New items after last vote ====================
1301 Note to renumber items use:
1302 scripts/renumber_projects.pl projects >1
1305 ========= Add new items above this line =================
1308 ============= Empty Feature Request form ===========
1309 Item n: One line summary ...
1310 Date: Date submitted
1311 Origin: Name and email of originator.
1314 What: More detailed explanation ...
1316 Why: Why it is important ...
1318 Notes: Additional notes or features (omit if not used)
1319 ============== End Feature Request form ==============
1322 ========== Items put on hold by Kern ============================
1325 ========== Items completed in version 5.0.0 ====================
1326 *Item : 'restore' menu: enter a JobId, automatically select dependents
1327 *Item : Deletion of disk Volumes when pruned (partial -- truncate when pruned)
1328 *Item : Implement Base jobs
1329 *Item : Restore from volumes on multiple storage daemons
1330 *Item : Enable/disable compression depending on storage device (disk/tape)
1331 *Item : Cause daemons to use a specific IP address to source communications
1332 *Item : "Maximum Concurrent Jobs" for drives when used with changer device
1333 *Item : List InChanger flag when doing restore.
1334 *Item : Port bat to Win32
1335 *Item : An option to operate on all pools with update vol parameters