3 Bacula Projects Roadmap
4 Status updated 25 February 2010
9 Item 1: Ability to restart failed jobs
10 Item 2: Scheduling syntax that permits more flexibility and options
11 Item 3: Data encryption on storage daemon
12 Item 4: Add ability to Verify any specified Job.
13 Item 5: Improve Bacula's tape and drive usage and cleaning management
14 Item 6: Allow FD to initiate a backup
15 Item 7: Implement Storage daemon compression
16 Item 8: Reduction of communications bandwidth for a backup
17 Item 9: Ability to reconnect a disconnected comm line
18 Item 10: Start spooling even when waiting on tape
19 Item 11: Include all conf files in specified directory
20 Item 12: Multiple threads in file daemon for the same job
21 Item 13: Possibilty to schedule Jobs on last Friday of the month
22 Item 14: Include timestamp of job launch in "stat clients" output
23 Item 15: Message mailing based on backup types
24 Item 16: Ability to import/export Bacula database entities
25 Item 17: Implementation of running Job speed limit.
26 Item 18: Add an override in Schedule for Pools based on backup types
27 Item 19: Automatic promotion of backup levels based on backup size
28 Item 20: Allow FileSet inclusion/exclusion by creation/mod times
29 Item 21: Archival (removal) of User Files to Tape
30 Item 22: An option to operate on all pools with update vol parameters
31 Item 23: Automatic disabling of devices
32 Item 24: Ability to defer Batch Insert to a later time
33 Item 25: Add MaxVolumeSize/MaxVolumeBytes to Storage resource
34 Item 26: Enable persistent naming/number of SQL queries
35 Item 27: Bacula Dir, FD and SD to support proxies
36 Item 28: Add Minumum Spool Size directive
37 Item 29: Handle Windows Encrypted Files using Win raw encryption
38 Item 30: Implement a Storage device like Amazon's S3.
39 Item 31: Convert tray monitor on Windows to a stand alone program
40 Item 32: Relabel disk volume after recycling
41 Item 33: Command that releases all drives in an autochanger
42 Item 34: Run bscan on a remote storage daemon from within bconsole.
43 Item 35: Implement a Migration job type that will create a reverse
44 Item 36: Job migration between different SDs
45 Item 37: Concurrent spooling and despooling withini a single job.
46 Item 39: Extend the verify code to make it possible to verify
47 Item 40: Separate "Storage" and "Device" in the bacula-dir.conf
50 Item 1: Ability to restart failed jobs
55 What: Often jobs fail because of a communications line drop or max run time,
56 cancel, or some other non-critical problem. Currrently any data
57 saved is lost. This implementation should modify the Storage daemon
58 so that it saves all the files that it knows are completely backed
61 The jobs should then be marked as incomplete and a subsequent
62 Incremental Accurate backup will then take into account all the
65 Why: Avoids backuping data already saved.
67 Notes: Requires Accurate to restart correctly. Must completed have a minimum
68 volume of data or files stored on Volume before enabling.
71 Item 2: Scheduling syntax that permits more flexibility and options
72 Date: 15 December 2006
73 Origin: Gregory Brauer (greg at wildbrain dot com) and
74 Florian Schnabel <florian.schnabel at docufy dot de>
77 What: Currently, Bacula only understands how to deal with weeks of the
78 month or weeks of the year in schedules. This makes it impossible
79 to do a true weekly rotation of tapes. There will always be a
80 discontinuity that will require disruptive manual intervention at
81 least monthly or yearly because week boundaries never align with
82 month or year boundaries.
84 A solution would be to add a new syntax that defines (at least)
85 a start timestamp, and repetition period.
87 An easy option to skip a certain job on a certain date.
90 Why: Rotated backups done at weekly intervals are useful, and Bacula
91 cannot currently do them without extensive hacking.
93 You could then easily skip tape backups on holidays. Especially
94 if you got no autochanger and can only fit one backup on a tape
95 that would be really handy, other jobs could proceed normally
96 and you won't get errors that way.
99 Notes: Here is an example syntax showing a 3-week rotation where full
100 Backups would be performed every week on Saturday, and an
101 incremental would be performed every week on Tuesday. Each
102 set of tapes could be removed from the loader for the following
103 two cycles before coming back and being reused on the third
104 week. Since the execution times are determined by intervals
105 from a given point in time, there will never be any issues with
106 having to adjust to any sort of arbitrary time boundary. In
107 the example provided, I even define the starting schedule
108 as crossing both a year and a month boundary, but the run times
109 would be based on the "Repeat" value and would therefore happen
114 Name = "Week 1 Rotation"
115 #Saturday. Would run Dec 30, Jan 20, Feb 10, etc.
119 Start = 2006-12-30 01:00
123 #Tuesday. Would run Jan 2, Jan 23, Feb 13, etc.
127 Start = 2007-01-02 01:00
134 Name = "Week 2 Rotation"
135 #Saturday. Would run Jan 6, Jan 27, Feb 17, etc.
139 Start = 2007-01-06 01:00
143 #Tuesday. Would run Jan 9, Jan 30, Feb 20, etc.
147 Start = 2007-01-09 01:00
154 Name = "Week 3 Rotation"
155 #Saturday. Would run Jan 13, Feb 3, Feb 24, etc.
159 Start = 2007-01-13 01:00
163 #Tuesday. Would run Jan 16, Feb 6, Feb 27, etc.
167 Start = 2007-01-16 01:00
173 Notes: Kern: I have merged the previously separate project of skipping
174 jobs (via Schedule syntax) into this.
177 Item 3: Data encryption on storage daemon
178 Origin: Tobias Barth <tobias.barth at web-arts.com>
179 Date: 04 February 2009
182 What: The storage demon should be able to do the data encryption that can
183 currently be done by the file daemon.
185 Why: This would have 2 advantages:
186 1) one could encrypt the data of unencrypted tapes by doing a
188 2) the storage daemon would be the only machine that would have
189 to keep the encryption keys.
192 As an addendum to the feature request, here are some crypto
193 implementation details I wrote up regarding SD-encryption back in Jan
195 http://www.mail-archive.com/bacula-users@lists.sourceforge.net/msg28860.html
198 Item 4: Add ability to Verify any specified Job.
199 Date: 17 January 2008
200 Origin: portrix.net Hamburg, Germany.
201 Contact: Christian Sabelmann
202 Status: 70% of the required Code is part of the Verify function since v. 2.x
205 The ability to tell Bacula which Job should verify instead of
206 automatically verify just the last one.
209 It is sad that such a powerfull feature like Verify Jobs
210 (VolumeToCatalog) is restricted to be used only with the last backup Job
211 of a client. Actual users who have to do daily Backups are forced to
212 also do daily Verify Jobs in order to take advantage of this useful
213 feature. This Daily Verify after Backup conduct is not always desired
214 and Verify Jobs have to be sometimes scheduled. (Not necessarily
215 scheduled in Bacula). With this feature Admins can verify Jobs once a
216 Week or less per month, selecting the Jobs they want to verify. This
217 feature is also not to difficult to implement taking in account older bug
218 reports about this feature and the selection of the Job to be verified.
220 Notes: For the verify Job, the user could select the Job to be verified
221 from a List of the latest Jobs of a client. It would also be possible to
222 verify a certain volume. All of these would naturaly apply only for
223 Jobs whose file information are still in the catalog.
226 Item 5: Improve Bacula's tape and drive usage and cleaning management
227 Date: 8 November 2005, November 11, 2005
228 Origin: Adam Thornton <athornton at sinenomine dot net>,
229 Arno Lehmann <al at its-lehmann dot de>
232 What: Make Bacula manage tape life cycle information, tape reuse
233 times and drive cleaning cycles.
235 Why: All three parts of this project are important when operating
237 We need to know which tapes need replacement, and we need to
238 make sure the drives are cleaned when necessary. While many
239 tape libraries and even autoloaders can handle all this
240 automatically, support by Bacula can be helpful for smaller
241 (older) libraries and single drives. Limiting the number of
242 times a tape is used might prevent tape errors when using
243 tapes until the drives can't read it any more. Also, checking
244 drive status during operation can prevent some failures (as I
245 [Arno] had to learn the hard way...)
247 Notes: First, Bacula could (and even does, to some limited extent)
248 record tape and drive usage. For tapes, the number of mounts,
249 the amount of data, and the time the tape has actually been
250 running could be recorded. Data fields for Read and Write
251 time and Number of mounts already exist in the catalog (I'm
252 not sure if VolBytes is the sum of all bytes ever written to
253 that volume by Bacula). This information can be important
254 when determining which media to replace. The ability to mark
255 Volumes as "used up" after a given number of write cycles
256 should also be implemented so that a tape is never actually
257 worn out. For the tape drives known to Bacula, similar
258 information is interesting to determine the device status and
259 expected life time: Time it's been Reading and Writing, number
260 of tape Loads / Unloads / Errors. This information is not yet
261 recorded as far as I [Arno] know. A new volume status would
262 be necessary for the new state, like "Used up" or "Worn out".
263 Volumes with this state could be used for restores, but not
264 for writing. These volumes should be migrated first (assuming
265 migration is implemented) and, once they are no longer needed,
266 could be moved to a Trash pool.
268 The next step would be to implement a drive cleaning setup.
269 Bacula already has knowledge about cleaning tapes. Once it
270 has some information about cleaning cycles (measured in drive
271 run time, number of tapes used, or calender days, for example)
272 it can automatically execute tape cleaning (with an
273 autochanger, obviously) or ask for operator assistance loading
276 The final step would be to implement TAPEALERT checks not only
277 when changing tapes and only sending the information to the
278 administrator, but rather checking after each tape error,
279 checking on a regular basis (for example after each tape
280 file), and also before unloading and after loading a new tape.
281 Then, depending on the drives TAPEALERT state and the known
282 drive cleaning state Bacula could automatically schedule later
283 cleaning, clean immediately, or inform the operator.
285 Implementing this would perhaps require another catalog change
286 and perhaps major changes in SD code and the DIR-SD protocol,
287 so I'd only consider this worth implementing if it would
288 actually be used or even needed by many people.
290 Implementation of these projects could happen in three distinct
291 sub-projects: Measuring Tape and Drive usage, retiring
292 volumes, and handling drive cleaning and TAPEALERTs.
295 Item 6: Allow FD to initiate a backup
296 Origin: Frank Volf (frank at deze dot org)
297 Date: 17 November 2005
300 What: Provide some means, possibly by a restricted console that
301 allows a FD to initiate a backup, and that uses the connection
302 established by the FD to the Director for the backup so that
303 a Director that is firewalled can do the backup.
304 Why: Makes backup of laptops much easier.
305 Notes: - The FD already has code for the monitor interface
306 - It could be nice to have a .job command that lists authorized
308 - Commands need to be restricted on the Director side
309 (for example by re-using the runscript flag)
310 - The Client resource can be used to authorize the connection
311 - In a first time, the client can't modify job parameters
312 - We need a way to run a status command to follow job progression
314 This project consists of the following points
315 1. Modify the FD to have a "mini-console" interface that
316 permits it to connect to the Director and start a
317 backup job of itself.
318 2. The list of jobs that can be started by the FD are
319 defined in the Director (possibly via a restricted
321 3. Modify the existing tray monitor code in the Win32 FD
322 so that it is a separate program from the FD.
323 4. The tray monitor program should be extended to permit
325 5. No new Director directives should be added without
326 prior consultation with the Bacula developers.
327 6. The comm line used by the FD to connect to the Director
328 should be re-used by the Director to do the backup.
329 This feature is partially implemented in the Director.
330 7. The FD may have a new directive that allows it to start
331 a backup when the FD starts.
332 8. The console interface to the FD should be extended to
333 permit a properly authorized console to initiate a
337 Item 7: Implement Storage daemon compression
338 Date: 18 December 2006
339 Origin: Vadim A. Umanski , e-mail umanski@ext.ru
341 What: The ability to compress backup data on the SD receiving data
342 instead of doing that on client sending data.
343 Why: The need is practical. I've got some machines that can send
344 data to the network 4 or 5 times faster than compressing
345 them (I've measured that). They're using fast enough SCSI/FC
346 disk subsystems but rather slow CPUs (ex. UltraSPARC II).
347 And the backup server has got a quite fast CPUs (ex. Dual P4
348 Xeons) and quite a low load. When you have 20, 50 or 100 GB
349 of raw data - running a job 4 to 5 times faster - that
350 really matters. On the other hand, the data can be
351 compressed 50% or better - so losing twice more space for
352 disk backup is not good at all. And the network is all mine
353 (I have a dedicated management/provisioning network) and I
354 can get as high bandwidth as I need - 100Mbps, 1000Mbps...
355 That's why the server-side compression feature is needed!
359 Item 8: Reduction of communications bandwidth for a backup
360 Date: 14 October 2008
361 Origin: Robin O'Leary (Equiinet)
364 What: Using rdiff techniques, Bacula could significantly reduce
365 the network data transfer volume to do a backup.
367 Why: Faster backup across the Internet
369 Notes: This requires retaining certain data on the client during a Full
370 backup that will speed up subsequent backups.
373 Item 9: Ability to reconnect a disconnected comm line
378 What: Often jobs fail because of a communications line drop. In that
379 case, Bacula should be able to reconnect to the other daemon and
382 Why: Avoids backuping data already saved.
384 Notes: *Very* complicated from a design point of view because of authenication.
386 Item 10: Start spooling even when waiting on tape
387 Origin: Tobias Barth <tobias.barth@web-arts.com>
391 What: If a job can be spooled to disk before writing it to tape, it should
392 be spooled immediately. Currently, bacula waits until the correct
393 tape is inserted into the drive.
395 Why: It could save hours. When bacula waits on the operator who must insert
396 the correct tape (e.g. a new tape or a tape from another media
397 pool), bacula could already prepare the spooled data in the spooling
398 directory and immediately start despooling when the tape was
399 inserted by the operator.
401 2nd step: Use 2 or more spooling directories. When one directory is
402 currently despooling, the next (on different disk drives) could
403 already be spooling the next data.
405 Notes: I am using bacula 2.2.8, which has none of those features
409 Item 11: Include all conf files in specified directory
410 Date: 18 October 2008
411 Origin: Database, Lda. Maputo, Mozambique
412 Contact:Cameron Smith / cameron.ord@database.co.mz
415 What: A directive something like "IncludeConf = /etc/bacula/subconfs" Every
416 time Bacula Director restarts or reloads, it will walk the given
417 directory (non-recursively) and include the contents of any files
418 therein, as though they were appended to bacula-dir.conf
420 Why: Permits simplified and safer configuration for larger installations with
421 many client PCs. Currently, through judicious use of JobDefs and
422 similar directives, it is possible to reduce the client-specific part of
423 a configuration to a minimum. The client-specific directives can be
424 prepared according to a standard template and dropped into a known
425 directory. However it is still necessary to add a line to the "master"
426 (bacula-dir.conf) referencing each new file. This exposes the master to
427 unnecessary risk of accidental mistakes and makes automation of adding
428 new client-confs, more difficult (it is easier to automate dropping a
429 file into a dir, than rewriting an existing file). Ken has previously
430 made a convincing argument for NOT including Bacula's core configuration
431 in an RDBMS, but I believe that the present request is a reasonable
432 extension to the current "flat-file-based" configuration philosophy.
434 Notes: There is NO need for any special syntax to these files. They should
435 contain standard directives which are simply "inlined" to the parent
436 file as already happens when you explicitly reference an external file.
438 Notes: (kes) this can already be done with scripting
439 From: John Jorgensen <jorgnsn@lcd.uregina.ca>
440 The bacula-dir.conf at our site contains these lines:
443 # Include subfiles associated with configuration of clients.
444 # They define the bulk of the Clients, Jobs, and FileSets.
446 @|"sh -c 'for f in /etc/bacula/clientdefs/*.conf ; do echo @${f} ; done'"
448 and when we get a new client, we just put its configuration into
449 a new file called something like:
451 /etc/bacula/clientdefs/clientname.conf
454 Item 12: Multiple threads in file daemon for the same job
455 Date: 27 November 2005
456 Origin: Ove Risberg (Ove.Risberg at octocode dot com)
459 What: I want the file daemon to start multiple threads for a backup
460 job so the fastest possible backup can be made.
462 The file daemon could parse the FileSet information and start
463 one thread for each File entry located on a separate
466 A confiuration option in the job section should be used to
467 enable or disable this feature. The confgutration option could
468 specify the maximum number of threads in the file daemon.
470 If the theads could spool the data to separate spool files
471 the restore process will not be much slower.
473 Why: Multiple concurrent backups of a large fileserver with many
474 disks and controllers will be much faster.
476 Notes: (KES) This is not necessary and could be accomplished
477 by having two jobs. In addition, the current VSS code
481 Item 13: Possibilty to schedule Jobs on last Friday of the month
482 Origin: Carsten Menke <bootsy52 at gmx dot net>
486 What: Currently if you want to run your monthly Backups on the last
487 Friday of each month this is only possible with workarounds (e.g
488 scripting) (As some months got 4 Fridays and some got 5 Fridays)
489 The same is true if you plan to run your yearly Backups on the
490 last Friday of the year. It would be nice to have the ability to
491 use the builtin scheduler for this.
493 Why: In many companies the last working day of the week is Friday (or
494 Saturday), so to get the most data of the month onto the monthly
495 tape, the employees are advised to insert the tape for the
496 monthly backups on the last friday of the month.
498 Notes: To give this a complete functionality it would be nice if the
499 "first" and "last" Keywords could be implemented in the
500 scheduler, so it is also possible to run monthy backups at the
501 first friday of the month and many things more. So if the syntax
502 would expand to this {first|last} {Month|Week|Day|Mo-Fri} of the
503 {Year|Month|Week} you would be able to run really flexible jobs.
505 To got a certain Job run on the last Friday of the Month for example
508 Run = pool=Monthly last Fri of the Month at 23:50
512 Run = pool=Yearly last Fri of the Year at 23:50
514 ## Certain Jobs the last Week of a Month
516 Run = pool=LastWeek last Week of the Month at 23:50
518 ## Monthly Backup on the last day of the month
520 Run = pool=Monthly last Day of the Month at 23:50
523 Item 14: Include timestamp of job launch in "stat clients" output
524 Origin: Mark Bergman <mark.bergman@uphs.upenn.edu>
525 Date: Tue Aug 22 17:13:39 EDT 2006
528 What: The "stat clients" command doesn't include any detail on when
529 the active backup jobs were launched.
531 Why: Including the timestamp would make it much easier to decide whether
532 a job is running properly.
534 Notes: It may be helpful to have the output from "stat clients" formatted
535 more like that from "stat dir" (and other commands), in a column
536 format. The per-client information that's currently shown (level,
537 client name, JobId, Volume, pool, device, Files, etc.) is good, but
538 somewhat hard to parse (both programmatically and visually),
539 particularly when there are many active clients.
542 Item 15: Message mailing based on backup types
543 Origin: Evan Kaufman <evan.kaufman@gmail.com>
544 Date: January 6, 2006
547 What: In the "Messages" resource definitions, allowing messages
548 to be mailed based on the type (backup, restore, etc.) and level
549 (full, differential, etc) of job that created the originating
552 Why: It would, for example, allow someone's boss to be emailed
553 automatically only when a Full Backup job runs, so he can
554 retrieve the tapes for offsite storage, even if the IT dept.
555 doesn't (or can't) explicitly notify him. At the same time, his
556 mailbox wouldnt be filled by notifications of Verifies, Restores,
557 or Incremental/Differential Backups (which would likely be kept
560 Notes: One way this could be done is through additional message types, for
564 # email the boss only on full system backups
565 Mail = boss@mycompany.com = full, !incremental, !differential, !restore,
567 # email us only when something breaks
568 MailOnError = itdept@mycompany.com = all
571 Notes: Kern: This should be rather trivial to implement.
574 Item 16: Ability to import/export Bacula database entities
579 What: Create a Bacula ASCII SQL database independent format that permits
580 importing and exporting database catalog Job entities.
582 Why: For achival, database clustering, tranfer to other databases
585 Notes: Job selection should be by Job, time, Volume, Client, Pool and possibly
589 Item 17: Implementation of running Job speed limit.
590 Origin: Alex F, alexxzell at yahoo dot com
591 Date: 29 January 2009
593 What: I noticed the need for an integrated bandwidth limiter for
594 running jobs. It would be very useful just to specify another
595 field in bacula-dir.conf, like speed = how much speed you wish
596 for that specific job to run at
598 Why: Because of a couple of reasons. First, it's very hard to implement a
599 traffic shaping utility and also make it reliable. Second, it is very
600 uncomfortable to have to implement these apps to, let's say 50 clients
601 (including desktops, servers). This would also be unreliable because you
602 have to make sure that the apps are properly working when needed; users
603 could also disable them (accidentally or not). It would be very useful
604 to provide Bacula this ability. All information would be centralized,
605 you would not have to go to 50 different clients in 10 different
606 locations for configuration; eliminating 3rd party additions help in
607 establishing efficiency. Would also avoid bandwidth congestion,
608 especially where there is little available.
611 Item 18: Add an override in Schedule for Pools based on backup types
613 Origin: Chad Slater <chad.slater@clickfox.com>
616 What: Adding a FullStorage=BigTapeLibrary in the Schedule resource
617 would help those of us who use different storage devices for different
618 backup levels cope with the "auto-upgrade" of a backup.
620 Why: Assume I add several new devices to be backed up, i.e. several
621 hosts with 1TB RAID. To avoid tape switching hassles, incrementals are
622 stored in a disk set on a 2TB RAID. If you add these devices in the
623 middle of the month, the incrementals are upgraded to "full" backups,
624 but they try to use the same storage device as requested in the
625 incremental job, filling up the RAID holding the differentials. If we
626 could override the Storage parameter for full and/or differential
627 backups, then the Full job would use the proper Storage device, which
628 has more capacity (i.e. a 8TB tape library.
631 Item 19: Automatic promotion of backup levels based on backup size
632 Date: 19 January 2006
633 Origin: Adam Thornton <athornton@sinenomine.net>
636 What: Other backup programs have a feature whereby it estimates the space
637 that a differential, incremental, and full backup would take. If
638 the difference in space required between the scheduled level and the
639 next level up is beneath some user-defined critical threshold, the
640 backup level is bumped to the next type. Doing this minimizes the
641 number of volumes necessary during a restore, with a fairly minimal
642 cost in backup media space.
644 Why: I know at least one (quite sophisticated and smart) user for whom the
645 absence of this feature is a deal-breaker in terms of using Bacula;
646 if we had it it would eliminate the one cool thing other backup
647 programs can do and we can't (at least, the one cool thing I know
651 Item 20: Allow FileSet inclusion/exclusion by creation/mod times
652 Origin: Evan Kaufman <evan.kaufman@gmail.com>
653 Date: January 11, 2006
656 What: In the vein of the Wild and Regex directives in a Fileset's
657 Options, it would be helpful to allow a user to include or exclude
658 files and directories by creation or modification times.
660 You could factor the Exclude=yes|no option in much the same way it
661 affects the Wild and Regex directives. For example, you could exclude
662 all files modified before a certain date:
666 Modified Before = ####
669 Or you could exclude all files created/modified since a certain date:
673 Created Modified Since = ####
676 The format of the time/date could be done several ways, say the number
677 of seconds since the epoch:
678 1137008553 = Jan 11 2006, 1:42:33PM # result of `date +%s`
680 Or a human readable date in a cryptic form:
681 20060111134233 = Jan 11 2006, 1:42:33PM # YYYYMMDDhhmmss
683 Why: I imagine a feature like this could have many uses. It would
684 allow a user to do a full backup while excluding the base operating
685 system files, so if I installed a Linux snapshot from a CD yesterday,
686 I'll *exclude* all files modified *before* today. If I need to
687 recover the system, I use the CD I already have, plus the tape backup.
688 Or if, say, a Windows client is hit by a particularly corrosive
689 virus, and I need to *exclude* any files created/modified *since* the
692 Notes: Of course, this feature would work in concert with other
693 in/exclude rules, and wouldnt override them (or each other).
695 Notes: The directives I'd imagine would be along the lines of
696 "[Created] [Modified] [Before|Since] = <date>".
697 So one could compare against 'ctime' and/or 'mtime', but ONLY 'before'
701 Item 21: Archival (removal) of User Files to Tape
703 Origin: Ray Pengelly [ray at biomed dot queensu dot ca
706 What: The ability to archive data to storage based on certain parameters
707 such as age, size, or location. Once the data has been written to
708 storage and logged it is then pruned from the originating
709 filesystem. Note! We are talking about user's files and not
712 Why: This would allow fully automatic storage management which becomes
713 useful for large datastores. It would also allow for auto-staging
714 from one media type to another.
716 Example 1) Medical imaging needs to store large amounts of data.
717 They decide to keep data on their servers for 6 months and then put
718 it away for long term storage. The server then finds all files
719 older than 6 months writes them to tape. The files are then removed
722 Example 2) All data that hasn't been accessed in 2 months could be
723 moved from high-cost, fibre-channel disk storage to a low-cost
724 large-capacity SATA disk storage pool which doesn't have as quick of
725 access time. Then after another 6 months (or possibly as one
726 storage pool gets full) data is migrated to Tape.
729 Item 22: An option to operate on all pools with update vol parameters
730 Origin: Dmitriy Pinchukov <absh@bossdev.kiev.ua>
732 Status: Patch made by Nigel Stepp
734 What: When I do update -> Volume parameters -> All Volumes
735 from Pool, then I have to select pools one by one. I'd like
736 console to have an option like "0: All Pools" in the list of
739 Why: I have many pools and therefore unhappy with manually
740 updating each of them using update -> Volume parameters -> All
741 Volumes from Pool -> pool #.
744 Item 23: Automatic disabling of devices
746 Origin: Peter Eriksson <peter at ifm.liu dot se>
749 What: After a configurable amount of fatal errors with a tape drive
750 Bacula should automatically disable further use of a certain
751 tape drive. There should also be "disable"/"enable" commands in
754 Why: On a multi-drive jukebox there is a possibility of tape drives
755 going bad during large backups (needing a cleaning tape run,
756 tapes getting stuck). It would be advantageous if Bacula would
757 automatically disable further use of a problematic tape drive
758 after a configurable amount of errors has occurred.
760 An example: I have a multi-drive jukebox (6 drives, 380+ slots)
761 where tapes occasionally get stuck inside the drive. Bacula will
762 notice that the "mtx-changer" command will fail and then fail
763 any backup jobs trying to use that drive. However, it will still
764 keep on trying to run new jobs using that drive and fail -
765 forever, and thus failing lots and lots of jobs... Since we have
766 many drives Bacula could have just automatically disabled
767 further use of that drive and used one of the other ones
771 Item 24: Ability to defer Batch Insert to a later time
776 What: Instead of doing a Job Batch Insert at the end of the Job
777 which might create resource contention with lots of Job,
778 defer the insert to a later time.
780 Why: Permits to focus on getting the data on the Volume and
781 putting the metadata into the Catalog outside the backup
784 Notes: Will use the proposed Bacula ASCII database import/export
785 format (i.e. dependent on the import/export entities project).
788 Item 25: Add MaxVolumeSize/MaxVolumeBytes to Storage resource
789 Origin: Bastian Friedrich <bastian.friedrich@collax.com>
793 What: SD has a "Maximum Volume Size" statement, which is deprecated and
794 superseded by the Pool resource statement "Maximum Volume Bytes".
795 It would be good if either statement could be used in Storage
798 Why: Pools do not have to be restricted to a single storage type/device;
799 thus, it may be impossible to define Maximum Volume Bytes in the
800 Pool resource. The old MaxVolSize statement is deprecated, as it
801 is SD side only. I am using the same pool for different devices.
803 Notes: State of idea currently unknown. Storage resources in the dir
804 config currently translate to very slim catalog entries; these
805 entries would require extensions to implement what is described
806 here. Quite possibly, numerous other statements that are currently
807 available in Pool resources could be used in Storage resources too
811 Item 26: Enable persistent naming/number of SQL queries
817 Change the parsing of the query.sql file and the query command so that
818 queries are named/numbered by a fixed value, not their order in the
823 One of the real strengths of bacula is the ability to query the
824 database, and the fact that complex queries can be saved and
825 referenced from a file is very powerful. However, the choice
826 of query (both for interactive use, and by scripting input
827 to the bconsole command) is completely dependent on the order
828 within the query.sql file. The descriptve labels are helpful for
829 interactive use, but users become used to calling a particular
830 query "by number", or may use scripts to execute queries. This
831 presents a problem if the number or order of queries in the file
834 If the query.sql file used the numeric tags as a real value (rather
835 than a comment), then users could have a higher confidence that they
836 are executing the intended query, that their local changes wouldn't
837 conflict with future bacula upgrades.
839 For scripting, it's very important that the intended query is
840 what's actually executed. The current method of parsing the
841 query.sql file discourages scripting because the addition or
842 deletion of queries within the file will require corresponding
843 changes to scripts. It may not be obvious to users that deleting
844 query "17" in the query.sql file will require changing all
845 references to higher numbered queries. Similarly, when new
846 bacula distributions change the number of "official" queries,
847 user-developed queries cannot simply be appended to the file
848 without also changing any references to those queries in scripts
849 or procedural documentation, etc.
851 In addition, using fixed numbers for queries would encourage more
852 user-initiated development of queries, by supporting conventions
855 queries numbered 1-50 are supported/developed/distributed by
856 with official bacula releases
858 queries numbered 100-200 are community contributed, and are
859 related to media management
861 queries numbered 201-300 are community contributed, and are
862 related to checksums, finding duplicated files across
863 different backups, etc.
865 queries numbered 301-400 are community contributed, and are
866 related to backup statistics (average file size, size per
867 client per backup level, time for all clients by backup level,
868 storage capacity by media type, etc.)
870 queries numbered 500-999 are locally created
873 Alternatively, queries could be called by keyword (tag), rather
877 Item 27: Bacula Dir, FD and SD to support proxies
878 Origin: Karl Grindley @ MIT Lincoln Laboratory <kgrindley at ll dot mit dot edu>
882 What: Support alternate methods for nailing up a TCP session such
883 as SOCKS5, SOCKS4 and HTTP (CONNECT) proxies. Such a feature
884 would allow tunneling of bacula traffic in and out of proxied
887 Why: Currently, bacula is architected to only function on a flat network, with
888 no barriers or limitations. Due to the large configuration states of
889 any network and the infinite configuration where file daemons and
890 storage daemons may sit in relation to one another, bacula often is
891 not usable on a network where filtered or air-gaped networks exist.
892 While often solutions such as ACL modifications to firewalls or port
893 redirection via SNAT or DNAT will solve the issue, often however,
894 these solutions are not adequate or not allowed by hard policy.
896 In an air-gapped network with only a highly locked down proxy services
897 are provided (SOCKS4/5 and/or HTTP and/or SSH outbound) ACLs or
898 iptable rules will not work.
900 Notes: Director resource tunneling: This configuration option to utilize a
901 proxy to connect to a client should be specified in the client
902 resource Client resource tunneling: should be configured in the client
903 resource in the director config file? Or configured on the bacula-fd
904 configuration file on the fd host itself? If the ladder, this would
905 allow only certain clients to use a proxy, where others do not when
906 establishing the TCP connection to the storage server.
908 Also worth noting, there are other 3rd party, light weight apps that
909 could be utilized to bootstrap this. Instead of sockifing bacula
910 itself, use an external program to broker proxy authentication, and
911 connection to the remote host. OpenSSH does this by using the
912 "ProxyCommand" syntax in the client configuration and uses stdin and
913 stdout to the command. Connect.c is a very popular one.
914 (http://bent.latency.net/bent/darcs/goto-san-connect-1.85/src/connect.html).
915 One could also possibly use stunnel, netcat, etc.
918 Item 28: Add Minumum Spool Size directive
920 Origin: Frank Sweetser <fs@wpi.edu>
922 What: Add a new SD directive, "minimum spool size" (or similar). This
923 directive would specify a minimum level of free space available for
924 spooling. If the unused spool space is less than this level, any
925 new spooling requests would be blocked as if the "maximum spool
926 size" threshold had bee reached. Already spooling jobs would be
927 unaffected by this directive.
929 Why: I've been bitten by this scenario a couple of times:
931 Assume a maximum spool size of 100M. Two concurrent jobs, A and B,
932 are both running. Due to timing quirks and previously running jobs,
933 job A has used 99.9M of space in the spool directory. While A is
934 busy despooling to disk, B is happily using the remaining 0.1M of
935 spool space. This ends up in a spool/despool sequence every 0.1M of
936 data. In addition to fragmenting the data on the volume far more
937 than was necessary, in larger data sets (ie, tens or hundreds of
938 gigabytes) it can easily produce multi-megabyte report emails!
941 Item 29: Handle Windows Encrypted Files using Win raw encryption
942 Origin: Michael Mohr, SAG Mohr.External@infineon.com
943 Date: 22 February 2008
944 Origin: Alex Ehrlich (Alex.Ehrlich-at-mail.ee)
948 What: Make it possible to backup and restore Encypted Files from and to
949 Windows systems without the need to decrypt it by using the raw
950 encryption functions API (see:
951 http://msdn2.microsoft.com/en-us/library/aa363783.aspx)
952 that is provided for that reason by Microsoft.
953 If a file ist encrypted could be examined by evaluating the
954 FILE_ATTRIBUTE_ENCRYTED flag of the GetFileAttributes
956 For each file backed up or restored by FD on Windows, check if
957 the file is encrypted; if so then use OpenEncryptedFileRaw,
958 ReadEncryptedFileRaw, WriteEncryptedFileRaw,
959 CloseEncryptedFileRaw instead of BackupRead and BackupWrite
962 Why: Without the usage of this interface the fd-daemon running
963 under the system account can't read encypted Files because
964 the key needed for the decrytion is missed by them. As a result
965 actually encrypted files are not backed up
966 by bacula and also no error is shown while missing these files.
968 Notes: Using xxxEncryptedFileRaw API would allow to backup and
969 restore EFS-encrypted files without decrypting their data.
970 Note that such files cannot be restored "portably" (at least,
971 easily) but they would be restoreable to a different (or
972 reinstalled) Win32 machine; the restore would require setup
973 of a EFS recovery agent in advance, of course, and this shall
974 be clearly reflected in the documentation, but this is the
975 normal Windows SysAdmin's business.
976 When "portable" backup is requested the EFS-encrypted files
977 shall be clearly reported as errors.
978 See MSDN on the "Backup and Restore of Encrypted Files" topic:
979 http://msdn.microsoft.com/en-us/library/aa363783.aspx
980 Maybe the EFS support requires a new flag in the database for
982 Unfortunately, the implementation is not as straightforward as
983 1-to-1 replacement of BackupRead with ReadEncryptedFileRaw,
984 requiring some FD code rewrite to work with
985 encrypted-file-related callback functions.
988 Item 30: Implement a Storage device like Amazon's S3.
990 Origin: Soren Hansen <soren@ubuntu.com>
992 What: Enable the storage daemon to store backup data on Amazon's
995 Why: Amazon's S3 is a cheap way to store data off-site.
997 Notes: If we configure the Pool to put only one job per volume (they don't
998 support append operation), and the volume size isn't to big (100MB?),
999 it should be easy to adapt the disk-changer script to add get/put
1000 procedure with curl. So, the data would be safetly copied during the
1003 Cloud should be only used with Copy jobs, users should always have
1004 a copy of their data on their site.
1006 We should also think to have our own cache, trying always to have
1007 cloud volume on the local disk. (I don't know if users want to store
1008 100GB on cloud, so it shouldn't be a disk size problem). For example,
1009 if bacula want to recycle a volume, it will start by downloading the
1010 file to truncate it few seconds later, if we can avoid that...
1012 Item 31: Convert tray monitor on Windows to a stand alone program
1017 What: Separate Win32 tray monitor to be a separate program.
1019 Why: Vista does not allow SYSTEM services to interact with the
1020 desktop, so the current tray monitor does not work on Vista
1023 Notes: Requires communicating with the FD via the network (simulate
1024 a console connection).
1027 Item 32: Relabel disk volume after recycling
1028 Origin: Pasi Kärkkäinen <pasik@iki.fi>
1030 Status: Not implemented yet, no code written.
1032 What: The ability to relabel the disk volume (and thus rename the file on the
1033 disk) after it has been recycled. Useful when you have a single job
1034 per disk volume, and you use a custom Label format, for example:
1036 "${Client}-${Level}-${NumVols:p/4/0/r}-${Year}_${Month}_${Day}-${Hour}_${Minute}"
1038 Why: Disk volumes in Bacula get the label/filename when they are used for the
1039 first time. If you use recycling and custom label format like above,
1040 the disk volume name doesn't match the contents after it has been
1041 recycled. This feature makes it possible to keep the label/filename
1042 in sync with the content and thus makes it easy to check/monitor the
1043 backups from the shell and/or normal file management tools, because
1044 the filenames of the disk volumes match the content.
1046 Notes: The configuration option could be "Relabel after Recycling = Yes".
1048 Item 33: Command that releases all drives in an autochanger
1049 Origin: Blake Dunlap (blake@nxs.net)
1053 What: It would be nice if there was a release command that
1054 would release all drives in an autochanger instead of having to
1055 do each one in turn.
1057 Why: It can take some time for a release to occur, and the
1058 commands must be given for each drive in turn, which can quicky
1059 scale if there are several drives in the library. (Having to
1060 watch the console, to give each command can waste a good bit of
1061 time when you start getting into the 16 drive range when the
1062 tapes can take up to 3 minutes to eject each)
1064 Notes: Due to the way some autochangers/libraries work, you
1065 cannot assume that new tapes inserted will go into slots that are
1066 not currently believed to be in use by bacula (the tape from that
1067 slot is in a drive). This would make any changes in
1068 configuration quicker/easier, as all drives need to be released
1069 before any modifications to slots.
1071 Item 34: Run bscan on a remote storage daemon from within bconsole.
1072 Date: 07 October 2009
1073 Origin: Graham Keeling <graham@equiinet.com>
1076 What: The ability to be able to run bscan on a remote storage daemon from
1077 within bconsole in order to populate your catalog.
1079 Why: Currently, it seems you have to:
1080 a) log in to a console on the remote machine
1081 b) figure out where the storage daemon config file is
1082 c) figure out the storage device from the config file
1083 d) figure out the catalog IP address
1084 e) figure out the catalog port
1085 f) open the port on the catalog firewall
1086 g) configure the catalog database to accept connections from the
1088 h) build a 'bscan' command from (b)-(e) above and run it
1089 It would be much nicer to be able to type something like this into
1091 *bscan storage=<storage> device=<device> volume=<volume>
1093 *bscan storage=<storage> all
1094 It seems to me that the scan could also do a better job than the
1095 external bscan program currently does. It would possibly be able to
1096 deduce some extra details, such as the catalog StorageId for the
1099 Notes: (Kern). If you need to do a bscan, you have done something wrong,
1100 so this functionality should not need to be integrated into the
1101 the Storage daemon. However, I am not opposed to someone implementing
1102 this feature providing that all the code is in a shared object (or dll)
1103 and does not add significantly to the size of the Storage daemon. In
1104 addition, the code should be written in a way such that the same source
1105 code is used in both the bscan program and the Storage daemon to avoid
1106 adding a lot of new code that must be maintained by the project.
1108 Item 35: Implement a Migration job type that will create a reverse
1109 incremental (or decremental) backup from two existing full backups.
1110 Date: 05 October 2009
1111 Origin: Griffith College Dublin. Some sponsorship available.
1112 Contact: Gavin McCullagh <gavin.mccullagh@gcd.ie>
1115 What: The ability to take two full backup jobs and derive a reverse
1116 incremental backup from them. The older full backup data may then
1119 Why: Long-term backups based on keeping full backups can be expensive in
1120 media. In many cases (eg a NAS), as the client accumulates files
1121 over months and years, the same file will be duplicated unchanged,
1122 across many media and datasets. Eg, Less than 10% (and
1123 shrinking) of our monthly full mail server backup is new files,
1124 the other 90% is also in the previous full backup.
1125 Regularly converting the oldest full backup into a reverse
1126 incremental backup allows the admin to keep access to old backup
1127 jobs, but remove all of the duplicated files, freeing up media.
1129 Notes: This feature was previously discussed on the bacula-devel list
1130 here: http://www.mail-archive.com/bacula-devel@lists.sourceforge.net/msg04962.html
1132 Item 36: Job migration between different SDs
1133 Origin: Mariusz Czulada <manieq AT wp DOT eu>
1137 What: Allow to specify in migration job devices on Storage Daemon other then
1138 the one used for migrated jobs (possibly on different/distant host)
1140 Why: Sometimes we have more then one system which requires backup
1141 implementation. Often, these systems are functionally unrelated and
1142 placed in different locations. Having a big backup device (a tape
1143 library) in each location is not cost-effective. It would be much
1144 better to have one powerful enough tape library which could handle
1145 backups from all systems, assuming relatively fast and reliable WAN
1146 connections. In such architecture backups are done in service windows
1147 on local bacula servers, then migrated to central storage off the peak
1150 Notes: If migration to different SD is working, migration to the same SD, as
1151 now, could be done the same way (i mean 'localhost') to unify the
1154 Item 37: Concurrent spooling and despooling withini a single job.
1156 Origin: Jesper Krogh <jesper@krogh.cc>
1158 What: When a job has spooling enabled and the spool area size is
1159 less than the total volumes size the storage daemon will:
1160 1) Spool to spool-area
1162 3) Go to 1 if more data to be backed up.
1164 Typical disks will serve data with a speed of 100MB/s when
1165 dealing with large files, network it typical capable of doing 115MB/s
1166 (GbitE). Tape drives will despool with 50-90MB/s (LTO3) 70-120MB/s
1167 (LTO4) depending on compression and data.
1169 As bacula currently works it'll hold back data from the client until
1170 de-spooling is done, now matter if the spool area can handle another
1171 block of data. Say given a FileSet of 4TB and a spool-area of 100GB and
1172 a Maximum Job Spool Size set to 50GB then above sequence could be
1173 changed to allow to spool to the other 50GB while despooling the first
1174 50GB and not holding back the client while doing it. As above numbers
1175 show, depending on tape-drive and disk-arrays this potentially leads to
1176 a cut of the backup-time of 50% for the individual jobs.
1178 Real-world example, backing up 112.6GB (large files) to LTO4 tapes
1179 (despools with ~75MB/s, data is gzipped on the remote filesystem.
1180 Maximum Job Spool Size = 8GB
1184 Elapsed time (total time): 46m 15s => 2775s
1185 Despooling time: 25m 41s => 1541s (55%)
1186 Spooling time: 20m 34s => 1234s (45%)
1187 Reported speed: 40.58MB/s
1188 Spooling speed: 112.6GB/1234s => 91.25MB/s
1189 Despooling speed: 112.6GB/1541s => 73.07MB/s
1191 So disk + net can "keep up" with the LTO4 drive (in this test)
1193 Prosed change would effectively make the backup run in the "despooling
1194 time" 1541s giving a reduction to 55% of the total run time.
1196 In the situation where the individual job cannot keep up with LTO-drive
1197 spooling enables efficient multiplexing of multiple concurrent jobs onto
1200 Why: When dealing with larger volumes the general utillization of the
1201 network/disk is important to maximize in order to be able to run a full
1202 backup over a weekend. Current work-around is to split the FileSet in
1203 smaller FileSet and Jobs but that leads to more configuration mangement
1204 and is harder to review for completeness. Subsequently it makes restores
1207 Item 39: Extend the verify code to make it possible to verify
1208 older jobs, not only the last one that has finished
1210 Origin: Ralf Gross (Ralf-Lists <at> ralfgross.de)
1211 Status: not implemented or documented
1213 What: At the moment a VolumeToCatalog job compares only the
1214 last job with the data in the catalog. It's not possible
1215 to compare the data (md5sums) of an older volume with the
1216 data in the catalog.
1218 Why: If a verify job fails, one has to immediately check the
1219 source of the problem, fix it and rerun the verify job.
1220 This has to happen before the next backup of the
1221 verified backup job starts.
1222 More important: It's not possible to check jobs that are
1223 kept for a long time (archiv). If a jobid could be
1224 specified for a verify job, older backups/tapes could be
1225 checked on a regular base.
1227 Notes: verify documentation:
1228 VolumeToCatalog: This level causes Bacula to read the file
1229 attribute data written to the Volume from the last Job [...]
1231 Verify Job = <Job-Resource-Name> If you run a verify job
1232 without this directive, the last job run will be compared
1233 with the catalog, which means that you must immediately
1234 follow a backup by a verify command. If you specify a Verify
1235 Job Bacula will find the last job with that name that ran [...]
1237 example bconsole verify dialog:
1240 JobName: VerifyServerXXX
1241 Level: VolumeToCatalog
1242 Client: ServerXXX-fd
1243 FileSet: ServerXXX-Vol1
1244 Pool: Full (From Job resource)
1245 Storage: Neo4100 (From Pool resource)
1246 Verify Job: ServerXXX-Vol1
1248 When: 2009-04-20 09:03:04
1250 OK to run? (yes/mod/no): m
1251 Parameters to modify:
1264 Item 40: Separate "Storage" and "Device" in the bacula-dir.conf
1266 Origin: "James Harper" <james.harper@bendigoit.com.au>
1267 Status: not implemented or documented
1269 What: Separate "Storage" and "Device" in the bacula-dir.conf
1270 The resulting config would looks something like:
1273 Name = name_of_server
1274 Address = hostname/IP address
1276 Password = shh_its_a_secret
1277 Maximum Concurrent Jobs = 7
1281 Name = name_of_device
1282 Storage = name_of_server
1283 Device = name_of_device_on_sd
1284 Media Type = media_type
1285 Maximum Concurrent Jobs = 1
1288 Maximum Concurrent Jobs would be specified with a server and a device
1289 maximum, which would both be honoured by the director. Almost everything
1290 that mentions a 'Storage' would need to be changed to 'Device', although
1291 perhaps a 'Storage' would just be a synonym for 'Device' for backwards
1294 Why: If you have multiple Storage definitions pointing to different
1295 Devices in the same Storage daemon, the "status storage" command
1296 prompts for each different device, but they all give the same
1301 ========= New items after last vote ====================
1304 ========= Add new items above this line =================
1307 ============= Empty Feature Request form ===========
1308 Item n: One line summary ...
1309 Date: Date submitted
1310 Origin: Name and email of originator.
1313 What: More detailed explanation ...
1315 Why: Why it is important ...
1317 Notes: Additional notes or features (omit if not used)
1318 ============== End Feature Request form ==============
1321 ========== Items put on hold by Kern ============================
1324 ========== Items completed in version 5.0.0 ====================
1325 *Item 2: 'restore' menu: enter a JobId, automatically select dependents
1326 *Item 5: Deletion of disk Volumes when pruned (partial -- truncate when pruned)
1327 *Item 6: Implement Base jobs
1328 *Item 10: Restore from volumes on multiple storage daemons
1329 *Item 15: Enable/disable compression depending on storage device (disk/tape)
1330 *Item 20: Cause daemons to use a specific IP address to source communications
1331 *Item 23: "Maximum Concurrent Jobs" for drives when used with changer device
1332 *Item 31: List InChanger flag when doing restore.
1333 *Item 35: Port bat to Win32