3 Bacula Projects Roadmap
4 Status updated 25 February 2010
9 Item 1: Ability to restart failed jobs
10 Item 2: Scheduling syntax that permits more flexibility and options
11 Item 3: Data encryption on storage daemon
12 Item 4: Add ability to Verify any specified Job.
13 Item 5: Improve Bacula's tape and drive usage and cleaning management
14 Item 6: Allow FD to initiate a backup
15 Item 7: Implement Storage daemon compression
16 Item 8: Reduction of communications bandwidth for a backup
17 Item 9: Ability to reconnect a disconnected comm line
18 Item 10: Start spooling even when waiting on tape
19 Item 11: Include all conf files in specified directory
20 Item 12: Multiple threads in file daemon for the same job
21 Item 13: Possibilty to schedule Jobs on last Friday of the month
22 Item 14: Include timestamp of job launch in "stat clients" output
23 Item 15: Message mailing based on backup types
24 Item 16: Ability to import/export Bacula database entities
25 Item 17: Implementation of running Job speed limit.
26 Item 18: Add an override in Schedule for Pools based on backup types
27 Item 19: Automatic promotion of backup levels based on backup size
28 Item 20: Allow FileSet inclusion/exclusion by creation/mod times
29 Item 21: Archival (removal) of User Files to Tape
30 Item 22: An option to operate on all pools with update vol parameters
31 Item 23: Automatic disabling of devices
32 Item 24: Ability to defer Batch Insert to a later time
33 Item 25: Add MaxVolumeSize/MaxVolumeBytes to Storage resource
34 Item 26: Enable persistent naming/number of SQL queries
35 Item 27: Bacula Dir, FD and SD to support proxies
36 Item 28: Add Minumum Spool Size directive
37 Item 29: Handle Windows Encrypted Files using Win raw encryption
38 Item 30: Implement a Storage device like Amazon's S3.
39 Item 31: Convert tray monitor on Windows to a stand alone program
40 Item 32: Relabel disk volume after recycling
41 Item 33: Command that releases all drives in an autochanger
42 Item 34: Run bscan on a remote storage daemon from within bconsole.
43 Item 35: Implement a Migration job type that will create a reverse
44 Item 36: Job migration between different SDs
45 Item 37: Concurrent spooling and despooling withini a single job.
46 Item 39: Extend the verify code to make it possible to verify
47 Item 40: Separate "Storage" and "Device" in the bacula-dir.conf
48 Item 41: Least recently used device selection for tape drives in autochanger.
51 Item 1: Ability to restart failed jobs
56 What: Often jobs fail because of a communications line drop or max run time,
57 cancel, or some other non-critical problem. Currrently any data
58 saved is lost. This implementation should modify the Storage daemon
59 so that it saves all the files that it knows are completely backed
62 The jobs should then be marked as incomplete and a subsequent
63 Incremental Accurate backup will then take into account all the
66 Why: Avoids backuping data already saved.
68 Notes: Requires Accurate to restart correctly. Must completed have a minimum
69 volume of data or files stored on Volume before enabling.
72 Item 2: Scheduling syntax that permits more flexibility and options
73 Date: 15 December 2006
74 Origin: Gregory Brauer (greg at wildbrain dot com) and
75 Florian Schnabel <florian.schnabel at docufy dot de>
78 What: Currently, Bacula only understands how to deal with weeks of the
79 month or weeks of the year in schedules. This makes it impossible
80 to do a true weekly rotation of tapes. There will always be a
81 discontinuity that will require disruptive manual intervention at
82 least monthly or yearly because week boundaries never align with
83 month or year boundaries.
85 A solution would be to add a new syntax that defines (at least)
86 a start timestamp, and repetition period.
88 An easy option to skip a certain job on a certain date.
91 Why: Rotated backups done at weekly intervals are useful, and Bacula
92 cannot currently do them without extensive hacking.
94 You could then easily skip tape backups on holidays. Especially
95 if you got no autochanger and can only fit one backup on a tape
96 that would be really handy, other jobs could proceed normally
97 and you won't get errors that way.
100 Notes: Here is an example syntax showing a 3-week rotation where full
101 Backups would be performed every week on Saturday, and an
102 incremental would be performed every week on Tuesday. Each
103 set of tapes could be removed from the loader for the following
104 two cycles before coming back and being reused on the third
105 week. Since the execution times are determined by intervals
106 from a given point in time, there will never be any issues with
107 having to adjust to any sort of arbitrary time boundary. In
108 the example provided, I even define the starting schedule
109 as crossing both a year and a month boundary, but the run times
110 would be based on the "Repeat" value and would therefore happen
115 Name = "Week 1 Rotation"
116 #Saturday. Would run Dec 30, Jan 20, Feb 10, etc.
120 Start = 2006-12-30 01:00
124 #Tuesday. Would run Jan 2, Jan 23, Feb 13, etc.
128 Start = 2007-01-02 01:00
135 Name = "Week 2 Rotation"
136 #Saturday. Would run Jan 6, Jan 27, Feb 17, etc.
140 Start = 2007-01-06 01:00
144 #Tuesday. Would run Jan 9, Jan 30, Feb 20, etc.
148 Start = 2007-01-09 01:00
155 Name = "Week 3 Rotation"
156 #Saturday. Would run Jan 13, Feb 3, Feb 24, etc.
160 Start = 2007-01-13 01:00
164 #Tuesday. Would run Jan 16, Feb 6, Feb 27, etc.
168 Start = 2007-01-16 01:00
174 Notes: Kern: I have merged the previously separate project of skipping
175 jobs (via Schedule syntax) into this.
178 Item 3: Data encryption on storage daemon
179 Origin: Tobias Barth <tobias.barth at web-arts.com>
180 Date: 04 February 2009
183 What: The storage demon should be able to do the data encryption that can
184 currently be done by the file daemon.
186 Why: This would have 2 advantages:
187 1) one could encrypt the data of unencrypted tapes by doing a
189 2) the storage daemon would be the only machine that would have
190 to keep the encryption keys.
193 As an addendum to the feature request, here are some crypto
194 implementation details I wrote up regarding SD-encryption back in Jan
196 http://www.mail-archive.com/bacula-users@lists.sourceforge.net/msg28860.html
199 Item 4: Add ability to Verify any specified Job.
200 Date: 17 January 2008
201 Origin: portrix.net Hamburg, Germany.
202 Contact: Christian Sabelmann
203 Status: 70% of the required Code is part of the Verify function since v. 2.x
206 The ability to tell Bacula which Job should verify instead of
207 automatically verify just the last one.
210 It is sad that such a powerfull feature like Verify Jobs
211 (VolumeToCatalog) is restricted to be used only with the last backup Job
212 of a client. Actual users who have to do daily Backups are forced to
213 also do daily Verify Jobs in order to take advantage of this useful
214 feature. This Daily Verify after Backup conduct is not always desired
215 and Verify Jobs have to be sometimes scheduled. (Not necessarily
216 scheduled in Bacula). With this feature Admins can verify Jobs once a
217 Week or less per month, selecting the Jobs they want to verify. This
218 feature is also not to difficult to implement taking in account older bug
219 reports about this feature and the selection of the Job to be verified.
221 Notes: For the verify Job, the user could select the Job to be verified
222 from a List of the latest Jobs of a client. It would also be possible to
223 verify a certain volume. All of these would naturaly apply only for
224 Jobs whose file information are still in the catalog.
227 Item 5: Improve Bacula's tape and drive usage and cleaning management
228 Date: 8 November 2005, November 11, 2005
229 Origin: Adam Thornton <athornton at sinenomine dot net>,
230 Arno Lehmann <al at its-lehmann dot de>
233 What: Make Bacula manage tape life cycle information, tape reuse
234 times and drive cleaning cycles.
236 Why: All three parts of this project are important when operating
238 We need to know which tapes need replacement, and we need to
239 make sure the drives are cleaned when necessary. While many
240 tape libraries and even autoloaders can handle all this
241 automatically, support by Bacula can be helpful for smaller
242 (older) libraries and single drives. Limiting the number of
243 times a tape is used might prevent tape errors when using
244 tapes until the drives can't read it any more. Also, checking
245 drive status during operation can prevent some failures (as I
246 [Arno] had to learn the hard way...)
248 Notes: First, Bacula could (and even does, to some limited extent)
249 record tape and drive usage. For tapes, the number of mounts,
250 the amount of data, and the time the tape has actually been
251 running could be recorded. Data fields for Read and Write
252 time and Number of mounts already exist in the catalog (I'm
253 not sure if VolBytes is the sum of all bytes ever written to
254 that volume by Bacula). This information can be important
255 when determining which media to replace. The ability to mark
256 Volumes as "used up" after a given number of write cycles
257 should also be implemented so that a tape is never actually
258 worn out. For the tape drives known to Bacula, similar
259 information is interesting to determine the device status and
260 expected life time: Time it's been Reading and Writing, number
261 of tape Loads / Unloads / Errors. This information is not yet
262 recorded as far as I [Arno] know. A new volume status would
263 be necessary for the new state, like "Used up" or "Worn out".
264 Volumes with this state could be used for restores, but not
265 for writing. These volumes should be migrated first (assuming
266 migration is implemented) and, once they are no longer needed,
267 could be moved to a Trash pool.
269 The next step would be to implement a drive cleaning setup.
270 Bacula already has knowledge about cleaning tapes. Once it
271 has some information about cleaning cycles (measured in drive
272 run time, number of tapes used, or calender days, for example)
273 it can automatically execute tape cleaning (with an
274 autochanger, obviously) or ask for operator assistance loading
277 The final step would be to implement TAPEALERT checks not only
278 when changing tapes and only sending the information to the
279 administrator, but rather checking after each tape error,
280 checking on a regular basis (for example after each tape
281 file), and also before unloading and after loading a new tape.
282 Then, depending on the drives TAPEALERT state and the known
283 drive cleaning state Bacula could automatically schedule later
284 cleaning, clean immediately, or inform the operator.
286 Implementing this would perhaps require another catalog change
287 and perhaps major changes in SD code and the DIR-SD protocol,
288 so I'd only consider this worth implementing if it would
289 actually be used or even needed by many people.
291 Implementation of these projects could happen in three distinct
292 sub-projects: Measuring Tape and Drive usage, retiring
293 volumes, and handling drive cleaning and TAPEALERTs.
296 Item 6: Allow FD to initiate a backup
297 Origin: Frank Volf (frank at deze dot org)
298 Date: 17 November 2005
301 What: Provide some means, possibly by a restricted console that
302 allows a FD to initiate a backup, and that uses the connection
303 established by the FD to the Director for the backup so that
304 a Director that is firewalled can do the backup.
305 Why: Makes backup of laptops much easier.
306 Notes: - The FD already has code for the monitor interface
307 - It could be nice to have a .job command that lists authorized
309 - Commands need to be restricted on the Director side
310 (for example by re-using the runscript flag)
311 - The Client resource can be used to authorize the connection
312 - In a first time, the client can't modify job parameters
313 - We need a way to run a status command to follow job progression
315 This project consists of the following points
316 1. Modify the FD to have a "mini-console" interface that
317 permits it to connect to the Director and start a
318 backup job of itself.
319 2. The list of jobs that can be started by the FD are
320 defined in the Director (possibly via a restricted
322 3. Modify the existing tray monitor code in the Win32 FD
323 so that it is a separate program from the FD.
324 4. The tray monitor program should be extended to permit
326 5. No new Director directives should be added without
327 prior consultation with the Bacula developers.
328 6. The comm line used by the FD to connect to the Director
329 should be re-used by the Director to do the backup.
330 This feature is partially implemented in the Director.
331 7. The FD may have a new directive that allows it to start
332 a backup when the FD starts.
333 8. The console interface to the FD should be extended to
334 permit a properly authorized console to initiate a
338 Item 7: Implement Storage daemon compression
339 Date: 18 December 2006
340 Origin: Vadim A. Umanski , e-mail umanski@ext.ru
342 What: The ability to compress backup data on the SD receiving data
343 instead of doing that on client sending data.
344 Why: The need is practical. I've got some machines that can send
345 data to the network 4 or 5 times faster than compressing
346 them (I've measured that). They're using fast enough SCSI/FC
347 disk subsystems but rather slow CPUs (ex. UltraSPARC II).
348 And the backup server has got a quite fast CPUs (ex. Dual P4
349 Xeons) and quite a low load. When you have 20, 50 or 100 GB
350 of raw data - running a job 4 to 5 times faster - that
351 really matters. On the other hand, the data can be
352 compressed 50% or better - so losing twice more space for
353 disk backup is not good at all. And the network is all mine
354 (I have a dedicated management/provisioning network) and I
355 can get as high bandwidth as I need - 100Mbps, 1000Mbps...
356 That's why the server-side compression feature is needed!
360 Item 8: Reduction of communications bandwidth for a backup
361 Date: 14 October 2008
362 Origin: Robin O'Leary (Equiinet)
365 What: Using rdiff techniques, Bacula could significantly reduce
366 the network data transfer volume to do a backup.
368 Why: Faster backup across the Internet
370 Notes: This requires retaining certain data on the client during a Full
371 backup that will speed up subsequent backups.
374 Item 9: Ability to reconnect a disconnected comm line
379 What: Often jobs fail because of a communications line drop. In that
380 case, Bacula should be able to reconnect to the other daemon and
383 Why: Avoids backuping data already saved.
385 Notes: *Very* complicated from a design point of view because of authenication.
387 Item 10: Start spooling even when waiting on tape
388 Origin: Tobias Barth <tobias.barth@web-arts.com>
392 What: If a job can be spooled to disk before writing it to tape, it should
393 be spooled immediately. Currently, bacula waits until the correct
394 tape is inserted into the drive.
396 Why: It could save hours. When bacula waits on the operator who must insert
397 the correct tape (e.g. a new tape or a tape from another media
398 pool), bacula could already prepare the spooled data in the spooling
399 directory and immediately start despooling when the tape was
400 inserted by the operator.
402 2nd step: Use 2 or more spooling directories. When one directory is
403 currently despooling, the next (on different disk drives) could
404 already be spooling the next data.
406 Notes: I am using bacula 2.2.8, which has none of those features
410 Item 11: Include all conf files in specified directory
411 Date: 18 October 2008
412 Origin: Database, Lda. Maputo, Mozambique
413 Contact:Cameron Smith / cameron.ord@database.co.mz
416 What: A directive something like "IncludeConf = /etc/bacula/subconfs" Every
417 time Bacula Director restarts or reloads, it will walk the given
418 directory (non-recursively) and include the contents of any files
419 therein, as though they were appended to bacula-dir.conf
421 Why: Permits simplified and safer configuration for larger installations with
422 many client PCs. Currently, through judicious use of JobDefs and
423 similar directives, it is possible to reduce the client-specific part of
424 a configuration to a minimum. The client-specific directives can be
425 prepared according to a standard template and dropped into a known
426 directory. However it is still necessary to add a line to the "master"
427 (bacula-dir.conf) referencing each new file. This exposes the master to
428 unnecessary risk of accidental mistakes and makes automation of adding
429 new client-confs, more difficult (it is easier to automate dropping a
430 file into a dir, than rewriting an existing file). Ken has previously
431 made a convincing argument for NOT including Bacula's core configuration
432 in an RDBMS, but I believe that the present request is a reasonable
433 extension to the current "flat-file-based" configuration philosophy.
435 Notes: There is NO need for any special syntax to these files. They should
436 contain standard directives which are simply "inlined" to the parent
437 file as already happens when you explicitly reference an external file.
439 Notes: (kes) this can already be done with scripting
440 From: John Jorgensen <jorgnsn@lcd.uregina.ca>
441 The bacula-dir.conf at our site contains these lines:
444 # Include subfiles associated with configuration of clients.
445 # They define the bulk of the Clients, Jobs, and FileSets.
447 @|"sh -c 'for f in /etc/bacula/clientdefs/*.conf ; do echo @${f} ; done'"
449 and when we get a new client, we just put its configuration into
450 a new file called something like:
452 /etc/bacula/clientdefs/clientname.conf
455 Item 12: Multiple threads in file daemon for the same job
456 Date: 27 November 2005
457 Origin: Ove Risberg (Ove.Risberg at octocode dot com)
460 What: I want the file daemon to start multiple threads for a backup
461 job so the fastest possible backup can be made.
463 The file daemon could parse the FileSet information and start
464 one thread for each File entry located on a separate
467 A confiuration option in the job section should be used to
468 enable or disable this feature. The confgutration option could
469 specify the maximum number of threads in the file daemon.
471 If the theads could spool the data to separate spool files
472 the restore process will not be much slower.
474 Why: Multiple concurrent backups of a large fileserver with many
475 disks and controllers will be much faster.
477 Notes: (KES) This is not necessary and could be accomplished
478 by having two jobs. In addition, the current VSS code
482 Item 13: Possibilty to schedule Jobs on last Friday of the month
483 Origin: Carsten Menke <bootsy52 at gmx dot net>
487 What: Currently if you want to run your monthly Backups on the last
488 Friday of each month this is only possible with workarounds (e.g
489 scripting) (As some months got 4 Fridays and some got 5 Fridays)
490 The same is true if you plan to run your yearly Backups on the
491 last Friday of the year. It would be nice to have the ability to
492 use the builtin scheduler for this.
494 Why: In many companies the last working day of the week is Friday (or
495 Saturday), so to get the most data of the month onto the monthly
496 tape, the employees are advised to insert the tape for the
497 monthly backups on the last friday of the month.
499 Notes: To give this a complete functionality it would be nice if the
500 "first" and "last" Keywords could be implemented in the
501 scheduler, so it is also possible to run monthy backups at the
502 first friday of the month and many things more. So if the syntax
503 would expand to this {first|last} {Month|Week|Day|Mo-Fri} of the
504 {Year|Month|Week} you would be able to run really flexible jobs.
506 To got a certain Job run on the last Friday of the Month for example
509 Run = pool=Monthly last Fri of the Month at 23:50
513 Run = pool=Yearly last Fri of the Year at 23:50
515 ## Certain Jobs the last Week of a Month
517 Run = pool=LastWeek last Week of the Month at 23:50
519 ## Monthly Backup on the last day of the month
521 Run = pool=Monthly last Day of the Month at 23:50
524 Item 14: Include timestamp of job launch in "stat clients" output
525 Origin: Mark Bergman <mark.bergman@uphs.upenn.edu>
526 Date: Tue Aug 22 17:13:39 EDT 2006
529 What: The "stat clients" command doesn't include any detail on when
530 the active backup jobs were launched.
532 Why: Including the timestamp would make it much easier to decide whether
533 a job is running properly.
535 Notes: It may be helpful to have the output from "stat clients" formatted
536 more like that from "stat dir" (and other commands), in a column
537 format. The per-client information that's currently shown (level,
538 client name, JobId, Volume, pool, device, Files, etc.) is good, but
539 somewhat hard to parse (both programmatically and visually),
540 particularly when there are many active clients.
543 Item 15: Message mailing based on backup types
544 Origin: Evan Kaufman <evan.kaufman@gmail.com>
545 Date: January 6, 2006
548 What: In the "Messages" resource definitions, allowing messages
549 to be mailed based on the type (backup, restore, etc.) and level
550 (full, differential, etc) of job that created the originating
553 Why: It would, for example, allow someone's boss to be emailed
554 automatically only when a Full Backup job runs, so he can
555 retrieve the tapes for offsite storage, even if the IT dept.
556 doesn't (or can't) explicitly notify him. At the same time, his
557 mailbox wouldnt be filled by notifications of Verifies, Restores,
558 or Incremental/Differential Backups (which would likely be kept
561 Notes: One way this could be done is through additional message types, for
565 # email the boss only on full system backups
566 Mail = boss@mycompany.com = full, !incremental, !differential, !restore,
568 # email us only when something breaks
569 MailOnError = itdept@mycompany.com = all
572 Notes: Kern: This should be rather trivial to implement.
575 Item 16: Ability to import/export Bacula database entities
580 What: Create a Bacula ASCII SQL database independent format that permits
581 importing and exporting database catalog Job entities.
583 Why: For achival, database clustering, tranfer to other databases
586 Notes: Job selection should be by Job, time, Volume, Client, Pool and possibly
590 Item 17: Implementation of running Job speed limit.
591 Origin: Alex F, alexxzell at yahoo dot com
592 Date: 29 January 2009
594 What: I noticed the need for an integrated bandwidth limiter for
595 running jobs. It would be very useful just to specify another
596 field in bacula-dir.conf, like speed = how much speed you wish
597 for that specific job to run at
599 Why: Because of a couple of reasons. First, it's very hard to implement a
600 traffic shaping utility and also make it reliable. Second, it is very
601 uncomfortable to have to implement these apps to, let's say 50 clients
602 (including desktops, servers). This would also be unreliable because you
603 have to make sure that the apps are properly working when needed; users
604 could also disable them (accidentally or not). It would be very useful
605 to provide Bacula this ability. All information would be centralized,
606 you would not have to go to 50 different clients in 10 different
607 locations for configuration; eliminating 3rd party additions help in
608 establishing efficiency. Would also avoid bandwidth congestion,
609 especially where there is little available.
612 Item 18: Add an override in Schedule for Pools based on backup types
614 Origin: Chad Slater <chad.slater@clickfox.com>
617 What: Adding a FullStorage=BigTapeLibrary in the Schedule resource
618 would help those of us who use different storage devices for different
619 backup levels cope with the "auto-upgrade" of a backup.
621 Why: Assume I add several new devices to be backed up, i.e. several
622 hosts with 1TB RAID. To avoid tape switching hassles, incrementals are
623 stored in a disk set on a 2TB RAID. If you add these devices in the
624 middle of the month, the incrementals are upgraded to "full" backups,
625 but they try to use the same storage device as requested in the
626 incremental job, filling up the RAID holding the differentials. If we
627 could override the Storage parameter for full and/or differential
628 backups, then the Full job would use the proper Storage device, which
629 has more capacity (i.e. a 8TB tape library.
632 Item 19: Automatic promotion of backup levels based on backup size
633 Date: 19 January 2006
634 Origin: Adam Thornton <athornton@sinenomine.net>
637 What: Other backup programs have a feature whereby it estimates the space
638 that a differential, incremental, and full backup would take. If
639 the difference in space required between the scheduled level and the
640 next level up is beneath some user-defined critical threshold, the
641 backup level is bumped to the next type. Doing this minimizes the
642 number of volumes necessary during a restore, with a fairly minimal
643 cost in backup media space.
645 Why: I know at least one (quite sophisticated and smart) user for whom the
646 absence of this feature is a deal-breaker in terms of using Bacula;
647 if we had it it would eliminate the one cool thing other backup
648 programs can do and we can't (at least, the one cool thing I know
652 Item 20: Allow FileSet inclusion/exclusion by creation/mod times
653 Origin: Evan Kaufman <evan.kaufman@gmail.com>
654 Date: January 11, 2006
657 What: In the vein of the Wild and Regex directives in a Fileset's
658 Options, it would be helpful to allow a user to include or exclude
659 files and directories by creation or modification times.
661 You could factor the Exclude=yes|no option in much the same way it
662 affects the Wild and Regex directives. For example, you could exclude
663 all files modified before a certain date:
667 Modified Before = ####
670 Or you could exclude all files created/modified since a certain date:
674 Created Modified Since = ####
677 The format of the time/date could be done several ways, say the number
678 of seconds since the epoch:
679 1137008553 = Jan 11 2006, 1:42:33PM # result of `date +%s`
681 Or a human readable date in a cryptic form:
682 20060111134233 = Jan 11 2006, 1:42:33PM # YYYYMMDDhhmmss
684 Why: I imagine a feature like this could have many uses. It would
685 allow a user to do a full backup while excluding the base operating
686 system files, so if I installed a Linux snapshot from a CD yesterday,
687 I'll *exclude* all files modified *before* today. If I need to
688 recover the system, I use the CD I already have, plus the tape backup.
689 Or if, say, a Windows client is hit by a particularly corrosive
690 virus, and I need to *exclude* any files created/modified *since* the
693 Notes: Of course, this feature would work in concert with other
694 in/exclude rules, and wouldnt override them (or each other).
696 Notes: The directives I'd imagine would be along the lines of
697 "[Created] [Modified] [Before|Since] = <date>".
698 So one could compare against 'ctime' and/or 'mtime', but ONLY 'before'
702 Item 21: Archival (removal) of User Files to Tape
704 Origin: Ray Pengelly [ray at biomed dot queensu dot ca
707 What: The ability to archive data to storage based on certain parameters
708 such as age, size, or location. Once the data has been written to
709 storage and logged it is then pruned from the originating
710 filesystem. Note! We are talking about user's files and not
713 Why: This would allow fully automatic storage management which becomes
714 useful for large datastores. It would also allow for auto-staging
715 from one media type to another.
717 Example 1) Medical imaging needs to store large amounts of data.
718 They decide to keep data on their servers for 6 months and then put
719 it away for long term storage. The server then finds all files
720 older than 6 months writes them to tape. The files are then removed
723 Example 2) All data that hasn't been accessed in 2 months could be
724 moved from high-cost, fibre-channel disk storage to a low-cost
725 large-capacity SATA disk storage pool which doesn't have as quick of
726 access time. Then after another 6 months (or possibly as one
727 storage pool gets full) data is migrated to Tape.
730 Item 22: An option to operate on all pools with update vol parameters
731 Origin: Dmitriy Pinchukov <absh@bossdev.kiev.ua>
733 Status: Patch made by Nigel Stepp
735 What: When I do update -> Volume parameters -> All Volumes
736 from Pool, then I have to select pools one by one. I'd like
737 console to have an option like "0: All Pools" in the list of
740 Why: I have many pools and therefore unhappy with manually
741 updating each of them using update -> Volume parameters -> All
742 Volumes from Pool -> pool #.
745 Item 23: Automatic disabling of devices
747 Origin: Peter Eriksson <peter at ifm.liu dot se>
750 What: After a configurable amount of fatal errors with a tape drive
751 Bacula should automatically disable further use of a certain
752 tape drive. There should also be "disable"/"enable" commands in
755 Why: On a multi-drive jukebox there is a possibility of tape drives
756 going bad during large backups (needing a cleaning tape run,
757 tapes getting stuck). It would be advantageous if Bacula would
758 automatically disable further use of a problematic tape drive
759 after a configurable amount of errors has occurred.
761 An example: I have a multi-drive jukebox (6 drives, 380+ slots)
762 where tapes occasionally get stuck inside the drive. Bacula will
763 notice that the "mtx-changer" command will fail and then fail
764 any backup jobs trying to use that drive. However, it will still
765 keep on trying to run new jobs using that drive and fail -
766 forever, and thus failing lots and lots of jobs... Since we have
767 many drives Bacula could have just automatically disabled
768 further use of that drive and used one of the other ones
772 Item 24: Ability to defer Batch Insert to a later time
777 What: Instead of doing a Job Batch Insert at the end of the Job
778 which might create resource contention with lots of Job,
779 defer the insert to a later time.
781 Why: Permits to focus on getting the data on the Volume and
782 putting the metadata into the Catalog outside the backup
785 Notes: Will use the proposed Bacula ASCII database import/export
786 format (i.e. dependent on the import/export entities project).
789 Item 25: Add MaxVolumeSize/MaxVolumeBytes to Storage resource
790 Origin: Bastian Friedrich <bastian.friedrich@collax.com>
794 What: SD has a "Maximum Volume Size" statement, which is deprecated and
795 superseded by the Pool resource statement "Maximum Volume Bytes".
796 It would be good if either statement could be used in Storage
799 Why: Pools do not have to be restricted to a single storage type/device;
800 thus, it may be impossible to define Maximum Volume Bytes in the
801 Pool resource. The old MaxVolSize statement is deprecated, as it
802 is SD side only. I am using the same pool for different devices.
804 Notes: State of idea currently unknown. Storage resources in the dir
805 config currently translate to very slim catalog entries; these
806 entries would require extensions to implement what is described
807 here. Quite possibly, numerous other statements that are currently
808 available in Pool resources could be used in Storage resources too
812 Item 26: Enable persistent naming/number of SQL queries
818 Change the parsing of the query.sql file and the query command so that
819 queries are named/numbered by a fixed value, not their order in the
824 One of the real strengths of bacula is the ability to query the
825 database, and the fact that complex queries can be saved and
826 referenced from a file is very powerful. However, the choice
827 of query (both for interactive use, and by scripting input
828 to the bconsole command) is completely dependent on the order
829 within the query.sql file. The descriptve labels are helpful for
830 interactive use, but users become used to calling a particular
831 query "by number", or may use scripts to execute queries. This
832 presents a problem if the number or order of queries in the file
835 If the query.sql file used the numeric tags as a real value (rather
836 than a comment), then users could have a higher confidence that they
837 are executing the intended query, that their local changes wouldn't
838 conflict with future bacula upgrades.
840 For scripting, it's very important that the intended query is
841 what's actually executed. The current method of parsing the
842 query.sql file discourages scripting because the addition or
843 deletion of queries within the file will require corresponding
844 changes to scripts. It may not be obvious to users that deleting
845 query "17" in the query.sql file will require changing all
846 references to higher numbered queries. Similarly, when new
847 bacula distributions change the number of "official" queries,
848 user-developed queries cannot simply be appended to the file
849 without also changing any references to those queries in scripts
850 or procedural documentation, etc.
852 In addition, using fixed numbers for queries would encourage more
853 user-initiated development of queries, by supporting conventions
856 queries numbered 1-50 are supported/developed/distributed by
857 with official bacula releases
859 queries numbered 100-200 are community contributed, and are
860 related to media management
862 queries numbered 201-300 are community contributed, and are
863 related to checksums, finding duplicated files across
864 different backups, etc.
866 queries numbered 301-400 are community contributed, and are
867 related to backup statistics (average file size, size per
868 client per backup level, time for all clients by backup level,
869 storage capacity by media type, etc.)
871 queries numbered 500-999 are locally created
874 Alternatively, queries could be called by keyword (tag), rather
878 Item 27: Bacula Dir, FD and SD to support proxies
879 Origin: Karl Grindley @ MIT Lincoln Laboratory <kgrindley at ll dot mit dot edu>
883 What: Support alternate methods for nailing up a TCP session such
884 as SOCKS5, SOCKS4 and HTTP (CONNECT) proxies. Such a feature
885 would allow tunneling of bacula traffic in and out of proxied
888 Why: Currently, bacula is architected to only function on a flat network, with
889 no barriers or limitations. Due to the large configuration states of
890 any network and the infinite configuration where file daemons and
891 storage daemons may sit in relation to one another, bacula often is
892 not usable on a network where filtered or air-gaped networks exist.
893 While often solutions such as ACL modifications to firewalls or port
894 redirection via SNAT or DNAT will solve the issue, often however,
895 these solutions are not adequate or not allowed by hard policy.
897 In an air-gapped network with only a highly locked down proxy services
898 are provided (SOCKS4/5 and/or HTTP and/or SSH outbound) ACLs or
899 iptable rules will not work.
901 Notes: Director resource tunneling: This configuration option to utilize a
902 proxy to connect to a client should be specified in the client
903 resource Client resource tunneling: should be configured in the client
904 resource in the director config file? Or configured on the bacula-fd
905 configuration file on the fd host itself? If the ladder, this would
906 allow only certain clients to use a proxy, where others do not when
907 establishing the TCP connection to the storage server.
909 Also worth noting, there are other 3rd party, light weight apps that
910 could be utilized to bootstrap this. Instead of sockifing bacula
911 itself, use an external program to broker proxy authentication, and
912 connection to the remote host. OpenSSH does this by using the
913 "ProxyCommand" syntax in the client configuration and uses stdin and
914 stdout to the command. Connect.c is a very popular one.
915 (http://bent.latency.net/bent/darcs/goto-san-connect-1.85/src/connect.html).
916 One could also possibly use stunnel, netcat, etc.
919 Item 28: Add Minumum Spool Size directive
921 Origin: Frank Sweetser <fs@wpi.edu>
923 What: Add a new SD directive, "minimum spool size" (or similar). This
924 directive would specify a minimum level of free space available for
925 spooling. If the unused spool space is less than this level, any
926 new spooling requests would be blocked as if the "maximum spool
927 size" threshold had bee reached. Already spooling jobs would be
928 unaffected by this directive.
930 Why: I've been bitten by this scenario a couple of times:
932 Assume a maximum spool size of 100M. Two concurrent jobs, A and B,
933 are both running. Due to timing quirks and previously running jobs,
934 job A has used 99.9M of space in the spool directory. While A is
935 busy despooling to disk, B is happily using the remaining 0.1M of
936 spool space. This ends up in a spool/despool sequence every 0.1M of
937 data. In addition to fragmenting the data on the volume far more
938 than was necessary, in larger data sets (ie, tens or hundreds of
939 gigabytes) it can easily produce multi-megabyte report emails!
942 Item 29: Handle Windows Encrypted Files using Win raw encryption
943 Origin: Michael Mohr, SAG Mohr.External@infineon.com
944 Date: 22 February 2008
945 Origin: Alex Ehrlich (Alex.Ehrlich-at-mail.ee)
949 What: Make it possible to backup and restore Encypted Files from and to
950 Windows systems without the need to decrypt it by using the raw
951 encryption functions API (see:
952 http://msdn2.microsoft.com/en-us/library/aa363783.aspx)
953 that is provided for that reason by Microsoft.
954 If a file ist encrypted could be examined by evaluating the
955 FILE_ATTRIBUTE_ENCRYTED flag of the GetFileAttributes
957 For each file backed up or restored by FD on Windows, check if
958 the file is encrypted; if so then use OpenEncryptedFileRaw,
959 ReadEncryptedFileRaw, WriteEncryptedFileRaw,
960 CloseEncryptedFileRaw instead of BackupRead and BackupWrite
963 Why: Without the usage of this interface the fd-daemon running
964 under the system account can't read encypted Files because
965 the key needed for the decrytion is missed by them. As a result
966 actually encrypted files are not backed up
967 by bacula and also no error is shown while missing these files.
969 Notes: Using xxxEncryptedFileRaw API would allow to backup and
970 restore EFS-encrypted files without decrypting their data.
971 Note that such files cannot be restored "portably" (at least,
972 easily) but they would be restoreable to a different (or
973 reinstalled) Win32 machine; the restore would require setup
974 of a EFS recovery agent in advance, of course, and this shall
975 be clearly reflected in the documentation, but this is the
976 normal Windows SysAdmin's business.
977 When "portable" backup is requested the EFS-encrypted files
978 shall be clearly reported as errors.
979 See MSDN on the "Backup and Restore of Encrypted Files" topic:
980 http://msdn.microsoft.com/en-us/library/aa363783.aspx
981 Maybe the EFS support requires a new flag in the database for
983 Unfortunately, the implementation is not as straightforward as
984 1-to-1 replacement of BackupRead with ReadEncryptedFileRaw,
985 requiring some FD code rewrite to work with
986 encrypted-file-related callback functions.
989 Item 30: Implement a Storage device like Amazon's S3.
991 Origin: Soren Hansen <soren@ubuntu.com>
993 What: Enable the storage daemon to store backup data on Amazon's
996 Why: Amazon's S3 is a cheap way to store data off-site.
998 Notes: If we configure the Pool to put only one job per volume (they don't
999 support append operation), and the volume size isn't to big (100MB?),
1000 it should be easy to adapt the disk-changer script to add get/put
1001 procedure with curl. So, the data would be safetly copied during the
1004 Cloud should be only used with Copy jobs, users should always have
1005 a copy of their data on their site.
1007 We should also think to have our own cache, trying always to have
1008 cloud volume on the local disk. (I don't know if users want to store
1009 100GB on cloud, so it shouldn't be a disk size problem). For example,
1010 if bacula want to recycle a volume, it will start by downloading the
1011 file to truncate it few seconds later, if we can avoid that...
1013 Item 31: Convert tray monitor on Windows to a stand alone program
1018 What: Separate Win32 tray monitor to be a separate program.
1020 Why: Vista does not allow SYSTEM services to interact with the
1021 desktop, so the current tray monitor does not work on Vista
1024 Notes: Requires communicating with the FD via the network (simulate
1025 a console connection).
1028 Item 32: Relabel disk volume after recycling
1029 Origin: Pasi Kärkkäinen <pasik@iki.fi>
1031 Status: Not implemented yet, no code written.
1033 What: The ability to relabel the disk volume (and thus rename the file on the
1034 disk) after it has been recycled. Useful when you have a single job
1035 per disk volume, and you use a custom Label format, for example:
1037 "${Client}-${Level}-${NumVols:p/4/0/r}-${Year}_${Month}_${Day}-${Hour}_${Minute}"
1039 Why: Disk volumes in Bacula get the label/filename when they are used for the
1040 first time. If you use recycling and custom label format like above,
1041 the disk volume name doesn't match the contents after it has been
1042 recycled. This feature makes it possible to keep the label/filename
1043 in sync with the content and thus makes it easy to check/monitor the
1044 backups from the shell and/or normal file management tools, because
1045 the filenames of the disk volumes match the content.
1047 Notes: The configuration option could be "Relabel after Recycling = Yes".
1049 Item 33: Command that releases all drives in an autochanger
1050 Origin: Blake Dunlap (blake@nxs.net)
1054 What: It would be nice if there was a release command that
1055 would release all drives in an autochanger instead of having to
1056 do each one in turn.
1058 Why: It can take some time for a release to occur, and the
1059 commands must be given for each drive in turn, which can quicky
1060 scale if there are several drives in the library. (Having to
1061 watch the console, to give each command can waste a good bit of
1062 time when you start getting into the 16 drive range when the
1063 tapes can take up to 3 minutes to eject each)
1065 Notes: Due to the way some autochangers/libraries work, you
1066 cannot assume that new tapes inserted will go into slots that are
1067 not currently believed to be in use by bacula (the tape from that
1068 slot is in a drive). This would make any changes in
1069 configuration quicker/easier, as all drives need to be released
1070 before any modifications to slots.
1072 Item 34: Run bscan on a remote storage daemon from within bconsole.
1073 Date: 07 October 2009
1074 Origin: Graham Keeling <graham@equiinet.com>
1077 What: The ability to be able to run bscan on a remote storage daemon from
1078 within bconsole in order to populate your catalog.
1080 Why: Currently, it seems you have to:
1081 a) log in to a console on the remote machine
1082 b) figure out where the storage daemon config file is
1083 c) figure out the storage device from the config file
1084 d) figure out the catalog IP address
1085 e) figure out the catalog port
1086 f) open the port on the catalog firewall
1087 g) configure the catalog database to accept connections from the
1089 h) build a 'bscan' command from (b)-(e) above and run it
1090 It would be much nicer to be able to type something like this into
1092 *bscan storage=<storage> device=<device> volume=<volume>
1094 *bscan storage=<storage> all
1095 It seems to me that the scan could also do a better job than the
1096 external bscan program currently does. It would possibly be able to
1097 deduce some extra details, such as the catalog StorageId for the
1100 Notes: (Kern). If you need to do a bscan, you have done something wrong,
1101 so this functionality should not need to be integrated into the
1102 the Storage daemon. However, I am not opposed to someone implementing
1103 this feature providing that all the code is in a shared object (or dll)
1104 and does not add significantly to the size of the Storage daemon. In
1105 addition, the code should be written in a way such that the same source
1106 code is used in both the bscan program and the Storage daemon to avoid
1107 adding a lot of new code that must be maintained by the project.
1109 Item 35: Implement a Migration job type that will create a reverse
1110 incremental (or decremental) backup from two existing full backups.
1111 Date: 05 October 2009
1112 Origin: Griffith College Dublin. Some sponsorship available.
1113 Contact: Gavin McCullagh <gavin.mccullagh@gcd.ie>
1116 What: The ability to take two full backup jobs and derive a reverse
1117 incremental backup from them. The older full backup data may then
1120 Why: Long-term backups based on keeping full backups can be expensive in
1121 media. In many cases (eg a NAS), as the client accumulates files
1122 over months and years, the same file will be duplicated unchanged,
1123 across many media and datasets. Eg, Less than 10% (and
1124 shrinking) of our monthly full mail server backup is new files,
1125 the other 90% is also in the previous full backup.
1126 Regularly converting the oldest full backup into a reverse
1127 incremental backup allows the admin to keep access to old backup
1128 jobs, but remove all of the duplicated files, freeing up media.
1130 Notes: This feature was previously discussed on the bacula-devel list
1131 here: http://www.mail-archive.com/bacula-devel@lists.sourceforge.net/msg04962.html
1133 Item 36: Job migration between different SDs
1134 Origin: Mariusz Czulada <manieq AT wp DOT eu>
1138 What: Allow to specify in migration job devices on Storage Daemon other then
1139 the one used for migrated jobs (possibly on different/distant host)
1141 Why: Sometimes we have more then one system which requires backup
1142 implementation. Often, these systems are functionally unrelated and
1143 placed in different locations. Having a big backup device (a tape
1144 library) in each location is not cost-effective. It would be much
1145 better to have one powerful enough tape library which could handle
1146 backups from all systems, assuming relatively fast and reliable WAN
1147 connections. In such architecture backups are done in service windows
1148 on local bacula servers, then migrated to central storage off the peak
1151 Notes: If migration to different SD is working, migration to the same SD, as
1152 now, could be done the same way (i mean 'localhost') to unify the
1155 Item 37: Concurrent spooling and despooling withini a single job.
1157 Origin: Jesper Krogh <jesper@krogh.cc>
1159 What: When a job has spooling enabled and the spool area size is
1160 less than the total volumes size the storage daemon will:
1161 1) Spool to spool-area
1163 3) Go to 1 if more data to be backed up.
1165 Typical disks will serve data with a speed of 100MB/s when
1166 dealing with large files, network it typical capable of doing 115MB/s
1167 (GbitE). Tape drives will despool with 50-90MB/s (LTO3) 70-120MB/s
1168 (LTO4) depending on compression and data.
1170 As bacula currently works it'll hold back data from the client until
1171 de-spooling is done, now matter if the spool area can handle another
1172 block of data. Say given a FileSet of 4TB and a spool-area of 100GB and
1173 a Maximum Job Spool Size set to 50GB then above sequence could be
1174 changed to allow to spool to the other 50GB while despooling the first
1175 50GB and not holding back the client while doing it. As above numbers
1176 show, depending on tape-drive and disk-arrays this potentially leads to
1177 a cut of the backup-time of 50% for the individual jobs.
1179 Real-world example, backing up 112.6GB (large files) to LTO4 tapes
1180 (despools with ~75MB/s, data is gzipped on the remote filesystem.
1181 Maximum Job Spool Size = 8GB
1185 Elapsed time (total time): 46m 15s => 2775s
1186 Despooling time: 25m 41s => 1541s (55%)
1187 Spooling time: 20m 34s => 1234s (45%)
1188 Reported speed: 40.58MB/s
1189 Spooling speed: 112.6GB/1234s => 91.25MB/s
1190 Despooling speed: 112.6GB/1541s => 73.07MB/s
1192 So disk + net can "keep up" with the LTO4 drive (in this test)
1194 Prosed change would effectively make the backup run in the "despooling
1195 time" 1541s giving a reduction to 55% of the total run time.
1197 In the situation where the individual job cannot keep up with LTO-drive
1198 spooling enables efficient multiplexing of multiple concurrent jobs onto
1201 Why: When dealing with larger volumes the general utillization of the
1202 network/disk is important to maximize in order to be able to run a full
1203 backup over a weekend. Current work-around is to split the FileSet in
1204 smaller FileSet and Jobs but that leads to more configuration mangement
1205 and is harder to review for completeness. Subsequently it makes restores
1208 Item 39: Extend the verify code to make it possible to verify
1209 older jobs, not only the last one that has finished
1211 Origin: Ralf Gross (Ralf-Lists <at> ralfgross.de)
1212 Status: not implemented or documented
1214 What: At the moment a VolumeToCatalog job compares only the
1215 last job with the data in the catalog. It's not possible
1216 to compare the data (md5sums) of an older volume with the
1217 data in the catalog.
1219 Why: If a verify job fails, one has to immediately check the
1220 source of the problem, fix it and rerun the verify job.
1221 This has to happen before the next backup of the
1222 verified backup job starts.
1223 More important: It's not possible to check jobs that are
1224 kept for a long time (archiv). If a jobid could be
1225 specified for a verify job, older backups/tapes could be
1226 checked on a regular base.
1228 Notes: verify documentation:
1229 VolumeToCatalog: This level causes Bacula to read the file
1230 attribute data written to the Volume from the last Job [...]
1232 Verify Job = <Job-Resource-Name> If you run a verify job
1233 without this directive, the last job run will be compared
1234 with the catalog, which means that you must immediately
1235 follow a backup by a verify command. If you specify a Verify
1236 Job Bacula will find the last job with that name that ran [...]
1238 example bconsole verify dialog:
1241 JobName: VerifyServerXXX
1242 Level: VolumeToCatalog
1243 Client: ServerXXX-fd
1244 FileSet: ServerXXX-Vol1
1245 Pool: Full (From Job resource)
1246 Storage: Neo4100 (From Pool resource)
1247 Verify Job: ServerXXX-Vol1
1249 When: 2009-04-20 09:03:04
1251 OK to run? (yes/mod/no): m
1252 Parameters to modify:
1265 Item 40: Separate "Storage" and "Device" in the bacula-dir.conf
1267 Origin: "James Harper" <james.harper@bendigoit.com.au>
1268 Status: not implemented or documented
1270 What: Separate "Storage" and "Device" in the bacula-dir.conf
1271 The resulting config would looks something like:
1274 Name = name_of_server
1275 Address = hostname/IP address
1277 Password = shh_its_a_secret
1278 Maximum Concurrent Jobs = 7
1282 Name = name_of_device
1283 Storage = name_of_server
1284 Device = name_of_device_on_sd
1285 Media Type = media_type
1286 Maximum Concurrent Jobs = 1
1289 Maximum Concurrent Jobs would be specified with a server and a device
1290 maximum, which would both be honoured by the director. Almost everything
1291 that mentions a 'Storage' would need to be changed to 'Device', although
1292 perhaps a 'Storage' would just be a synonym for 'Device' for backwards
1295 Why: If you have multiple Storage definitions pointing to different
1296 Devices in the same Storage daemon, the "status storage" command
1297 prompts for each different device, but they all give the same
1302 Item 41: Least recently used device selection for tape drives in autochanger.
1303 Date: 12 October 2009
1304 Origin: Thomas Carter <tcarter@memc.com>
1307 What: A better tape drive selection algorithm for multi-drive
1308 autochangers. The AUTOCHANGER class contains an array list of tape
1309 devices. When a tape drive is needed, this list is always searched in
1310 order. This causes lower number drives (specifically drive 0) to do a
1311 majority of the work with higher numbered drives possibly never being
1312 used. When a drive in an autochanger is reserved for use, its entry should
1313 be moved to the end of the list; this would give a rough LRU drive
1316 Why: The current implementation places a majority of use and wear on drive
1317 0 of a multi-drive autochanger.
1321 ========= New items after last vote ====================
1324 ========= Add new items above this line =================
1327 ============= Empty Feature Request form ===========
1328 Item n: One line summary ...
1329 Date: Date submitted
1330 Origin: Name and email of originator.
1333 What: More detailed explanation ...
1335 Why: Why it is important ...
1337 Notes: Additional notes or features (omit if not used)
1338 ============== End Feature Request form ==============
1341 ========== Items put on hold by Kern ============================
1344 ========== Items completed in version 5.0.0 ====================
1345 *Item 2: 'restore' menu: enter a JobId, automatically select dependents
1346 *Item 5: Deletion of disk Volumes when pruned (partial -- truncate when pruned)
1347 *Item 6: Implement Base jobs
1348 *Item 10: Restore from volumes on multiple storage daemons
1349 *Item 15: Enable/disable compression depending on storage device (disk/tape)
1350 *Item 20: Cause daemons to use a specific IP address to source communications
1351 *Item 23: "Maximum Concurrent Jobs" for drives when used with changer device
1352 *Item 31: List InChanger flag when doing restore.
1353 *Item 35: Port bat to Win32