Projects:
Bacula Projects Roadmap
- 07 December 2005
- (prioritized by user vote)
+ Prioritized by user vote 07 December 2005
+ Status updated 30 July 2006
Summary:
Item 1: Implement data encryption (as opposed to comm encryption)
Item 1: Implement data encryption (as opposed to comm encryption)
Date: 28 October 2005
Origin: Sponsored by Landon and 13 contributors to EFF.
- Status: Landon Fuller is currently implementing this.
+ Status: Done: Landon Fuller has implemented this in 1.39.x.
What: Currently the data that is stored on the Volume is not
encrypted. For confidentiality, encryption of data at
Origin: Sponsored by Riege Software International GmbH. Contact:
Daniel Holtkamp <holtkamp at riege dot com>
Date: 28 October 2005
- Status: Partially coded in 1.37 -- much more to do. Assigned to
+ Status: 90% complete: Working in 1.39, more to do. Assigned to
Kern.
What: The ability to copy, move, or archive data that is on a
but one additional database record must be written, which does
not need any database change.
+ Kern: see if we can correct restoration of directories if
+ replace=ifnewer is set. Currently, if the directory does not
+ exist, a "dummy" directory is created, then when all the files
+ are updated, the dummy directory is newer so the real values
+ are not updated.
+
Item 4: Implement a Bacula GUI/management tool using Python.
Origin: Kern
Date: 28 October 2005
- Status:
+ Status: Lucus is working on this for Python GTK+.
What: Implement a Bacula console, and management tools
using Python and Qt or GTK.
Item 9: Implement new {Client}Run{Before|After}Job feature.
Date: 26 September 2005
- Origin: Phil Stracchino <phil.stracchino at speakeasy dot net>
- Status:
+ Origin: Phil Stracchino
+ Status: Done. This has been implemented by Eric Bollengier
What: Some time ago, there was a discussion of RunAfterJob and
ClientRunAfterJob, and the fact that they do not run after failed
ClientRunAfterFailedJob directive), but to my knowledge these
were never implemented.
+ The current implementation doesn't permit to add new feature easily.
+
An alternate way of approaching the problem has just occurred to
me. Suppose the RunBeforeJob and RunAfterJob directives were
- expanded in a manner something like this example:
+ expanded in a manner like this example:
- RunBeforeJob {
+ RunScript {
Command = "/opt/bacula/etc/checkhost %c"
- RunsOnClient = No
- RunsAtJobLevels = All # All, Full, Diff, Inc
- AbortJobOnError = Yes
+ RunsOnClient = No # default
+ AbortJobOnError = Yes # default
+ RunsWhen = Before
}
- RunBeforeJob {
+ RunScript {
Command = c:/bacula/systemstate.bat
RunsOnClient = yes
- RunsAtJobLevels = All # All, Full, Diff, Inc
AbortJobOnError = No
+ RunsWhen = After
+ RunsOnFailure = yes
}
- RunAfterJob {
+ RunScript {
Command = c:/bacula/deletestatefile.bat
- RunsOnClient = Yes
- RunsAtJobLevels = All # All, Full, Diff, Inc
- RunsOnSuccess = Yes
- RunsOnFailure = Yes
- }
- RunAfterJob {
- Command = c:/bacula/somethingelse.bat
- RunsOnClient = Yes
- RunsAtJobLevels = All
- RunsOnSuccess = No
- RunsOnFailure = Yes
- }
- RunAfterJob {
- Command = "/opt/bacula/etc/checkhost -v %c"
- RunsOnClient = No
- RunsAtJobLevels = All
- RunsOnSuccess = No
- RunsOnFailure = Yes
+ Target = rico-fd
+ RunsWhen = Always
}
+ It's now possible to specify more than 1 command per Job.
+ (you can stop your database and your webserver without a script)
+
+ ex :
+ Job {
+ Name = "Client1"
+ JobDefs = "DefaultJob"
+ Write Bootstrap = "/tmp/bacula/var/bacula/working/Client1.bsr"
+ FileSet = "Minimal"
+
+ RunBeforeJob = "echo test before ; echo test before2"
+ RunBeforeJob = "echo test before (2nd time)"
+ RunBeforeJob = "echo test before (3rd time)"
+ RunAfterJob = "echo test after"
+ ClientRunAfterJob = "echo test after client"
+
+ RunScript {
+ Command = "echo test RunScript in error"
+ Runsonclient = yes
+ RunsOnSuccess = no
+ RunsOnFailure = yes
+ RunsWhen = After # never by default
+ }
+ RunScript {
+ Command = "echo test RunScript on success"
+ Runsonclient = yes
+ RunsOnSuccess = yes # default
+ RunsOnFailure = no # default
+ RunsWhen = After
+ }
+ }
Why: It would be a significant change to the structure of the
directives, but allows for a lot more flexibility, including
succeeds, or RunBefore tasks that still allow the job to run even
if that specific RunBefore fails.
- Notes: By Kern: I would prefer to have a single new Resource called
- RunScript. More notes from Phil:
+ Notes: (More notes from Phil, Kern, David and Eric)
+ I would prefer to have a single new Resource called
+ RunScript.
- RunBeforeJob = yes|no
- RunAfterJob = yes|no
- RunsAtJobLevels = All|Full|Diff|Inc
+ RunsWhen = After|Before|Always
+ RunsAtJobLevels = All|Full|Diff|Inc # not yet implemented
The AbortJobOnError, RunsOnSuccess and RunsOnFailure directives
- could be optional, and possibly RunsWhen as well.
+ could be optional, and possibly RunWhen as well.
AbortJobOnError would be ignored unless RunsWhen was set to Before
- (or RunsBefore Job set to Yes), and would default to Yes if
- omitted. If AbortJobOnError was set to No, failure of the script
+ and would default to Yes if omitted.
+ If AbortJobOnError was set to No, failure of the script
would still generate a warning.
RunsOnSuccess would be ignored unless RunsWhen was set to After
Allow having the before/after status on the script command
line so that the same script can be used both before/after.
- David Boyes.
Item 10: Merge multiple backups (Synthetic Backup or Consolidation).
Origin: Marc Cousin and Eric Bollengier
Date: 15 November 2005
- Status: Depends on first implementing project Item 1 (Migration).
+ Status: Waiting implementation. Depends on first implementing
+ project Item 2 (Migration).
What: A merged backup is a backup made without connecting to the Client.
It would be a Merge of existing backups into a single backup.
Date: 11 November 2005
Origin: Joshua Kugler <joshua dot kugler at uaf dot edu>
Marek Bajon <mbajon at bimsplus dot com dot pl>
- Status: RFC
+ Status:
What: Currently when a file changes, the entire file will be backed up in
the next incremental or full backup. To save space on the tapes
Item 14: Implement red/black binary tree routines.
Date: 28 October 2005
Origin: Kern
- Status:
+ Status: Class code is complete. Code needs to be integrated into
+ restore tree code.
What: Implement a red/black binary tree class. This could
then replace the current binary insert/search routines
Item 15: Add support for FileSets in user directories CACHEDIR.TAG
Origin: Norbert Kiesel <nkiesel at tbdnetworks dot com>
Date: 21 November 2005
- Status:
+ Status: (I think this is better done using a Python event that I
+ will implement in version 1.39.x).
What: CACHDIR.TAG is a proposal for identifying directories which
should be ignored for archiving/backup. It works by ignoring
Item 16: Implement extraction of Win32 BackupWrite data.
Origin: Thorsten Engel <thorsten.engel at matrix-computer dot com>
Date: 28 October 2005
- Status: Assigned to Thorsten. Implemented in current CVS
+ Status: Done. Assigned to Thorsten. Implemented in current CVS
What: This provides the Bacula File daemon with code that
can pick apart the stream output that Microsoft writes
Item 22: Permit multiple Media Types in an Autochanger
Origin: Kern
- Status: Now implemented
+ Status: Done. Implemented in 1.38.9 (I think).
What: Modify the Storage daemon so that multiple Media Types
can be specified in an autochanger. This would be somewhat
do a Bacula restore. By excluding the base OS files, the
backup set will be *much* smaller.
+
+============= Empty Feature Request form ===========
+Item n: One line summary ...
+ Date: Date submitted
+ Origin: Name and email of originator.
+ Status:
+
+ What: More detailed explanation ...
+
+ Why: Why it is important ...
+
+ Notes: Additional notes or features (omit if not used)
+============== End Feature Request form ==============
+
+
+===============================================
+Feature requests submitted after cutoff for December 2005 vote
+ and not yet discussed.
===============================================
-Not in Dec 2005 Vote:
Item n: Allow skipping execution of Jobs
Date: 29 November 2005
Origin: Florian Schnabel <florian.schnabel at docufy dot de>
that would be really handy, other jobs could proceed normally
and you won't get errors that way.
-============= Empty Feature Request form ===========
-Item n: One line summary ...
- Date: Date submitted
- Origin: Name and email of originator.
- Status:
+===================================================
+
+Item n: archive data
+
+ Origin: calvin streeting calvin at absentdream dot com
+ Date: 15/5/2006
+
+ What: The abilty to archive to media (dvd/cd) in a uncompressd format
+ for dead filing (archiving not backing up)
+
+ Why: At my works when jobs are finished and moved off of the main file
+ servers (raid based systems) onto a simple linux file server (ide based
+ system) so users can find old information without contacting the IT
+ dept.
+
+ So this data dosn't realy change it only gets added to,
+ But it also needs backing up. At the moment it takes
+ about 8 hours to back up our servers (working data) so
+ rather than add more time to existing backups i am trying
+ to implement a system where we backup the acrhive data to
+ cd/dvd these disks would only need to be appended to
+ (burn only new/changed files to new disks for off site
+ storage). basialy understand the differnce between
+ achive data and live data.
+
+ Notes: scan the data and email me when it needs burning divide
+ into predifind chunks keep a recored of what is on what
+ disk make me a label (simple php->mysql=>pdf stuff) i
+ could do this bit ability to save data uncompresed so
+ it can be read in any other system (future proof data)
+ save the catalog with the disk as some kind of menu
+ system
+
+Item : Tray monitor window cleanups
+ Origin: Alan Brown ajb2 at mssl dot ucl dot ac dot uk
+ Date: 24 July 2006
+ Status:
+ What: Resizeable and scrollable windows in the tray monitor.
- What: More detailed explanation ...
+ Why: With multiple clients, or with many jobs running, the displayed
+ window often ends up larger than the available screen, making
+ the trailing items difficult to read.
- Why: Why it is important ...
+ Notes:
- Notes: Additional notes or features (omit if not used)
-============== End Feature Request form ==============
+ Item : Clustered file-daemons
+ Origin: Alan Brown ajb2 at mssl dot ucl dot ac dot uk
+ Date: 24 July 2006
+ Status:
+ What: A "virtual" filedaemon, which is actually a cluster of real ones.
+
+ Why: In the case of clustered filesystems (SAN setups, GFS, or OCFS2, etc)
+ multiple machines may have access to the same set of filesystems
+
+ For performance reasons, one may wish to initate backups from
+ several of these machines simultaneously, instead of just using
+ one backup source for the common clustered filesystem.
+
+ For obvious reasons, normally backups of $A-FD/$PATH and
+ B-FD/$PATH are treated as different backup sets. In this case
+ they are the same communal set.
+
+ Likewise when restoring, it would be easier to just specify
+ one of the cluster machines and let bacula decide which to use.
+
+ This can be faked to some extent using DNS round robin entries
+ and a virtual IP address, however it means "status client" will
+ always give bogus answers. Additionally there is no way of
+ spreading the load evenly among the servers.
+
+ What is required is something similar to the storage daemon
+ autochanger directives, so that Bacula can keep track of
+ operating backups/restores and direct new jobs to a "free"
+ client.
+
+ Notes:
+
+Item : Tray monitor window cleanups
+ Origin: Alan Brown ajb2 at mssl dot ucl dot ac dot uk
+ Date: 24 July 2006
+ Status:
+ What: Resizeable and scrollable windows in the tray monitor.
+
+ Why: With multiple clients, or with many jobs running, the displayed
+ window often ends up larger than the available screen, making
+ the trailing items difficult to read.
+
+ Notes:
+
+Item: Commercial database support
+ Origin: Russell Howe <russell_howe dot wreckage dot org>
+ Date: 26 July 2006
+ Status:
+
+ What: It would be nice for the database backend to support more
+ databases. I'm thinking of SQL Server at the moment, but I guess Oracle,
+ DB2, MaxDB, etc are all candidates. SQL Server would presumably be
+ implemented using FreeTDS or maybe an ODBC library?
+
+ Why: We only really have one database server, which is MS SQL Server
+ 2000. Maintaining a second one for the backup software (we grew out of
+ SQLite, which I liked, but which didn't work so well with our database
+ size). We don't really have a machine with the resources to run
+ postgres, and would rather only maintain a single DBMS. We're stuck with
+ SQL Server because pretty much all the company's custom applications
+ (written by consultants) are locked into SQL Server 2000. I can imagine
+ this scenario is fairly common, and it would be nice to use the existing
+ properly specced database server for storing Bacula's catalog, rather
+ than having to run a second DBMS.
+
+
+Item n: Split documentation
+ Origin: Maxx <maxxatworkat gmail dot com>
+ Date: 27th July 2006
+ Status:
+
+ What: Split documentation in several books
+
+ Why: Bacula manual has now more than 600 pages, and looking for
+ implementation details is getting complicated. I think
+ it would be good to split the single volume in two or
+ maybe three parts:
+
+ 1) Introduction, requirements and tutorial, typically
+ are useful only until first installation time
+
+ 2) Basic installation and configuration, with all the
+ gory details about the directives supported 3)
+ Advanced Bacula: testing, troubleshooting, GUI and
+ ancillary programs, security managements, scripting,
+ etc.
+
+ Notes:
+
+Item n: Include an option to operate on all pools when doing
+ update vol parameters
+
+ Origin: Dmitriy Pinchukov <absh@bossdev.kiev.ua>
+ Date: 16 August 2006
+ Status:
+
+ What: When I do update -> Volume parameters -> All Volumes
+ from Pool, then I have to select pools one by one. I'd like
+ console to have an option like "0: All Pools" in the list of
+ defined pools.
+
+ Why: I have many pools and therefore unhappy with manually
+ updating each of them using update -> Volume parameters -> All
+ Volumes from Pool -> pool #.
+
+Item n: Automatic promotion of backup levels
+ Date: 19 January 2006
+ Origin: Adam Thornton <athornton@sinenomine.net>
+ Status: Blue sky
+
+ What: Amanda has a feature whereby it estimates the space that a
+ differential, incremental, and full backup would take. If the
+ difference in space required between the scheduled level and the next
+ level up is beneath some user-defined critical threshold, the backup
+ level is bumped to the next type. Doing this minimizes the number of
+ volumes necessary during a restore, with a fairly minimal cost in
+ backup media space.
+
+ Why: I know at least one (quite sophisticated and smart) user
+ for whom the absence of this feature is a deal-breaker in terms of
+ using Bacula; if we had it it would eliminate the one cool thing
+ Amanda can do and we can't (at least, the one cool thing I know of).
+
+
+
+
+Item n+1: Incorporation of XACML2/SAML2 parsing
+ Date: 19 January 2006
+ Origin: Adam Thornton <athornton@sinenomine.net>
+ Status: Blue sky
+
+ What: XACML is "eXtensible Access Control Markup Language" and
+ "SAML is the "Security Assertion Markup Language"--an XML standard
+ for making statements about identity and authorization. Having these
+ would give us a framework to approach ACLs in a generic manner, and
+ in a way flexible enough to support the four major sorts of ACLs I
+ see as a concern to Bacula at this point, as well as (probably) to
+ deal with new sorts of ACLs that may appear in the future.
+
+ Why: Bacula is beginning to need to back up systems with ACLs
+ that do not map cleanly onto traditional Unix permissions. I see
+ four sets of ACLs--in general, mutually incompatible with one
+ another--that we're going to need to deal with. These are: NTFS
+ ACLs, POSIX ACLs, NFSv4 ACLS, and AFS ACLS. (Some may question the
+ relevance of AFS; AFS is one of Sine Nomine's core consulting
+ businesses, and having a reputable file-level backup and restore
+ technology for it (as Tivoli is probably going to drop AFS support
+ soon since IBM no longer supports AFS) would be of huge benefit to
+ our customers; we'd most likely create the AFS support at Sine Nomine
+ for inclusion into the Bacula (and perhaps some changes to the
+ OpenAFS volserver) core code.)
+
+ Now, obviously, Bacula already handles NTFS just fine. However, I
+ think there's a lot of value in implementing a generic ACL model, so
+ that it's easy to support whatever particular instances of ACLs come
+ down the pike: POSIX ACLS (think SELinux) and NFSv4 are the obvious
+ things arriving in the Linux world in a big way in the near future.
+ XACML, although overcomplicated for our needs, provides this
+ framework, and we should be able to leverage other people's
+ implementations to minimize the amount of work *we* have to do to get
+ a generic ACL framework. Basically, the costs of implementation are
+ high, but they're largely both external to Bacula and already sunk.
+
+Item 1: Add an over-ride in the Schedule configuration to use a
+ different pool for different backup types.
+
+Date: 19 Jan 2005
+Origin: Chad Slater <chad.slater@clickfox.com>
+Status:
+
+ What: Adding a FullStorage=BigTapeLibrary in the Schedule resource
+ would help those of us who use different storage devices for different
+ backup levels cope with the "auto-upgrade" of a backup.
+
+ Why: Assume I add several new device to be backed up, i.e. several
+ hosts with 1TB RAID. To avoid tape switching hassles, incrementals are
+ stored in a disk set on a 2TB RAID. If you add these devices in the
+ middle of the month, the incrementals are upgraded to "full" backups,
+ but they try to use the same storage device as requested in the
+ incremental job, filling up the RAID holding the differentials. If we
+ could override the Storage parameter for full and/or differential
+ backups, then the Full job would use the proper Storage device, which
+ has more capacity (i.e. a 8TB tape library.
+
+
+Item: Implement multiple numeric backup levels as supported by dump
+Date: 3 April 2006
+Origin: Daniel Rich <drich@employees.org>
+Status:
+What: Dump allows specification of backup levels numerically instead of just
+ "full", "incr", and "diff". In this system, at any given level, all
+ files are backed up that were were modified since the last backup of a
+ higher level (with 0 being the highest and 9 being the lowest). A
+ level 0 is therefore equivalent to a full, level 9 an incremental, and
+ the levels 1 through 8 are varying levels of differentials. For
+ bacula's sake, these could be represented as "full", "incr", and
+ "diff1", "diff2", etc.
+
+Why: Support of multiple backup levels would provide for more advanced backup
+ rotation schemes such as "Towers of Hanoi". This would allow better
+ flexibility in performing backups, and can lead to shorter recover
+ times.
+
+Notes: Legato Networker supports a similar system with full, incr, and 1-9 as
+ levels.
+
+Kern notes: I think this would add very little functionality, but a *lot* of
+ additional overhead to Bacula.
+
+Item 1: include JobID in spool file name
+ Origin: Mark Bergman <mark.bergman@uphs.upenn.edu>
+ Date: Tue Aug 22 17:13:39 EDT 2006
+ Status:
+
+ What: Change the name of the spool file to include the JobID
+
+ Why: JobIDs are the common key used to refer to jobs, yet the
+ spoolfile name doesn't include that information. The date/time
+ stamp is useful (and should be retained).
+
+
+
+Item 2: include timestamp of job launch in "stat clients" output
+ Origin: Mark Bergman <mark.bergman@uphs.upenn.edu>
+ Date: Tue Aug 22 17:13:39 EDT 2006
+ Status:
+
+ What: The "stat clients" command doesn't include any detail on when
+ the active backup jobs were launched.
+
+ Why: Including the timestamp would make it much easier to decide whether
+ a job is running properly.
+
+ Notes: It may be helpful to have the output from "stat clients" formatted
+ more like that from "stat dir" (and other commands), in a column
+ format. The per-client information that's currently shown (level,
+ client name, JobId, Volume, pool, device, Files, etc.) is good, but
+ somewhat hard to parse (both programmatically and visually),
+ particularly when there are many active clients.
+
+Item 1: Filesystemwatch triggered backup.
+ Date: 31 August 2006
+ Origin: Jesper Krogh <jesper@krogh.cc>
+ Status: Unimplemented, depends probably on "client initiated backups"
+
+ What: With inotify and similar filesystem triggeret notification
+ systems is it possible to have the file-daemon to monitor
+ filesystem changes and initiate backup.
+
+ Why: There are 2 situations where this is nice to have.
+ 1) It is possible to get a much finer-grained backup than
+ the fixed schedules used now.. A file created and deleted
+ a few hours later, can automatically be caught.
+
+ 2) The introduced load on the system will probably be
+ distributed more even on the system.
+
+ Notes: This can be combined with configration that specifies
+ something like: "at most every 15 minutes or when changes
+ consumed XX MB".
+
+Item n: Message mailing based on backup types
+Origin: Evan Kaufman <evan.kaufman@gmail.com>
+ Date: January 6, 2006
+Status:
+
+ What: In the "Messages" resource definitions, allowing messages
+ to be mailed based on the type (backup, restore, etc.) and level
+ (full, differential, etc) of job that created the originating
+ message(s).
+
+Why: It would, for example, allow someone's boss to be emailed
+ automatically only when a Full Backup job runs, so he can
+ retrieve the tapes for offsite storage, even if the IT dept.
+ doesn't (or can't) explicitly notify him. At the same time, his
+ mailbox wouldnt be filled by notifications of Verifies, Restores,
+ or Incremental/Differential Backups (which would likely be kept
+ onsite).
+
+Notes:
+ One way this could be done is through additional message types, for example:
+
+ Messages {
+ # email the boss only on full system backups
+ Mail = boss@mycompany.com = full, !incremental, !differential, !restore,
+ !verify, !admin
+ # email us only when something breaks
+ MailOnError = itdept@mycompany.com = all
+ }
+
+
+Item n: Allow inclusion/exclusion of files in a fileset by creation/mod times
+ Origin: Evan Kaufman <evan.kaufman@gmail.com>
+ Date: January 11, 2006
+ Status:
+
+ What: In the vein of the Wild and Regex directives in a Fileset's
+ Options, it would be helpful to allow a user to include or exclude
+ files and directories by creation or modification times.
+
+ You could factor the Exclude=yes|no option in much the same way it
+ affects the Wild and Regex directives. For example, you could exclude
+ all files modified before a certain date:
+
+ Options {
+ Exclude = yes
+ Modified Before = ####
+ }
+
+ Or you could exclude all files created/modified since a certain date:
+
+ Options {
+ Exclude = yes
+ Created Modified Since = ####
+ }
+
+ The format of the time/date could be done several ways, say the number
+ of seconds since the epoch:
+ 1137008553 = Jan 11 2006, 1:42:33PM # result of `date +%s`
+
+ Or a human readable date in a cryptic form:
+ 20060111134233 = Jan 11 2006, 1:42:33PM # YYYYMMDDhhmmss
+
+ Why: I imagine a feature like this could have many uses. It would
+ allow a user to do a full backup while excluding the base operating
+ system files, so if I installed a Linux snapshot from a CD yesterday,
+ I'll *exclude* all files modified *before* today. If I need to
+ recover the system, I use the CD I already have, plus the tape backup.
+ Or if, say, a Windows client is hit by a particularly corrosive
+ virus, and I need to *exclude* any files created/modified *since* the
+ time of infection.
+
+ Notes: Of course, this feature would work in concert with other
+ in/exclude rules, and wouldnt override them (or each other).
+
+ Notes: The directives I'd imagine would be along the lines of
+ "[Created] [Modified] [Before|Since] = <date>".
+ So one could compare against 'ctime' and/or 'mtime', but ONLY 'before'
+ or 'since'.