Projects: Bacula Projects Roadmap 22 November 2005 Below, you will find more information on future projects: Item 1: Implement a Migration job type that will move the job data from one device to another. Origin: Sponsored by Riege Sofware International GmbH. Contact: Daniel Holtkamp Date: 28 October 2005 Status: Partially coded in 1.37 -- much more to do. Assigned to Kern. What: The ability to copy, move, or archive data that is on a device to another device is very important. Why: An ISP might want to backup to disk, but after 30 days migrate the data to tape backup and delete it from disk. Bacula should be able to handle this automatically. It needs to know what was put where, and when, and what to migrate -- it is a bit like retention periods. Doing so would allow space to be freed up for current backups while maintaining older data on tape drives. Notes: Migration could be triggered by: Number of Jobs Number of Volumes Age of Jobs Highwater size (keep total size) Lowwater mark Item 2: Implement extraction of Win32 BackupWrite data. Origin: Thorsten Engel Date: 28 October 2005 Status: Assigned to Thorsten. Implemented in current CVS What: This provides the Bacula File daemon with code that can pick apart the stream output that Microsoft writes for BackupWrite data, and thus the data can be read and restored on non-Win32 machines. Why: BackupWrite data is the portable=no option in Win32 FileSets, and in previous Baculas, this data could only be extracted using a Win32 FD. With this new code, the Windows data can be extracted and restored on any OS. Item 3: Implement a Bacula GUI/management tool using Python and Qt. Origin: Kern Date: 28 October 2005 Status: What: Implement a Bacula console, and management tools using Python and Qt. Why: Don't we already have a wxWidgets GUI? Yes, but it is written in C++ and changes to the user interface must be hand tailored using C++ code. By developing the user interface using Qt designer, the interface can be very easily updated and most of the new Python code will be automatically created. The user interface changes become very simple, and only the new features must be implement. In addition, the code will be in Python, which will give many more users easy (or easier) access to making additions or modifications. Item 4: Implement a Python interface to the Bacula catalog. Date: 28 October 2005 Origin: Kern Status: What: Implement an interface for Python scripts to access the catalog through Bacula. Why: This will permit users to customize Bacula through Python scripts. Item 5: Implement more Python events in Bacula. Date: 28 October 2005 Origin: Status: What: Allow Python scripts to be called at more places within Bacula and provide additional access to Bacula internal variables. Why: This will permit users to customize Bacula through Python scripts. Notes: Recycle event Scratch pool event NeedVolume event Item 6: Implement Base jobs. Date: 28 October 2005 Origin: Kern Status: What: A base job is sort of like a Full save except that you will want the FileSet to contain only files that are unlikely to change in the future (i.e. a snapshot of most of your system after installing it). After the base job has been run, when you are doing a Full save, you specify one or more Base jobs to be used. All files that have been backed up in the Base job/jobs but not modified will then be excluded from the backup. During a restore, the Base jobs will be automatically pulled in where necessary. Why: This is something none of the competition does, as far as we know (except perhpas BackupPC, which is a Perl program that saves to disk only). It is big win for the user, it makes Bacula stand out as offering a unique optimization that immediately saves time and money. Basically, imagine that you have 100 nearly identical Windows or Linux machine containing the OS and user files. Now for the OS part, a Base job will be backed up once, and rather than making 100 copies of the OS, there will be only one. If one or more of the systems have some files updated, no problem, they will be automatically restored. Notes: Huge savings in tape usage even for a single machine. Will require more resources because the DIR must send FD a list of files/attribs, and the FD must search the list and compare it for each file to be saved. Item 7: Add Plug-ins to the FileSet Include statements. Date: 28 October 2005 Origin: Status: Partially coded in 1.37 -- much more to do. What: Allow users to specify wild-card and/or regular expressions to be matched in both the Include and Exclude directives in a FileSet. At the same time, allow users to define plug-ins to be called (based on regular expression/wild-card matching). Why: This would give the users the ultimate ability to control how files are backed up/restored. A user could write a plug-in knows how to backup his Oracle database without stopping/starting it, for example. Item 8: Implement huge exclude list support using hashing. Date: 28 October 2005 Origin: Kern Status: What: Allow users to specify very large exclude list (currently more than about 1000 files is too many). Why: This would give the users the ability to exclude all files that are loaded with the OS (e.g. using rpms or debs). If the user can restore the base OS from CDs, there is no need to backup all those files. A complete restore would be to restore the base OS, then do a Bacula restore. By excluding the base OS files, the backup set will be *much* smaller. Item 9: Implement data encryption (as opposed to communications encryption) Date: 28 October 2005 Origin: Sponsored by Landon and 13 contributors to EFF. Status: Landon Fuller is currently implementing this. What: Currently the data that is stored on the Volume is not encrypted. For confidentiality, encryption of data at the File daemon level is essential. Data encryption encrypts the data in the File daemon and decrypts the data in the File daemon during a restore. Why: Large sites require this. Item 10: Permit multiple Media Types in an Autochanger Origin: Status: What: Modify the Storage daemon so that multiple Media Types can be specified in an autochanger. This would be somewhat of a simplistic implementation in that each drive would still be allowed to have only one Media Type. However, the Storage daemon will ensure that only a drive with the Media Type that matches what the Director specifies is chosen. Why: This will permit user with several different drive types to make full use of their autochangers. Item 11: Allow two different autochanger definitions that refer to the same autochanger. Date: 28 October 2005 Origin: Kern Status: What: Currently, the autochanger script is locked based on the autochanger. That is, if multiple drives are being simultaneously used, the Storage daemon ensures that only one drive at a time can access the mtx-changer script. This change would base the locking on the control device, rather than the autochanger. It would then permit two autochanger definitions for the same autochanger, but with different drives. Logically, the autochanger could then be "partitioned" for different jobs, clients, or class of jobs, and if the locking is based on the control device (e.g. /dev/sg0) the mtx-changer script will be locked appropriately. Why: This will permit users to partition autochangers for specific use. It would also permit implementation of multiple Media Types with no changes to the Storage daemon. Item 12: Implement red/black binary tree routines. Date: 28 October 2005 Origin: Kern Status: What: Implement a red/black binary tree class. This could then replace the current binary insert/search routines used in the restore in memory tree. This could significantly speed up the creation of the in memory restore tree. Why: Performance enhancement. Item 13: Let Bacula log tape usage and handle drive cleaning cycles. Date: November 11, 2005 Origin: Arno Lehmann Status: What: Make Bacula manage tape life cycle information and drive cleaning cycles. Why: Both parts of this project are important when operating backups. We need to know which tapes need replacement, and we need to make sure the drives are cleaned when necessary. While many tape libraries and even autoloaders can handle all this automatically, support by Bacula can be helpful for smaller (older) libraries and single drives. Also, checking drive status during operation can prevent some failures (as I had to learn the hard way...) Notes: First, Bacula could (and even does, to some limited extent) record tape and drive usage. For tapes, the number of mounts, the amount of data, and the time the tape has actually been running could be recorded. Data fields for Read and Write time and Nmber of mounts already exist in the catalog (I'm not sure if VolBytes is the sum of all bytes ever written to that volume by Bacula). This information can be important when determining which media to replace. For the tape drives known to Bacula, similar information is interesting to determine the device status and expected life time: Time it's been Reading and Writing, number of tape Loads / Unloads / Errors. This information is not yet recorded as far as I know. The next step would be implementing drive cleaning setup. Bacula already has knowledge about cleaning tapes. Once it has some information about cleaning cycles (measured in drive run time, number of tapes used, or calender days, for example) it can automatically execute tape cleaning (with an autochanger, obviously) or ask for operator assistence loading a cleaning tape. The next step would be to implement TAPEALERT checks not only when changing tapes and only sending he information to the administrator, but rather checking after each tape error, checking on a regular basis (for example after each tape file), and also before unloading and after loading a new tape. Then, depending on the drives TAPEALERT state and the know drive cleaning state Bacula could automatically schedule later cleaning, clean immediately, or inform the operator. Implementing this would perhaps require another catalog change and perhaps major changes in SD code and the DIR-SD protocoll, so I'd only consider this worth implementing if it would actually be used or even needed by many people. Item 14: Merging of multiple backups into a single one. (Also called Synthetic Backup or Consolidation). Origin: Marc Cousin and Eric Bollengier Date: 15 November 2005 Status: Depends on first implementing project Item 1 (Migration). What: A merged backup is a backup made without connecting to the Client. It would be a Merge of existing backups into a single backup. In effect, it is like a restore but to the backup medium. For instance, say that last sunday we made a full backup. Then all week long, we created incremental backups, in order to do them fast. Now comes sunday again, and we need another full. The merged backup makes it possible to do instead an incremental backup (during the night for instance), and then create a merged backup during the day, by using the full and incrementals from the week. The merged backup will be exactly like a full made sunday night on the tape, but the production interruption on the Client will be minimal, as the Client will only have to send incrementals. In fact, if it's done correctly, you could merge all the Incrementals into single Incremental, or all the Incrementals and the last Differential into a new Differential, or the Full, last differential and all the Incrementals into a new Full backup. And there is no need to involve the Client. Why: The benefit is that : - the Client just does an incremental ; - the merged backup on tape is just as a single full backup, and can be restored very fast. This is also a way of reducing the backup data since the old data can then be pruned (or not) from the catalog, possibly allowing older volumes to be recycled Item 15: Automatic disabling of devices Date: 2005-11-11 Origin: Peter Eriksson Status: What: After a configurable amount of fatal errors with a tape drive Bacula should automatically disable further use of a certain tape drive. There should also be "disable"/"enable" commands in the "bconsole" tool. Why: On a multi-drive jukebox there is a possibility of tape drives going bad during large backups (needing a cleaning tape run, tapes getting stuck). It would be advantageous if Bacula would automatically disable further use of a problematic tape drive after a configurable amount of errors has occured. An example: I have a multi-drive jukebox (6 drives, 380+ slots) where tapes occasionally get stuck inside the drive. Bacula will notice that the "mtx-changer" command will fail and then fail any backup jobs trying to use that drive. However, it will still keep on trying to run new jobs using that drive and fail - forever, and thus failing lots and lots of jobs... Since we have many drives Bacula could have just automatically disabled further use of that drive and used one of the other ones instead. Item 16: Directive/mode to backup only file changes, not entire file Date: 11 November 2005 Origin: Joshua Kugler Marek Bajon Status: RFC What: Currently when a file changes, the entire file will be backed up in the next incremental or full backup. To save space on the tapes it would be nice to have a mode whereby only the changes to the file would be backed up when it is changed. Why: This would save lots of space when backing up large files such as logs, mbox files, Outlook PST files and the like. Notes: This would require the usage of disk-based volumes as comparing files would not be feasible using a tape drive. Item 17: Quick release of FD-SD connection Origin: Frank Volf (frank at deze dot org) Date: 17 november 2005 Status: What: In the bacula implementation a backup is finished after all data and attributes are succesfully written to storage. When using a tape backup it is very annoying that a backup can take a day, simply because the current tape (or whatever) is full and the administrator has not put a new one in. During that time the system cannot be taken off-line, because there is still an open session between the storage daemon and the file daemon on the client. Although this is a very good strategey for making "safe backups" This can be annoying for e.g. laptops, that must remain connected until the bacukp is completed. Using a new feature called "migration" it will be possible to spool first to harddisk (using a special 'spool' migration scheme) and then migrate the backup to tape. There is still the problem of getting the attributes committed. If it takes a very long time to do, with the current code, the job has not terminated, and the File daemon is not freed up. The Storage daemon should release the File daemon as soon as all the file data and all the attributes have been sent to it (the SD). Currently the SD waits until everything is on tape and all the attributes are transmitted to the Director before signalling completion to the FD. I don't think I would have any problem changing this. The reason is that even if the FD reports back to the Dir that all is OK, the job will not terminate until the SD has done the same thing -- so in a way keeping the SD-FD link open to the very end is not really very productive ... Why: Makes backup of laptops much easier. Item 18: Add support for CACHEDIR.TAG Origin: Norbert Kiesel Date: 21 November 2005 Status: What: CACHDIR.TAG is a proposal for identifying directories which should be ignored for archiving/backup. It works by ignoring directory trees which have a file named CACHEDIR.TAG with a specific content. See http://www.brynosaurus.com/cachedir/spec.html for details. From Peter Eriksson: I suggest that if this is implemented (I've also asked for this feature some year ago) that it is made compatible with Legato Networkers ".nsr" files where you can specify a lot of options on how to handle files/directories (including denying further parsing of .nsr files lower down into the directory trees). A PDF version of the .nsr man page can be viewed at: http://www.ifm.liu.se/~peter/nsr.pdf Why: It's a nice alternative to "exclude" patterns for directories which don't have regular pathnames. Also, it allows users to control backup for themself. Implementation should be pretty simple. GNU tar >= 1.14 or so supports it, too. Notes: I envision this as an optional feature to a fileset specification. Item 19: Implement new {Client}Run{Before|After}Job feature. Date: 26 September 2005 Origin: Phil Stracchino Status: What: Some time ago, there was a discussion of RunAfterJob and ClientRunAfterJob, and the fact that they do not run after failed jobs. At the time, there was a suggestion to add a RunAfterFailedJob directive (and, presumably, a matching ClientRunAfterFailedJob directive), but to my knowledge these were never implemented. An alternate way of approaching the problem has just occurred to me. Suppose the RunBeforeJob and RunAfterJob directives were expanded in a manner something like this example: RunBeforeJob { Command = "/opt/bacula/etc/checkhost %c" RunsOnClient = No RunsAtJobLevels = All # All, Full, Diff, Inc AbortJobOnError = Yes } RunBeforeJob { Command = c:/bacula/systemstate.bat RunsOnClient = yes RunsAtJobLevels = All # All, Full, Diff, Inc AbortJobOnError = No } RunAfterJob { Command = c:/bacula/deletestatefile.bat RunsOnClient = Yes RunsAtJobLevels = All # All, Full, Diff, Inc RunsOnSuccess = Yes RunsOnFailure = Yes } RunAfterJob { Command = c:/bacula/somethingelse.bat RunsOnClient = Yes RunsAtJobLevels = All RunsOnSuccess = No RunsOnFailure = Yes } RunAfterJob { Command = "/opt/bacula/etc/checkhost -v %c" RunsOnClient = No RunsAtJobLevels = All RunsOnSuccess = No RunsOnFailure = Yes } Why: It would be a significant change to the structure of the directives, but allows for a lot more flexibility, including RunAfter commands that will run regardless of whether the job succeeds, or RunBefore tasks that still allow the job to run even if that specific RunBefore fails. Notes: By Kern: I would prefer to have a single new Resource called RunScript. More notes from Phil: RunsWhen = Before|After RunsAtJobLevels = All|Full|Diff|Inc The AbortJobOnError, RunsOnSuccess and RunsOnFailure directives could be optional, and possibly RunsWhen as well. If omitted, RunsWhen would default to Before. AbortJobOnError would be ignored unless RunsWhen was set to Before (or RunsBefore Job set to Yes), and would default to Yes if omitted. If AbortJobOnError was set to No, failure of the script would still generate a warning. RunsOnSuccess would be ignored unless RunsWhen was set to After (or RunsBeforeJob set to No), and default to Yes. RunsOnFailure would be ignored unless RunsWhen was set to After, and default to No. ============= Empty RFC form =========== Item n: One line summary ... Date: Date submitted Origin: Name and email of originator. Status: What: More detailed explanation ... Why: Why it is important ... Notes: Additional notes or features (omit if not used) ============== End RFC form ============== Items completed for release 1.38.0 -- see kernsdone