X-Git-Url: https://git.sur5r.net/?a=blobdiff_plain;f=bacula%2Fprojects;h=a7334e20bd7c8960324fcf1859f8e4848ee57b15;hb=30ba4a6f0147e8af1353ed3dbc8a54eb4b25fdee;hp=69b1a556d82f80c09071903ade0225966276d2ee;hpb=d08d3642c66b931918accc895e1440389408fe95;p=bacula%2Fbacula diff --git a/bacula/projects b/bacula/projects index 69b1a556d8..a7334e20bd 100644 --- a/bacula/projects +++ b/bacula/projects @@ -1,191 +1,239 @@ Projects: Bacula Projects Roadmap - Prioritized by user vote 07 December 2005 - Status updated 30 July 2006 + Status updated 25 February 2010 Summary: -Item 1: Implement data encryption (as opposed to comm encryption) -Item 2: Implement Migration that moves Jobs from one Pool to another. -Item 3: Accurate restoration of renamed/deleted files from -Item 4: Implement a Bacula GUI/management tool using Python. -Item 5: Implement Base jobs. -Item 6: Allow FD to initiate a backup -Item 7: Improve Bacula's tape and drive usage and cleaning management. -Item 8: Implement creation and maintenance of copy pools -Item 9: Implement new {Client}Run{Before|After}Job feature. -Item 10: Merge multiple backups (Synthetic Backup or Consolidation). -Item 11: Deletion of Disk-Based Bacula Volumes -Item 12: Directive/mode to backup only file changes, not entire file -Item 13: Multiple threads in file daemon for the same job -Item 14: Implement red/black binary tree routines. -Item 15: Add support for FileSets in user directories CACHEDIR.TAG -Item 16: Implement extraction of Win32 BackupWrite data. -Item 17: Implement a Python interface to the Bacula catalog. -Item 18: Archival (removal) of User Files to Tape -Item 19: Add Plug-ins to the FileSet Include statements. -Item 20: Implement more Python events in Bacula. -Item 21: Quick release of FD-SD connection after backup. -Item 22: Permit multiple Media Types in an Autochanger -Item 23: Allow different autochanger definitions for one autochanger. -Item 24: Automatic disabling of devices -Item 25: Implement huge exclude list support using hashing. - - -Below, you will find more information on future projects: - -Item 1: Implement data encryption (as opposed to comm encryption) - Date: 28 October 2005 - Origin: Sponsored by Landon and 13 contributors to EFF. - Status: Done: Landon Fuller has implemented this in 1.39.x. - - What: Currently the data that is stored on the Volume is not - encrypted. For confidentiality, encryption of data at - the File daemon level is essential. - Data encryption encrypts the data in the File daemon and - decrypts the data in the File daemon during a restore. - - Why: Large sites require this. - -Item 2: Implement Migration that moves Jobs from one Pool to another. - Origin: Sponsored by Riege Software International GmbH. Contact: - Daniel Holtkamp - Date: 28 October 2005 - Status: 90% complete: Working in 1.39, more to do. Assigned to - Kern. - - What: The ability to copy, move, or archive data that is on a - device to another device is very important. - - Why: An ISP might want to backup to disk, but after 30 days - migrate the data to tape backup and delete it from - disk. Bacula should be able to handle this - automatically. It needs to know what was put where, - and when, and what to migrate -- it is a bit like - retention periods. Doing so would allow space to be - freed up for current backups while maintaining older - data on tape drives. - - Notes: Riege Software have asked for the following migration - triggers: - Age of Job - Highwater mark (stopped by Lowwater mark?) - - Notes: Migration could be additionally triggered by: - Number of Jobs - Number of Volumes - -Item 3: Accurate restoration of renamed/deleted files from - Incremental/Differential backups - Date: 28 November 2005 - Origin: Martin Simmons (martin at lispworks dot com) +* => item complete + +Item 1: Ability to restart failed jobs +Item 2: Scheduling syntax that permits more flexibility and options +Item 3: Data encryption on storage daemon +Item 4: Add ability to Verify any specified Job. +Item 5: Improve Bacula's tape and drive usage and cleaning management +Item 6: Allow FD to initiate a backup +Item 7: Implement Storage daemon compression +Item 8: Reduction of communications bandwidth for a backup +Item 9: Ability to reconnect a disconnected comm line +Item 10: Start spooling even when waiting on tape +Item 11: Include all conf files in specified directory +Item 12: Multiple threads in file daemon for the same job +Item 13: Possibilty to schedule Jobs on last Friday of the month +Item 14: Include timestamp of job launch in "stat clients" output +Item 15: Message mailing based on backup types +Item 16: Ability to import/export Bacula database entities +Item 17: Implementation of running Job speed limit. +Item 18: Add an override in Schedule for Pools based on backup types +Item 19: Automatic promotion of backup levels based on backup size +Item 20: Allow FileSet inclusion/exclusion by creation/mod times +Item 21: Archival (removal) of User Files to Tape +Item 22: An option to operate on all pools with update vol parameters +Item 23: Automatic disabling of devices +Item 24: Ability to defer Batch Insert to a later time +Item 25: Add MaxVolumeSize/MaxVolumeBytes to Storage resource +Item 26: Enable persistent naming/number of SQL queries +Item 27: Bacula Dir, FD and SD to support proxies +Item 28: Add Minumum Spool Size directive +Item 29: Handle Windows Encrypted Files using Win raw encryption +Item 30: Implement a Storage device like Amazon's S3. +Item 31: Convert tray monitor on Windows to a stand alone program +Item 32: Relabel disk volume after recycling +Item 33: Command that releases all drives in an autochanger +Item 34: Run bscan on a remote storage daemon from within bconsole. +Item 35: Implement a Migration job type that will create a reverse +Item 36: Job migration between different SDs +Item 37: Concurrent spooling and despooling withini a single job. +Item 39: Extend the verify code to make it possible to verify +Item 40: Separate "Storage" and "Device" in the bacula-dir.conf +Item 41: Least recently used device selection for tape drives in autochanger. + + +Item 1: Ability to restart failed jobs + Date: 26 April 2009 + Origin: Kern/Eric + Status: + + What: Often jobs fail because of a communications line drop or max run time, + cancel, or some other non-critical problem. Currrently any data + saved is lost. This implementation should modify the Storage daemon + so that it saves all the files that it knows are completely backed + up to the Volume + + The jobs should then be marked as incomplete and a subsequent + Incremental Accurate backup will then take into account all the + previously saved job. + + Why: Avoids backuping data already saved. + + Notes: Requires Accurate to restart correctly. Must completed have a minimum + volume of data or files stored on Volume before enabling. + + +Item 2: Scheduling syntax that permits more flexibility and options + Date: 15 December 2006 + Origin: Gregory Brauer (greg at wildbrain dot com) and + Florian Schnabel Status: - What: When restoring a fileset for a specified date (including "most - recent"), Bacula should give you exactly the files and directories - that existed at the time of the last backup prior to that date. - - Currently this only works if the last backup was a Full backup. - When the last backup was Incremental/Differential, files and - directories that have been renamed or deleted since the last Full - backup are not currently restored correctly. Ditto for files with - extra/fewer hard links than at the time of the last Full backup. - - Why: Incremental/Differential would be much more useful if this worked. - - Notes: Item 14 (Merging of multiple backups into a single one) seems to - rely on this working, otherwise the merged backups will not be - truly equivalent to a Full backup. - - Kern: notes shortened. This can be done without the need for - inodes. It is essentially the same as the current Verify job, - but one additional database record must be written, which does - not need any database change. - - Kern: see if we can correct restoration of directories if - replace=ifnewer is set. Currently, if the directory does not - exist, a "dummy" directory is created, then when all the files - are updated, the dummy directory is newer so the real values - are not updated. - -Item 4: Implement a Bacula GUI/management tool using Python. - Origin: Kern - Date: 28 October 2005 - Status: Lucus is working on this for Python GTK+. - - What: Implement a Bacula console, and management tools - using Python and Qt or GTK. - - Why: Don't we already have a wxWidgets GUI? Yes, but - it is written in C++ and changes to the user interface - must be hand tailored using C++ code. By developing - the user interface using Qt designer, the interface - can be very easily updated and most of the new Python - code will be automatically created. The user interface - changes become very simple, and only the new features - must be implement. In addition, the code will be in - Python, which will give many more users easy (or easier) - access to making additions or modifications. - - Notes: This is currently being implemented using Python-GTK by - Lucas Di Pentima - -Item 5: Implement Base jobs. - Date: 28 October 2005 - Origin: Kern - Status: - - What: A base job is sort of like a Full save except that you - will want the FileSet to contain only files that are - unlikely to change in the future (i.e. a snapshot of - most of your system after installing it). After the - base job has been run, when you are doing a Full save, - you specify one or more Base jobs to be used. All - files that have been backed up in the Base job/jobs but - not modified will then be excluded from the backup. - During a restore, the Base jobs will be automatically - pulled in where necessary. - - Why: This is something none of the competition does, as far as - we know (except perhaps BackupPC, which is a Perl program that - saves to disk only). It is big win for the user, it - makes Bacula stand out as offering a unique - optimization that immediately saves time and money. - Basically, imagine that you have 100 nearly identical - Windows or Linux machine containing the OS and user - files. Now for the OS part, a Base job will be backed - up once, and rather than making 100 copies of the OS, - there will be only one. If one or more of the systems - have some files updated, no problem, they will be - automatically restored. - - Notes: Huge savings in tape usage even for a single machine. - Will require more resources because the DIR must send - FD a list of files/attribs, and the FD must search the - list and compare it for each file to be saved. - -Item 6: Allow FD to initiate a backup - Origin: Frank Volf (frank at deze dot org) - Date: 17 November 2005 - Status: + What: Currently, Bacula only understands how to deal with weeks of the + month or weeks of the year in schedules. This makes it impossible + to do a true weekly rotation of tapes. There will always be a + discontinuity that will require disruptive manual intervention at + least monthly or yearly because week boundaries never align with + month or year boundaries. + + A solution would be to add a new syntax that defines (at least) + a start timestamp, and repetition period. + + An easy option to skip a certain job on a certain date. + + + Why: Rotated backups done at weekly intervals are useful, and Bacula + cannot currently do them without extensive hacking. + + You could then easily skip tape backups on holidays. Especially + if you got no autochanger and can only fit one backup on a tape + that would be really handy, other jobs could proceed normally + and you won't get errors that way. + + + Notes: Here is an example syntax showing a 3-week rotation where full + Backups would be performed every week on Saturday, and an + incremental would be performed every week on Tuesday. Each + set of tapes could be removed from the loader for the following + two cycles before coming back and being reused on the third + week. Since the execution times are determined by intervals + from a given point in time, there will never be any issues with + having to adjust to any sort of arbitrary time boundary. In + the example provided, I even define the starting schedule + as crossing both a year and a month boundary, but the run times + would be based on the "Repeat" value and would therefore happen + weekly as desired. + + + Schedule { + Name = "Week 1 Rotation" + #Saturday. Would run Dec 30, Jan 20, Feb 10, etc. + Run { + Options { + Type = Full + Start = 2006-12-30 01:00 + Repeat = 3w + } + } + #Tuesday. Would run Jan 2, Jan 23, Feb 13, etc. + Run { + Options { + Type = Incremental + Start = 2007-01-02 01:00 + Repeat = 3w + } + } + } - What: Provide some means, possibly by a restricted console that - allows a FD to initiate a backup, and that uses the connection - established by the FD to the Director for the backup so that - a Director that is firewalled can do the backup. + Schedule { + Name = "Week 2 Rotation" + #Saturday. Would run Jan 6, Jan 27, Feb 17, etc. + Run { + Options { + Type = Full + Start = 2007-01-06 01:00 + Repeat = 3w + } + } + #Tuesday. Would run Jan 9, Jan 30, Feb 20, etc. + Run { + Options { + Type = Incremental + Start = 2007-01-09 01:00 + Repeat = 3w + } + } + } - Why: Makes backup of laptops much easier. + Schedule { + Name = "Week 3 Rotation" + #Saturday. Would run Jan 13, Feb 3, Feb 24, etc. + Run { + Options { + Type = Full + Start = 2007-01-13 01:00 + Repeat = 3w + } + } + #Tuesday. Would run Jan 16, Feb 6, Feb 27, etc. + Run { + Options { + Type = Incremental + Start = 2007-01-16 01:00 + Repeat = 3w + } + } + } -Item 7: Improve Bacula's tape and drive usage and cleaning management. - Date: 8 November 2005, November 11, 2005 + Notes: Kern: I have merged the previously separate project of skipping + jobs (via Schedule syntax) into this. + + +Item 3: Data encryption on storage daemon + Origin: Tobias Barth + Date: 04 February 2009 + Status: new + + What: The storage demon should be able to do the data encryption that can + currently be done by the file daemon. + + Why: This would have 2 advantages: + 1) one could encrypt the data of unencrypted tapes by doing a + migration job + 2) the storage daemon would be the only machine that would have + to keep the encryption keys. + + Notes from Landon: + As an addendum to the feature request, here are some crypto + implementation details I wrote up regarding SD-encryption back in Jan + 2008: + http://www.mail-archive.com/bacula-users@lists.sourceforge.net/msg28860.html + + +Item 4: Add ability to Verify any specified Job. +Date: 17 January 2008 +Origin: portrix.net Hamburg, Germany. +Contact: Christian Sabelmann +Status: 70% of the required Code is part of the Verify function since v. 2.x + + What: + The ability to tell Bacula which Job should verify instead of + automatically verify just the last one. + + Why: + It is sad that such a powerfull feature like Verify Jobs + (VolumeToCatalog) is restricted to be used only with the last backup Job + of a client. Actual users who have to do daily Backups are forced to + also do daily Verify Jobs in order to take advantage of this useful + feature. This Daily Verify after Backup conduct is not always desired + and Verify Jobs have to be sometimes scheduled. (Not necessarily + scheduled in Bacula). With this feature Admins can verify Jobs once a + Week or less per month, selecting the Jobs they want to verify. This + feature is also not to difficult to implement taking in account older bug + reports about this feature and the selection of the Job to be verified. + + Notes: For the verify Job, the user could select the Job to be verified + from a List of the latest Jobs of a client. It would also be possible to + verify a certain volume. All of these would naturaly apply only for + Jobs whose file information are still in the catalog. + + +Item 5: Improve Bacula's tape and drive usage and cleaning management + Date: 8 November 2005, November 11, 2005 Origin: Adam Thornton , Arno Lehmann Status: - What: Make Bacula manage tape life cycle information, tape reuse + What: Make Bacula manage tape life cycle information, tape reuse times and drive cleaning cycles. - Why: All three parts of this project are important when operating + Why: All three parts of this project are important when operating backups. We need to know which tapes need replacement, and we need to make sure the drives are cleaned when necessary. While many @@ -197,7 +245,7 @@ Item 7: Improve Bacula's tape and drive usage and cleaning management. drive status during operation can prevent some failures (as I [Arno] had to learn the hard way...) - Notes: First, Bacula could (and even does, to some limited extent) + Notes: First, Bacula could (and even does, to some limited extent) record tape and drive usage. For tapes, the number of mounts, the amount of data, and the time the tape has actually been running could be recorded. Data fields for Read and Write @@ -244,334 +292,425 @@ Item 7: Improve Bacula's tape and drive usage and cleaning management. sub-projects: Measuring Tape and Drive usage, retiring volumes, and handling drive cleaning and TAPEALERTs. -Item 8: Implement creation and maintenance of copy pools - Date: 27 November 2005 - Origin: David Boyes (dboyes at sinenomine dot net) + +Item 6: Allow FD to initiate a backup +Origin: Frank Volf (frank at deze dot org) +Date: 17 November 2005 +Status: + +What: Provide some means, possibly by a restricted console that + allows a FD to initiate a backup, and that uses the connection + established by the FD to the Director for the backup so that + a Director that is firewalled can do the backup. +Why: Makes backup of laptops much easier. +Notes: - The FD already has code for the monitor interface + - It could be nice to have a .job command that lists authorized + jobs. + - Commands need to be restricted on the Director side + (for example by re-using the runscript flag) + - The Client resource can be used to authorize the connection + - In a first time, the client can't modify job parameters + - We need a way to run a status command to follow job progression + + This project consists of the following points + 1. Modify the FD to have a "mini-console" interface that + permits it to connect to the Director and start a + backup job of itself. + 2. The list of jobs that can be started by the FD are + defined in the Director (possibly via a restricted + console). + 3. Modify the existing tray monitor code in the Win32 FD + so that it is a separate program from the FD. + 4. The tray monitor program should be extended to permit + initiating a backup. + 5. No new Director directives should be added without + prior consultation with the Bacula developers. + 6. The comm line used by the FD to connect to the Director + should be re-used by the Director to do the backup. + This feature is partially implemented in the Director. + 7. The FD may have a new directive that allows it to start + a backup when the FD starts. + 8. The console interface to the FD should be extended to + permit a properly authorized console to initiate a + backup via the FD. + + +Item 7: Implement Storage daemon compression + Date: 18 December 2006 + Origin: Vadim A. Umanski , e-mail umanski@ext.ru Status: + What: The ability to compress backup data on the SD receiving data + instead of doing that on client sending data. + Why: The need is practical. I've got some machines that can send + data to the network 4 or 5 times faster than compressing + them (I've measured that). They're using fast enough SCSI/FC + disk subsystems but rather slow CPUs (ex. UltraSPARC II). + And the backup server has got a quite fast CPUs (ex. Dual P4 + Xeons) and quite a low load. When you have 20, 50 or 100 GB + of raw data - running a job 4 to 5 times faster - that + really matters. On the other hand, the data can be + compressed 50% or better - so losing twice more space for + disk backup is not good at all. And the network is all mine + (I have a dedicated management/provisioning network) and I + can get as high bandwidth as I need - 100Mbps, 1000Mbps... + That's why the server-side compression feature is needed! + Notes: - What: I would like Bacula to have the capability to write copies - of backed-up data on multiple physical volumes selected - from different pools without transferring the data - multiple times, and to accept any of the copy volumes - as valid for restore. - - Why: In many cases, businesses are required to keep offsite - copies of backup volumes, or just wish for simple - protection against a human operator dropping a storage - volume and damaging it. The ability to generate multiple - volumes in the course of a single backup job allows - customers to simple check out one copy and send it - offsite, marking it as out of changer or otherwise - unavailable. Currently, the library and magazine - management capability in Bacula does not make this process - simple. - - Restores would use the copy of the data on the first - available volume, in order of copy pool chain definition. - - This is also a major scalability issue -- as the number of - clients increases beyond several thousand, and the volume - of data increases, transferring the data multiple times to - produce additional copies of the backups will become - physically impossible due to transfer speed - issues. Generating multiple copies at server side will - become the only practical option. - - How: I suspect that this will require adding a multiplexing - SD that appears to be a SD to a specific FD, but 1-n FDs - to the specific back end SDs managing the primary and copy - pools. Storage pools will also need to acquire parameters - to define the pools to be used for copies. - - Notes: I would commit some of my developers' time if we can agree - on the design and behavior. - -Item 9: Implement new {Client}Run{Before|After}Job feature. - Date: 26 September 2005 - Origin: Phil Stracchino - Status: Done. This has been implemented by Eric Bollengier - - What: Some time ago, there was a discussion of RunAfterJob and - ClientRunAfterJob, and the fact that they do not run after failed - jobs. At the time, there was a suggestion to add a - RunAfterFailedJob directive (and, presumably, a matching - ClientRunAfterFailedJob directive), but to my knowledge these - were never implemented. - - The current implementation doesn't permit to add new feature easily. - - An alternate way of approaching the problem has just occurred to - me. Suppose the RunBeforeJob and RunAfterJob directives were - expanded in a manner like this example: - - RunScript { - Command = "/opt/bacula/etc/checkhost %c" - RunsOnClient = No # default - AbortJobOnError = Yes # default - RunsWhen = Before - } - RunScript { - Command = c:/bacula/systemstate.bat - RunsOnClient = yes - AbortJobOnError = No - RunsWhen = After - RunsOnFailure = yes - } - RunScript { - Command = c:/bacula/deletestatefile.bat - Target = rico-fd - RunsWhen = Always - } +Item 8: Reduction of communications bandwidth for a backup + Date: 14 October 2008 + Origin: Robin O'Leary (Equiinet) + Status: - It's now possible to specify more than 1 command per Job. - (you can stop your database and your webserver without a script) - - ex : - Job { - Name = "Client1" - JobDefs = "DefaultJob" - Write Bootstrap = "/tmp/bacula/var/bacula/working/Client1.bsr" - FileSet = "Minimal" - - RunBeforeJob = "echo test before ; echo test before2" - RunBeforeJob = "echo test before (2nd time)" - RunBeforeJob = "echo test before (3rd time)" - RunAfterJob = "echo test after" - ClientRunAfterJob = "echo test after client" - - RunScript { - Command = "echo test RunScript in error" - Runsonclient = yes - RunsOnSuccess = no - RunsOnFailure = yes - RunsWhen = After # never by default - } - RunScript { - Command = "echo test RunScript on success" - Runsonclient = yes - RunsOnSuccess = yes # default - RunsOnFailure = no # default - RunsWhen = After - } - } + What: Using rdiff techniques, Bacula could significantly reduce + the network data transfer volume to do a backup. - Why: It would be a significant change to the structure of the - directives, but allows for a lot more flexibility, including - RunAfter commands that will run regardless of whether the job - succeeds, or RunBefore tasks that still allow the job to run even - if that specific RunBefore fails. - - Notes: (More notes from Phil, Kern, David and Eric) - I would prefer to have a single new Resource called - RunScript. - - RunsWhen = After|Before|Always - RunsAtJobLevels = All|Full|Diff|Inc # not yet implemented - - The AbortJobOnError, RunsOnSuccess and RunsOnFailure directives - could be optional, and possibly RunWhen as well. - - AbortJobOnError would be ignored unless RunsWhen was set to Before - and would default to Yes if omitted. - If AbortJobOnError was set to No, failure of the script - would still generate a warning. - - RunsOnSuccess would be ignored unless RunsWhen was set to After - (or RunsBeforeJob set to No), and default to Yes. - - RunsOnFailure would be ignored unless RunsWhen was set to After, - and default to No. - - Allow having the before/after status on the script command - line so that the same script can be used both before/after. - -Item 10: Merge multiple backups (Synthetic Backup or Consolidation). - Origin: Marc Cousin and Eric Bollengier - Date: 15 November 2005 - Status: Waiting implementation. Depends on first implementing - project Item 2 (Migration). - - What: A merged backup is a backup made without connecting to the Client. - It would be a Merge of existing backups into a single backup. - In effect, it is like a restore but to the backup medium. - - For instance, say that last Sunday we made a full backup. Then - all week long, we created incremental backups, in order to do - them fast. Now comes Sunday again, and we need another full. - The merged backup makes it possible to do instead an incremental - backup (during the night for instance), and then create a merged - backup during the day, by using the full and incrementals from - the week. The merged backup will be exactly like a full made - Sunday night on the tape, but the production interruption on the - Client will be minimal, as the Client will only have to send - incrementals. - - In fact, if it's done correctly, you could merge all the - Incrementals into single Incremental, or all the Incrementals - and the last Differential into a new Differential, or the Full, - last differential and all the Incrementals into a new Full - backup. And there is no need to involve the Client. - - Why: The benefit is that : - - the Client just does an incremental ; - - the merged backup on tape is just as a single full backup, - and can be restored very fast. - - This is also a way of reducing the backup data since the old - data can then be pruned (or not) from the catalog, possibly - allowing older volumes to be recycled - -Item 11: Deletion of Disk-Based Bacula Volumes - Date: Nov 25, 2005 - Origin: Ross Boylan (edited - by Kern) - Status: - - What: Provide a way for Bacula to automatically remove Volumes - from the filesystem, or optionally to truncate them. - Obviously, the Volume must be pruned prior removal. - - Why: This would allow users more control over their Volumes and - prevent disk based volumes from consuming too much space. - - Notes: The following two directives might do the trick: - - Volume Data Retention =