X-Git-Url: https://git.sur5r.net/?a=blobdiff_plain;f=bacula%2Fprojects;h=8704fca47120eed53ab29c6a3a9c1b467dd4d597;hb=943ef07717af1afa3b32adb7127fe1b4f8e14671;hp=fdd666486509c567c25a73aca07d03699323588e;hpb=b8224aab234012c2d127b84eceb160e99dd4a14d;p=bacula%2Fbacula diff --git a/bacula/projects b/bacula/projects index fdd6664865..8704fca471 100644 --- a/bacula/projects +++ b/bacula/projects @@ -1,113 +1,29 @@ Projects: Bacula Projects Roadmap - Status updated 26 January 2007 - After re-ordering in vote priority - -Items Completed: -Item: 18 Quick release of FD-SD connection after backup. -Item: 40 Include JobID in spool file name -Item: 25 Implement huge exclude list support using dlist + Status updated 04 February 2009 Summary: -Item: 1 Accurate restoration of renamed/deleted files -Item: 2 Implement a Bacula GUI/management tool. -Item: 3 Allow FD to initiate a backup -Item: 4 Merge multiple backups (Synthetic Backup or Consolidation). -Item: 5 Deletion of Disk-Based Bacula Volumes -Item: 6 Implement Base jobs. -Item: 7 Implement creation and maintenance of copy pools -Item: 8 Directive/mode to backup only file changes, not entire file -Item: 9 Implement a server-side compression feature -Item: 10 Improve Bacula's tape and drive usage and cleaning management. -Item: 11 Allow skipping execution of Jobs -Item: 12 Add a scheduling syntax that permits weekly rotations -Item: 13 Archival (removal) of User Files to Tape -Item: 14 Cause daemons to use a specific IP address to source communications -Item: 15 Multiple threads in file daemon for the same job -Item: 16 Add Plug-ins to the FileSet Include statements. -Item: 17 Restore only file attributes (permissions, ACL, owner, group...) -Item: 18* Quick release of FD-SD connection after backup. -Item: 19 Implement a Python interface to the Bacula catalog. -Item: 20 Archive data -Item: 21 Split documentation -Item: 22 Implement support for stacking arbitrary stream filters, sinks. -Item: 23 Implement from-client and to-client on restore command line. -Item: 24 Add an override in Schedule for Pools based on backup types. -Item: 25* Implement huge exclude list support using hashing. -Item: 26 Implement more Python events in Bacula. -Item: 27 Incorporation of XACML2/SAML2 parsing -Item: 28 Filesystem watch triggered backup. -Item: 29 Allow inclusion/exclusion of files in a fileset by creation/mod times -Item: 30 Tray monitor window cleanups -Item: 31 Implement multiple numeric backup levels as supported by dump -Item: 32 Automatic promotion of backup levels -Item: 33 Clustered file-daemons -Item: 34 Commercial database support -Item: 35 Automatic disabling of devices -Item: 36 An option to operate on all pools with update vol parameters -Item: 37 Add an item to the restore option where you can select a pool -Item: 38 Include timestamp of job launch in "stat clients" output -Item: 39 Message mailing based on backup types -Item: 40* Include JobID in spool file name - - -Item 1: Accurate restoration of renamed/deleted files - Date: 28 November 2005 - Origin: Martin Simmons (martin at lispworks dot com) - Status: Robert Nelson will implement this - - What: When restoring a fileset for a specified date (including "most - recent"), Bacula should give you exactly the files and directories - that existed at the time of the last backup prior to that date. - - Currently this only works if the last backup was a Full backup. - When the last backup was Incremental/Differential, files and - directories that have been renamed or deleted since the last Full - backup are not currently restored correctly. Ditto for files with - extra/fewer hard links than at the time of the last Full backup. - - Why: Incremental/Differential would be much more useful if this worked. - - Notes: Merging of multiple backups into a single one seems to - rely on this working, otherwise the merged backups will not be - truly equivalent to a Full backup. - - Kern: notes shortened. This can be done without the need for - inodes. It is essentially the same as the current Verify job, - but one additional database record must be written, which does - not need any database change. - - Kern: see if we can correct restoration of directories if - replace=ifnewer is set. Currently, if the directory does not - exist, a "dummy" directory is created, then when all the files - are updated, the dummy directory is newer so the real values - are not updated. - -Item 2: Implement a Bacula GUI/management tool. - Origin: Kern - Date: 28 October 2005 - Status: In progress - - What: Implement a Bacula console, and management tools - probably using Qt3 and C++. - - Why: Don't we already have a wxWidgets GUI? Yes, but - it is written in C++ and changes to the user interface - must be hand tailored using C++ code. By developing - the user interface using Qt designer, the interface - can be very easily updated and most of the new Python - code will be automatically created. The user interface - changes become very simple, and only the new features - must be implement. In addition, the code will be in - Python, which will give many more users easy (or easier) - access to making additions or modifications. - - Notes: There is a partial Python-GTK implementation - Lucas Di Pentima but - it is no longer being developed. - -Item 3: Allow FD to initiate a backup +Item 2: Allow FD to initiate a backup +Item 6: Deletion of disk Volumes when pruned +Item 7: Implement Base jobs +Item 9: Scheduling syntax that permits more flexibility and options +Item 10: Message mailing based on backup types +Item 11: Cause daemons to use a specific IP address to source communications +Item 14: Add an override in Schedule for Pools based on backup types +Item 15: Implement more Python events and functions --- Abandoned for plugins +Item 16: Allow inclusion/exclusion of files in a fileset by creation/mod times +Item 17: Automatic promotion of backup levels based on backup size +Item 19: Automatic disabling of devices +Item 20: An option to operate on all pools with update vol parameters +Item 21: Include timestamp of job launch in "stat clients" output +Item 22: Implement Storage daemon compression +Item 23: Improve Bacula's tape and drive usage and cleaning management +Item 24: Multiple threads in file daemon for the same job +Item 25: Archival (removal) of User Files to Tape + + +Item 2: Allow FD to initiate a backup Origin: Frank Volf (frank at deze dot org) Date: 17 November 2005 Status: @@ -120,43 +36,8 @@ Item 3: Allow FD to initiate a backup Why: Makes backup of laptops much easier. -Item 4: Merge multiple backups (Synthetic Backup or Consolidation). - Origin: Marc Cousin and Eric Bollengier - Date: 15 November 2005 - Status: Waiting implementation. Depends on first implementing - project Item 2 (Migration) which is now done. - - What: A merged backup is a backup made without connecting to the Client. - It would be a Merge of existing backups into a single backup. - In effect, it is like a restore but to the backup medium. - - For instance, say that last Sunday we made a full backup. Then - all week long, we created incremental backups, in order to do - them fast. Now comes Sunday again, and we need another full. - The merged backup makes it possible to do instead an incremental - backup (during the night for instance), and then create a merged - backup during the day, by using the full and incrementals from - the week. The merged backup will be exactly like a full made - Sunday night on the tape, but the production interruption on the - Client will be minimal, as the Client will only have to send - incrementals. - - In fact, if it's done correctly, you could merge all the - Incrementals into single Incremental, or all the Incrementals - and the last Differential into a new Differential, or the Full, - last differential and all the Incrementals into a new Full - backup. And there is no need to involve the Client. - - Why: The benefit is that : - - the Client just does an incremental ; - - the merged backup on tape is just as a single full backup, - and can be restored very fast. - - This is also a way of reducing the backup data since the old - data can then be pruned (or not) from the catalog, possibly - allowing older volumes to be recycled - -Item 5: Deletion of Disk-Based Bacula Volumes + +Item 6: Deletion of disk Volumes when pruned Date: Nov 25, 2005 Origin: Ross Boylan (edited by Kern) @@ -177,7 +58,7 @@ Item 5: Deletion of Disk-Based Bacula Volumes The migration project should also remove a Volume that is migrated. This might also work for tape Volumes. -Item 6: Implement Base jobs. +Item 7: Implement Base jobs Date: 28 October 2005 Origin: Kern Status: @@ -211,168 +92,11 @@ Item 6: Implement Base jobs. FD a list of files/attribs, and the FD must search the list and compare it for each file to be saved. -Item 7: Implement creation and maintenance of copy pools - Date: 27 November 2005 - Origin: David Boyes (dboyes at sinenomine dot net) - Status: - - What: I would like Bacula to have the capability to write copies - of backed-up data on multiple physical volumes selected - from different pools without transferring the data - multiple times, and to accept any of the copy volumes - as valid for restore. - - Why: In many cases, businesses are required to keep offsite - copies of backup volumes, or just wish for simple - protection against a human operator dropping a storage - volume and damaging it. The ability to generate multiple - volumes in the course of a single backup job allows - customers to simple check out one copy and send it - offsite, marking it as out of changer or otherwise - unavailable. Currently, the library and magazine - management capability in Bacula does not make this process - simple. - - Restores would use the copy of the data on the first - available volume, in order of copy pool chain definition. - - This is also a major scalability issue -- as the number of - clients increases beyond several thousand, and the volume - of data increases, transferring the data multiple times to - produce additional copies of the backups will become - physically impossible due to transfer speed - issues. Generating multiple copies at server side will - become the only practical option. - - How: I suspect that this will require adding a multiplexing - SD that appears to be a SD to a specific FD, but 1-n FDs - to the specific back end SDs managing the primary and copy - pools. Storage pools will also need to acquire parameters - to define the pools to be used for copies. - - Notes: I would commit some of my developers' time if we can agree - on the design and behavior. - -Item 8: Directive/mode to backup only file changes, not entire file - Date: 11 November 2005 - Origin: Joshua Kugler - Marek Bajon - Status: - - What: Currently when a file changes, the entire file will be backed up in - the next incremental or full backup. To save space on the tapes - it would be nice to have a mode whereby only the changes to the - file would be backed up when it is changed. - - Why: This would save lots of space when backing up large files such as - logs, mbox files, Outlook PST files and the like. - - Notes: This would require the usage of disk-based volumes as comparing - files would not be feasible using a tape drive. - -Item 9: Implement a server-side compression feature - Date: 18 December 2006 - Origin: Vadim A. Umanski , e-mail umanski@ext.ru - Status: - What: The ability to compress backup data on server receiving data - instead of doing that on client sending data. - Why: The need is practical. I've got some machines that can send - data to the network 4 or 5 times faster than compressing - them (I've measured that). They're using fast enough SCSI/FC - disk subsystems but rather slow CPUs (ex. UltraSPARC II). - And the backup server has got a quite fast CPUs (ex. Dual P4 - Xeons) and quite a low load. When you have 20, 50 or 100 GB - of raw data - running a job 4 to 5 times faster - that - really matters. On the other hand, the data can be - compressed 50% or better - so losing twice more space for - disk backup is not good at all. And the network is all mine - (I have a dedicated management/provisioning network) and I - can get as high bandwidth as I need - 100Mbps, 1000Mbps... - That's why the server-side compression feature is needed! - Notes: - -Item 10: Improve Bacula's tape and drive usage and cleaning management. - Date: 8 November 2005, November 11, 2005 - Origin: Adam Thornton , - Arno Lehmann - Status: - - What: Make Bacula manage tape life cycle information, tape reuse - times and drive cleaning cycles. - - Why: All three parts of this project are important when operating - backups. - We need to know which tapes need replacement, and we need to - make sure the drives are cleaned when necessary. While many - tape libraries and even autoloaders can handle all this - automatically, support by Bacula can be helpful for smaller - (older) libraries and single drives. Limiting the number of - times a tape is used might prevent tape errors when using - tapes until the drives can't read it any more. Also, checking - drive status during operation can prevent some failures (as I - [Arno] had to learn the hard way...) - - Notes: First, Bacula could (and even does, to some limited extent) - record tape and drive usage. For tapes, the number of mounts, - the amount of data, and the time the tape has actually been - running could be recorded. Data fields for Read and Write - time and Number of mounts already exist in the catalog (I'm - not sure if VolBytes is the sum of all bytes ever written to - that volume by Bacula). This information can be important - when determining which media to replace. The ability to mark - Volumes as "used up" after a given number of write cycles - should also be implemented so that a tape is never actually - worn out. For the tape drives known to Bacula, similar - information is interesting to determine the device status and - expected life time: Time it's been Reading and Writing, number - of tape Loads / Unloads / Errors. This information is not yet - recorded as far as I [Arno] know. A new volume status would - be necessary for the new state, like "Used up" or "Worn out". - Volumes with this state could be used for restores, but not - for writing. These volumes should be migrated first (assuming - migration is implemented) and, once they are no longer needed, - could be moved to a Trash pool. - - The next step would be to implement a drive cleaning setup. - Bacula already has knowledge about cleaning tapes. Once it - has some information about cleaning cycles (measured in drive - run time, number of tapes used, or calender days, for example) - it can automatically execute tape cleaning (with an - autochanger, obviously) or ask for operator assistance loading - a cleaning tape. - - The final step would be to implement TAPEALERT checks not only - when changing tapes and only sending the information to the - administrator, but rather checking after each tape error, - checking on a regular basis (for example after each tape - file), and also before unloading and after loading a new tape. - Then, depending on the drives TAPEALERT state and the known - drive cleaning state Bacula could automatically schedule later - cleaning, clean immediately, or inform the operator. - - Implementing this would perhaps require another catalog change - and perhaps major changes in SD code and the DIR-SD protocol, - so I'd only consider this worth implementing if it would - actually be used or even needed by many people. - - Implementation of these projects could happen in three distinct - sub-projects: Measuring Tape and Drive usage, retiring - volumes, and handling drive cleaning and TAPEALERTs. - -Item 11: Allow skipping execution of Jobs - Date: 29 November 2005 - Origin: Florian Schnabel - Status: - - What: An easy option to skip a certain job on a certain date. - Why: You could then easily skip tape backups on holidays. Especially - if you got no autochanger and can only fit one backup on a tape - that would be really handy, other jobs could proceed normally - and you won't get errors that way. -Item 12: Add a scheduling syntax that permits weekly rotations +Item 9: Scheduling syntax that permits more flexibility and options Date: 15 December 2006 - Origin: Gregory Brauer (greg at wildbrain dot com) + Origin: Gregory Brauer (greg at wildbrain dot com) and + Florian Schnabel Status: What: Currently, Bacula only understands how to deal with weeks of the @@ -385,9 +109,18 @@ Item 12: Add a scheduling syntax that permits weekly rotations A solution would be to add a new syntax that defines (at least) a start timestamp, and repetition period. - Why: Rotated backups done at weekly intervals are useful, and Bacula + An easy option to skip a certain job on a certain date. + + + Why: Rotated backups done at weekly intervals are useful, and Bacula cannot currently do them without extensive hacking. + You could then easily skip tape backups on holidays. Especially + if you got no autochanger and can only fit one backup on a tape + that would be really handy, other jobs could proceed normally + and you won't get errors that way. + + Notes: Here is an example syntax showing a 3-week rotation where full Backups would be performed every week on Saturday, and an incremental would be performed every week on Tuesday. Each @@ -462,34 +195,42 @@ Item 12: Add a scheduling syntax that permits weekly rotations } } -Item 13: Archival (removal) of User Files to Tape - Date: Nov. 24/2005 - Origin: Ray Pengelly [ray at biomed dot queensu dot ca - Status: + Notes: Kern: I have merged the previously separate project of skipping + jobs (via Schedule syntax) into this. - What: The ability to archive data to storage based on certain parameters - such as age, size, or location. Once the data has been written to - storage and logged it is then pruned from the originating - filesystem. Note! We are talking about user's files and not - Bacula Volumes. - Why: This would allow fully automatic storage management which becomes - useful for large datastores. It would also allow for auto-staging - from one media type to another. +Item 10: Message mailing based on backup types + Origin: Evan Kaufman + Date: January 6, 2006 + Status: - Example 1) Medical imaging needs to store large amounts of data. - They decide to keep data on their servers for 6 months and then put - it away for long term storage. The server then finds all files - older than 6 months writes them to tape. The files are then removed - from the server. + What: In the "Messages" resource definitions, allowing messages + to be mailed based on the type (backup, restore, etc.) and level + (full, differential, etc) of job that created the originating + message(s). - Example 2) All data that hasn't been accessed in 2 months could be - moved from high-cost, fibre-channel disk storage to a low-cost - large-capacity SATA disk storage pool which doesn't have as quick of - access time. Then after another 6 months (or possibly as one - storage pool gets full) data is migrated to Tape. + Why: It would, for example, allow someone's boss to be emailed + automatically only when a Full Backup job runs, so he can + retrieve the tapes for offsite storage, even if the IT dept. + doesn't (or can't) explicitly notify him. At the same time, his + mailbox wouldnt be filled by notifications of Verifies, Restores, + or Incremental/Differential Backups (which would likely be kept + onsite). + + Notes: One way this could be done is through additional message types, for example: + + Messages { + # email the boss only on full system backups + Mail = boss@mycompany.com = full, !incremental, !differential, !restore, + !verify, !admin + # email us only when something breaks + MailOnError = itdept@mycompany.com = all + } + + Notes: Kern: This should be rather trivial to implement. -Item 14: Cause daemons to use a specific IP address to source communications + +Item 11: Cause daemons to use a specific IP address to source communications Origin: Bill Moran Date: 18 Dec 2006 Status: @@ -518,360 +259,73 @@ Item 14: Cause daemons to use a specific IP address to source communications 10.0.0.1 and zone transfers will always originate from 10.0.0.2. -Item 15: Multiple threads in file daemon for the same job - Date: 27 November 2005 - Origin: Ove Risberg (Ove.Risberg at octocode dot com) - Status: - - What: I want the file daemon to start multiple threads for a backup - job so the fastest possible backup can be made. - - The file daemon could parse the FileSet information and start - one thread for each File entry located on a separate - filesystem. - - A confiuration option in the job section should be used to - enable or disable this feature. The confgutration option could - specify the maximum number of threads in the file daemon. - - If the theads could spool the data to separate spool files - the restore process will not be much slower. - - Why: Multiple concurrent backups of a large fileserver with many - disks and controllers will be much faster. - -Item 16: Add Plug-ins to the FileSet Include statements. - Date: 28 October 2005 - Origin: - Status: Partially coded in 1.37 -- much more to do. - - What: Allow users to specify wild-card and/or regular - expressions to be matched in both the Include and - Exclude directives in a FileSet. At the same time, - allow users to define plug-ins to be called (based on - regular expression/wild-card matching). - - Why: This would give the users the ultimate ability to control - how files are backed up/restored. A user could write a - plug-in knows how to backup his Oracle database without - stopping/starting it, for example. - -Item 17: Restore only file attributes (permissions, ACL, owner, group...) - Origin: Eric Bollengier - Date: 30/12/2006 - Status: - What: The goal of this project is to be able to restore only rights - and attributes of files without crushing them. - - Why: Who have never had to repair a chmod -R 777, or a wild update - of recursive right under Windows? At this time, you must have - enough space to restore data, dump attributes (easy with acl, - more complex with unix/windows rights) and apply them to your - broken tree. With this options, it will be very easy to compare - right or ACL over the time. +Item 14: Add an override in Schedule for Pools based on backup types +Date: 19 Jan 2005 +Origin: Chad Slater +Status: + + What: Adding a FullStorage=BigTapeLibrary in the Schedule resource + would help those of us who use different storage devices for different + backup levels cope with the "auto-upgrade" of a backup. - Notes: If the file is here, we skip restore and we change rights. - If the file isn't here, we can create an empty one and apply - rights or do nothing. + Why: Assume I add several new devices to be backed up, i.e. several + hosts with 1TB RAID. To avoid tape switching hassles, incrementals are + stored in a disk set on a 2TB RAID. If you add these devices in the + middle of the month, the incrementals are upgraded to "full" backups, + but they try to use the same storage device as requested in the + incremental job, filling up the RAID holding the differentials. If we + could override the Storage parameter for full and/or differential + backups, then the Full job would use the proper Storage device, which + has more capacity (i.e. a 8TB tape library. -Item 18: Quick release of FD-SD connection after backup. - Origin: Frank Volf (frank at deze dot org) - Date: 17 November 2005 - Status: Done -- implemented by Kern -- in CVS 26Jan07 - - What: In the Bacula implementation a backup is finished after all data - and attributes are successfully written to storage. When using a - tape backup it is very annoying that a backup can take a day, - simply because the current tape (or whatever) is full and the - administrator has not put a new one in. During that time the - system cannot be taken off-line, because there is still an open - session between the storage daemon and the file daemon on the - client. - Although this is a very good strategy for making "safe backups" - This can be annoying for e.g. laptops, that must remain - connected until the backup is completed. - - Using a new feature called "migration" it will be possible to - spool first to harddisk (using a special 'spool' migration - scheme) and then migrate the backup to tape. - - There is still the problem of getting the attributes committed. - If it takes a very long time to do, with the current code, the - job has not terminated, and the File daemon is not freed up. The - Storage daemon should release the File daemon as soon as all the - file data and all the attributes have been sent to it (the SD). - Currently the SD waits until everything is on tape and all the - attributes are transmitted to the Director before signaling - completion to the FD. I don't think I would have any problem - changing this. The reason is that even if the FD reports back to - the Dir that all is OK, the job will not terminate until the SD - has done the same thing -- so in a way keeping the SD-FD link - open to the very end is not really very productive ... - - Why: Makes backup of laptops much faster. - -Item 19: Implement a Python interface to the Bacula catalog. +Item 15: Implement more Python events and functions Date: 28 October 2005 Origin: Kern - Status: + Status: Project abandoned in favor of plugins. + + What: Allow Python scripts to be called at more places + within Bacula and provide additional access to Bacula + internal variables. - What: Implement an interface for Python scripts to access - the catalog through Bacula. + Implement an interface for Python scripts to access the + catalog through Bacula. Why: This will permit users to customize Bacula through Python scripts. -Item 20: Archive data - Date: 15/5/2006 - Origin: calvin streeting calvin at absentdream dot com - Status: - - What: The abilty to archive to media (dvd/cd) in a uncompressed format - for dead filing (archiving not backing up) + Notes: Recycle event + Scratch pool event + NeedVolume event + MediaFull event + + Also add a way to get a listing of currently running + jobs (possibly also scheduled jobs). - Why: At my works when jobs are finished and moved off of the main file - servers (raid based systems) onto a simple linux file server (ide based - system) so users can find old information without contacting the IT - dept. - So this data dosn't realy change it only gets added to, - But it also needs backing up. At the moment it takes - about 8 hours to back up our servers (working data) so - rather than add more time to existing backups i am trying - to implement a system where we backup the acrhive data to - cd/dvd these disks would only need to be appended to - (burn only new/changed files to new disks for off site - storage). basialy understand the differnce between - achive data and live data. + to start the appropriate job. - Notes: Scan the data and email me when it needs burning divide - into predifind chunks keep a recored of what is on what - disk make me a label (simple php->mysql=>pdf stuff) i - could do this bit ability to save data uncompresed so - it can be read in any other system (future proof data) - save the catalog with the disk as some kind of menu - system -Item 21: Split documentation - Origin: Maxx - Date: 27th July 2006 +Item 16: Allow inclusion/exclusion of files in a fileset by creation/mod times + Origin: Evan Kaufman + Date: January 11, 2006 Status: - What: Split documentation in several books + What: In the vein of the Wild and Regex directives in a Fileset's + Options, it would be helpful to allow a user to include or exclude + files and directories by creation or modification times. - Why: Bacula manual has now more than 600 pages, and looking for - implementation details is getting complicated. I think - it would be good to split the single volume in two or - maybe three parts: + You could factor the Exclude=yes|no option in much the same way it + affects the Wild and Regex directives. For example, you could exclude + all files modified before a certain date: - 1) Introduction, requirements and tutorial, typically - are useful only until first installation time + Options { + Exclude = yes + Modified Before = #### + } - 2) Basic installation and configuration, with all the - gory details about the directives supported 3) - Advanced Bacula: testing, troubleshooting, GUI and - ancillary programs, security managements, scripting, - etc. - - -Item 22: Implement support for stacking arbitrary stream filters, sinks. -Date: 23 November 2006 -Origin: Landon Fuller -Status: Planning. Assigned to landonf. - - What: Implement support for the following: - - Stacking arbitrary stream filters (eg, encryption, compression, - sparse data handling)) - - Attaching file sinks to terminate stream filters (ie, write out - the resultant data to a file) - - Refactor the restoration state machine accordingly - - Why: The existing stream implementation suffers from the following: - - All state (compression, encryption, stream restoration), is - global across the entire restore process, for all streams. There are - multiple entry and exit points in the restoration state machine, and - thus multiple places where state must be allocated, deallocated, - initialized, or reinitialized. This results in exceptional complexity - for the author of a stream filter. - - The developer must enumerate all possible combinations of filters - and stream types (ie, win32 data with encryption, without encryption, - with encryption AND compression, etc). - - Notes: This feature request only covers implementing the stream filters/ - sinks, and refactoring the file daemon's restoration implementation - accordingly. If I have extra time, I will also rewrite the backup - implementation. My intent in implementing the restoration first is to - solve pressing bugs in the restoration handling, and to ensure that - the new restore implementation handles existing backups correctly. - - I do not plan on changing the network or tape data structures to - support defining arbitrary stream filters, but supporting that - functionality is the ultimate goal. - - Assistance with either code or testing would be fantastic. - -Item 23: Implement from-client and to-client on restore command line. - Date: 11 December 2006 - Origin: Discussion on Bacula-users entitled 'Scripted restores to - different clients', December 2006 - Status: New feature request - - What: While using bconsole interactively, you can specify the client - that a backup job is to be restored for, and then you can - specify later a different client to send the restored files - back to. However, using the 'restore' command with all options - on the command line, this cannot be done, due to the ambiguous - 'client' parameter. Additionally, this parameter means different - things depending on if it's specified on the command line or - afterwards, in the Modify Job screens. - - Why: This feature would enable restore jobs to be more completely - automated, for example by a web or GUI front-end. - - Notes: client can also be implied by specifying the jobid on the command - line - -Item 24: Add an override in Schedule for Pools based on backup types. -Date: 19 Jan 2005 -Origin: Chad Slater -Status: - - What: Adding a FullStorage=BigTapeLibrary in the Schedule resource - would help those of us who use different storage devices for different - backup levels cope with the "auto-upgrade" of a backup. - - Why: Assume I add several new device to be backed up, i.e. several - hosts with 1TB RAID. To avoid tape switching hassles, incrementals are - stored in a disk set on a 2TB RAID. If you add these devices in the - middle of the month, the incrementals are upgraded to "full" backups, - but they try to use the same storage device as requested in the - incremental job, filling up the RAID holding the differentials. If we - could override the Storage parameter for full and/or differential - backups, then the Full job would use the proper Storage device, which - has more capacity (i.e. a 8TB tape library. - -Item 25: Implement huge exclude list support using hashing (dlists). - Date: 28 October 2005 - Origin: Kern - Status: Done in 2.1.2 but was done with dlists (doubly linked lists - since hashing will not help. The huge list also supports - large include lists). - - What: Allow users to specify very large exclude list (currently - more than about 1000 files is too many). - - Why: This would give the users the ability to exclude all - files that are loaded with the OS (e.g. using rpms - or debs). If the user can restore the base OS from - CDs, there is no need to backup all those files. A - complete restore would be to restore the base OS, then - do a Bacula restore. By excluding the base OS files, the - backup set will be *much* smaller. - -Item 26: Implement more Python events in Bacula. - Date: 28 October 2005 - Origin: Kern - Status: - - What: Allow Python scripts to be called at more places - within Bacula and provide additional access to Bacula - internal variables. - - Why: This will permit users to customize Bacula through - Python scripts. - - Notes: Recycle event - Scratch pool event - NeedVolume event - MediaFull event - - Also add a way to get a listing of currently running - jobs (possibly also scheduled jobs). - - -Item 27: Incorporation of XACML2/SAML2 parsing - Date: 19 January 2006 - Origin: Adam Thornton - Status: Blue sky - - What: XACML is "eXtensible Access Control Markup Language" and - "SAML is the "Security Assertion Markup Language"--an XML standard - for making statements about identity and authorization. Having these - would give us a framework to approach ACLs in a generic manner, and - in a way flexible enough to support the four major sorts of ACLs I - see as a concern to Bacula at this point, as well as (probably) to - deal with new sorts of ACLs that may appear in the future. - - Why: Bacula is beginning to need to back up systems with ACLs - that do not map cleanly onto traditional Unix permissions. I see - four sets of ACLs--in general, mutually incompatible with one - another--that we're going to need to deal with. These are: NTFS - ACLs, POSIX ACLs, NFSv4 ACLS, and AFS ACLS. (Some may question the - relevance of AFS; AFS is one of Sine Nomine's core consulting - businesses, and having a reputable file-level backup and restore - technology for it (as Tivoli is probably going to drop AFS support - soon since IBM no longer supports AFS) would be of huge benefit to - our customers; we'd most likely create the AFS support at Sine Nomine - for inclusion into the Bacula (and perhaps some changes to the - OpenAFS volserver) core code.) - - Now, obviously, Bacula already handles NTFS just fine. However, I - think there's a lot of value in implementing a generic ACL model, so - that it's easy to support whatever particular instances of ACLs come - down the pike: POSIX ACLS (think SELinux) and NFSv4 are the obvious - things arriving in the Linux world in a big way in the near future. - XACML, although overcomplicated for our needs, provides this - framework, and we should be able to leverage other people's - implementations to minimize the amount of work *we* have to do to get - a generic ACL framework. Basically, the costs of implementation are - high, but they're largely both external to Bacula and already sunk. - -Item 28: Filesystem watch triggered backup. - Date: 31 August 2006 - Origin: Jesper Krogh - Status: Unimplemented, depends probably on "client initiated backups" - - What: With inotify and similar filesystem triggeret notification - systems is it possible to have the file-daemon to monitor - filesystem changes and initiate backup. - - Why: There are 2 situations where this is nice to have. - 1) It is possible to get a much finer-grained backup than - the fixed schedules used now.. A file created and deleted - a few hours later, can automatically be caught. - - 2) The introduced load on the system will probably be - distributed more even on the system. - - Notes: This can be combined with configration that specifies - something like: "at most every 15 minutes or when changes - consumed XX MB". - -Kern Notes: I would rather see this implemented by an external program - that monitors the Filesystem changes, then uses the console - to start the appropriate job. - -Item 29: Allow inclusion/exclusion of files in a fileset by creation/mod times - Origin: Evan Kaufman - Date: January 11, 2006 - Status: - - What: In the vein of the Wild and Regex directives in a Fileset's - Options, it would be helpful to allow a user to include or exclude - files and directories by creation or modification times. - - You could factor the Exclude=yes|no option in much the same way it - affects the Wild and Regex directives. For example, you could exclude - all files modified before a certain date: - - Options { - Exclude = yes - Modified Before = #### - } - - Or you could exclude all files created/modified since a certain date: + Or you could exclude all files created/modified since a certain date: Options { Exclude = yes @@ -902,40 +356,8 @@ Item 29: Allow inclusion/exclusion of files in a fileset by creation/mod times So one could compare against 'ctime' and/or 'mtime', but ONLY 'before' or 'since'. - -Item 30: Tray monitor window cleanups - Origin: Alan Brown ajb2 at mssl dot ucl dot ac dot uk - Date: 24 July 2006 - Status: - What: Resizeable and scrollable windows in the tray monitor. - - Why: With multiple clients, or with many jobs running, the displayed - window often ends up larger than the available screen, making - the trailing items difficult to read. - - -Item 31: Implement multiple numeric backup levels as supported by dump -Date: 3 April 2006 -Origin: Daniel Rich -Status: -What: Dump allows specification of backup levels numerically instead of just - "full", "incr", and "diff". In this system, at any given level, all - files are backed up that were were modified since the last backup of a - higher level (with 0 being the highest and 9 being the lowest). A - level 0 is therefore equivalent to a full, level 9 an incremental, and - the levels 1 through 8 are varying levels of differentials. For - bacula's sake, these could be represented as "full", "incr", and - "diff1", "diff2", etc. - -Why: Support of multiple backup levels would provide for more advanced backup - rotation schemes such as "Towers of Hanoi". This would allow better - flexibility in performing backups, and can lead to shorter recover - times. - -Notes: Legato Networker supports a similar system with full, incr, and 1-9 as - levels. -Item 32: Automatic promotion of backup levels +Item 17: Automatic promotion of backup levels based on backup size Date: 19 January 2006 Origin: Adam Thornton Status: @@ -953,60 +375,8 @@ Item 32: Automatic promotion of backup levels using Bacula; if we had it it would eliminate the one cool thing Amanda can do and we can't (at least, the one cool thing I know of). -Item 33: Clustered file-daemons - Origin: Alan Brown ajb2 at mssl dot ucl dot ac dot uk - Date: 24 July 2006 - Status: - What: A "virtual" filedaemon, which is actually a cluster of real ones. - - Why: In the case of clustered filesystems (SAN setups, GFS, or OCFS2, etc) - multiple machines may have access to the same set of filesystems - - For performance reasons, one may wish to initate backups from - several of these machines simultaneously, instead of just using - one backup source for the common clustered filesystem. - - For obvious reasons, normally backups of $A-FD/$PATH and - B-FD/$PATH are treated as different backup sets. In this case - they are the same communal set. - - Likewise when restoring, it would be easier to just specify - one of the cluster machines and let bacula decide which to use. - - This can be faked to some extent using DNS round robin entries - and a virtual IP address, however it means "status client" will - always give bogus answers. Additionally there is no way of - spreading the load evenly among the servers. - What is required is something similar to the storage daemon - autochanger directives, so that Bacula can keep track of - operating backups/restores and direct new jobs to a "free" - client. - - Notes: - -Item 34: Commercial database support - Origin: Russell Howe - Date: 26 July 2006 - Status: - - What: It would be nice for the database backend to support more - databases. I'm thinking of SQL Server at the moment, but I guess Oracle, - DB2, MaxDB, etc are all candidates. SQL Server would presumably be - implemented using FreeTDS or maybe an ODBC library? - - Why: We only really have one database server, which is MS SQL Server - 2000. Maintaining a second one for the backup software (we grew out of - SQLite, which I liked, but which didn't work so well with our database - size). We don't really have a machine with the resources to run - postgres, and would rather only maintain a single DBMS. We're stuck with - SQL Server because pretty much all the company's custom applications - (written by consultants) are locked into SQL Server 2000. I can imagine - this scenario is fairly common, and it would be nice to use the existing - properly specced database server for storing Bacula's catalog, rather - than having to run a second DBMS. - -Item 35: Automatic disabling of devices +Item 19: Automatic disabling of devices Date: 2005-11-11 Origin: Peter Eriksson Status: @@ -1032,10 +402,10 @@ Item 35: Automatic disabling of devices further use of that drive and used one of the other ones instead. -Item 36: An option to operate on all pools with update vol parameters +Item 20: An option to operate on all pools with update vol parameters Origin: Dmitriy Pinchukov Date: 16 August 2006 - Status: + Status: Patch made by Nigel Stepp What: When I do update -> Volume parameters -> All Volumes from Pool, then I have to select pools one by one. I'd like @@ -1046,30 +416,8 @@ Item 36: An option to operate on all pools with update vol parameters updating each of them using update -> Volume parameters -> All Volumes from Pool -> pool #. -Item 37: Add an item to the restore option where you can select a pool - Origin: kshatriyak at gmail dot com - Date: 1/1/2006 - Status: - What: In the restore option (Select the most recent backup for a - client) it would be useful to add an option where you can limit - the selection to a certain pool. - - Why: When using cloned jobs, most of the time you have 2 pools - a - disk pool and a tape pool. People who have 2 pools would like to - select the most recent backup from disk, not from tape (tape - would be only needed in emergency). However, the most recent - backup (which may just differ a second from the disk backup) may - be on tape and would be selected. The problem becomes bigger if - you have a full and differential - the most "recent" full backup - may be on disk, while the most recent differential may be on tape - (though the differential on disk may differ even only a second or - so). Bacula will complain that the backups reside on different - media then. For now the only solution now when restoring things - when you have 2 pools is to manually search for the right - job-id's and enter them by hand, which is a bit fault tolerant. - -Item 38: Include timestamp of job launch in "stat clients" output +Item 21: Include timestamp of job launch in "stat clients" output Origin: Mark Bergman Date: Tue Aug 22 17:13:39 EDT 2006 Status: @@ -1088,120 +436,752 @@ Item 38: Include timestamp of job launch in "stat clients" output particularly when there are many active clients. -Item 39: Message mailing based on backup types - Origin: Evan Kaufman - Date: January 6, 2006 - Status: - What: In the "Messages" resource definitions, allowing messages - to be mailed based on the type (backup, restore, etc.) and level - (full, differential, etc) of job that created the originating - message(s). +Item 22: Implement Storage daemon compression + Date: 18 December 2006 + Origin: Vadim A. Umanski , e-mail umanski@ext.ru + Status: + What: The ability to compress backup data on the SD receiving data + instead of doing that on client sending data. + Why: The need is practical. I've got some machines that can send + data to the network 4 or 5 times faster than compressing + them (I've measured that). They're using fast enough SCSI/FC + disk subsystems but rather slow CPUs (ex. UltraSPARC II). + And the backup server has got a quite fast CPUs (ex. Dual P4 + Xeons) and quite a low load. When you have 20, 50 or 100 GB + of raw data - running a job 4 to 5 times faster - that + really matters. On the other hand, the data can be + compressed 50% or better - so losing twice more space for + disk backup is not good at all. And the network is all mine + (I have a dedicated management/provisioning network) and I + can get as high bandwidth as I need - 100Mbps, 1000Mbps... + That's why the server-side compression feature is needed! + Notes: - Why: It would, for example, allow someone's boss to be emailed - automatically only when a Full Backup job runs, so he can - retrieve the tapes for offsite storage, even if the IT dept. - doesn't (or can't) explicitly notify him. At the same time, his - mailbox wouldnt be filled by notifications of Verifies, Restores, - or Incremental/Differential Backups (which would likely be kept - onsite). +Item 23: Improve Bacula's tape and drive usage and cleaning management + Date: 8 November 2005, November 11, 2005 + Origin: Adam Thornton , + Arno Lehmann + Status: - Notes: One way this could be done is through additional message types, for example: + What: Make Bacula manage tape life cycle information, tape reuse + times and drive cleaning cycles. - Messages { - # email the boss only on full system backups - Mail = boss@mycompany.com = full, !incremental, !differential, !restore, - !verify, !admin - # email us only when something breaks - MailOnError = itdept@mycompany.com = all - } + Why: All three parts of this project are important when operating + backups. + We need to know which tapes need replacement, and we need to + make sure the drives are cleaned when necessary. While many + tape libraries and even autoloaders can handle all this + automatically, support by Bacula can be helpful for smaller + (older) libraries and single drives. Limiting the number of + times a tape is used might prevent tape errors when using + tapes until the drives can't read it any more. Also, checking + drive status during operation can prevent some failures (as I + [Arno] had to learn the hard way...) + Notes: First, Bacula could (and even does, to some limited extent) + record tape and drive usage. For tapes, the number of mounts, + the amount of data, and the time the tape has actually been + running could be recorded. Data fields for Read and Write + time and Number of mounts already exist in the catalog (I'm + not sure if VolBytes is the sum of all bytes ever written to + that volume by Bacula). This information can be important + when determining which media to replace. The ability to mark + Volumes as "used up" after a given number of write cycles + should also be implemented so that a tape is never actually + worn out. For the tape drives known to Bacula, similar + information is interesting to determine the device status and + expected life time: Time it's been Reading and Writing, number + of tape Loads / Unloads / Errors. This information is not yet + recorded as far as I [Arno] know. A new volume status would + be necessary for the new state, like "Used up" or "Worn out". + Volumes with this state could be used for restores, but not + for writing. These volumes should be migrated first (assuming + migration is implemented) and, once they are no longer needed, + could be moved to a Trash pool. -Item 40: Include JobID in spool file name ****DONE**** - Origin: Mark Bergman - Date: Tue Aug 22 17:13:39 EDT 2006 - Status: Done. (patches/testing/project-include-jobid-in-spool-name.patch) - No need to vote for this item. + The next step would be to implement a drive cleaning setup. + Bacula already has knowledge about cleaning tapes. Once it + has some information about cleaning cycles (measured in drive + run time, number of tapes used, or calender days, for example) + it can automatically execute tape cleaning (with an + autochanger, obviously) or ask for operator assistance loading + a cleaning tape. + + The final step would be to implement TAPEALERT checks not only + when changing tapes and only sending the information to the + administrator, but rather checking after each tape error, + checking on a regular basis (for example after each tape + file), and also before unloading and after loading a new tape. + Then, depending on the drives TAPEALERT state and the known + drive cleaning state Bacula could automatically schedule later + cleaning, clean immediately, or inform the operator. - What: Change the name of the spool file to include the JobID + Implementing this would perhaps require another catalog change + and perhaps major changes in SD code and the DIR-SD protocol, + so I'd only consider this worth implementing if it would + actually be used or even needed by many people. - Why: JobIDs are the common key used to refer to jobs, yet the - spoolfile name doesn't include that information. The date/time - stamp is useful (and should be retained). + Implementation of these projects could happen in three distinct + sub-projects: Measuring Tape and Drive usage, retiring + volumes, and handling drive cleaning and TAPEALERTs. -============= New Freature Requests after vote of 26 Jan 2007 ======== -Item n: Enable to relocate files and directories when restoring - Date: 2007-03-01 - Origin: Eric Bollengier +Item 24: Multiple threads in file daemon for the same job + Date: 27 November 2005 + Origin: Ove Risberg (Ove.Risberg at octocode dot com) Status: - What: The where= option is not powerful enough. It will be - a great feature if bacula can restore a file in the - same directory, but with a different name, or in - an other directory without recreating the full path. + What: I want the file daemon to start multiple threads for a backup + job so the fastest possible backup can be made. - Why: When i want to restore a production environment to a - development environment, i just want change the first - directory. ie restore /prod/data/file.dat to /rect/data/file.dat. - At this time, i have to move by hand files. You must have a big - dump space to restore and move data after. + The file daemon could parse the FileSet information and start + one thread for each File entry located on a separate + filesystem. - When i use Linux or SAN snapshot, i mount them to /mnt/snap_xxx - so, when a restore a file, i have to move by hand - from /mnt/snap_xxx/file to /xxx/file. I can't replace a file - easily. + A confiuration option in the job section should be used to + enable or disable this feature. The confgutration option could + specify the maximum number of threads in the file daemon. - When a user ask me to restore a file in its personal folder, - (without replace the existing one), i can't restore from - my_file.txt to my_file.txt.old witch is very practical. + If the theads could spool the data to separate spool files + the restore process will not be much slower. - - Notes: I think we can enhance the where= option very easily by - allowing regexp expression. (by replacing bregex by libpcre - see http://en.wikipedia.org/wiki/PCRE and http://www.pcre.org/) + Why: Multiple concurrent backups of a large fileserver with many + disks and controllers will be much faster. - Since, many users think that regexp are not user friendly, i think - that bat, bconsole or brestore must provide a simple way to - configure where= option (i think to something like in - openoffice "search and replace"). +Item 25: Archival (removal) of User Files to Tape + Date: Nov. 24/2005 + Origin: Ray Pengelly [ray at biomed dot queensu dot ca + Status: - Ie, if user uses where=/tmp/bacula-restore, we keep the old - fashion. + What: The ability to archive data to storage based on certain parameters + such as age, size, or location. Once the data has been written to + storage and logged it is then pruned from the originating + filesystem. Note! We are talking about user's files and not + Bacula Volumes. - If user uses something like where=s!/prod!/test!, files will - be restored from /prod/xxx to /test/xxx. + Why: This would allow fully automatic storage management which becomes + useful for large datastores. It would also allow for auto-staging + from one media type to another. - If user uses something like where=s/$/.old/, files will - be restored from /prod/xxx.txt to /prod/xxx.txt.old. + Example 1) Medical imaging needs to store large amounts of data. + They decide to keep data on their servers for 6 months and then put + it away for long term storage. The server then finds all files + older than 6 months writes them to tape. The files are then removed + from the server. - If user uses something like where=s/txt$/old.txt/, files will - be restored from /prod/xxx.txt to /prod/xxx.old.txt + Example 2) All data that hasn't been accessed in 2 months could be + moved from high-cost, fibre-channel disk storage to a low-cost + large-capacity SATA disk storage pool which doesn't have as quick of + access time. Then after another 6 months (or possibly as one + storage pool gets full) data is migrated to Tape. - if user uses something like where=s/([a-z]+)$/old.$1/, files will - be restored from /prod/xxx.ext to /prod/xxx.old.ext -Item n: Implement Catalog directive for Pool resource in Director -configuration - Origin: Alan Davis adavis@ruckus.com - Date: 6 March 2007 - Status: Submitted - - What: The current behavior is for the director to create all pools - found in the configuration file in all catalogs. Add a - Catalog directive to the Pool resource to specify which - catalog to use for each pool definition. - - Why: This allows different catalogs to have different pool - attributes and eliminates the side-effect of adding - pools to catalogs that don't need/use them. +========= New Items since the last vote ================= + +Item 26: Add a new directive to bacula-dir.conf which permits inclusion of all subconfiguration files in a given directory +Date: 18 October 2008 +Origin: Database, Lda. Maputo, Mozambique +Contact:Cameron Smith / cameron.ord@database.co.mz +Status: New request + +What: A directive something like "IncludeConf = /etc/bacula/subconfs" Every + time Bacula Director restarts or reloads, it will walk the given + directory (non-recursively) and include the contents of any files + therein, as though they were appended to bacula-dir.conf + +Why: Permits simplified and safer configuration for larger installations with + many client PCs. Currently, through judicious use of JobDefs and + similar directives, it is possible to reduce the client-specific part of + a configuration to a minimum. The client-specific directives can be + prepared according to a standard template and dropped into a known + directory. However it is still necessary to add a line to the "master" + (bacula-dir.conf) referencing each new file. This exposes the master to + unnecessary risk of accidental mistakes and makes automation of adding + new client-confs, more difficult (it is easier to automate dropping a + file into a dir, than rewriting an existing file). Ken has previously + made a convincing argument for NOT including Bacula's core configuration + in an RDBMS, but I believe that the present request is a reasonable + extension to the current "flat-file-based" configuration philosophy. - Notes: +Notes: There is NO need for any special syntax to these files. They should + contain standard directives which are simply "inlined" to the parent + file as already happens when you explicitly reference an external file. + +Notes: (kes) this can already be done with scripting + From: John Jorgensen + The bacula-dir.conf at our site contains these lines: + + # + # Include subfiles associated with configuration of clients. + # They define the bulk of the Clients, Jobs, and FileSets. + # + @|"sh -c 'for f in /etc/bacula/clientdefs/*.conf ; do echo @${f} ; done'" + + and when we get a new client, we just put its configuration into + a new file called something like: + + /etc/bacula/clientdefs/clientname.conf + + + Item n: List inChanger flag when doing restore. + Origin: Jesper Krogh + Date: 17 oct. 2008 + Status: + + What: When doing a restore the restore selection dialog ends by telling stuff + like this: + The job will require the following + Volume(s) Storage(s) SD Device(s) + =========================================================================== + 000741L3 LTO-4 LTO3 + 000866L3 LTO-4 LTO3 + 000765L3 LTO-4 LTO3 + 000764L3 LTO-4 LTO3 + 000756L3 LTO-4 LTO3 + 001759L3 LTO-4 LTO3 + 001763L3 LTO-4 LTO3 + 001762L3 LTO-4 LTO3 + 001767L3 LTO-4 LTO3 + + When having an autochanger, it would be really nice with an inChanger + column so the operator knew if this restore job would stop waiting for + operator intervention. This is done just by selecting the inChanger flag + from the catalog and printing it in a seperate column. + + + Why: This would help getting large restores through minimizing the + time spent waiting for operator to drop by and change tapes in the library. + + Notes: [Kern] I think it would also be good to have the Slot as well, + or some indication that Bacula thinks the volume is in the autochanger + because it depends on both the InChanger flag and the Slot being + valid. + + +Item 1: Implement an interface between Bacula and Amazon's S3. + Date: 25 August 2008 + Origin: Soren Hansen + Status: Not started. + What: Enable the storage daemon to store backup data on Amazon's + S3 service. + + Why: Amazon's S3 is a cheap way to store data off-site. Current + ways to integrate Bacula and S3 involve storing all the data + locally and syncing them to S3, and manually fetching them + again when they're needed. This is very cumbersome. + + +Item 1: enable/disable compression depending on storage device (disk/tape) + Origin: Ralf Gross ralf-lists@ralfgross.de + Date: 2008-01-11 + Status: Initial Request + + What: Add a new option to the storage resource of the director. Depending + on this option, compression will be enabled/disabled for a device. + + Why: If different devices (disks/tapes) are used for full/diff/incr + backups, software compression will be enabled for all backups + because of the FileSet compression option. For backup to tapes + wich are able to do hardware compression this is not desired. + + + Notes: + http://news.gmane.org/gmane.comp.sysutils.backup.bacula.devel/cutoff=11124 + It must be clear to the user, that the FileSet compression option + must still be enabled use compression for a backup job at all. + Thus a name for the new option in the director must be + well-defined. + + Notes: KES I think the Storage definition should probably override what + is in the Job definition or vice-versa, but in any case, it must + be well defined. + + +Item 1: Backup and Restore of Windows Encrypted Files through raw encryption + functions + + Origin: Michael Mohr, SAG Mohr.External@infineon.com + + Date: 22 February 2008 + + Status: + + What: Make it possible to backup and restore Encypted Files from and to + Windows systems without the need to decrypt it by using the raw + encryption functions API (see: + http://msdn2.microsoft.com/en-us/library/aa363783.aspx) + + that is provided for that reason by Microsoft. + If a file ist encrypted could be examined by evaluating the + FILE_ATTRIBUTE_ENCRYTED flag of the GetFileAttributes + function. + + Why: Without the usage of this interface the fd-daemon running + under the system account can't read encypted Files because + the key needed for the decrytion is missed by them. As a result + actually encrypted files are not backed up + by bacula and also no error is shown while missing these files. + + Notes: ./. + + Item 1: Possibilty to schedule Jobs on last Friday of the month + Origin: Carsten Menke + Date: 02 March 2008 + Status: + + What: Currently if you want to run your monthly Backups on the last + Friday of each month this is only possible with workarounds (e.g + scripting) (As some months got 4 Fridays and some got 5 Fridays) + The same is true if you plan to run your yearly Backups on the + last Friday of the year. It would be nice to have the ability to + use the builtin scheduler for this. + + Why: In many companies the last working day of the week is Friday (or + Saturday), so to get the most data of the month onto the monthly + tape, the employees are advised to insert the tape for the + monthly backups on the last friday of the month. + + Notes: To give this a complete functionality it would be nice if the + "first" and "last" Keywords could be implemented in the + scheduler, so it is also possible to run monthy backups at the + first friday of the month and many things more. So if the syntax + would expand to this {first|last} {Month|Week|Day|Mo-Fri} of the + {Year|Month|Week} you would be able to run really flexible jobs. + + To got a certain Job run on the last Friday of the Month for example one could + then write + + Run = pool=Monthly last Fri of the Month at 23:50 + + ## Yearly Backup + + Run = pool=Yearly last Fri of the Year at 23:50 + + ## Certain Jobs the last Week of a Month + + Run = pool=LastWeek last Week of the Month at 23:50 + + ## Monthly Backup on the last day of the month + + Run = pool=Monthly last Day of the Month at 23:50 + + Date: 20 March 2008 + + Origin: Frank Sweetser + + What: Add a new SD directive, "minimum spool size" (or similar). This + directive would specify a minimum level of free space available for + spooling. If the unused spool space is less than this level, any + new spooling requests would be blocked as if the "maximum spool + size" threshold had bee reached. Already spooling jobs would be + unaffected by this directive. + + Why: I've been bitten by this scenario a couple of times: + + Assume a maximum spool size of 100M. Two concurrent jobs, A and B, + are both running. Due to timing quirks and previously running jobs, + job A has used 99.9M of space in the spool directory. While A is + busy despooling to disk, B is happily using the remaining 0.1M of + spool space. This ends up in a spool/despool sequence every 0.1M of + data. In addition to fragmenting the data on the volume far more + than was necessary, in larger data sets (ie, tens or hundreds of + gigabytes) it can easily produce multi-megabyte report emails! + + Item n?: Expand the Verify Job capability to verify Jobs older than the + last one. For VolumeToCatalog Jobs + Date: 17 Januar 2008 + Origin: portrix.net Hamburg, Germany. + Contact: Christian Sabelmann + Status: 70% of the required Code is part of the Verify function since v. 2.x + + What: + The ability to tell Bacula which Job should verify instead of + automatically verify just the last one. + + Why: + It is sad that such a powerfull feature like Verify Jobs + (VolumeToCatalog) is restricted to be used only with the last backup Job + of a client. Actual users who have to do daily Backups are forced to + also do daily Verify Jobs in order to take advantage of this useful + feature. This Daily Verify after Backup conduct is not always desired + and Verify Jobs have to be sometimes scheduled. (Not necessarily + scheduled in Bacula). With this feature Admins can verify Jobs once a + Week or less per month, selecting the Jobs they want to verify. This + feature is also not to difficult to implement taking in account older bug + reports about this feature and the selection of the Job to be verified. + + Notes: For the verify Job, the user could select the Job to be verified + from a List of the latest Jobs of a client. It would also be possible to + verify a certain volume. All of these would naturaly apply only for + Jobs whose file information are still in the catalog. + +Item X: Add EFS support on Windows + Origin: Alex Ehrlich (Alex.Ehrlich-at-mail.ee) + Date: 05 August 2008 + Status: + + What: For each file backed up or restored by FD on Windows, check if + the file is encrypted; if so then use OpenEncryptedFileRaw, + ReadEncryptedFileRaw, WriteEncryptedFileRaw, + CloseEncryptedFileRaw instead of BackupRead and BackupWrite + API calls. + + Why: Many laptop users utilize the EFS functionality today; so do. + some non-laptop ones, too. + Currently files encrypted by means of EFS cannot be backed up. + It means a Windows boutique cannot rely on Bacula as its + backup solution, at least when using Windows 2K, XPP, + "better" Vista etc on workstations, unless EFS is + forbidden by policies. + The current situation might result into "false sense of + security" among the end-users. + + Notes: Using xxxEncryptedFileRaw API would allow to backup and + restore EFS-encrypted files without decrypting their data. + Note that such files cannot be restored "portably" (at least, + easily) but they would be restoreable to a different (or + reinstalled) Win32 machine; the restore would require setup + of a EFS recovery agent in advance, of course, and this shall + be clearly reflected in the documentation, but this is the + normal Windows SysAdmin's business. + When "portable" backup is requested the EFS-encrypted files + shall be clearly reported as errors. + See MSDN on the "Backup and Restore of Encrypted Files" topic: + http://msdn.microsoft.com/en-us/library/aa363783.aspx + Maybe the EFS support requires a new flag in the database for + each file, too? + Unfortunately, the implementation is not as straightforward as + 1-to-1 replacement of BackupRead with ReadEncryptedFileRaw, + requiring some FD code rewrite to work with + encrypted-file-related callback functions. + +Item n: Data encryption on storage daemon + Origin: Tobias Barth + Date: 04 February 2009 + Status: new + + What: The storage demon should be able to do the data encryption that can currently be done by the file daemon. + + Why: This would have 2 advantages: 1) one could encrypt the data of unencrypted tapes by doing a migration job, and 2) the storage daemon would be the only machine that would have to keep the encryption keys. + + +Item 1: "Maximum Concurrent Jobs" for drives when used with changer device + Origin: Ralf Gross ralf-lists ralfgross.de + Date: 2008-12-12 + Status: Initial Request + + What: respect the "Maximum Concurrent Jobs" directive in the _drives_ + Storage section in addition to the changer section + + Why: I have a 3 drive changer where I want to be able to let 3 concurrent + jobs run in parallel. But only one job per drive at the same time. + Right now I don't see how I could limit the number of concurrent jobs + per drive in this situation. + + Notes: Using different priorities for these jobs lead to problems that other + jobs are blocked. On the user list I got the advice to use the "Prefer Mounted + Volumes" directive, but Kern advised against using "Prefer Mounted + Volumes" in an other thread: + http://article.gmane.org/gmane.comp.sysutils.backup.bacula.devel/11876/ + + In addition I'm not sure if this would be the same as respecting the + drive's "Maximum Concurrent Jobs" setting. + + Example: + + Storage { + Name = Neo4100 + Address = .... + SDPort = 9103 + Password = "wiped" + Device = Neo4100 + Media Type = LTO4 + Autochanger = yes + Maximum Concurrent Jobs = 3 + } + + Storage { + Name = Neo4100-LTO4-D1 + Address = .... + SDPort = 9103 + Password = "wiped" + Device = ULTRIUM-TD4-D1 + Media Type = LTO4 + Maximum Concurrent Jobs = 1 + } + + [2 more drives] + + The "Maximum Concurrent Jobs = 1" directive in the drive's section is ignored. + + Item n: Add MaxVolumeSize/MaxVolumeBytes statement to Storage resource + Origin: Bastian Friedrich + Date: 2008-07-09 + Status: - + + What: SD has a "Maximum Volume Size" statement, which is deprecated + and superseded by the Pool resource statement "Maximum Volume Bytes". It + would be good if either statement could be used in Storage resources. + + Why: Pools do not have to be restricted to a single storage + type/device; thus, it may be impossible to define Maximum Volume Bytes in + the Pool resource. The old MaxVolSize statement is deprecated, as it is + SD side only. + I am using the same pool for different devices. + + Notes: State of idea currently unknown. Storage resources in the dir + config currently translate to very slim catalog entries; these entries + would require extensions to implement what is described here. Quite + possibly, numerous other statements that are currently available in Pool + resources could be used in Storage resources too quite well. + +Item 1: Start spooling even when waiting on tape + Origin: Tobias Barth + Date: 25 April 2008 + Status: + + What: If a job can be spooled to disk before writing it to tape, it +should be spooled immediately. + Currently, bacula waits until the correct tape is inserted +into the drive. + + Why: It could save hours. When bacula waits on the operator who +must insert the correct tape (e.g. a new + tape or a tape from another media pool), bacula could already +prepare the spooled data in the + spooling directory and immediately start despooling when the +tape was inserted by the operator. + + 2nd step: Use 2 or more spooling directories. When one directory is +currently despooling, the next (on different + disk drives) could already be spooling the next data. + + Notes: I am using bacula 2.2.8, which has none of those features +implemented. + +Item 1: enable persistent naming/number of SQL queries + + Date: 24 Jan, 2007 + Origin: Mark Bergman + Status: + + What: + Change the parsing of the query.sql file and the query command so that + queries are named/numbered by a fixed value, not their order in the + file. + + + Why: + One of the real strengths of bacula is the ability to query the + database, and the fact that complex queries can be saved and + referenced from a file is very powerful. However, the choice + of query (both for interactive use, and by scripting input + to the bconsole command) is completely dependent on the order + within the query.sql file. The descriptve labels are helpful for + interactive use, but users become used to calling a particular + query "by number", or may use scripts to execute queries. This + presents a problem if the number or order of queries in the file + changes. + + If the query.sql file used the numeric tags as a real value (rather + than a comment), then users could have a higher confidence that they + are executing the intended query, that their local changes wouldn't + conflict with future bacula upgrades. + + For scripting, it's very important that the intended query is + what's actually executed. The current method of parsing the + query.sql file discourages scripting because the addition or + deletion of queries within the file will require corresponding + changes to scripts. It may not be obvious to users that deleting + query "17" in the query.sql file will require changing all + references to higher numbered queries. Similarly, when new + bacula distributions change the number of "official" queries, + user-developed queries cannot simply be appended to the file + without also changing any references to those queries in scripts + or procedural documentation, etc. + + In addition, using fixed numbers for queries would encourage more + user-initiated development of queries, by supporting conventions + such as: + + queries numbered 1-50 are supported/developed/distributed by + with official bacula releases + + queries numbered 100-200 are community contributed, and are + related to media management + + queries numbered 201-300 are community contributed, and are + related to checksums, finding duplicated files across + different backups, etc. + + queries numbered 301-400 are community contributed, and are + related to backup statistics (average file size, size per + client per backup level, time for all clients by backup level, + storage capacity by media type, etc.) + + queries numbered 500-999 are locally created + + Notes: + Alternatively, queries could be called by keyword (tag), rather + than by number. + +Item 1: Implementation of running Job speed limit. +Origin: Alex F, alexxzell at yahoo dot com +Date: 29 January 2009 + +What: I noticed the need for an integrated bandwidth limiter for + running jobs. It would be very useful just to specify another + field in bacula-dir.conf, like speed = how much speed you wish + for that specific job to run at + +Why: Because of a couple of reasons. First, it's very hard to implement a + traffic shaping utility and also make it reliable. Second, it is very + uncomfortable to have to implement these apps to, let's say 50 clients + (including desktops, servers). This would also be unreliable because you + have to make sure that the apps are properly working when needed; users + could also disable them (accidentally or not). It would be very useful + to provide Bacula this ability. All information would be centralized, + you would not have to go to 50 different clients in 10 different + locations for configuration; eliminating 3rd party additions help in + establishing efficiency. Would also avoid bandwidth congestion, + especially where there is little available. + + + encrypted-file-related callback functions. + + +============= Empty Feature Request form =========== +Item n: One line summary ... + Date: Date submitted + Origin: Name and email of originator. + Status: + + What: More detailed explanation ... + + Why: Why it is important ... + + Notes: Additional notes or features (omit if not used) +============== End Feature Request form ============== + + +========== Items put on hold by Kern ============================ + +Item h2: Implement support for stacking arbitrary stream filters, sinks. +Date: 23 November 2006 +Origin: Landon Fuller +Status: Planning. Assigned to landonf. + + What: Implement support for the following: + - Stacking arbitrary stream filters (eg, encryption, compression, + sparse data handling)) + - Attaching file sinks to terminate stream filters (ie, write out + the resultant data to a file) + - Refactor the restoration state machine accordingly + + Why: The existing stream implementation suffers from the following: - All + state (compression, encryption, stream restoration), is + global across the entire restore process, for all streams. There are + multiple entry and exit points in the restoration state machine, and + thus multiple places where state must be allocated, deallocated, + initialized, or reinitialized. This results in exceptional complexity + for the author of a stream filter. + - The developer must enumerate all possible combinations of filters + and stream types (ie, win32 data with encryption, without encryption, + with encryption AND compression, etc). + + Notes: This feature request only covers implementing the stream filters/ + sinks, and refactoring the file daemon's restoration + implementation accordingly. If I have extra time, I will also + rewrite the backup implementation. My intent in implementing the + restoration first is to solve pressing bugs in the restoration + handling, and to ensure that the new restore implementation + handles existing backups correctly. + + I do not plan on changing the network or tape data structures to + support defining arbitrary stream filters, but supporting that + functionality is the ultimate goal. + + Assistance with either code or testing would be fantastic. + + Notes: Kern: this project has a lot of merit, and we need to do it, but + it is really an issue for developers rather than a new feature + for users, so I have removed it from the voting list, but kept it + here, but at some point, it will be implemented. + +Item h3: Filesystem watch triggered backup. + Date: 31 August 2006 + Origin: Jesper Krogh + Status: + + What: With inotify and similar filesystem triggeret notification + systems is it possible to have the file-daemon to monitor + filesystem changes and initiate backup. + + Why: There are 2 situations where this is nice to have. + 1) It is possible to get a much finer-grained backup than + the fixed schedules used now.. A file created and deleted + a few hours later, can automatically be caught. + + 2) The introduced load on the system will probably be + distributed more even on the system. + + Notes: This can be combined with configration that specifies + something like: "at most every 15 minutes or when changes + consumed XX MB". + +Kern Notes: I would rather see this implemented by an external program + that monitors the Filesystem changes, then uses the console -Item n: Implement NDMP protocol support +Item h4: Directive/mode to backup only file changes, not entire file + Date: 11 November 2005 + Origin: Joshua Kugler + Marek Bajon + Status: + + What: Currently when a file changes, the entire file will be backed up in + the next incremental or full backup. To save space on the tapes + it would be nice to have a mode whereby only the changes to the + file would be backed up when it is changed. + + Why: This would save lots of space when backing up large files such as + logs, mbox files, Outlook PST files and the like. + + Notes: This would require the usage of disk-based volumes as comparing + files would not be feasible using a tape drive. + + Notes: Kern: I don't know how to implement this. Put on hold until someone + provides a detailed implementation plan. + + +Item h5: Implement multiple numeric backup levels as supported by dump +Date: 3 April 2006 +Origin: Daniel Rich +Status: +What: Dump allows specification of backup levels numerically instead of just + "full", "incr", and "diff". In this system, at any given level, + all files are backed up that were were modified since the last + backup of a higher level (with 0 being the highest and 9 being the + lowest). A level 0 is therefore equivalent to a full, level 9 an + incremental, and the levels 1 through 8 are varying levels of + differentials. For bacula's sake, these could be represented as + "full", "incr", and "diff1", "diff2", etc. + +Why: Support of multiple backup levels would provide for more advanced + backup rotation schemes such as "Towers of Hanoi". This would + allow better flexibility in performing backups, and can lead to + shorter recover times. + +Notes: Legato Networker supports a similar system with full, incr, and 1-9 as + levels. + +Notes: Kern: I don't see the utility of this, and it would be a *huge* + modification to existing code. + +Item h6: Implement NDMP protocol support Origin: Alan Davis Date: 06 March 2007 - Status: Submitted + Status: What: Network Data Management Protocol is implemented by a number of NAS filer vendors to enable backups using third-party @@ -1222,46 +1202,149 @@ Item n: Implement NDMP protocol support reference implementation from Traakan is known to compile on Solaris 10. - Notes (Kern): I am not at all in favor of this until NDMP becomes + Notes: Kern: I am not at all in favor of this until NDMP becomes an Open Standard or until there are Open Source libraries that interface to it. -Item n: make changing "spooldata=yes|no" possible for - manual/interactive jobs +Item h7: Commercial database support + Origin: Russell Howe + Date: 26 July 2006 + Status: + + What: It would be nice for the database backend to support more databases. + I'm thinking of SQL Server at the moment, but I guess Oracle, DB2, + MaxDB, etc are all candidates. SQL Server would presumably be + implemented using FreeTDS or maybe an ODBC library? -Origin: Marc Schiffbauer + Why: We only really have one database server, which is MS SQL Server 2000. + Maintaining a second one for the backup software (we grew out of + SQLite, which I liked, but which didn't work so well with our + database size). We don't really have a machine with the resources + to run postgres, and would rather only maintain a single DBMS. + We're stuck with SQL Server because pretty much all the company's + custom applications (written by consultants) are locked into SQL + Server 2000. I can imagine this scenario is fairly common, and it + would be nice to use the existing properly specced database server + for storing Bacula's catalog, rather than having to run a second + DBMS. + + Notes: This might be nice, but someone other than me will probably need to + implement it, and at the moment, proprietary code cannot legally + be mixed with Bacula GPLed code. This would be possible only + providing the vendors provide GPLed (or OpenSource) interface + code. + +Item h8: Incorporation of XACML2/SAML2 parsing + Date: 19 January 2006 + Origin: Adam Thornton + Status: Blue sky -Date: 12 April 2007) + What: XACML is "eXtensible Access Control Markup Language" and "SAML is + the "Security Assertion Markup Language"--an XML standard for + making statements about identity and authorization. Having these + would give us a framework to approach ACLs in a generic manner, + and in a way flexible enough to support the four major sorts of + ACLs I see as a concern to Bacula at this point, as well as + (probably) to deal with new sorts of ACLs that may appear in the + future. + + Why: Bacula is beginning to need to back up systems with ACLs that do not + map cleanly onto traditional Unix permissions. I see four sets of + ACLs--in general, mutually incompatible with one another--that + we're going to need to deal with. These are: NTFS ACLs, POSIX + ACLs, NFSv4 ACLS, and AFS ACLS. (Some may question the relevance + of AFS; AFS is one of Sine Nomine's core consulting businesses, + and having a reputable file-level backup and restore technology + for it (as Tivoli is probably going to drop AFS support soon since + IBM no longer supports AFS) would be of huge benefit to our + customers; we'd most likely create the AFS support at Sine Nomine + for inclusion into the Bacula (and perhaps some changes to the + OpenAFS volserver) core code.) -Status: NEW + Now, obviously, Bacula already handles NTFS just fine. However, I + think there's a lot of value in implementing a generic ACL model, + so that it's easy to support whatever particular instances of ACLs + come down the pike: POSIX ACLS (think SELinux) and NFSv4 are the + obvious things arriving in the Linux world in a big way in the + near future. XACML, although overcomplicated for our needs, + provides this framework, and we should be able to leverage other + people's implementations to minimize the amount of work *we* have + to do to get a generic ACL framework. Basically, the costs of + implementation are high, but they're largely both external to + Bacula and already sunk. + + Notes: As you indicate this is a bit of "blue sky" or in other words, + at the moment, it is a bit esoteric to consider for Bacula. + +Item h9: Archive data + Date: 15/5/2006 + Origin: calvin streeting calvin at absentdream dot com + Status: -What: Make it possible to modify the spooldata option - for a job when being run from within the console. - Currently it is possible to modify the backup level - and the spooldata setting in a Schedule resource. - It is also possible to modify the backup level when using - the "run" command in the console. - But it is currently not possible to to the same - with "spooldata=yes|no" like: + What: The abilty to archive to media (dvd/cd) in a uncompressed format + for dead filing (archiving not backing up) - run job=MyJob level=incremental spooldata=yes + Why: At work when jobs are finished and moved off of the main + file servers (raid based systems) onto a simple Linux + file server (ide based system) so users can find old + information without contacting the IT dept. -Why: In some situations it would be handy to be able to switch - spooldata on or off for interactive/manual jobs based on - which data the admin expects or how fast the LAN/WAN - connection currently is. + So this data dosn't realy change it only gets added to, + But it also needs backing up. At the moment it takes + about 8 hours to back up our servers (working data) so + rather than add more time to existing backups i am trying + to implement a system where we backup the acrhive data to + cd/dvd these disks would only need to be appended to + (burn only new/changed files to new disks for off site + storage). basialy understand the differnce between + achive data and live data. -Notes: ./. + Notes: Scan the data and email me when it needs burning divide + into predefined chunks keep a recored of what is on what + disk make me a label (simple php->mysql=>pdf stuff) i + could do this bit ability to save data uncompresed so + it can be read in any other system (future proof data) + save the catalog with the disk as some kind of menu + system -============= Empty Feature Request form =========== -Item n: One line summary ... - Date: Date submitted - Origin: Name and email of originator. - Status: + Notes: Kern: I don't understand this item, and in any case, if it + is specific to DVD/CDs, which we do not recommend using, + it is unlikely to be implemented except as a user + submitted patch. - What: More detailed explanation ... - Why: Why it is important ... +Item h10: Clustered file-daemons + Origin: Alan Brown ajb2 at mssl dot ucl dot ac dot uk + Date: 24 July 2006 + Status: + What: A "virtual" filedaemon, which is actually a cluster of real ones. - Notes: Additional notes or features (omit if not used) -============== End Feature Request form ============== + Why: In the case of clustered filesystems (SAN setups, GFS, or OCFS2, etc) + multiple machines may have access to the same set of filesystems + + For performance reasons, one may wish to initate backups from + several of these machines simultaneously, instead of just using + one backup source for the common clustered filesystem. + + For obvious reasons, normally backups of $A-FD/$PATH and + B-FD/$PATH are treated as different backup sets. In this case + they are the same communal set. + + Likewise when restoring, it would be easier to just specify + one of the cluster machines and let bacula decide which to use. + + This can be faked to some extent using DNS round robin entries + and a virtual IP address, however it means "status client" will + always give bogus answers. Additionally there is no way of + spreading the load evenly among the servers. + + What is required is something similar to the storage daemon + autochanger directives, so that Bacula can keep track of + operating backups/restores and direct new jobs to a "free" + client. + + Notes: Kern: I don't understand the request enough to be able to + implement it. A lot more design detail should be presented + before voting on this project. + + Feature Request Form