git.sur5r.net Git - bacula/bacula/blob - bacula/projects

   1
   2 Projects:
   3                      Bacula Projects Roadmap
   4                 Prioritized by user vote 07 December 2005
   5                     Status updated 30 July 2006
   6
   7 Summary:
   8 Item  1:  Implement data encryption (as opposed to comm encryption)
   9 Item  2:  Implement Migration that moves Jobs from one Pool to another.
  10 Item  3:  Accurate restoration of renamed/deleted files from
  11 Item  4:  Implement a Bacula GUI/management tool using Python.
  12 Item  5:  Implement Base jobs.
  13 Item  6:  Allow FD to initiate a backup
  14 Item  7:  Improve Bacula's tape and drive usage and cleaning management.
  15 Item  8:  Implement creation and maintenance of copy pools
  16 Item  9:  Implement new {Client}Run{Before|After}Job feature.
  17 Item 10:  Merge multiple backups (Synthetic Backup or Consolidation).
  18 Item 11:  Deletion of Disk-Based Bacula Volumes
  19 Item 12:  Directive/mode to backup only file changes, not entire file
  20 Item 13:  Multiple threads in file daemon for the same job
  21 Item 14:  Implement red/black binary tree routines.
  22 Item 15:  Add support for FileSets in user directories  CACHEDIR.TAG
  23 Item 16:  Implement extraction of Win32 BackupWrite data.
  24 Item 17:  Implement a Python interface to the Bacula catalog.
  25 Item 18:  Archival (removal) of User Files to Tape
  26 Item 19:  Add Plug-ins to the FileSet Include statements.
  27 Item 20:  Implement more Python events in Bacula.
  28 Item 21:  Quick release of FD-SD connection after backup.
  29 Item 22:  Permit multiple Media Types in an Autochanger
  30 Item 23:  Allow different autochanger definitions for one autochanger.
  31 Item 24:  Automatic disabling of devices
  32 Item 25:  Implement huge exclude list support using hashing.
  33
  34
  35 Below, you will find more information on future projects:
  36
  37 Item  1:  Implement data encryption (as opposed to comm encryption)
  38   Date:   28 October 2005
  39   Origin: Sponsored by Landon and 13 contributors to EFF.
  40   Status: Done: Landon Fuller has implemented this in 1.39.x.
  41
  42   What:   Currently the data that is stored on the Volume is not
  43           encrypted. For confidentiality, encryption of data at
  44           the File daemon level is essential.
  45           Data encryption encrypts the data in the File daemon and
  46           decrypts the data in the File daemon during a restore.
  47
  48   Why:    Large sites require this.
  49
  50 Item 2:   Implement Migration that moves Jobs from one Pool to another.
  51   Origin: Sponsored by Riege Software International GmbH. Contact:
  52           Daniel Holtkamp <holtkamp at riege dot com>
  53   Date:   28 October 2005
  54   Status: 90% complete: Working in 1.39, more to do. Assigned to
  55           Kern.
  56
  57   What:   The ability to copy, move, or archive data that is on a
  58           device to another device is very important.
  59
  60   Why:    An ISP might want to backup to disk, but after 30 days
  61           migrate the data to tape backup and delete it from
  62           disk.  Bacula should be able to handle this
  63           automatically.  It needs to know what was put where,
  64           and when, and what to migrate -- it is a bit like
  65           retention periods.  Doing so would allow space to be
  66           freed up for current backups while maintaining older
  67           data on tape drives.
  68
  69   Notes:   Riege Software have asked for the following migration
  70            triggers:
  71            Age of Job
  72            Highwater mark (stopped by Lowwater mark?)
  73
  74   Notes:  Migration could be additionally triggered by:
  75            Number of Jobs
  76            Number of Volumes
  77
  78 Item  3:  Accurate restoration of renamed/deleted files from
  79           Incremental/Differential backups
  80   Date:   28 November 2005
  81   Origin: Martin Simmons (martin at lispworks dot com)
  82   Status:
  83
  84   What:   When restoring a fileset for a specified date (including "most
  85           recent"), Bacula should give you exactly the files and directories
  86           that existed at the time of the last backup prior to that date.
  87
  88           Currently this only works if the last backup was a Full backup.
  89           When the last backup was Incremental/Differential, files and
  90           directories that have been renamed or deleted since the last Full
  91           backup are not currently restored correctly.  Ditto for files with
  92           extra/fewer hard links than at the time of the last Full backup.
  93
  94   Why:    Incremental/Differential would be much more useful if this worked.
  95
  96   Notes:  Item 14 (Merging of multiple backups into a single one) seems to
  97           rely on this working, otherwise the merged backups will not be
  98           truly equivalent to a Full backup.
  99
 100           Kern: notes shortened. This can be done without the need for
 101           inodes. It is essentially the same as the current Verify job,
 102           but one additional database record must be written, which does
 103           not need any database change.
 104
 105           Kern: see if we can correct restoration of directories if
 106           replace=ifnewer is set.  Currently, if the directory does not
 107           exist, a "dummy" directory is created, then when all the files
 108           are updated, the dummy directory is newer so the real values
 109           are not updated.
 110
 111 Item 4:   Implement a Bacula GUI/management tool using Python.
 112   Origin: Kern
 113   Date:   28 October 2005
 114   Status: Lucus is working on this for Python GTK+.
 115
 116   What:   Implement a Bacula console, and management tools
 117           using Python and Qt or GTK.
 118
 119   Why:    Don't we already have a wxWidgets GUI?  Yes, but
 120           it is written in C++ and changes to the user interface
 121           must be hand tailored using C++ code. By developing
 122           the user interface using Qt designer, the interface
 123           can be very easily updated and most of the new Python
 124           code will be automatically created.  The user interface
 125           changes become very simple, and only the new features
 126           must be implement.  In addition, the code will be in
 127           Python, which will give many more users easy (or easier)
 128           access to making additions or modifications.
 129
 130  Notes:   This is currently being implemented using Python-GTK by
 131           Lucas Di Pentima <lucas at lunix dot com dot ar>
 132
 133 Item 5:   Implement Base jobs.
 134   Date:   28 October 2005
 135   Origin: Kern
 136   Status:
 137
 138   What:   A base job is sort of like a Full save except that you
 139           will want the FileSet to contain only files that are
 140           unlikely to change in the future (i.e.  a snapshot of
 141           most of your system after installing it).  After the
 142           base job has been run, when you are doing a Full save,
 143           you specify one or more Base jobs to be used.  All
 144           files that have been backed up in the Base job/jobs but
 145           not modified will then be excluded from the backup.
 146           During a restore, the Base jobs will be automatically
 147           pulled in where necessary.
 148
 149   Why:    This is something none of the competition does, as far as
 150           we know (except perhaps BackupPC, which is a Perl program that
 151           saves to disk only).  It is big win for the user, it
 152           makes Bacula stand out as offering a unique
 153           optimization that immediately saves time and money.
 154           Basically, imagine that you have 100 nearly identical
 155           Windows or Linux machine containing the OS and user
 156           files.  Now for the OS part, a Base job will be backed
 157           up once, and rather than making 100 copies of the OS,
 158           there will be only one.  If one or more of the systems
 159           have some files updated, no problem, they will be
 160           automatically restored.
 161
 162   Notes:  Huge savings in tape usage even for a single machine.
 163           Will require more resources because the DIR must send
 164           FD a list of files/attribs, and the FD must search the
 165           list and compare it for each file to be saved.
 166
 167 Item  6:  Allow FD to initiate a backup
 168   Origin: Frank Volf (frank at deze dot org)
 169   Date:   17 November 2005
 170   Status:
 171
 172    What:  Provide some means, possibly by a restricted console that
 173           allows a FD to initiate a backup, and that uses the connection
 174           established by the FD to the Director for the backup so that
 175           a Director that is firewalled can do the backup.
 176
 177    Why:   Makes backup of laptops much easier.
 178
 179 Item  7:  Improve Bacula's tape and drive usage and cleaning management.
 180   Date:   8 November 2005, November 11, 2005
 181   Origin: Adam Thornton <athornton at sinenomine dot net>,
 182           Arno Lehmann <al at its-lehmann dot de>
 183   Status:
 184
 185   What:   Make Bacula manage tape life cycle information, tape reuse
 186           times and drive cleaning cycles.
 187
 188   Why:    All three parts of this project are important when operating
 189           backups.
 190           We need to know which tapes need replacement, and we need to
 191           make sure the drives are cleaned when necessary.  While many
 192           tape libraries and even autoloaders can handle all this
 193           automatically, support by Bacula can be helpful for smaller
 194           (older) libraries and single drives.  Limiting the number of
 195           times a tape is used might prevent tape errors when using
 196           tapes until the drives can't read it any more.  Also, checking
 197           drive status during operation can prevent some failures (as I
 198           [Arno] had to learn the hard way...)
 199
 200   Notes:  First, Bacula could (and even does, to some limited extent)
 201           record tape and drive usage.  For tapes, the number of mounts,
 202           the amount of data, and the time the tape has actually been
 203           running could be recorded.  Data fields for Read and Write
 204           time and Number of mounts already exist in the catalog (I'm
 205           not sure if VolBytes is the sum of all bytes ever written to
 206           that volume by Bacula).  This information can be important
 207           when determining which media to replace.  The ability to mark
 208           Volumes as "used up" after a given number of write cycles
 209           should also be implemented so that a tape is never actually
 210           worn out.  For the tape drives known to Bacula, similar
 211           information is interesting to determine the device status and
 212           expected life time: Time it's been Reading and Writing, number
 213           of tape Loads / Unloads / Errors.  This information is not yet
 214           recorded as far as I [Arno] know.  A new volume status would
 215           be necessary for the new state, like "Used up" or "Worn out".
 216           Volumes with this state could be used for restores, but not
 217           for writing. These volumes should be migrated first (assuming
 218           migration is implemented) and, once they are no longer needed,
 219           could be moved to a Trash pool.
 220
 221           The next step would be to implement a drive cleaning setup.
 222           Bacula already has knowledge about cleaning tapes.  Once it
 223           has some information about cleaning cycles (measured in drive
 224           run time, number of tapes used, or calender days, for example)
 225           it can automatically execute tape cleaning (with an
 226           autochanger, obviously) or ask for operator assistance loading
 227           a cleaning tape.
 228
 229           The final step would be to implement TAPEALERT checks not only
 230           when changing tapes and only sending the information to the
 231           administrator, but rather checking after each tape error,
 232           checking on a regular basis (for example after each tape
 233           file), and also before unloading and after loading a new tape.
 234           Then, depending on the drives TAPEALERT state and the known
 235           drive cleaning state Bacula could automatically schedule later
 236           cleaning, clean immediately, or inform the operator.
 237
 238           Implementing this would perhaps require another catalog change
 239           and perhaps major changes in SD code and the DIR-SD protocol,
 240           so I'd only consider this worth implementing if it would
 241           actually be used or even needed by many people.
 242
 243           Implementation of these projects could happen in three distinct
 244           sub-projects: Measuring Tape and Drive usage, retiring
 245           volumes, and handling drive cleaning and TAPEALERTs.
 246
 247 Item  8:  Implement creation and maintenance of copy pools
 248   Date:   27 November 2005
 249   Origin: David Boyes (dboyes at sinenomine dot net)
 250   Status:
 251
 252   What:   I would like Bacula to have the capability to write copies
 253           of backed-up data on multiple physical volumes selected
 254           from different pools without transferring the data
 255           multiple times, and to accept any of the copy volumes
 256           as valid for restore.
 257
 258   Why:    In many cases, businesses are required to keep offsite
 259           copies of backup volumes, or just wish for simple
 260           protection against a human operator dropping a storage
 261           volume and damaging it. The ability to generate multiple
 262           volumes in the course of a single backup job allows
 263           customers to simple check out one copy and send it
 264           offsite, marking it as out of changer or otherwise
 265           unavailable. Currently, the library and magazine
 266           management capability in Bacula does not make this process
 267           simple.
 268
 269           Restores would use the copy of the data on the first
 270           available volume, in order of copy pool chain definition.
 271
 272           This is also a major scalability issue -- as the number of
 273           clients increases beyond several thousand, and the volume
 274           of data increases, transferring the data multiple times to
 275           produce additional copies of the backups will become
 276           physically impossible due to transfer speed
 277           issues. Generating multiple copies at server side will
 278           become the only practical option.
 279
 280   How:    I suspect that this will require adding a multiplexing
 281           SD that appears to be a SD to a specific FD, but 1-n FDs
 282           to the specific back end SDs managing the primary and copy
 283           pools.  Storage pools will also need to acquire parameters
 284           to define the pools to be used for copies.
 285
 286   Notes:  I would commit some of my developers' time if we can agree
 287           on the design and behavior.
 288
 289 Item  9:  Implement new {Client}Run{Before|After}Job feature.
 290   Date:   26 September 2005
 291   Origin: Phil Stracchino
 292   Status: Done. This has been implemented by Eric Bollengier
 293
 294   What:   Some time ago, there was a discussion of RunAfterJob and
 295           ClientRunAfterJob, and the fact that they do not run after failed
 296           jobs.  At the time, there was a suggestion to add a
 297           RunAfterFailedJob directive (and, presumably, a matching
 298           ClientRunAfterFailedJob directive), but to my knowledge these
 299           were never implemented.
 300
 301           The current implementation doesn't permit to add new feature easily.
 302
 303           An alternate way of approaching the problem has just occurred to
 304           me.  Suppose the RunBeforeJob and RunAfterJob directives were
 305           expanded in a manner like this example:
 306
 307           RunScript {
 308               Command = "/opt/bacula/etc/checkhost %c"
 309               RunsOnClient = No          # default
 310               AbortJobOnError = Yes      # default
 311               RunsWhen = Before
 312           }
 313           RunScript {
 314               Command = c:/bacula/systemstate.bat
 315               RunsOnClient = yes
 316               AbortJobOnError = No
 317               RunsWhen = After
 318               RunsOnFailure = yes
 319           }
 320
 321           RunScript {
 322               Command = c:/bacula/deletestatefile.bat
 323               Target = rico-fd
 324               RunsWhen = Always
 325           }
 326
 327           It's now possible to specify more than 1 command per Job.
 328           (you can stop your database and your webserver without a script)
 329
 330           ex :
 331           Job {
 332               Name = "Client1"
 333               JobDefs = "DefaultJob"
 334               Write Bootstrap = "/tmp/bacula/var/bacula/working/Client1.bsr"
 335               FileSet = "Minimal"
 336
 337               RunBeforeJob = "echo test before ; echo test before2"
 338               RunBeforeJob = "echo test before (2nd time)"
 339               RunBeforeJob = "echo test before (3rd time)"
 340               RunAfterJob = "echo test after"
 341               ClientRunAfterJob = "echo test after client"
 342
 343               RunScript {
 344                 Command = "echo test RunScript in error"
 345                 Runsonclient = yes
 346                 RunsOnSuccess = no
 347                 RunsOnFailure = yes
 348                 RunsWhen = After            # never by default
 349               }
 350               RunScript {
 351                 Command = "echo test RunScript on success"
 352                 Runsonclient = yes
 353                 RunsOnSuccess = yes # default
 354                 RunsOnFailure = no  # default
 355                 RunsWhen = After
 356               }
 357           }
 358
 359   Why:    It would be a significant change to the structure of the
 360           directives, but allows for a lot more flexibility, including
 361           RunAfter commands that will run regardless of whether the job
 362           succeeds, or RunBefore tasks that still allow the job to run even
 363           if that specific RunBefore fails.
 364
 365   Notes:  (More notes from Phil, Kern, David and Eric)
 366           I would prefer to have a single new Resource called
 367           RunScript.
 368
 369             RunsWhen = After|Before|Always
 370             RunsAtJobLevels = All|Full|Diff|Inc # not yet implemented
 371
 372           The AbortJobOnError, RunsOnSuccess and RunsOnFailure directives
 373           could be optional, and possibly RunWhen as well.
 374
 375           AbortJobOnError would be ignored unless RunsWhen was set to Before
 376           and would default to Yes if omitted.
 377           If AbortJobOnError was set to No, failure of the script
 378           would still generate a warning.
 379
 380           RunsOnSuccess would be ignored unless RunsWhen was set to After
 381           (or RunsBeforeJob set to No), and default to Yes.
 382
 383           RunsOnFailure would be ignored unless RunsWhen was set to After,
 384           and default to No.
 385
 386           Allow having the before/after status on the script command
 387           line so that the same script can be used both before/after.
 388
 389 Item 10:  Merge multiple backups (Synthetic Backup or Consolidation).
 390   Origin: Marc Cousin and Eric Bollengier
 391   Date:   15 November 2005
 392   Status: Waiting implementation. Depends on first implementing
 393           project Item 2 (Migration).
 394
 395   What:   A merged backup is a backup made without connecting to the Client.
 396           It would be a Merge of existing backups into a single backup.
 397           In effect, it is like a restore but to the backup medium.
 398
 399           For instance, say that last Sunday we made a full backup.  Then
 400           all week long, we created incremental backups, in order to do
 401           them fast.  Now comes Sunday again, and we need another full.
 402           The merged backup makes it possible to do instead an incremental
 403           backup (during the night for instance), and then create a merged
 404           backup during the day, by using the full and incrementals from
 405           the week.  The merged backup will be exactly like a full made
 406           Sunday night on the tape, but the production interruption on the
 407           Client will be minimal, as the Client will only have to send
 408           incrementals.
 409
 410           In fact, if it's done correctly, you could merge all the
 411           Incrementals into single Incremental, or all the Incrementals
 412           and the last Differential into a new Differential, or the Full,
 413           last differential and all the Incrementals into a new Full
 414           backup.  And there is no need to involve the Client.
 415
 416   Why:    The benefit is that :
 417           - the Client just does an incremental ;
 418           - the merged backup on tape is just as a single full backup,
 419             and can be restored very fast.
 420
 421           This is also a way of reducing the backup data since the old
 422           data can then be pruned (or not) from the catalog, possibly
 423           allowing older volumes to be recycled
 424
 425 Item 11:  Deletion of Disk-Based Bacula Volumes
 426   Date:   Nov 25, 2005
 427   Origin: Ross Boylan <RossBoylan at stanfordalumni dot org> (edited
 428           by Kern)
 429   Status:
 430
 431    What:  Provide a way for Bacula to automatically remove Volumes
 432           from the filesystem, or optionally to truncate them.
 433           Obviously, the Volume must be pruned prior removal.
 434
 435   Why:    This would allow users more control over their Volumes and
 436           prevent disk based volumes from consuming too much space.
 437
 438   Notes:  The following two directives might do the trick:
 439
 440           Volume Data Retention = <time period>
 441           Remove Volume After = <time period>
 442
 443           The migration project should also remove a Volume that is
 444           migrated. This might also work for tape Volumes.
 445
 446 Item 12:  Directive/mode to backup only file changes, not entire file
 447   Date:   11 November 2005
 448   Origin: Joshua Kugler <joshua dot kugler at uaf dot edu>
 449           Marek Bajon <mbajon at bimsplus dot com dot pl>
 450   Status:
 451
 452   What:   Currently when a file changes, the entire file will be backed up in
 453           the next incremental or full backup.  To save space on the tapes
 454           it would be nice to have a mode whereby only the changes to the
 455           file would be backed up when it is changed.
 456
 457   Why:    This would save lots of space when backing up large files such as
 458           logs, mbox files, Outlook PST files and the like.
 459
 460   Notes:  This would require the usage of disk-based volumes as comparing
 461           files would not be feasible using a tape drive.
 462
 463 Item 13:  Multiple threads in file daemon for the same job
 464   Date:   27 November 2005
 465   Origin: Ove Risberg (Ove.Risberg at octocode dot com)
 466   Status:
 467
 468   What:   I want the file daemon to start multiple threads for a backup
 469           job so the fastest possible backup can be made.
 470
 471           The file daemon could parse the FileSet information and start
 472           one thread for each File entry located on a separate
 473           filesystem.
 474
 475           A configuration option in the job section should be used to
 476           enable or disable this feature. The configuration option could
 477           specify the maximum number of threads in the file daemon.
 478
 479           If the theads could spool the data to separate spool files
 480           the restore process will not be much slower.
 481
 482   Why:    Multiple concurrent backups of a large fileserver with many
 483           disks and controllers will be much faster.
 484
 485   Notes:  I am willing to try to implement this but I will probably
 486           need some help and advice.  (No problem -- Kern)
 487
 488 Item 14:  Implement red/black binary tree routines.
 489   Date:   28 October 2005
 490   Origin: Kern
 491   Status: Class code is complete. Code needs to be integrated into
 492           restore tree code.
 493
 494   What:   Implement a red/black binary tree class. This could
 495           then replace the current binary insert/search routines
 496           used in the restore in memory tree.  This could significantly
 497           speed up the creation of the in memory restore tree.
 498
 499   Why:    Performance enhancement.
 500
 501 Item 15:  Add support for FileSets in user directories  CACHEDIR.TAG
 502   Origin: Norbert Kiesel <nkiesel at tbdnetworks dot com>
 503   Date:   21 November 2005
 504   Status: (I think this is better done using a Python event that I
 505            will implement in version 1.39.x).
 506
 507   What:   CACHDIR.TAG is a proposal for identifying directories which
 508           should be ignored for archiving/backup.  It works by ignoring
 509           directory trees which have a file named CACHEDIR.TAG with a
 510           specific content.  See
 511           http://www.brynosaurus.com/cachedir/spec.html
 512           for details.
 513
 514           From Peter Eriksson:
 515           I suggest that if this is implemented (I've also asked for this
 516           feature some year ago) that it is made compatible with Legato
 517           Networkers ".nsr" files where you can specify a lot of options on
 518           how to handle files/directories (including denying further
 519           parsing of .nsr files lower down into the directory trees).  A
 520           PDF version of the .nsr man page can be viewed at:
 521
 522           http://www.ifm.liu.se/~peter/nsr.pdf
 523
 524   Why:    It's a nice alternative to "exclude" patterns for directories
 525           which don't have regular pathnames.  Also, it allows users to
 526           control backup for themselves.  Implementation should be pretty
 527           simple.  GNU tar >= 1.14 or so supports it, too.
 528
 529   Notes:  I envision this as an optional feature to a fileset
 530           specification.
 531
 532
 533 Item 16:  Implement extraction of Win32 BackupWrite data.
 534   Origin: Thorsten Engel <thorsten.engel at matrix-computer dot com>
 535   Date:   28 October 2005
 536   Status: Done. Assigned to Thorsten. Implemented in current CVS
 537
 538   What:   This provides the Bacula File daemon with code that
 539           can pick apart the stream output that Microsoft writes
 540           for BackupWrite data, and thus the data can be read
 541           and restored on non-Win32 machines.
 542
 543   Why:    BackupWrite data is the portable=no option in Win32
 544           FileSets, and in previous Baculas, this data could
 545           only be extracted using a Win32 FD. With this new code,
 546           the Windows data can be extracted and restored on
 547           any OS.
 548
 549
 550 Item 18:  Implement a Python interface to the Bacula catalog.
 551   Date:   28 October 2005
 552   Origin: Kern
 553   Status:
 554
 555   What:   Implement an interface for Python scripts to access
 556           the catalog through Bacula.
 557
 558   Why:    This will permit users to customize Bacula through
 559           Python scripts.
 560
 561 Item 18:  Archival (removal) of User Files to Tape
 562
 563   Date:   Nov. 24/2005
 564
 565   Origin: Ray Pengelly [ray at biomed dot queensu dot ca
 566   Status:
 567
 568   What:   The ability to archive data to storage based on certain parameters
 569           such as age, size, or location.  Once the data has been written to
 570           storage and logged it is then pruned from the originating
 571           filesystem. Note! We are talking about user's files and not
 572           Bacula Volumes.
 573
 574   Why:    This would allow fully automatic storage management which becomes
 575           useful for large datastores.  It would also allow for auto-staging
 576           from one media type to another.
 577
 578           Example 1) Medical imaging needs to store large amounts of data.
 579           They decide to keep data on their servers for 6 months and then put
 580           it away for long term storage.  The server then finds all files
 581           older than 6 months writes them to tape.  The files are then removed
 582           from the server.
 583
 584           Example 2) All data that hasn't been accessed in 2 months could be
 585           moved from high-cost, fibre-channel disk storage to a low-cost
 586           large-capacity SATA disk storage pool which doesn't have as quick of
 587           access time.  Then after another 6 months (or possibly as one
 588           storage pool gets full) data is migrated to Tape.
 589
 590 Item 19:  Add Plug-ins to the FileSet Include statements.
 591   Date:   28 October 2005
 592   Origin:
 593   Status: Partially coded in 1.37 -- much more to do.
 594
 595   What:   Allow users to specify wild-card and/or regular
 596           expressions to be matched in both the Include and
 597           Exclude directives in a FileSet.  At the same time,
 598           allow users to define plug-ins to be called (based on
 599           regular expression/wild-card matching).
 600
 601   Why:    This would give the users the ultimate ability to control
 602           how files are backed up/restored.  A user could write a
 603           plug-in knows how to backup his Oracle database without
 604           stopping/starting it, for example.
 605
 606 Item 20:  Implement more Python events in Bacula.
 607   Date:   28 October 2005
 608   Origin:
 609   Status:
 610
 611   What:   Allow Python scripts to be called at more places
 612           within Bacula and provide additional access to Bacula
 613           internal variables.
 614
 615   Why:    This will permit users to customize Bacula through
 616           Python scripts.
 617
 618   Notes:  Recycle event
 619           Scratch pool event
 620           NeedVolume event
 621           MediaFull event
 622
 623           Also add a way to get a listing of currently running
 624           jobs (possibly also scheduled jobs).
 625
 626
 627 Item 21:  Quick release of FD-SD connection after backup.
 628   Origin: Frank Volf (frank at deze dot org)
 629   Date:   17 November 2005
 630   Status:
 631
 632    What:  In the Bacula implementation a backup is finished after all data
 633           and attributes are successfully written to storage.  When using a
 634           tape backup it is very annoying that a backup can take a day,
 635           simply because the current tape (or whatever) is full and the
 636           administrator has not put a new one in.  During that time the
 637           system cannot be taken off-line, because there is still an open
 638           session between the storage daemon and the file daemon on the
 639           client.
 640
 641           Although this is a very good strategy for making "safe backups"
 642           This can be annoying for e.g.  laptops, that must remain
 643           connected until the backup is completed.
 644
 645           Using a new feature called "migration" it will be possible to
 646           spool first to harddisk (using a special 'spool' migration
 647           scheme) and then migrate the backup to tape.
 648
 649           There is still the problem of getting the attributes committed.
 650           If it takes a very long time to do, with the current code, the
 651           job has not terminated, and the File daemon is not freed up.  The
 652           Storage daemon should release the File daemon as soon as all the
 653           file data and all the attributes have been sent to it (the SD).
 654           Currently the SD waits until everything is on tape and all the
 655           attributes are transmitted to the Director before signaling
 656           completion to the FD. I don't think I would have any problem
 657           changing this.  The reason is that even if the FD reports back to
 658           the Dir that all is OK, the job will not terminate until the SD
 659           has done the same thing -- so in a way keeping the SD-FD link
 660           open to the very end is not really very productive ...
 661
 662    Why:   Makes backup of laptops much easier.
 663
 664 Item 22:  Permit multiple Media Types in an Autochanger
 665   Origin: Kern
 666   Status: Done. Implemented in 1.38.9 (I think).
 667
 668   What:   Modify the Storage daemon so that multiple Media Types
 669           can be specified in an autochanger. This would be somewhat
 670           of a simplistic implementation in that each drive would
 671           still be allowed to have only one Media Type.  However,
 672           the Storage daemon will ensure that only a drive with
 673           the Media Type that matches what the Director specifies
 674           is chosen.
 675
 676   Why:    This will permit user with several different drive types
 677           to make full use of their autochangers.
 678
 679 Item 23:  Allow different autochanger definitions for one autochanger.
 680   Date:   28 October 2005
 681   Origin: Kern
 682   Status:
 683
 684   What:   Currently, the autochanger script is locked based on
 685           the autochanger. That is, if multiple drives are being
 686           simultaneously used, the Storage daemon ensures that only
 687           one drive at a time can access the mtx-changer script.
 688           This change would base the locking on the control device,
 689           rather than the autochanger. It would then permit two autochanger
 690           definitions for the same autochanger, but with different
 691           drives. Logically, the autochanger could then be "partitioned"
 692           for different jobs, clients, or class of jobs, and if the locking
 693           is based on the control device (e.g. /dev/sg0) the mtx-changer
 694           script will be locked appropriately.
 695
 696   Why:    This will permit users to partition autochangers for specific
 697           use. It would also permit implementation of multiple Media
 698           Types with no changes to the Storage daemon.
 699
 700 Item 24:  Automatic disabling of devices
 701    Date:   2005-11-11
 702    Origin: Peter Eriksson <peter at ifm.liu dot se>
 703    Status:
 704
 705    What:  After a configurable amount of fatal errors with a tape drive
 706           Bacula should automatically disable further use of a certain
 707           tape drive. There should also be "disable"/"enable" commands in
 708           the "bconsole" tool.
 709
 710    Why:   On a multi-drive jukebox there is a possibility of tape drives
 711           going bad during large backups (needing a cleaning tape run,
 712           tapes getting stuck). It would be advantageous if Bacula would
 713           automatically disable further use of a problematic tape drive
 714           after a configurable amount of errors has occurred.
 715
 716           An example: I have a multi-drive jukebox (6 drives, 380+ slots)
 717           where tapes occasionally get stuck inside the drive. Bacula will
 718           notice that the "mtx-changer" command will fail and then fail
 719           any backup jobs trying to use that drive. However, it will still
 720           keep on trying to run new jobs using that drive and fail -
 721           forever, and thus failing lots and lots of jobs... Since we have
 722           many drives Bacula could have just automatically disabled
 723           further use of that drive and used one of the other ones
 724           instead.
 725
 726 Item 25:  Implement huge exclude list support using hashing.
 727   Date:   28 October 2005
 728   Origin: Kern
 729   Status:
 730
 731   What:   Allow users to specify very large exclude list (currently
 732           more than about 1000 files is too many).
 733
 734   Why:    This would give the users the ability to exclude all
 735           files that are loaded with the OS (e.g. using rpms
 736           or debs). If the user can restore the base OS from
 737           CDs, there is no need to backup all those files. A
 738           complete restore would be to restore the base OS, then
 739           do a Bacula restore. By excluding the base OS files, the
 740           backup set will be *much* smaller.
 741
 742
 743 ============= Empty Feature Request form ===========
 744 Item n:   One line summary ...
 745   Date:   Date submitted
 746   Origin: Name and email of originator.
 747   Status:
 748
 749   What:   More detailed explanation ...
 750
 751   Why:    Why it is important ...
 752
 753   Notes:  Additional notes or features (omit if not used)
 754 ============== End Feature Request form ==============
 755
 756
 757 ===============================================
 758 Feature requests submitted after cutoff for December 2005 vote
 759   and not yet discussed.
 760 ===============================================
 761 Item n:   Allow skipping execution of Jobs
 762   Date:   29 November 2005
 763   Origin: Florian Schnabel <florian.schnabel at docufy dot de>
 764   Status:
 765
 766      What: An easy option to skip a certain job  on a certain date.
 767      Why:  You could then easily skip tape backups on holidays.  Especially
 768            if you got no autochanger and can only fit one backup on a tape
 769            that would be really handy, other jobs could proceed normally
 770            and you won't get errors that way.
 771
 772 ===================================================
 773
 774 Item n: archive data
 775
 776   Origin: calvin streeting calvin at absentdream dot com
 777   Date:   15/5/2006
 778
 779   What:   The abilty to archive to media (dvd/cd) in a uncompressd format
 780           for dead filing (archiving not backing up)
 781
 782   Why:  At my works when jobs are finished and moved off of the main file
 783         servers (raid based systems) onto a simple linux file server (ide based
 784         system) so users can find old information without contacting the IT
 785         dept.
 786
 787         So this data dosn't realy change it only gets added to,
 788         But it also needs backing up.  At the moment it takes
 789         about 8 hours to back up our servers (working data) so
 790         rather than add more time to existing backups i am trying
 791         to implement a system where we backup the acrhive data to
 792         cd/dvd these disks would only need to be appended to
 793         (burn only new/changed files to new disks for off site
 794         storage).  basialy understand the differnce between
 795         achive data and live data.
 796
 797   Notes: scan the data and email me when it needs burning divide
 798           into predifind chunks keep a recored of what is on what
 799           disk make me a label (simple php->mysql=>pdf stuff) i
 800           could do this bit ability to save data uncompresed so
 801           it can be read in any other system (future proof data)
 802           save the catalog with the disk as some kind of menu
 803           system
 804
 805 Item :  Tray monitor window cleanups
 806   Origin: Alan Brown ajb2 at mssl dot ucl dot ac dot uk
 807   Date:   24 July 2006
 808   Status:
 809   What:   Resizeable and scrollable windows in the tray monitor.
 810
 811   Why:    With multiple clients, or with many jobs running, the displayed
 812           window often ends up larger than the available screen, making
 813           the trailing items difficult to read.
 814
 815    Notes:
 816
 817   Item :  Clustered file-daemons
 818   Origin: Alan Brown ajb2 at mssl dot ucl dot ac dot uk
 819   Date:   24 July 2006
 820   Status:
 821   What:   A "virtual" filedaemon, which is actually a cluster of real ones.
 822
 823   Why:    In the case of clustered filesystems (SAN setups, GFS, or OCFS2, etc)
 824           multiple machines may have access to the same set of filesystems
 825
 826           For performance reasons, one may wish to initate backups from
 827           several of these machines simultaneously, instead of just using
 828           one backup source for the common clustered filesystem.
 829
 830           For obvious reasons, normally backups of $A-FD/$PATH and
 831           B-FD/$PATH are treated as different backup sets. In this case
 832           they are the same communal set.
 833
 834           Likewise when restoring, it would be easier to just specify
 835           one of the cluster machines and let bacula decide which to use.
 836
 837           This can be faked to some extent using DNS round robin entries
 838           and a virtual IP address, however it means "status client" will
 839           always give bogus answers. Additionally there is no way of
 840           spreading the load evenly among the servers.
 841
 842           What is required is something similar to the storage daemon
 843           autochanger directives, so that Bacula can keep track of
 844           operating backups/restores and direct new jobs to a "free"
 845           client.
 846
 847    Notes:
 848
 849 Item :  Tray monitor window cleanups
 850   Origin: Alan Brown ajb2 at mssl dot ucl dot ac dot uk
 851   Date:   24 July 2006
 852   Status:
 853   What:   Resizeable and scrollable windows in the tray monitor.
 854
 855   Why:    With multiple clients, or with many jobs running, the displayed
 856           window often ends up larger than the available screen, making
 857           the trailing items difficult to read.
 858
 859   Notes:
 860
 861 Item:    Commercial database support
 862   Origin: Russell Howe <russell_howe dot wreckage dot org>
 863   Date:   26 July 2006
 864   Status:
 865
 866   What:   It would be nice for the database backend to support more
 867           databases. I'm thinking of SQL Server at the moment, but I guess Oracle,
 868           DB2, MaxDB, etc are all candidates. SQL Server would presumably be
 869           implemented using FreeTDS or maybe an ODBC library?
 870
 871   Why:    We only really have one database server, which is MS SQL Server
 872           2000. Maintaining a second one for the backup software (we grew out of
 873           SQLite, which I liked, but which didn't work so well with our database
 874           size). We don't really have a machine with the resources to run
 875           postgres, and would rather only maintain a single DBMS. We're stuck with
 876           SQL Server because pretty much all the company's custom applications
 877           (written by consultants) are locked into SQL Server 2000. I can imagine
 878           this scenario is fairly common, and it would be nice to use the existing
 879           properly specced database server for storing Bacula's catalog, rather
 880           than having to run a second DBMS.
 881
 882
 883 Item n:   Split documentation
 884   Origin: Maxx <maxxatworkat gmail dot com>
 885   Date:   27th July 2006
 886   Status:
 887
 888   What:   Split documentation in several books
 889
 890   Why:    Bacula manual has now more than 600 pages, and looking for
 891           implementation details is getting complicated.  I think
 892           it would be good to split the single volume in two or
 893           maybe three parts:
 894
 895           1) Introduction, requirements and tutorial, typically
 896              are useful only until first installation time
 897
 898           2) Basic installation and configuration, with all the
 899              gory details about the directives supported 3)
 900              Advanced Bacula: testing, troubleshooting, GUI and
 901              ancillary programs, security managements, scripting,
 902              etc.
 903
 904   Notes:
 905
 906 Item n: Include an option to operate on all pools when doing
 907         update vol parameters
 908
 909    Origin: Dmitriy Pinchukov <absh@bossdev.kiev.ua>
 910    Date:   16 August 2006
 911    Status:
 912
 913    What: When I do update -> Volume parameters -> All Volumes
 914          from Pool, then I have to select pools one by one.  I'd like
 915          console to have an option like "0: All Pools" in the list of
 916          defined pools.
 917
 918    Why: I have many pools and therefore unhappy with manually
 919         updating each of them using update -> Volume parameters -> All
 920         Volumes from Pool -> pool #.
 921
 922 Item n:   Automatic promotion of backup levels
 923    Date:   19 January 2006
 924    Origin: Adam Thornton <athornton@sinenomine.net>
 925    Status: Blue sky
 926
 927    What: Amanda has a feature whereby it estimates the space that a
 928          differential, incremental, and full backup would take.  If the
 929          difference in space required between the scheduled level and the next
 930          level up is beneath some user-defined critical threshold, the backup
 931          level is bumped to the next type.  Doing this minimizes the number of
 932          volumes necessary during a restore, with a fairly minimal cost in
 933          backup media space.
 934
 935    Why:   I know at least one (quite sophisticated and smart) user
 936           for whom the absence of this feature is a deal-breaker in terms of
 937           using Bacula; if we had it it would eliminate the one cool thing
 938           Amanda can do and we can't (at least, the one cool thing I know of).
 939
 940
 941
 942
 943 Item n+1:   Incorporation of XACML2/SAML2 parsing
 944    Date:   19 January 2006
 945    Origin: Adam Thornton <athornton@sinenomine.net>
 946    Status: Blue sky
 947
 948    What:   XACML is "eXtensible Access Control Markup Language" and
 949           "SAML is the "Security Assertion Markup Language"--an XML standard
 950           for making statements about identity and authorization.  Having these
 951           would give us a framework to approach ACLs in a generic manner, and
 952           in a way flexible enough to support the four major sorts of ACLs I
 953           see as a concern to Bacula at this point, as well as (probably) to
 954           deal with new sorts of ACLs that may appear in the future.
 955
 956    Why:    Bacula is beginning to need to back up systems with ACLs
 957           that do not map cleanly onto traditional Unix permissions.  I see
 958           four sets of ACLs--in general, mutually incompatible with one
 959           another--that we're going to need to deal with.  These are: NTFS
 960           ACLs, POSIX ACLs, NFSv4 ACLS, and AFS ACLS.  (Some may question the
 961           relevance of AFS; AFS is one of Sine Nomine's core consulting
 962           businesses, and having a reputable file-level backup and restore
 963           technology for it (as Tivoli is probably going to drop AFS support
 964           soon since IBM no longer supports AFS) would be of huge benefit to
 965           our customers; we'd most likely create the AFS support at Sine Nomine
 966           for inclusion into the Bacula (and perhaps some changes to the
 967           OpenAFS volserver) core code.)
 968
 969           Now, obviously, Bacula already handles NTFS just fine.  However, I
 970           think there's a lot of value in implementing a generic ACL model, so
 971           that it's easy to support whatever particular instances of ACLs come
 972           down the pike: POSIX ACLS (think SELinux) and NFSv4 are the obvious
 973           things arriving in the Linux world in a big way in the near future.
 974           XACML, although overcomplicated for our needs, provides this
 975           framework, and we should be able to leverage other people's
 976           implementations to minimize the amount of work *we* have to do to get
 977           a generic ACL framework.  Basically, the costs of implementation are
 978           high, but they're largely both external to Bacula and already sunk.
 979
 980 Item 1:   Add an over-ride in the Schedule configuration to use a
 981           different pool for different backup types.
 982
 983 Date:   19 Jan 2005
 984 Origin: Chad Slater <chad.slater@clickfox.com>
 985 Status:
 986
 987   What:   Adding a FullStorage=BigTapeLibrary in the Schedule resource
 988           would help those of us who use different storage devices for different
 989           backup levels cope with the "auto-upgrade" of a backup.
 990
 991   Why:    Assume I add several new device to be backed up, i.e. several
 992           hosts with 1TB RAID.  To avoid tape switching hassles, incrementals are
 993           stored in a disk set on a 2TB RAID.  If you add these devices in the
 994           middle of the month, the incrementals are upgraded to "full" backups,
 995           but they try to use the same storage device as requested in the
 996           incremental job, filling up the RAID holding the differentials.  If we
 997           could override the Storage parameter for full and/or differential
 998           backups, then the Full job would use the proper Storage device, which
 999           has more capacity (i.e. a 8TB tape library.
1000
1001
1002 Item:     Implement multiple numeric backup levels as supported by dump
1003 Date:     3 April 2006
1004 Origin:   Daniel Rich <drich@employees.org>
1005 Status:
1006 What:     Dump allows specification of backup levels numerically instead of just
1007           "full", "incr", and "diff".  In this system, at any given level, all
1008           files are backed up that were were modified since the last backup of a
1009           higher level (with 0 being the highest and 9 being the lowest).  A
1010           level 0 is therefore equivalent to a full, level 9 an incremental, and
1011           the levels 1 through 8 are varying levels of differentials.  For
1012           bacula's sake, these could be represented as "full", "incr", and
1013           "diff1", "diff2", etc.
1014
1015 Why:      Support of multiple backup levels would provide for more advanced backup
1016           rotation schemes such as "Towers of Hanoi".  This would allow better
1017           flexibility in performing backups, and can lead to shorter recover
1018           times.
1019
1020 Notes:    Legato Networker supports a similar system with full, incr, and 1-9 as
1021           levels.
1022
1023 Kern notes: I think this would add very little functionality, but a *lot* of
1024           additional overhead to Bacula.
1025
1026 Item 1:   include JobID in spool file name
1027   Origin: Mark Bergman <mark.bergman@uphs.upenn.edu>
1028   Date:  Tue Aug 22 17:13:39 EDT 2006
1029   Status:
1030
1031   What:   Change the name of the spool file to include the JobID
1032
1033   Why:    JobIDs are the common key used to refer to jobs, yet the
1034         spoolfile name doesn't include that information. The date/time
1035         stamp is useful (and should be retained).
1036
1037
1038
1039 Item 2:   include timestamp of job launch in "stat clients" output
1040   Origin: Mark Bergman <mark.bergman@uphs.upenn.edu>
1041   Date:  Tue Aug 22 17:13:39 EDT 2006
1042   Status:
1043
1044   What:   The "stat clients" command doesn't include any detail on when
1045         the active backup jobs were launched.
1046
1047   Why:   Including the timestamp would make it much easier to decide whether
1048         a job is running properly.
1049
1050   Notes: It may be helpful to have the output from "stat clients" formatted
1051         more like that from "stat dir" (and other commands), in a column
1052         format. The per-client information that's currently shown (level,
1053         client name, JobId, Volume, pool, device, Files, etc.) is good, but
1054         somewhat hard to parse (both programmatically and visually),
1055         particularly when there are many active clients.
1056
1057 Item 1:   Filesystemwatch triggered backup.
1058   Date:   31 August 2006
1059   Origin: Jesper Krogh <jesper@krogh.cc>
1060   Status: Unimplemented, depends probably on "client initiated backups"
1061
1062   What:   With inotify and similar filesystem triggeret notification
1063           systems is it possible to have the file-daemon to monitor
1064           filesystem changes and initiate backup.
1065
1066   Why:    There are 2 situations where this is nice to have.
1067           1) It is possible to get a much finer-grained backup than
1068              the fixed schedules used now.. A file created and deleted
1069              a few hours later, can automatically be caught.
1070
1071           2) The introduced load on the system will probably be
1072              distributed more even on the system.
1073
1074   Notes:  This can be combined with configration that specifies
1075           something like: "at most every 15 minutes or when changes
1076           consumed XX MB".
1077
1078 Item n:  Message mailing based on backup types
1079 Origin:  Evan Kaufman <evan.kaufman@gmail.com>
1080   Date:  January 6, 2006
1081 Status:
1082
1083   What:  In the "Messages" resource definitions, allowing messages
1084          to be mailed based on the type (backup, restore, etc.) and level
1085          (full, differential, etc) of job that created the originating
1086          message(s).
1087
1088 Why:     It would, for example, allow someone's boss to be emailed
1089          automatically only when a Full Backup job runs, so he can
1090          retrieve the tapes for offsite storage, even if the IT dept.
1091          doesn't (or can't) explicitly notify him.  At the same time, his
1092          mailbox wouldnt be filled by notifications of Verifies, Restores,
1093          or Incremental/Differential Backups (which would likely be kept
1094          onsite).
1095
1096 Notes:
1097         One way this could be done is through additional message types, for example:
1098
1099    Messages {
1100      # email the boss only on full system backups
1101      Mail = boss@mycompany.com = full, !incremental, !differential, !restore,
1102             !verify, !admin
1103      # email us only when something breaks
1104      MailOnError = itdept@mycompany.com = all
1105    }
1106
1107
1108 Item n:   Allow inclusion/exclusion of files in a fileset by creation/mod times
1109   Origin: Evan Kaufman <evan.kaufman@gmail.com>
1110   Date:   January 11, 2006
1111   Status:
1112
1113   What:   In the vein of the Wild and Regex directives in a Fileset's
1114           Options, it would be helpful to allow a user to include or exclude
1115           files and directories by creation or modification times.
1116
1117           You could factor the Exclude=yes|no option in much the same way it
1118           affects the Wild and Regex directives.  For example, you could exclude
1119           all files modified before a certain date:
1120
1121    Options {
1122      Exclude = yes
1123      Modified Before = ####
1124    }
1125
1126            Or you could exclude all files created/modified since a certain date:
1127
1128    Options {
1129       Exclude = yes
1130      Created Modified Since = ####
1131    }
1132
1133            The format of the time/date could be done several ways, say the number
1134            of seconds since the epoch:
1135            1137008553 = Jan 11 2006, 1:42:33PM   # result of `date +%s`
1136
1137            Or a human readable date in a cryptic form:
1138            20060111134233 = Jan 11 2006, 1:42:33PM   # YYYYMMDDhhmmss
1139
1140   Why:    I imagine a feature like this could have many uses. It would
1141           allow a user to do a full backup while excluding the base operating
1142           system files, so if I installed a Linux snapshot from a CD yesterday,
1143           I'll *exclude* all files modified *before* today.  If I need to
1144           recover the system, I use the CD I already have, plus the tape backup.
1145           Or if, say, a Windows client is hit by a particularly corrosive
1146           virus, and I need to *exclude* any files created/modified *since* the
1147           time of infection.
1148
1149   Notes:  Of course, this feature would work in concert with other
1150           in/exclude rules, and wouldnt override them (or each other).
1151
1152   Notes:  The directives I'd imagine would be along the lines of
1153           "[Created] [Modified] [Before|Since] = <date>".
1154           So one could compare against 'ctime' and/or 'mtime', but ONLY 'before'
1155            or 'since'.