git.sur5r.net Git - bacula/bacula/blob - bacula/projects

   1
   2 Projects:
   3                      Bacula Projects Roadmap
   4                     Status updated 12 January 2007
   5
   6 Summary:
   7 Item  1:  Accurate restoration of renamed/deleted files
   8 Item  2:  Implement a Bacula GUI/management tool.
   9 Item  3:  Implement Base jobs.
  10 Item  4:  Implement from-client and to-client on restore command line.
  11 Item  5:  Implement creation and maintenance of copy pools
  12 Item  6:  Merge multiple backups (Synthetic Backup or Consolidation).
  13 Item  7:  Deletion of Disk-Based Bacula Volumes
  14 Item  8:  Implement a Python interface to the Bacula catalog.
  15 Item  9:  Archival (removal) of User Files to Tape
  16 Item 10:  Add Plug-ins to the FileSet Include statements.
  17 Item 11:  Implement more Python events in Bacula.
  18 Item 12:  Quick release of FD-SD connection after backup.
  19 Item 13:  Implement huge exclude list support using hashing.
  20 Item 14:  Allow skipping execution of Jobs
  21 Item 15:  Tray monitor window cleanups
  22 Item 16:  Split documentation
  23 Item 17:  Automatic promotion of backup levels
  24 Item 18:  Add an override in Schedule for Pools based on backup types.
  25 Item 10:  An option to operate on all pools with update vol parameters
  26 Item 20:  Include JobID in spool file name
  27 Item 21:  Include timestamp of job launch in "stat clients" output
  28 Item 22:  Message mailing based on backup types
  29 Item 23:  Allow inclusion/exclusion of files in a fileset by creation/mod times
  30 Item 24:  Add a scheduling syntax that permits weekly rotations
  31 Item 25:  Improve Bacula's tape and drive usage and cleaning management.
  32 Item 26:  Implement support for stacking arbitrary stream filters, sinks.
  33 Item 27:  Allow FD to initiate a backup
  34 Item 28:  Directive/mode to backup only file changes, not entire file
  35 Item 29:  Automatic disabling of devices
  36 Item 30:  Incorporation of XACML2/SAML2 parsing
  37 Item 31:  Clustered file-daemons
  38 Item 32:  Commercial database support
  39 Item 33:  Archive data
  40 Item 34:  Filesystem watch triggered backup.
  41 Item 35:  Implement multiple numeric backup levels as supported by dump
  42 Item 36:  Implement a server-side compression feature
  43 Item 37:  Cause daemons to use a specific IP address to source communications
  44 Item 38:  Multiple threads in file daemon for the same job
  45 Item 39:  Restore only file attributes (permissions, ACL, owner, group...)
  46 Item 40:  Add an item to the restore option where you can select a pool
  47
  48 Below, you will find more information on future projects:
  49
  50 Item  1:  Accurate restoration of renamed/deleted files
  51   Date:   28 November 2005
  52   Origin: Martin Simmons (martin at lispworks dot com)
  53   Status: Robert Nelson will implement this
  54
  55   What:   When restoring a fileset for a specified date (including "most
  56           recent"), Bacula should give you exactly the files and directories
  57           that existed at the time of the last backup prior to that date.
  58
  59           Currently this only works if the last backup was a Full backup.
  60           When the last backup was Incremental/Differential, files and
  61           directories that have been renamed or deleted since the last Full
  62           backup are not currently restored correctly.  Ditto for files with
  63           extra/fewer hard links than at the time of the last Full backup.
  64
  65   Why:    Incremental/Differential would be much more useful if this worked.
  66
  67   Notes:  Merging of multiple backups into a single one seems to
  68           rely on this working, otherwise the merged backups will not be
  69           truly equivalent to a Full backup.
  70
  71           Kern: notes shortened. This can be done without the need for
  72           inodes. It is essentially the same as the current Verify job,
  73           but one additional database record must be written, which does
  74           not need any database change.
  75
  76           Kern: see if we can correct restoration of directories if
  77           replace=ifnewer is set.  Currently, if the directory does not
  78           exist, a "dummy" directory is created, then when all the files
  79           are updated, the dummy directory is newer so the real values
  80           are not updated.
  81
  82 Item  2:  Implement a Bacula GUI/management tool.
  83   Origin: Kern
  84   Date:   28 October 2005
  85   Status:
  86
  87   What:   Implement a Bacula console, and management tools
  88           probably using Qt3 and C++.
  89
  90   Why:    Don't we already have a wxWidgets GUI?  Yes, but
  91           it is written in C++ and changes to the user interface
  92           must be hand tailored using C++ code. By developing
  93           the user interface using Qt designer, the interface
  94           can be very easily updated and most of the new Python
  95           code will be automatically created.  The user interface
  96           changes become very simple, and only the new features
  97           must be implement.  In addition, the code will be in
  98           Python, which will give many more users easy (or easier)
  99           access to making additions or modifications.
 100
 101  Notes:   There is a partial Python-GTK implementation
 102           Lucas Di Pentima <lucas at lunix dot com dot ar> but
 103           it is no longer being developed.
 104
 105
 106 Item  3:  Implement Base jobs.
 107   Date:   28 October 2005
 108   Origin: Kern
 109   Status:
 110
 111   What:   A base job is sort of like a Full save except that you
 112           will want the FileSet to contain only files that are
 113           unlikely to change in the future (i.e.  a snapshot of
 114           most of your system after installing it).  After the
 115           base job has been run, when you are doing a Full save,
 116           you specify one or more Base jobs to be used.  All
 117           files that have been backed up in the Base job/jobs but
 118           not modified will then be excluded from the backup.
 119           During a restore, the Base jobs will be automatically
 120           pulled in where necessary.
 121
 122   Why:    This is something none of the competition does, as far as
 123           we know (except perhaps BackupPC, which is a Perl program that
 124           saves to disk only).  It is big win for the user, it
 125           makes Bacula stand out as offering a unique
 126           optimization that immediately saves time and money.
 127           Basically, imagine that you have 100 nearly identical
 128           Windows or Linux machine containing the OS and user
 129           files.  Now for the OS part, a Base job will be backed
 130           up once, and rather than making 100 copies of the OS,
 131           there will be only one.  If one or more of the systems
 132           have some files updated, no problem, they will be
 133           automatically restored.
 134
 135   Notes:  Huge savings in tape usage even for a single machine.
 136           Will require more resources because the DIR must send
 137           FD a list of files/attribs, and the FD must search the
 138           list and compare it for each file to be saved.
 139
 140 Item  4:  Implement from-client and to-client on restore command line.
 141    Date:  11 December 2006
 142   Origin: Discussion on Bacula-users entitled 'Scripted restores to
 143           different clients', December 2006
 144   Status: New feature request
 145
 146   What:   While using bconsole interactively, you can specify the client
 147           that a backup job is to be restored for, and then you can
 148           specify later a different client to send the restored files
 149           back to. However, using the 'restore' command with all options
 150           on the command line, this cannot be done, due to the ambiguous
 151           'client' parameter. Additionally, this parameter means different
 152           things depending on if it's specified on the command line or
 153           afterwards, in the Modify Job screens.
 154
 155      Why: This feature would enable restore jobs to be more completely
 156           automated, for example by a web or GUI front-end.
 157
 158    Notes: client can also be implied by specifying the jobid on the command
 159           line
 160
 161 Item  5:  Implement creation and maintenance of copy pools
 162   Date:   27 November 2005
 163   Origin: David Boyes (dboyes at sinenomine dot net)
 164   Status:
 165
 166   What:   I would like Bacula to have the capability to write copies
 167           of backed-up data on multiple physical volumes selected
 168           from different pools without transferring the data
 169           multiple times, and to accept any of the copy volumes
 170           as valid for restore.
 171
 172   Why:    In many cases, businesses are required to keep offsite
 173           copies of backup volumes, or just wish for simple
 174           protection against a human operator dropping a storage
 175           volume and damaging it. The ability to generate multiple
 176           volumes in the course of a single backup job allows
 177           customers to simple check out one copy and send it
 178           offsite, marking it as out of changer or otherwise
 179           unavailable. Currently, the library and magazine
 180           management capability in Bacula does not make this process
 181           simple.
 182
 183           Restores would use the copy of the data on the first
 184           available volume, in order of copy pool chain definition.
 185
 186           This is also a major scalability issue -- as the number of
 187           clients increases beyond several thousand, and the volume
 188           of data increases, transferring the data multiple times to
 189           produce additional copies of the backups will become
 190           physically impossible due to transfer speed
 191           issues. Generating multiple copies at server side will
 192           become the only practical option.
 193
 194   How:    I suspect that this will require adding a multiplexing
 195           SD that appears to be a SD to a specific FD, but 1-n FDs
 196           to the specific back end SDs managing the primary and copy
 197           pools.  Storage pools will also need to acquire parameters
 198           to define the pools to be used for copies.
 199
 200   Notes:  I would commit some of my developers' time if we can agree
 201           on the design and behavior.
 202
 203 Item  6:  Merge multiple backups (Synthetic Backup or Consolidation).
 204   Origin: Marc Cousin and Eric Bollengier
 205   Date:   15 November 2005
 206   Status: Waiting implementation. Depends on first implementing
 207           project Item 2 (Migration) which is now done.
 208
 209   What:   A merged backup is a backup made without connecting to the Client.
 210           It would be a Merge of existing backups into a single backup.
 211           In effect, it is like a restore but to the backup medium.
 212
 213           For instance, say that last Sunday we made a full backup.  Then
 214           all week long, we created incremental backups, in order to do
 215           them fast.  Now comes Sunday again, and we need another full.
 216           The merged backup makes it possible to do instead an incremental
 217           backup (during the night for instance), and then create a merged
 218           backup during the day, by using the full and incrementals from
 219           the week.  The merged backup will be exactly like a full made
 220           Sunday night on the tape, but the production interruption on the
 221           Client will be minimal, as the Client will only have to send
 222           incrementals.
 223
 224           In fact, if it's done correctly, you could merge all the
 225           Incrementals into single Incremental, or all the Incrementals
 226           and the last Differential into a new Differential, or the Full,
 227           last differential and all the Incrementals into a new Full
 228           backup.  And there is no need to involve the Client.
 229
 230   Why:    The benefit is that :
 231           - the Client just does an incremental ;
 232           - the merged backup on tape is just as a single full backup,
 233             and can be restored very fast.
 234
 235           This is also a way of reducing the backup data since the old
 236           data can then be pruned (or not) from the catalog, possibly
 237           allowing older volumes to be recycled
 238
 239 Item  7:  Deletion of Disk-Based Bacula Volumes
 240   Date:   Nov 25, 2005
 241   Origin: Ross Boylan <RossBoylan at stanfordalumni dot org> (edited
 242           by Kern)
 243   Status:
 244
 245    What:  Provide a way for Bacula to automatically remove Volumes
 246           from the filesystem, or optionally to truncate them.
 247           Obviously, the Volume must be pruned prior removal.
 248
 249   Why:    This would allow users more control over their Volumes and
 250           prevent disk based volumes from consuming too much space.
 251
 252   Notes:  The following two directives might do the trick:
 253
 254           Volume Data Retention = <time period>
 255           Remove Volume After = <time period>
 256
 257           The migration project should also remove a Volume that is
 258           migrated. This might also work for tape Volumes.
 259
 260 Item  8:  Implement a Python interface to the Bacula catalog.
 261   Date:   28 October 2005
 262   Origin: Kern
 263   Status:
 264
 265   What:   Implement an interface for Python scripts to access
 266           the catalog through Bacula.
 267
 268   Why:    This will permit users to customize Bacula through
 269           Python scripts.
 270
 271 Item  9:  Archival (removal) of User Files to Tape
 272
 273   Date:   Nov. 24/2005
 274
 275   Origin: Ray Pengelly [ray at biomed dot queensu dot ca
 276   Status:
 277
 278   What:   The ability to archive data to storage based on certain parameters
 279           such as age, size, or location.  Once the data has been written to
 280           storage and logged it is then pruned from the originating
 281           filesystem. Note! We are talking about user's files and not
 282           Bacula Volumes.
 283
 284   Why:    This would allow fully automatic storage management which becomes
 285           useful for large datastores.  It would also allow for auto-staging
 286           from one media type to another.
 287
 288           Example 1) Medical imaging needs to store large amounts of data.
 289           They decide to keep data on their servers for 6 months and then put
 290           it away for long term storage.  The server then finds all files
 291           older than 6 months writes them to tape.  The files are then removed
 292           from the server.
 293
 294           Example 2) All data that hasn't been accessed in 2 months could be
 295           moved from high-cost, fibre-channel disk storage to a low-cost
 296           large-capacity SATA disk storage pool which doesn't have as quick of
 297           access time.  Then after another 6 months (or possibly as one
 298           storage pool gets full) data is migrated to Tape.
 299
 300 Item 10:  Add Plug-ins to the FileSet Include statements.
 301   Date:   28 October 2005
 302   Origin:
 303   Status: Partially coded in 1.37 -- much more to do.
 304
 305   What:   Allow users to specify wild-card and/or regular
 306           expressions to be matched in both the Include and
 307           Exclude directives in a FileSet.  At the same time,
 308           allow users to define plug-ins to be called (based on
 309           regular expression/wild-card matching).
 310
 311   Why:    This would give the users the ultimate ability to control
 312           how files are backed up/restored.  A user could write a
 313           plug-in knows how to backup his Oracle database without
 314           stopping/starting it, for example.
 315
 316 Item 11:  Implement more Python events in Bacula.
 317   Date:   28 October 2005
 318   Origin: Kern
 319   Status:
 320
 321   What:   Allow Python scripts to be called at more places
 322           within Bacula and provide additional access to Bacula
 323           internal variables.
 324
 325   Why:    This will permit users to customize Bacula through
 326           Python scripts.
 327
 328   Notes:  Recycle event
 329           Scratch pool event
 330           NeedVolume event
 331           MediaFull event
 332
 333           Also add a way to get a listing of currently running
 334           jobs (possibly also scheduled jobs).
 335
 336
 337 Item 12:  Quick release of FD-SD connection after backup.
 338   Origin: Frank Volf (frank at deze dot org)
 339   Date:   17 November 2005
 340   Status:
 341
 342    What:  In the Bacula implementation a backup is finished after all data
 343           and attributes are successfully written to storage.  When using a
 344           tape backup it is very annoying that a backup can take a day,
 345           simply because the current tape (or whatever) is full and the
 346           administrator has not put a new one in.  During that time the
 347           system cannot be taken off-line, because there is still an open
 348           session between the storage daemon and the file daemon on the
 349           client.
 350
 351           Although this is a very good strategy for making "safe backups"
 352           This can be annoying for e.g.  laptops, that must remain
 353           connected until the backup is completed.
 354
 355           Using a new feature called "migration" it will be possible to
 356           spool first to harddisk (using a special 'spool' migration
 357           scheme) and then migrate the backup to tape.
 358
 359           There is still the problem of getting the attributes committed.
 360           If it takes a very long time to do, with the current code, the
 361           job has not terminated, and the File daemon is not freed up.  The
 362           Storage daemon should release the File daemon as soon as all the
 363           file data and all the attributes have been sent to it (the SD).
 364           Currently the SD waits until everything is on tape and all the
 365           attributes are transmitted to the Director before signaling
 366           completion to the FD. I don't think I would have any problem
 367           changing this.  The reason is that even if the FD reports back to
 368           the Dir that all is OK, the job will not terminate until the SD
 369           has done the same thing -- so in a way keeping the SD-FD link
 370           open to the very end is not really very productive ...
 371
 372    Why:   Makes backup of laptops much faster.
 373
 374
 375
 376 Item 13:  Implement huge exclude list support using hashing.
 377   Date:   28 October 2005
 378   Origin: Kern
 379   Status:
 380
 381   What:   Allow users to specify very large exclude list (currently
 382           more than about 1000 files is too many).
 383
 384   Why:    This would give the users the ability to exclude all
 385           files that are loaded with the OS (e.g. using rpms
 386           or debs). If the user can restore the base OS from
 387           CDs, there is no need to backup all those files. A
 388           complete restore would be to restore the base OS, then
 389           do a Bacula restore. By excluding the base OS files, the
 390           backup set will be *much* smaller.
 391
 392
 393 Item 14:  Allow skipping execution of Jobs
 394   Date:   29 November 2005
 395   Origin: Florian Schnabel <florian.schnabel at docufy dot de>
 396   Status:
 397
 398     What: An easy option to skip a certain job  on a certain date.
 399      Why: You could then easily skip tape backups on holidays.  Especially
 400           if you got no autochanger and can only fit one backup on a tape
 401           that would be really handy, other jobs could proceed normally
 402           and you won't get errors that way.
 403
 404
 405 Item 15:  Tray monitor window cleanups
 406   Origin: Alan Brown ajb2 at mssl dot ucl dot ac dot uk
 407   Date:   24 July 2006
 408   Status:
 409   What:   Resizeable and scrollable windows in the tray monitor.
 410
 411   Why:    With multiple clients, or with many jobs running, the displayed
 412           window often ends up larger than the available screen, making
 413           the trailing items difficult to read.
 414
 415
 416 Item 16:  Split documentation
 417   Origin: Maxx <maxxatworkat gmail dot com>
 418   Date:   27th July 2006
 419   Status:
 420
 421   What:   Split documentation in several books
 422
 423   Why:    Bacula manual has now more than 600 pages, and looking for
 424           implementation details is getting complicated.  I think
 425           it would be good to split the single volume in two or
 426           maybe three parts:
 427
 428           1) Introduction, requirements and tutorial, typically
 429              are useful only until first installation time
 430
 431           2) Basic installation and configuration, with all the
 432              gory details about the directives supported 3)
 433              Advanced Bacula: testing, troubleshooting, GUI and
 434              ancillary programs, security managements, scripting,
 435              etc.
 436
 437
 438
 439 Item 17:  Automatic promotion of backup levels
 440    Date:  19 January 2006
 441   Origin: Adam Thornton <athornton@sinenomine.net>
 442   Status:
 443
 444     What: Amanda has a feature whereby it estimates the space that a
 445           differential, incremental, and full backup would take.  If the
 446           difference in space required between the scheduled level and the next
 447           level up is beneath some user-defined critical threshold, the backup
 448           level is bumped to the next type.  Doing this minimizes the number of
 449           volumes necessary during a restore, with a fairly minimal cost in
 450           backup media space.
 451
 452     Why:  I know at least one (quite sophisticated and smart) user
 453           for whom the absence of this feature is a deal-breaker in terms of
 454           using Bacula; if we had it it would eliminate the one cool thing
 455           Amanda can do and we can't (at least, the one cool thing I know of).
 456
 457
 458 Item 18:  Add an override in Schedule for Pools based on backup types.
 459 Date:     19 Jan 2005
 460 Origin:   Chad Slater <chad.slater@clickfox.com>
 461 Status:
 462
 463   What:   Adding a FullStorage=BigTapeLibrary in the Schedule resource
 464           would help those of us who use different storage devices for different
 465           backup levels cope with the "auto-upgrade" of a backup.
 466
 467   Why:    Assume I add several new device to be backed up, i.e. several
 468           hosts with 1TB RAID.  To avoid tape switching hassles, incrementals are
 469           stored in a disk set on a 2TB RAID.  If you add these devices in the
 470           middle of the month, the incrementals are upgraded to "full" backups,
 471           but they try to use the same storage device as requested in the
 472           incremental job, filling up the RAID holding the differentials.  If we
 473           could override the Storage parameter for full and/or differential
 474           backups, then the Full job would use the proper Storage device, which
 475           has more capacity (i.e. a 8TB tape library.
 476
 477 Item 19:  An option to operate on all pools with update vol parameters
 478   Origin: Dmitriy Pinchukov <absh@bossdev.kiev.ua>
 479    Date:  16 August 2006
 480   Status:
 481
 482    What:  When I do update -> Volume parameters -> All Volumes
 483           from Pool, then I have to select pools one by one.  I'd like
 484           console to have an option like "0: All Pools" in the list of
 485           defined pools.
 486
 487    Why:   I have many pools and therefore unhappy with manually
 488           updating each of them using update -> Volume parameters -> All
 489           Volumes from Pool -> pool #.
 490
 491
 492
 493 Item 20:  Include JobID in spool file name ****DONE****
 494   Origin: Mark Bergman <mark.bergman@uphs.upenn.edu>
 495   Date:   Tue Aug 22 17:13:39 EDT 2006
 496   Status: Done. (patches/testing/project-include-jobid-in-spool-name.patch)
 497           No need to vote for this item.
 498
 499   What:   Change the name of the spool file to include the JobID
 500
 501   Why:    JobIDs are the common key used to refer to jobs, yet the
 502           spoolfile name doesn't include that information. The date/time
 503           stamp is useful (and should be retained).
 504
 505
 506
 507 Item 21:  Include timestamp of job launch in "stat clients" output
 508   Origin: Mark Bergman <mark.bergman@uphs.upenn.edu>
 509   Date:   Tue Aug 22 17:13:39 EDT 2006
 510   Status:
 511
 512   What:   The "stat clients" command doesn't include any detail on when
 513           the active backup jobs were launched.
 514
 515   Why:    Including the timestamp would make it much easier to decide whether
 516           a job is running properly.
 517
 518   Notes:  It may be helpful to have the output from "stat clients" formatted
 519           more like that from "stat dir" (and other commands), in a column
 520           format. The per-client information that's currently shown (level,
 521           client name, JobId, Volume, pool, device, Files, etc.) is good, but
 522           somewhat hard to parse (both programmatically and visually),
 523           particularly when there are many active clients.
 524
 525
 526
 527 Item 22:  Message mailing based on backup types
 528  Origin:  Evan Kaufman <evan.kaufman@gmail.com>
 529    Date:  January 6, 2006
 530  Status:
 531
 532    What:  In the "Messages" resource definitions, allowing messages
 533           to be mailed based on the type (backup, restore, etc.) and level
 534           (full, differential, etc) of job that created the originating
 535           message(s).
 536
 537  Why:     It would, for example, allow someone's boss to be emailed
 538           automatically only when a Full Backup job runs, so he can
 539           retrieve the tapes for offsite storage, even if the IT dept.
 540           doesn't (or can't) explicitly notify him.  At the same time, his
 541           mailbox wouldnt be filled by notifications of Verifies, Restores,
 542           or Incremental/Differential Backups (which would likely be kept
 543           onsite).
 544
 545  Notes:   One way this could be done is through additional message types, for example:
 546
 547    Messages {
 548      # email the boss only on full system backups
 549      Mail = boss@mycompany.com = full, !incremental, !differential, !restore,
 550             !verify, !admin
 551      # email us only when something breaks
 552      MailOnError = itdept@mycompany.com = all
 553    }
 554
 555
 556 Item 23:  Allow inclusion/exclusion of files in a fileset by creation/mod times
 557   Origin: Evan Kaufman <evan.kaufman@gmail.com>
 558   Date:   January 11, 2006
 559   Status:
 560
 561   What:   In the vein of the Wild and Regex directives in a Fileset's
 562           Options, it would be helpful to allow a user to include or exclude
 563           files and directories by creation or modification times.
 564
 565           You could factor the Exclude=yes|no option in much the same way it
 566           affects the Wild and Regex directives.  For example, you could exclude
 567           all files modified before a certain date:
 568
 569    Options {
 570      Exclude = yes
 571      Modified Before = ####
 572    }
 573
 574            Or you could exclude all files created/modified since a certain date:
 575
 576    Options {
 577       Exclude = yes
 578      Created Modified Since = ####
 579    }
 580
 581            The format of the time/date could be done several ways, say the number
 582            of seconds since the epoch:
 583            1137008553 = Jan 11 2006, 1:42:33PM   # result of `date +%s`
 584
 585            Or a human readable date in a cryptic form:
 586            20060111134233 = Jan 11 2006, 1:42:33PM   # YYYYMMDDhhmmss
 587
 588   Why:    I imagine a feature like this could have many uses. It would
 589           allow a user to do a full backup while excluding the base operating
 590           system files, so if I installed a Linux snapshot from a CD yesterday,
 591           I'll *exclude* all files modified *before* today.  If I need to
 592           recover the system, I use the CD I already have, plus the tape backup.
 593           Or if, say, a Windows client is hit by a particularly corrosive
 594           virus, and I need to *exclude* any files created/modified *since* the
 595           time of infection.
 596
 597   Notes:  Of course, this feature would work in concert with other
 598           in/exclude rules, and wouldnt override them (or each other).
 599
 600   Notes:  The directives I'd imagine would be along the lines of
 601           "[Created] [Modified] [Before|Since] = <date>".
 602           So one could compare against 'ctime' and/or 'mtime', but ONLY 'before'
 603            or 'since'.
 604
 605
 606 Item 24:  Add a scheduling syntax that permits weekly rotations
 607    Date:  15 December 2006
 608   Origin: Gregory Brauer (greg at wildbrain dot com)
 609   Status:
 610
 611    What:  Currently, Bacula only understands how to deal with weeks of the
 612           month or weeks of the year in schedules.  This makes it impossible
 613           to do a true weekly rotation of tapes.  There will always be a
 614           discontinuity that will require disruptive manual intervention at
 615           least monthly or yearly because week boundaries never align with
 616           month or year boundaries.
 617
 618           A solution would be to add a new syntax that defines (at least)
 619           a start timestamp, and repetition period.
 620
 621    Why:   Rotated backups done at weekly intervals are useful, and Bacula
 622           cannot currently do them without extensive hacking.
 623
 624    Notes: Here is an example syntax showing a 3-week rotation where full
 625           Backups would be performed every week on Saturday, and an
 626           incremental would be performed every week on Tuesday.  Each
 627           set of tapes could be removed from the loader for the following
 628           two cycles before coming back and being reused on the third
 629           week.  Since the execution times are determined by intervals
 630           from a given point in time, there will never be any issues with
 631           having to adjust to any sort of arbitrary time boundary.  In
 632           the example provided, I even define the starting schedule
 633           as crossing both a year and a month boundary, but the run times
 634           would be based on the "Repeat" value and would therefore happen
 635           weekly as desired.
 636
 637
 638           Schedule {
 639               Name = "Week 1 Rotation"
 640               #Saturday.  Would run Dec 30, Jan 20, Feb 10, etc.
 641               Run {
 642                   Options {
 643                       Type   = Full
 644                       Start  = 2006-12-30 01:00
 645                       Repeat = 3w
 646                   }
 647               }
 648               #Tuesday.  Would run Jan 2, Jan 23, Feb 13, etc.
 649               Run {
 650                   Options {
 651                       Type   = Incremental
 652                       Start  = 2007-01-02 01:00
 653                       Repeat = 3w
 654                   }
 655               }
 656           }
 657
 658           Schedule {
 659               Name = "Week 2 Rotation"
 660               #Saturday.  Would run Jan 6, Jan 27, Feb 17, etc.
 661               Run {
 662                   Options {
 663                       Type   = Full
 664                       Start  = 2007-01-06 01:00
 665                       Repeat = 3w
 666                   }
 667               }
 668               #Tuesday.  Would run Jan 9, Jan 30, Feb 20, etc.
 669               Run {
 670                   Options {
 671                       Type   = Incremental
 672                       Start  = 2007-01-09 01:00
 673                       Repeat = 3w
 674                   }
 675               }
 676           }
 677
 678           Schedule {
 679               Name = "Week 3 Rotation"
 680               #Saturday.  Would run Jan 13, Feb 3, Feb 24, etc.
 681               Run {
 682                   Options {
 683                       Type   = Full
 684                       Start  = 2007-01-13 01:00
 685                       Repeat = 3w
 686                   }
 687               }
 688               #Tuesday.  Would run Jan 16, Feb 6, Feb 27, etc.
 689               Run {
 690                   Options {
 691                       Type   = Incremental
 692                       Start  = 2007-01-16 01:00
 693                       Repeat = 3w
 694                   }
 695               }
 696           }
 697
 698
 699 Item 25:  Improve Bacula's tape and drive usage and cleaning management.
 700   Date:   8 November 2005, November 11, 2005
 701   Origin: Adam Thornton <athornton at sinenomine dot net>,
 702           Arno Lehmann <al at its-lehmann dot de>
 703   Status:
 704
 705   What:   Make Bacula manage tape life cycle information, tape reuse
 706           times and drive cleaning cycles.
 707
 708   Why:    All three parts of this project are important when operating
 709           backups.
 710           We need to know which tapes need replacement, and we need to
 711           make sure the drives are cleaned when necessary.  While many
 712           tape libraries and even autoloaders can handle all this
 713           automatically, support by Bacula can be helpful for smaller
 714           (older) libraries and single drives.  Limiting the number of
 715           times a tape is used might prevent tape errors when using
 716           tapes until the drives can't read it any more.  Also, checking
 717           drive status during operation can prevent some failures (as I
 718           [Arno] had to learn the hard way...)
 719
 720   Notes:  First, Bacula could (and even does, to some limited extent)
 721           record tape and drive usage.  For tapes, the number of mounts,
 722           the amount of data, and the time the tape has actually been
 723           running could be recorded.  Data fields for Read and Write
 724           time and Number of mounts already exist in the catalog (I'm
 725           not sure if VolBytes is the sum of all bytes ever written to
 726           that volume by Bacula).  This information can be important
 727           when determining which media to replace.  The ability to mark
 728           Volumes as "used up" after a given number of write cycles
 729           should also be implemented so that a tape is never actually
 730           worn out.  For the tape drives known to Bacula, similar
 731           information is interesting to determine the device status and
 732           expected life time: Time it's been Reading and Writing, number
 733           of tape Loads / Unloads / Errors.  This information is not yet
 734           recorded as far as I [Arno] know.  A new volume status would
 735           be necessary for the new state, like "Used up" or "Worn out".
 736           Volumes with this state could be used for restores, but not
 737           for writing. These volumes should be migrated first (assuming
 738           migration is implemented) and, once they are no longer needed,
 739           could be moved to a Trash pool.
 740
 741           The next step would be to implement a drive cleaning setup.
 742           Bacula already has knowledge about cleaning tapes.  Once it
 743           has some information about cleaning cycles (measured in drive
 744           run time, number of tapes used, or calender days, for example)
 745           it can automatically execute tape cleaning (with an
 746           autochanger, obviously) or ask for operator assistance loading
 747           a cleaning tape.
 748
 749           The final step would be to implement TAPEALERT checks not only
 750           when changing tapes and only sending the information to the
 751           administrator, but rather checking after each tape error,
 752           checking on a regular basis (for example after each tape
 753           file), and also before unloading and after loading a new tape.
 754           Then, depending on the drives TAPEALERT state and the known
 755           drive cleaning state Bacula could automatically schedule later
 756           cleaning, clean immediately, or inform the operator.
 757
 758           Implementing this would perhaps require another catalog change
 759           and perhaps major changes in SD code and the DIR-SD protocol,
 760           so I'd only consider this worth implementing if it would
 761           actually be used or even needed by many people.
 762
 763           Implementation of these projects could happen in three distinct
 764           sub-projects: Measuring Tape and Drive usage, retiring
 765           volumes, and handling drive cleaning and TAPEALERTs.
 766
 767 Item 26:  Implement support for stacking arbitrary stream filters, sinks.
 768 Date:     23 November 2006
 769 Origin:   Landon Fuller <landonf@threerings.net>
 770 Status:   Planning. Assigned to landonf.
 771
 772   What:   Implement support for the following:
 773           - Stacking arbitrary stream filters (eg, encryption, compression,
 774             sparse data handling))
 775           - Attaching file sinks to terminate stream filters (ie, write out
 776             the resultant data to a file)
 777           - Refactor the restoration state machine accordingly
 778
 779    Why:   The existing stream implementation suffers from the following:
 780            - All state (compression, encryption, stream restoration), is
 781              global across the entire restore process, for all streams. There are
 782              multiple entry and exit points in the restoration state machine, and
 783              thus multiple places where state must be allocated, deallocated,
 784              initialized, or reinitialized. This results in exceptional complexity
 785              for the author of a stream filter.
 786            - The developer must enumerate all possible combinations of filters
 787              and stream types (ie, win32 data with encryption, without encryption,
 788              with encryption AND compression, etc).
 789
 790   Notes:  This feature request only covers implementing the stream filters/
 791           sinks, and refactoring the file daemon's restoration implementation
 792           accordingly. If I have extra time, I will also rewrite the backup
 793           implementation. My intent in implementing the restoration first is to
 794           solve pressing bugs in the restoration handling, and to ensure that
 795           the new restore implementation handles existing backups correctly.
 796
 797           I do not plan on changing the network or tape data structures to
 798           support defining arbitrary stream filters, but supporting that
 799           functionality is the ultimate goal.
 800
 801           Assistance with either code or testing would be fantastic.
 802
 803 Item 27:  Allow FD to initiate a backup
 804   Origin: Frank Volf (frank at deze dot org)
 805   Date:   17 November 2005
 806   Status:
 807
 808    What:  Provide some means, possibly by a restricted console that
 809           allows a FD to initiate a backup, and that uses the connection
 810           established by the FD to the Director for the backup so that
 811           a Director that is firewalled can do the backup.
 812
 813    Why:   Makes backup of laptops much easier.
 814
 815 Item 28:  Directive/mode to backup only file changes, not entire file
 816   Date:   11 November 2005
 817   Origin: Joshua Kugler <joshua dot kugler at uaf dot edu>
 818           Marek Bajon <mbajon at bimsplus dot com dot pl>
 819   Status:
 820
 821   What:   Currently when a file changes, the entire file will be backed up in
 822           the next incremental or full backup.  To save space on the tapes
 823           it would be nice to have a mode whereby only the changes to the
 824           file would be backed up when it is changed.
 825
 826   Why:    This would save lots of space when backing up large files such as
 827           logs, mbox files, Outlook PST files and the like.
 828
 829   Notes:  This would require the usage of disk-based volumes as comparing
 830           files would not be feasible using a tape drive.
 831
 832 Item 29:  Automatic disabling of devices
 833    Date:  2005-11-11
 834   Origin: Peter Eriksson <peter at ifm.liu dot se>
 835   Status:
 836
 837    What:  After a configurable amount of fatal errors with a tape drive
 838           Bacula should automatically disable further use of a certain
 839           tape drive. There should also be "disable"/"enable" commands in
 840           the "bconsole" tool.
 841
 842    Why:   On a multi-drive jukebox there is a possibility of tape drives
 843           going bad during large backups (needing a cleaning tape run,
 844           tapes getting stuck). It would be advantageous if Bacula would
 845           automatically disable further use of a problematic tape drive
 846           after a configurable amount of errors has occurred.
 847
 848           An example: I have a multi-drive jukebox (6 drives, 380+ slots)
 849           where tapes occasionally get stuck inside the drive. Bacula will
 850           notice that the "mtx-changer" command will fail and then fail
 851           any backup jobs trying to use that drive. However, it will still
 852           keep on trying to run new jobs using that drive and fail -
 853           forever, and thus failing lots and lots of jobs... Since we have
 854           many drives Bacula could have just automatically disabled
 855           further use of that drive and used one of the other ones
 856           instead.
 857
 858 Item 30:  Incorporation of XACML2/SAML2 parsing
 859    Date:   19 January 2006
 860    Origin: Adam Thornton <athornton@sinenomine.net>
 861    Status: Blue sky
 862
 863    What:   XACML is "eXtensible Access Control Markup Language" and
 864           "SAML is the "Security Assertion Markup Language"--an XML standard
 865           for making statements about identity and authorization.  Having these
 866           would give us a framework to approach ACLs in a generic manner, and
 867           in a way flexible enough to support the four major sorts of ACLs I
 868           see as a concern to Bacula at this point, as well as (probably) to
 869           deal with new sorts of ACLs that may appear in the future.
 870
 871    Why:    Bacula is beginning to need to back up systems with ACLs
 872           that do not map cleanly onto traditional Unix permissions.  I see
 873           four sets of ACLs--in general, mutually incompatible with one
 874           another--that we're going to need to deal with.  These are: NTFS
 875           ACLs, POSIX ACLs, NFSv4 ACLS, and AFS ACLS.  (Some may question the
 876           relevance of AFS; AFS is one of Sine Nomine's core consulting
 877           businesses, and having a reputable file-level backup and restore
 878           technology for it (as Tivoli is probably going to drop AFS support
 879           soon since IBM no longer supports AFS) would be of huge benefit to
 880           our customers; we'd most likely create the AFS support at Sine Nomine
 881           for inclusion into the Bacula (and perhaps some changes to the
 882           OpenAFS volserver) core code.)
 883
 884           Now, obviously, Bacula already handles NTFS just fine.  However, I
 885           think there's a lot of value in implementing a generic ACL model, so
 886           that it's easy to support whatever particular instances of ACLs come
 887           down the pike: POSIX ACLS (think SELinux) and NFSv4 are the obvious
 888           things arriving in the Linux world in a big way in the near future.
 889           XACML, although overcomplicated for our needs, provides this
 890           framework, and we should be able to leverage other people's
 891           implementations to minimize the amount of work *we* have to do to get
 892           a generic ACL framework.  Basically, the costs of implementation are
 893           high, but they're largely both external to Bacula and already sunk.
 894
 895
 896 Item 31:  Clustered file-daemons
 897   Origin: Alan Brown ajb2 at mssl dot ucl dot ac dot uk
 898   Date:   24 July 2006
 899   Status:
 900   What:   A "virtual" filedaemon, which is actually a cluster of real ones.
 901
 902   Why:    In the case of clustered filesystems (SAN setups, GFS, or OCFS2, etc)
 903           multiple machines may have access to the same set of filesystems
 904
 905           For performance reasons, one may wish to initate backups from
 906           several of these machines simultaneously, instead of just using
 907           one backup source for the common clustered filesystem.
 908
 909           For obvious reasons, normally backups of $A-FD/$PATH and
 910           B-FD/$PATH are treated as different backup sets. In this case
 911           they are the same communal set.
 912
 913           Likewise when restoring, it would be easier to just specify
 914           one of the cluster machines and let bacula decide which to use.
 915
 916           This can be faked to some extent using DNS round robin entries
 917           and a virtual IP address, however it means "status client" will
 918           always give bogus answers. Additionally there is no way of
 919           spreading the load evenly among the servers.
 920
 921           What is required is something similar to the storage daemon
 922           autochanger directives, so that Bacula can keep track of
 923           operating backups/restores and direct new jobs to a "free"
 924           client.
 925
 926    Notes:
 927
 928 Item 32:  Commercial database support
 929   Origin: Russell Howe <russell_howe dot wreckage dot org>
 930   Date:   26 July 2006
 931   Status:
 932
 933   What:   It would be nice for the database backend to support more
 934           databases. I'm thinking of SQL Server at the moment, but I guess Oracle,
 935           DB2, MaxDB, etc are all candidates. SQL Server would presumably be
 936           implemented using FreeTDS or maybe an ODBC library?
 937
 938   Why:    We only really have one database server, which is MS SQL Server
 939           2000. Maintaining a second one for the backup software (we grew out of
 940           SQLite, which I liked, but which didn't work so well with our database
 941           size). We don't really have a machine with the resources to run
 942           postgres, and would rather only maintain a single DBMS. We're stuck with
 943           SQL Server because pretty much all the company's custom applications
 944           (written by consultants) are locked into SQL Server 2000. I can imagine
 945           this scenario is fairly common, and it would be nice to use the existing
 946           properly specced database server for storing Bacula's catalog, rather
 947           than having to run a second DBMS.
 948
 949
 950 Item 33:  Archive data
 951   Date:   15/5/2006
 952   Origin: calvin streeting calvin at absentdream dot com
 953   Status:
 954
 955   What:   The abilty to archive to media (dvd/cd) in a uncompressed format
 956           for dead filing (archiving not backing up)
 957
 958     Why:  At my works when jobs are finished and moved off of the main file
 959           servers (raid based systems) onto a simple linux file server (ide based
 960           system) so users can find old information without contacting the IT
 961           dept.
 962
 963           So this data dosn't realy change it only gets added to,
 964           But it also needs backing up.  At the moment it takes
 965           about 8 hours to back up our servers (working data) so
 966           rather than add more time to existing backups i am trying
 967           to implement a system where we backup the acrhive data to
 968           cd/dvd these disks would only need to be appended to
 969           (burn only new/changed files to new disks for off site
 970           storage).  basialy understand the differnce between
 971           achive data and live data.
 972
 973   Notes:  Scan the data and email me when it needs burning divide
 974           into predifind chunks keep a recored of what is on what
 975           disk make me a label (simple php->mysql=>pdf stuff) i
 976           could do this bit ability to save data uncompresed so
 977           it can be read in any other system (future proof data)
 978           save the catalog with the disk as some kind of menu
 979           system
 980
 981 Item 34:  Filesystem watch triggered backup.
 982   Date:   31 August 2006
 983   Origin: Jesper Krogh <jesper@krogh.cc>
 984   Status: Unimplemented, depends probably on "client initiated backups"
 985
 986   What:   With inotify and similar filesystem triggeret notification
 987           systems is it possible to have the file-daemon to monitor
 988           filesystem changes and initiate backup.
 989
 990   Why:    There are 2 situations where this is nice to have.
 991           1) It is possible to get a much finer-grained backup than
 992              the fixed schedules used now.. A file created and deleted
 993              a few hours later, can automatically be caught.
 994
 995           2) The introduced load on the system will probably be
 996              distributed more even on the system.
 997
 998   Notes:  This can be combined with configration that specifies
 999           something like: "at most every 15 minutes or when changes
1000           consumed XX MB".
1001
1002 Kern Notes: I would rather see this implemented by an external program
1003           that monitors the Filesystem changes, then uses the console
1004           to start the appropriate job.
1005
1006 Item 35:  Implement multiple numeric backup levels as supported by dump
1007 Date:     3 April 2006
1008 Origin:   Daniel Rich <drich@employees.org>
1009 Status:
1010 What:     Dump allows specification of backup levels numerically instead of just
1011           "full", "incr", and "diff".  In this system, at any given level, all
1012           files are backed up that were were modified since the last backup of a
1013           higher level (with 0 being the highest and 9 being the lowest).  A
1014           level 0 is therefore equivalent to a full, level 9 an incremental, and
1015           the levels 1 through 8 are varying levels of differentials.  For
1016           bacula's sake, these could be represented as "full", "incr", and
1017           "diff1", "diff2", etc.
1018
1019 Why:      Support of multiple backup levels would provide for more advanced backup
1020           rotation schemes such as "Towers of Hanoi".  This would allow better
1021           flexibility in performing backups, and can lead to shorter recover
1022           times.
1023
1024 Notes:    Legato Networker supports a similar system with full, incr, and 1-9 as
1025           levels.
1026
1027 Item 36:  Implement a server-side compression feature
1028   Date:   18 December 2006
1029   Origin: Vadim A. Umanski , e-mail umanski@ext.ru
1030   Status:
1031   What:   The ability to compress backup data on server receiving data
1032           instead of doing that on client sending data.
1033   Why:    The need is practical. I've got some machines that can send
1034           data to the network 4 or 5 times faster than compressing
1035           them (I've measured that). They're using fast enough SCSI/FC
1036           disk subsystems but rather slow CPUs (ex. UltraSPARC II).
1037           And the backup server has got a quite fast CPUs (ex. Dual P4
1038           Xeons) and quite a low load. When you have 20, 50 or 100 GB
1039           of raw data - running a job 4 to 5 times faster - that
1040           really matters. On the other hand, the data can be
1041           compressed 50% or better - so losing twice more space for
1042           disk backup is not good at all. And the network is all mine
1043           (I have a dedicated management/provisioning network) and I
1044           can get as high bandwidth as I need - 100Mbps, 1000Mbps...
1045           That's why the server-side compression feature is needed!
1046   Notes:
1047
1048 Item 37:  Cause daemons to use a specific IP address to source communications
1049  Origin:  Bill Moran <wmoran@collaborativefusion.com>
1050  Date:    18 Dec 2006
1051  Status:
1052  What:    Cause Bacula daemons (dir, fd, sd) to always use the ip address
1053           specified in the [DIR|DF|SD]Addr directive as the source IP
1054           for initiating communication.
1055  Why:     On complex networks, as well as extremely secure networks, it's
1056           not unusual to have multiple possible routes through the network.
1057           Often, each of these routes is secured by different policies
1058           (effectively, firewalls allow or deny different traffic depending
1059           on the source address)
1060           Unfortunately, it can sometimes be difficult or impossible to
1061           represent this in a system routing table, as the result is
1062           excessive subnetting that quickly exhausts available IP space.
1063           The best available workaround is to provide multiple IPs to
1064           a single machine that are all on the same subnet.  In order
1065           for this to work properly, applications must support the ability
1066           to bind outgoing connections to a specified address, otherwise
1067           the operating system will always choose the first IP that
1068           matches the required route.
1069  Notes:   Many other programs support this.  For example, the following
1070           can be configured in BIND:
1071           query-source address 10.0.0.1;
1072           transfer-source 10.0.0.2;
1073           Which means queries from this server will always come from
1074           10.0.0.1 and zone transfers will always originate from
1075           10.0.0.2.
1076
1077 Item 38:  Multiple threads in file daemon for the same job
1078   Date:   27 November 2005
1079   Origin: Ove Risberg (Ove.Risberg at octocode dot com)
1080   Status:
1081
1082   What:   I want the file daemon to start multiple threads for a backup
1083           job so the fastest possible backup can be made.
1084
1085           The file daemon could parse the FileSet information and start
1086           one thread for each File entry located on a separate
1087           filesystem.
1088
1089           A confiuration option in the job section should be used to
1090           enable or disable this feature. The confgutration option could
1091           specify the maximum number of threads in the file daemon.
1092
1093           If the theads could spool the data to separate spool files
1094           the restore process will not be much slower.
1095
1096   Why:    Multiple concurrent backups of a large fileserver with many
1097           disks and controllers will be much faster.
1098
1099 Item 39:  Restore only file attributes (permissions, ACL, owner, group...)
1100   Origin: Eric Bollengier
1101   Date:   30/12/2006
1102   Status:
1103
1104   What:   The goal of this project is to be able to restore only rights
1105           and attributes of files without crushing them.
1106
1107   Why:    Who have never had to repair a chmod -R 777, or a wild update
1108           of recursive right under Windows? At this time, you must have
1109           enough space to restore data, dump attributes (easy with acl,
1110           more complex with unix/windows rights) and apply them to your
1111           broken tree. With this options, it will be very easy to compare
1112           right or ACL over the time.
1113
1114   Notes:  If the file is here, we skip restore and we change rights.
1115           If the file isn't here, we can create an empty one and apply
1116           rights or do nothing.
1117
1118 Item 40:  Add an item to the restore option where you can select a pool
1119   Origin: kshatriyak at gmail dot com
1120     Date: 1/1/2006
1121   Status:
1122
1123     What: In the restore option (Select the most recent backup for a
1124           client) it would be useful to add an option where you can limit
1125           the selection to a certain pool.
1126
1127      Why: When using cloned jobs, most of the time you have 2 pools - a
1128           disk pool and a tape pool.  People who have 2 pools would like to
1129           select the most recent backup from disk, not from tape (tape
1130           would be only needed in emergency).  However, the most recent
1131           backup (which may just differ a second from the disk backup) may
1132           be on tape and would be selected.  The problem becomes bigger if
1133           you have a full and differential - the most "recent" full backup
1134           may be on disk, while the most recent differential may be on tape
1135           (though the differential on disk may differ even only a second or
1136           so).  Bacula will complain that the backups reside on different
1137           media then.  For now the only solution now when restoring things
1138           when you have 2 pools is to manually search for the right
1139           job-id's and enter them by hand, which is a bit fault tolerant.
1140
1141 ============= Empty Feature Request form ===========
1142 Item  n:  One line summary ...
1143   Date:   Date submitted
1144   Origin: Name and email of originator.
1145   Status:
1146
1147   What:   More detailed explanation ...
1148
1149   Why:    Why it is important ...
1150
1151   Notes:  Additional notes or features (omit if not used)
1152 ============== End Feature Request form ==============