git.sur5r.net Git - bacula/bacula/blob - bacula/projects

   1
   2 Projects:
   3                      Bacula Projects Roadmap
   4                     Status updated 15 December 2006
   5
   6
   7 Summary:
   8 Item  1:  Accurate restoration of renamed/deleted files
   9 Item  2:  Implement a Bacula GUI/management tool.
  10 Item  3:  Implement Base jobs.
  11 Item  4:  Implement from-client and to-client on restore command line.
  12 Item  5:  Implement creation and maintenance of copy pools
  13 Item  6:  Merge multiple backups (Synthetic Backup or Consolidation).
  14 Item  8:  Deletion of Disk-Based Bacula Volumes
  15 Item  9:  Implement a Python interface to the Bacula catalog.
  16 Item 10:  Archival (removal) of User Files to Tape
  17 Item 11:  Add Plug-ins to the FileSet Include statements.
  18 Item 12:  Implement more Python events in Bacula.
  19 Item 13:  Quick release of FD-SD connection after backup.
  20 Item 14:  Implement huge exclude list support using hashing.
  21 Item 15:  Allow skipping execution of Jobs
  22 Item 16:  Tray monitor window cleanups
  23 Item 17:  Split documentation
  24 Item 18:  Automatic promotion of backup levels
  25 Item 19:  Add an override in Schedule for Pools based on backup types.
  26 Item 20:  An option to operate on all pools with update vol parameters
  27 Item 21:  Include JobID in spool file name
  28 Item 22:  Include timestamp of job launch in "stat clients" output
  29 Item 23:  Message mailing based on backup types
  30 Item 24:  Allow inclusion/exclusion of files in a fileset by creation/mod times
  31 Item 25:  Add a scheduling syntax that permits weekly rotations
  32 Item 26:  Improve Bacula's tape and drive usage and cleaning management.
  33 Item 27:  Implement support for stacking arbitrary stream filters, sinks.
  34 Item 28:  Allow FD to initiate a backup
  35 Item 29:  Directive/mode to backup only file changes, not entire file
  36 Item 30:  Automatic disabling of devices
  37 Item 31:  Incorporation of XACML2/SAML2 parsing
  38 Item 32:  Clustered file-daemons
  39 Item 33:  Commercial database support
  40 Item 34:  Archive data
  41 Item 35:  Filesystem watch triggered backup.
  42 Item 36:  Implement multiple numeric backup levels as supported by dump
  43
  44
  45 Below, you will find more information on future projects:
  46
  47 Item  1:  Accurate restoration of renamed/deleted files
  48   Date:   28 November 2005
  49   Origin: Martin Simmons (martin at lispworks dot com)
  50   Status: Robert Nelson will implement this
  51
  52   What:   When restoring a fileset for a specified date (including "most
  53           recent"), Bacula should give you exactly the files and directories
  54           that existed at the time of the last backup prior to that date.
  55
  56           Currently this only works if the last backup was a Full backup.
  57           When the last backup was Incremental/Differential, files and
  58           directories that have been renamed or deleted since the last Full
  59           backup are not currently restored correctly.  Ditto for files with
  60           extra/fewer hard links than at the time of the last Full backup.
  61
  62   Why:    Incremental/Differential would be much more useful if this worked.
  63
  64   Notes:  Merging of multiple backups into a single one seems to
  65           rely on this working, otherwise the merged backups will not be
  66           truly equivalent to a Full backup.
  67
  68           Kern: notes shortened. This can be done without the need for
  69           inodes. It is essentially the same as the current Verify job,
  70           but one additional database record must be written, which does
  71           not need any database change.
  72
  73           Kern: see if we can correct restoration of directories if
  74           replace=ifnewer is set.  Currently, if the directory does not
  75           exist, a "dummy" directory is created, then when all the files
  76           are updated, the dummy directory is newer so the real values
  77           are not updated.
  78
  79 Item  2:  Implement a Bacula GUI/management tool.
  80   Origin: Kern
  81   Date:   28 October 2005
  82   Status:
  83
  84   What:   Implement a Bacula console, and management tools
  85           probably using Qt3 and C++.
  86
  87   Why:    Don't we already have a wxWidgets GUI?  Yes, but
  88           it is written in C++ and changes to the user interface
  89           must be hand tailored using C++ code. By developing
  90           the user interface using Qt designer, the interface
  91           can be very easily updated and most of the new Python
  92           code will be automatically created.  The user interface
  93           changes become very simple, and only the new features
  94           must be implement.  In addition, the code will be in
  95           Python, which will give many more users easy (or easier)
  96           access to making additions or modifications.
  97
  98  Notes:   There is a partial Python-GTK implementation
  99           Lucas Di Pentima <lucas at lunix dot com dot ar> but
 100           it is no longer being developed.
 101
 102
 103 Item  3:  Implement Base jobs.
 104   Date:   28 October 2005
 105   Origin: Kern
 106   Status:
 107
 108   What:   A base job is sort of like a Full save except that you
 109           will want the FileSet to contain only files that are
 110           unlikely to change in the future (i.e.  a snapshot of
 111           most of your system after installing it).  After the
 112           base job has been run, when you are doing a Full save,
 113           you specify one or more Base jobs to be used.  All
 114           files that have been backed up in the Base job/jobs but
 115           not modified will then be excluded from the backup.
 116           During a restore, the Base jobs will be automatically
 117           pulled in where necessary.
 118
 119   Why:    This is something none of the competition does, as far as
 120           we know (except perhaps BackupPC, which is a Perl program that
 121           saves to disk only).  It is big win for the user, it
 122           makes Bacula stand out as offering a unique
 123           optimization that immediately saves time and money.
 124           Basically, imagine that you have 100 nearly identical
 125           Windows or Linux machine containing the OS and user
 126           files.  Now for the OS part, a Base job will be backed
 127           up once, and rather than making 100 copies of the OS,
 128           there will be only one.  If one or more of the systems
 129           have some files updated, no problem, they will be
 130           automatically restored.
 131
 132   Notes:  Huge savings in tape usage even for a single machine.
 133           Will require more resources because the DIR must send
 134           FD a list of files/attribs, and the FD must search the
 135           list and compare it for each file to be saved.
 136
 137 Item  4:   Implement from-client and to-client on restore command line.
 138    Date:   11 December 2006
 139    Origin: Discussion on Bacula-users entitled 'Scripted restores to
 140            different clients', December 2006
 141    Status: New feature request
 142
 143    What:   While using bconsole interactively, you can specify the client
 144            that a backup job is to be restored for, and then you can
 145            specify later a different client to send the restored files
 146            back to. However, using the 'restore' command with all options
 147            on the command line, this cannot be done, due to the ambiguous
 148            'client' parameter. Additionally, this parameter means different
 149            things depending on if it's specified on the command line or
 150            afterwards, in the Modify Job screens.
 151
 152    Why: This feature would enable restore jobs to be more completely
 153            automated, for example by a web or GUI front-end.
 154
 155    Notes: client can also be implied by specifying the jobid on the command
 156            line
 157
 158 Item  5:  Implement creation and maintenance of copy pools
 159   Date:   27 November 2005
 160   Origin: David Boyes (dboyes at sinenomine dot net)
 161   Status:
 162
 163   What:   I would like Bacula to have the capability to write copies
 164           of backed-up data on multiple physical volumes selected
 165           from different pools without transferring the data
 166           multiple times, and to accept any of the copy volumes
 167           as valid for restore.
 168
 169   Why:    In many cases, businesses are required to keep offsite
 170           copies of backup volumes, or just wish for simple
 171           protection against a human operator dropping a storage
 172           volume and damaging it. The ability to generate multiple
 173           volumes in the course of a single backup job allows
 174           customers to simple check out one copy and send it
 175           offsite, marking it as out of changer or otherwise
 176           unavailable. Currently, the library and magazine
 177           management capability in Bacula does not make this process
 178           simple.
 179
 180           Restores would use the copy of the data on the first
 181           available volume, in order of copy pool chain definition.
 182
 183           This is also a major scalability issue -- as the number of
 184           clients increases beyond several thousand, and the volume
 185           of data increases, transferring the data multiple times to
 186           produce additional copies of the backups will become
 187           physically impossible due to transfer speed
 188           issues. Generating multiple copies at server side will
 189           become the only practical option.
 190
 191   How:    I suspect that this will require adding a multiplexing
 192           SD that appears to be a SD to a specific FD, but 1-n FDs
 193           to the specific back end SDs managing the primary and copy
 194           pools.  Storage pools will also need to acquire parameters
 195           to define the pools to be used for copies.
 196
 197   Notes:  I would commit some of my developers' time if we can agree
 198           on the design and behavior.
 199
 200 Item  6:  Merge multiple backups (Synthetic Backup or Consolidation).
 201   Origin: Marc Cousin and Eric Bollengier
 202   Date:   15 November 2005
 203   Status: Waiting implementation. Depends on first implementing
 204           project Item 2 (Migration) which is now done.
 205
 206   What:   A merged backup is a backup made without connecting to the Client.
 207           It would be a Merge of existing backups into a single backup.
 208           In effect, it is like a restore but to the backup medium.
 209
 210           For instance, say that last Sunday we made a full backup.  Then
 211           all week long, we created incremental backups, in order to do
 212           them fast.  Now comes Sunday again, and we need another full.
 213           The merged backup makes it possible to do instead an incremental
 214           backup (during the night for instance), and then create a merged
 215           backup during the day, by using the full and incrementals from
 216           the week.  The merged backup will be exactly like a full made
 217           Sunday night on the tape, but the production interruption on the
 218           Client will be minimal, as the Client will only have to send
 219           incrementals.
 220
 221           In fact, if it's done correctly, you could merge all the
 222           Incrementals into single Incremental, or all the Incrementals
 223           and the last Differential into a new Differential, or the Full,
 224           last differential and all the Incrementals into a new Full
 225           backup.  And there is no need to involve the Client.
 226
 227   Why:    The benefit is that :
 228           - the Client just does an incremental ;
 229           - the merged backup on tape is just as a single full backup,
 230             and can be restored very fast.
 231
 232           This is also a way of reducing the backup data since the old
 233           data can then be pruned (or not) from the catalog, possibly
 234           allowing older volumes to be recycled
 235
 236 Item  8:  Deletion of Disk-Based Bacula Volumes
 237   Date:   Nov 25, 2005
 238   Origin: Ross Boylan <RossBoylan at stanfordalumni dot org> (edited
 239           by Kern)
 240   Status:
 241
 242    What:  Provide a way for Bacula to automatically remove Volumes
 243           from the filesystem, or optionally to truncate them.
 244           Obviously, the Volume must be pruned prior removal.
 245
 246   Why:    This would allow users more control over their Volumes and
 247           prevent disk based volumes from consuming too much space.
 248
 249   Notes:  The following two directives might do the trick:
 250
 251           Volume Data Retention = <time period>
 252           Remove Volume After = <time period>
 253
 254           The migration project should also remove a Volume that is
 255           migrated. This might also work for tape Volumes.
 256
 257 Item  9:  Implement a Python interface to the Bacula catalog.
 258   Date:   28 October 2005
 259   Origin: Kern
 260   Status:
 261
 262   What:   Implement an interface for Python scripts to access
 263           the catalog through Bacula.
 264
 265   Why:    This will permit users to customize Bacula through
 266           Python scripts.
 267
 268 Item 10:  Archival (removal) of User Files to Tape
 269
 270   Date:   Nov. 24/2005
 271
 272   Origin: Ray Pengelly [ray at biomed dot queensu dot ca
 273   Status:
 274
 275   What:   The ability to archive data to storage based on certain parameters
 276           such as age, size, or location.  Once the data has been written to
 277           storage and logged it is then pruned from the originating
 278           filesystem. Note! We are talking about user's files and not
 279           Bacula Volumes.
 280
 281   Why:    This would allow fully automatic storage management which becomes
 282           useful for large datastores.  It would also allow for auto-staging
 283           from one media type to another.
 284
 285           Example 1) Medical imaging needs to store large amounts of data.
 286           They decide to keep data on their servers for 6 months and then put
 287           it away for long term storage.  The server then finds all files
 288           older than 6 months writes them to tape.  The files are then removed
 289           from the server.
 290
 291           Example 2) All data that hasn't been accessed in 2 months could be
 292           moved from high-cost, fibre-channel disk storage to a low-cost
 293           large-capacity SATA disk storage pool which doesn't have as quick of
 294           access time.  Then after another 6 months (or possibly as one
 295           storage pool gets full) data is migrated to Tape.
 296
 297 Item 11:  Add Plug-ins to the FileSet Include statements.
 298   Date:   28 October 2005
 299   Origin:
 300   Status: Partially coded in 1.37 -- much more to do.
 301
 302   What:   Allow users to specify wild-card and/or regular
 303           expressions to be matched in both the Include and
 304           Exclude directives in a FileSet.  At the same time,
 305           allow users to define plug-ins to be called (based on
 306           regular expression/wild-card matching).
 307
 308   Why:    This would give the users the ultimate ability to control
 309           how files are backed up/restored.  A user could write a
 310           plug-in knows how to backup his Oracle database without
 311           stopping/starting it, for example.
 312
 313 Item 12:  Implement more Python events in Bacula.
 314   Date:   28 October 2005
 315   Origin: Kern
 316   Status:
 317
 318   What:   Allow Python scripts to be called at more places
 319           within Bacula and provide additional access to Bacula
 320           internal variables.
 321
 322   Why:    This will permit users to customize Bacula through
 323           Python scripts.
 324
 325   Notes:  Recycle event
 326           Scratch pool event
 327           NeedVolume event
 328           MediaFull event
 329
 330           Also add a way to get a listing of currently running
 331           jobs (possibly also scheduled jobs).
 332
 333
 334 Item 13:  Quick release of FD-SD connection after backup.
 335   Origin: Frank Volf (frank at deze dot org)
 336   Date:   17 November 2005
 337   Status:
 338
 339    What:  In the Bacula implementation a backup is finished after all data
 340           and attributes are successfully written to storage.  When using a
 341           tape backup it is very annoying that a backup can take a day,
 342           simply because the current tape (or whatever) is full and the
 343           administrator has not put a new one in.  During that time the
 344           system cannot be taken off-line, because there is still an open
 345           session between the storage daemon and the file daemon on the
 346           client.
 347
 348           Although this is a very good strategy for making "safe backups"
 349           This can be annoying for e.g.  laptops, that must remain
 350           connected until the backup is completed.
 351
 352           Using a new feature called "migration" it will be possible to
 353           spool first to harddisk (using a special 'spool' migration
 354           scheme) and then migrate the backup to tape.
 355
 356           There is still the problem of getting the attributes committed.
 357           If it takes a very long time to do, with the current code, the
 358           job has not terminated, and the File daemon is not freed up.  The
 359           Storage daemon should release the File daemon as soon as all the
 360           file data and all the attributes have been sent to it (the SD).
 361           Currently the SD waits until everything is on tape and all the
 362           attributes are transmitted to the Director before signaling
 363           completion to the FD. I don't think I would have any problem
 364           changing this.  The reason is that even if the FD reports back to
 365           the Dir that all is OK, the job will not terminate until the SD
 366           has done the same thing -- so in a way keeping the SD-FD link
 367           open to the very end is not really very productive ...
 368
 369    Why:   Makes backup of laptops much faster.
 370
 371
 372
 373 Item 14:  Implement huge exclude list support using hashing.
 374   Date:   28 October 2005
 375   Origin: Kern
 376   Status:
 377
 378   What:   Allow users to specify very large exclude list (currently
 379           more than about 1000 files is too many).
 380
 381   Why:    This would give the users the ability to exclude all
 382           files that are loaded with the OS (e.g. using rpms
 383           or debs). If the user can restore the base OS from
 384           CDs, there is no need to backup all those files. A
 385           complete restore would be to restore the base OS, then
 386           do a Bacula restore. By excluding the base OS files, the
 387           backup set will be *much* smaller.
 388
 389
 390 Item 15:  Allow skipping execution of Jobs
 391   Date:   29 November 2005
 392   Origin: Florian Schnabel <florian.schnabel at docufy dot de>
 393   Status:
 394
 395      What: An easy option to skip a certain job  on a certain date.
 396      Why:  You could then easily skip tape backups on holidays.  Especially
 397            if you got no autochanger and can only fit one backup on a tape
 398            that would be really handy, other jobs could proceed normally
 399            and you won't get errors that way.
 400
 401
 402 Item 16:  Tray monitor window cleanups
 403   Origin: Alan Brown ajb2 at mssl dot ucl dot ac dot uk
 404   Date:   24 July 2006
 405   Status:
 406   What:   Resizeable and scrollable windows in the tray monitor.
 407
 408   Why:    With multiple clients, or with many jobs running, the displayed
 409           window often ends up larger than the available screen, making
 410           the trailing items difficult to read.
 411
 412
 413 Item 17:  Split documentation
 414   Origin: Maxx <maxxatworkat gmail dot com>
 415   Date:   27th July 2006
 416   Status:
 417
 418   What:   Split documentation in several books
 419
 420   Why:    Bacula manual has now more than 600 pages, and looking for
 421           implementation details is getting complicated.  I think
 422           it would be good to split the single volume in two or
 423           maybe three parts:
 424
 425           1) Introduction, requirements and tutorial, typically
 426              are useful only until first installation time
 427
 428           2) Basic installation and configuration, with all the
 429              gory details about the directives supported 3)
 430              Advanced Bacula: testing, troubleshooting, GUI and
 431              ancillary programs, security managements, scripting,
 432              etc.
 433
 434
 435
 436 Item 18:  Automatic promotion of backup levels
 437    Date:   19 January 2006
 438    Origin: Adam Thornton <athornton@sinenomine.net>
 439    Status: Blue sky
 440
 441    What: Amanda has a feature whereby it estimates the space that a
 442          differential, incremental, and full backup would take.  If the
 443          difference in space required between the scheduled level and the next
 444          level up is beneath some user-defined critical threshold, the backup
 445          level is bumped to the next type.  Doing this minimizes the number of
 446          volumes necessary during a restore, with a fairly minimal cost in
 447          backup media space.
 448
 449    Why:  I know at least one (quite sophisticated and smart) user
 450          for whom the absence of this feature is a deal-breaker in terms of
 451          using Bacula; if we had it it would eliminate the one cool thing
 452          Amanda can do and we can't (at least, the one cool thing I know of).
 453
 454
 455 Item 19:  Add an override in Schedule for Pools based on backup types.
 456 Date:     19 Jan 2005
 457 Origin:   Chad Slater <chad.slater@clickfox.com>
 458 Status:
 459
 460   What:   Adding a FullStorage=BigTapeLibrary in the Schedule resource
 461           would help those of us who use different storage devices for different
 462           backup levels cope with the "auto-upgrade" of a backup.
 463
 464   Why:    Assume I add several new device to be backed up, i.e. several
 465           hosts with 1TB RAID.  To avoid tape switching hassles, incrementals are
 466           stored in a disk set on a 2TB RAID.  If you add these devices in the
 467           middle of the month, the incrementals are upgraded to "full" backups,
 468           but they try to use the same storage device as requested in the
 469           incremental job, filling up the RAID holding the differentials.  If we
 470           could override the Storage parameter for full and/or differential
 471           backups, then the Full job would use the proper Storage device, which
 472           has more capacity (i.e. a 8TB tape library.
 473
 474 Item 20:  An option to operate on all pools with update vol parameters
 475    Origin: Dmitriy Pinchukov <absh@bossdev.kiev.ua>
 476    Date:   16 August 2006
 477    Status:
 478
 479    What:  When I do update -> Volume parameters -> All Volumes
 480           from Pool, then I have to select pools one by one.  I'd like
 481           console to have an option like "0: All Pools" in the list of
 482           defined pools.
 483
 484    Why:   I have many pools and therefore unhappy with manually
 485           updating each of them using update -> Volume parameters -> All
 486           Volumes from Pool -> pool #.
 487
 488
 489
 490 Item 21:  Include JobID in spool file name
 491   Origin: Mark Bergman <mark.bergman@uphs.upenn.edu>
 492   Date:   Tue Aug 22 17:13:39 EDT 2006
 493   Status:
 494
 495   What:   Change the name of the spool file to include the JobID
 496
 497   Why:    JobIDs are the common key used to refer to jobs, yet the
 498           spoolfile name doesn't include that information. The date/time
 499           stamp is useful (and should be retained).
 500
 501
 502
 503 Item 22:  Include timestamp of job launch in "stat clients" output
 504   Origin: Mark Bergman <mark.bergman@uphs.upenn.edu>
 505   Date:   Tue Aug 22 17:13:39 EDT 2006
 506   Status:
 507
 508   What:   The "stat clients" command doesn't include any detail on when
 509           the active backup jobs were launched.
 510
 511   Why:    Including the timestamp would make it much easier to decide whether
 512           a job is running properly.
 513
 514   Notes:  It may be helpful to have the output from "stat clients" formatted
 515           more like that from "stat dir" (and other commands), in a column
 516           format. The per-client information that's currently shown (level,
 517           client name, JobId, Volume, pool, device, Files, etc.) is good, but
 518           somewhat hard to parse (both programmatically and visually),
 519           particularly when there are many active clients.
 520
 521
 522
 523 Item 23:  Message mailing based on backup types
 524 Origin:  Evan Kaufman <evan.kaufman@gmail.com>
 525   Date:  January 6, 2006
 526 Status:
 527
 528   What:  In the "Messages" resource definitions, allowing messages
 529          to be mailed based on the type (backup, restore, etc.) and level
 530          (full, differential, etc) of job that created the originating
 531          message(s).
 532
 533 Why:     It would, for example, allow someone's boss to be emailed
 534          automatically only when a Full Backup job runs, so he can
 535          retrieve the tapes for offsite storage, even if the IT dept.
 536          doesn't (or can't) explicitly notify him.  At the same time, his
 537          mailbox wouldnt be filled by notifications of Verifies, Restores,
 538          or Incremental/Differential Backups (which would likely be kept
 539          onsite).
 540
 541 Notes:   One way this could be done is through additional message types, for example:
 542
 543    Messages {
 544      # email the boss only on full system backups
 545      Mail = boss@mycompany.com = full, !incremental, !differential, !restore,
 546             !verify, !admin
 547      # email us only when something breaks
 548      MailOnError = itdept@mycompany.com = all
 549    }
 550
 551
 552 Item 24:  Allow inclusion/exclusion of files in a fileset by creation/mod times
 553   Origin: Evan Kaufman <evan.kaufman@gmail.com>
 554   Date:   January 11, 2006
 555   Status:
 556
 557   What:   In the vein of the Wild and Regex directives in a Fileset's
 558           Options, it would be helpful to allow a user to include or exclude
 559           files and directories by creation or modification times.
 560
 561           You could factor the Exclude=yes|no option in much the same way it
 562           affects the Wild and Regex directives.  For example, you could exclude
 563           all files modified before a certain date:
 564
 565    Options {
 566      Exclude = yes
 567      Modified Before = ####
 568    }
 569
 570            Or you could exclude all files created/modified since a certain date:
 571
 572    Options {
 573       Exclude = yes
 574      Created Modified Since = ####
 575    }
 576
 577            The format of the time/date could be done several ways, say the number
 578            of seconds since the epoch:
 579            1137008553 = Jan 11 2006, 1:42:33PM   # result of `date +%s`
 580
 581            Or a human readable date in a cryptic form:
 582            20060111134233 = Jan 11 2006, 1:42:33PM   # YYYYMMDDhhmmss
 583
 584   Why:    I imagine a feature like this could have many uses. It would
 585           allow a user to do a full backup while excluding the base operating
 586           system files, so if I installed a Linux snapshot from a CD yesterday,
 587           I'll *exclude* all files modified *before* today.  If I need to
 588           recover the system, I use the CD I already have, plus the tape backup.
 589           Or if, say, a Windows client is hit by a particularly corrosive
 590           virus, and I need to *exclude* any files created/modified *since* the
 591           time of infection.
 592
 593   Notes:  Of course, this feature would work in concert with other
 594           in/exclude rules, and wouldnt override them (or each other).
 595
 596   Notes:  The directives I'd imagine would be along the lines of
 597           "[Created] [Modified] [Before|Since] = <date>".
 598           So one could compare against 'ctime' and/or 'mtime', but ONLY 'before'
 599            or 'since'.
 600
 601
 602 Item 25:  Add a scheduling syntax that permits weekly rotations
 603    Date:  15 December 2006
 604   Origin: Gregory Brauer (greg at wildbrain dot com)
 605   Status:
 606
 607    What:  Currently, Bacula only understands how to deal with weeks of the
 608           month or weeks of the year in schedules.  This makes it impossible
 609           to do a true weekly rotation of tapes.  There will always be a
 610           discontinuity that will require disruptive manual intervention at
 611           least monthly or yearly because week boundaries never align with
 612           month or year boundaries.
 613
 614           A solution would be to add a new syntax that defines (at least)
 615           a start timestamp, and repetition period.
 616
 617    Why:   Rotated backups done at weekly intervals are useful, and Bacula
 618           cannot currently do them without extensive hacking.
 619
 620    Notes: Here is an example syntax showing a 3-week rotation where full
 621           Backups would be performed every week on Saturday, and an
 622           incremental would be performed every week on Tuesday.  Each
 623           set of tapes could be removed from the loader for the following
 624           two cycles before coming back and being reused on the third
 625           week.  Since the execution times are determined by intervals
 626           from a given point in time, there will never be any issues with
 627           having to adjust to any sort of arbitrary time boundary.  In
 628           the example provided, I even define the starting schedule
 629           as crossing both a year and a month boundary, but the run times
 630           would be based on the "Repeat" value and would therefore happen
 631           weekly as desired.
 632
 633
 634           Schedule {
 635               Name = "Week 1 Rotation"
 636               #Saturday.  Would run Dec 30, Jan 20, Feb 10, etc.
 637               Run {
 638                   Options {
 639                       Type   = Full
 640                       Start  = 2006-12-30 01:00
 641                       Repeat = 3w
 642                   }
 643               }
 644               #Tuesday.  Would run Jan 2, Jan 23, Feb 13, etc.
 645               Run {
 646                   Options {
 647                       Type   = Incremental
 648                       Start  = 2007-01-02 01:00
 649                       Repeat = 3w
 650                   }
 651               }
 652           }
 653
 654           Schedule {
 655               Name = "Week 2 Rotation"
 656               #Saturday.  Would run Jan 6, Jan 27, Feb 17, etc.
 657               Run {
 658                   Options {
 659                       Type   = Full
 660                       Start  = 2007-01-06 01:00
 661                       Repeat = 3w
 662                   }
 663               }
 664               #Tuesday.  Would run Jan 9, Jan 30, Feb 20, etc.
 665               Run {
 666                   Options {
 667                       Type   = Incremental
 668                       Start  = 2007-01-09 01:00
 669                       Repeat = 3w
 670                   }
 671               }
 672           }
 673
 674           Schedule {
 675               Name = "Week 3 Rotation"
 676               #Saturday.  Would run Jan 13, Feb 3, Feb 24, etc.
 677               Run {
 678                   Options {
 679                       Type   = Full
 680                       Start  = 2007-01-13 01:00
 681                       Repeat = 3w
 682                   }
 683               }
 684               #Tuesday.  Would run Jan 16, Feb 6, Feb 27, etc.
 685               Run {
 686                   Options {
 687                       Type   = Incremental
 688                       Start  = 2007-01-16 01:00
 689                       Repeat = 3w
 690                   }
 691               }
 692           }
 693
 694
 695 Item 26:  Improve Bacula's tape and drive usage and cleaning management.
 696   Date:   8 November 2005, November 11, 2005
 697   Origin: Adam Thornton <athornton at sinenomine dot net>,
 698           Arno Lehmann <al at its-lehmann dot de>
 699   Status:
 700
 701   What:   Make Bacula manage tape life cycle information, tape reuse
 702           times and drive cleaning cycles.
 703
 704   Why:    All three parts of this project are important when operating
 705           backups.
 706           We need to know which tapes need replacement, and we need to
 707           make sure the drives are cleaned when necessary.  While many
 708           tape libraries and even autoloaders can handle all this
 709           automatically, support by Bacula can be helpful for smaller
 710           (older) libraries and single drives.  Limiting the number of
 711           times a tape is used might prevent tape errors when using
 712           tapes until the drives can't read it any more.  Also, checking
 713           drive status during operation can prevent some failures (as I
 714           [Arno] had to learn the hard way...)
 715
 716   Notes:  First, Bacula could (and even does, to some limited extent)
 717           record tape and drive usage.  For tapes, the number of mounts,
 718           the amount of data, and the time the tape has actually been
 719           running could be recorded.  Data fields for Read and Write
 720           time and Number of mounts already exist in the catalog (I'm
 721           not sure if VolBytes is the sum of all bytes ever written to
 722           that volume by Bacula).  This information can be important
 723           when determining which media to replace.  The ability to mark
 724           Volumes as "used up" after a given number of write cycles
 725           should also be implemented so that a tape is never actually
 726           worn out.  For the tape drives known to Bacula, similar
 727           information is interesting to determine the device status and
 728           expected life time: Time it's been Reading and Writing, number
 729           of tape Loads / Unloads / Errors.  This information is not yet
 730           recorded as far as I [Arno] know.  A new volume status would
 731           be necessary for the new state, like "Used up" or "Worn out".
 732           Volumes with this state could be used for restores, but not
 733           for writing. These volumes should be migrated first (assuming
 734           migration is implemented) and, once they are no longer needed,
 735           could be moved to a Trash pool.
 736
 737           The next step would be to implement a drive cleaning setup.
 738           Bacula already has knowledge about cleaning tapes.  Once it
 739           has some information about cleaning cycles (measured in drive
 740           run time, number of tapes used, or calender days, for example)
 741           it can automatically execute tape cleaning (with an
 742           autochanger, obviously) or ask for operator assistance loading
 743           a cleaning tape.
 744
 745           The final step would be to implement TAPEALERT checks not only
 746           when changing tapes and only sending the information to the
 747           administrator, but rather checking after each tape error,
 748           checking on a regular basis (for example after each tape
 749           file), and also before unloading and after loading a new tape.
 750           Then, depending on the drives TAPEALERT state and the known
 751           drive cleaning state Bacula could automatically schedule later
 752           cleaning, clean immediately, or inform the operator.
 753
 754           Implementing this would perhaps require another catalog change
 755           and perhaps major changes in SD code and the DIR-SD protocol,
 756           so I'd only consider this worth implementing if it would
 757           actually be used or even needed by many people.
 758
 759           Implementation of these projects could happen in three distinct
 760           sub-projects: Measuring Tape and Drive usage, retiring
 761           volumes, and handling drive cleaning and TAPEALERTs.
 762
 763 Item 27:  Implement support for stacking arbitrary stream filters, sinks.
 764 Date:     23 November 2006
 765 Origin:   Landon Fuller <landonf@threerings.net>
 766 Status:   Planning. Assigned to landonf.
 767
 768 What:
 769         Implement support for the following:
 770         - Stacking arbitrary stream filters (eg, encryption, compression,
 771           sparse data handling))
 772         - Attaching file sinks to terminate stream filters (ie, write out
 773           the resultant data to a file)
 774         - Refactor the restoration state machine accordingly
 775
 776 Why:
 777         The existing stream implementation suffers from the following:
 778          - All state (compression, encryption, stream restoration), is
 779            global across the entire restore process, for all streams. There are
 780            multiple entry and exit points in the restoration state machine, and
 781            thus multiple places where state must be allocated, deallocated,
 782            initialized, or reinitialized. This results in exceptional complexity
 783            for the author of a stream filter.
 784          - The developer must enumerate all possible combinations of filters
 785            and stream types (ie, win32 data with encryption, without encryption,
 786            with encryption AND compression, etc).
 787
 788 Notes:
 789         This feature request only covers implementing the stream filters/
 790         sinks, and refactoring the file daemon's restoration implementation
 791         accordingly. If I have extra time, I will also rewrite the backup
 792         implementation. My intent in implementing the restoration first is to
 793         solve pressing bugs in the restoration handling, and to ensure that
 794         the new restore implementation handles existing backups correctly.
 795
 796         I do not plan on changing the network or tape data structures to
 797         support defining arbitrary stream filters, but supporting that
 798         functionality is the ultimate goal.
 799
 800         Assistance with either code or testing would be fantastic.
 801
 802 Item 28:  Allow FD to initiate a backup
 803   Origin: Frank Volf (frank at deze dot org)
 804   Date:   17 November 2005
 805   Status:
 806
 807    What:  Provide some means, possibly by a restricted console that
 808           allows a FD to initiate a backup, and that uses the connection
 809           established by the FD to the Director for the backup so that
 810           a Director that is firewalled can do the backup.
 811
 812    Why:   Makes backup of laptops much easier.
 813
 814 Item 29:  Directive/mode to backup only file changes, not entire file
 815   Date:   11 November 2005
 816   Origin: Joshua Kugler <joshua dot kugler at uaf dot edu>
 817           Marek Bajon <mbajon at bimsplus dot com dot pl>
 818   Status:
 819
 820   What:   Currently when a file changes, the entire file will be backed up in
 821           the next incremental or full backup.  To save space on the tapes
 822           it would be nice to have a mode whereby only the changes to the
 823           file would be backed up when it is changed.
 824
 825   Why:    This would save lots of space when backing up large files such as
 826           logs, mbox files, Outlook PST files and the like.
 827
 828   Notes:  This would require the usage of disk-based volumes as comparing
 829           files would not be feasible using a tape drive.
 830
 831 Item 30:  Automatic disabling of devices
 832    Date:   2005-11-11
 833    Origin: Peter Eriksson <peter at ifm.liu dot se>
 834    Status:
 835
 836    What:  After a configurable amount of fatal errors with a tape drive
 837           Bacula should automatically disable further use of a certain
 838           tape drive. There should also be "disable"/"enable" commands in
 839           the "bconsole" tool.
 840
 841    Why:   On a multi-drive jukebox there is a possibility of tape drives
 842           going bad during large backups (needing a cleaning tape run,
 843           tapes getting stuck). It would be advantageous if Bacula would
 844           automatically disable further use of a problematic tape drive
 845           after a configurable amount of errors has occurred.
 846
 847           An example: I have a multi-drive jukebox (6 drives, 380+ slots)
 848           where tapes occasionally get stuck inside the drive. Bacula will
 849           notice that the "mtx-changer" command will fail and then fail
 850           any backup jobs trying to use that drive. However, it will still
 851           keep on trying to run new jobs using that drive and fail -
 852           forever, and thus failing lots and lots of jobs... Since we have
 853           many drives Bacula could have just automatically disabled
 854           further use of that drive and used one of the other ones
 855           instead.
 856
 857 Item 31:  Incorporation of XACML2/SAML2 parsing
 858    Date:   19 January 2006
 859    Origin: Adam Thornton <athornton@sinenomine.net>
 860    Status: Blue sky
 861
 862    What:   XACML is "eXtensible Access Control Markup Language" and
 863           "SAML is the "Security Assertion Markup Language"--an XML standard
 864           for making statements about identity and authorization.  Having these
 865           would give us a framework to approach ACLs in a generic manner, and
 866           in a way flexible enough to support the four major sorts of ACLs I
 867           see as a concern to Bacula at this point, as well as (probably) to
 868           deal with new sorts of ACLs that may appear in the future.
 869
 870    Why:    Bacula is beginning to need to back up systems with ACLs
 871           that do not map cleanly onto traditional Unix permissions.  I see
 872           four sets of ACLs--in general, mutually incompatible with one
 873           another--that we're going to need to deal with.  These are: NTFS
 874           ACLs, POSIX ACLs, NFSv4 ACLS, and AFS ACLS.  (Some may question the
 875           relevance of AFS; AFS is one of Sine Nomine's core consulting
 876           businesses, and having a reputable file-level backup and restore
 877           technology for it (as Tivoli is probably going to drop AFS support
 878           soon since IBM no longer supports AFS) would be of huge benefit to
 879           our customers; we'd most likely create the AFS support at Sine Nomine
 880           for inclusion into the Bacula (and perhaps some changes to the
 881           OpenAFS volserver) core code.)
 882
 883           Now, obviously, Bacula already handles NTFS just fine.  However, I
 884           think there's a lot of value in implementing a generic ACL model, so
 885           that it's easy to support whatever particular instances of ACLs come
 886           down the pike: POSIX ACLS (think SELinux) and NFSv4 are the obvious
 887           things arriving in the Linux world in a big way in the near future.
 888           XACML, although overcomplicated for our needs, provides this
 889           framework, and we should be able to leverage other people's
 890           implementations to minimize the amount of work *we* have to do to get
 891           a generic ACL framework.  Basically, the costs of implementation are
 892           high, but they're largely both external to Bacula and already sunk.
 893
 894
 895 Item 32:  Clustered file-daemons
 896   Origin: Alan Brown ajb2 at mssl dot ucl dot ac dot uk
 897   Date:   24 July 2006
 898   Status:
 899   What:   A "virtual" filedaemon, which is actually a cluster of real ones.
 900
 901   Why:    In the case of clustered filesystems (SAN setups, GFS, or OCFS2, etc)
 902           multiple machines may have access to the same set of filesystems
 903
 904           For performance reasons, one may wish to initate backups from
 905           several of these machines simultaneously, instead of just using
 906           one backup source for the common clustered filesystem.
 907
 908           For obvious reasons, normally backups of $A-FD/$PATH and
 909           B-FD/$PATH are treated as different backup sets. In this case
 910           they are the same communal set.
 911
 912           Likewise when restoring, it would be easier to just specify
 913           one of the cluster machines and let bacula decide which to use.
 914
 915           This can be faked to some extent using DNS round robin entries
 916           and a virtual IP address, however it means "status client" will
 917           always give bogus answers. Additionally there is no way of
 918           spreading the load evenly among the servers.
 919
 920           What is required is something similar to the storage daemon
 921           autochanger directives, so that Bacula can keep track of
 922           operating backups/restores and direct new jobs to a "free"
 923           client.
 924
 925    Notes:
 926
 927 Item 33:  Commercial database support
 928   Origin: Russell Howe <russell_howe dot wreckage dot org>
 929   Date:   26 July 2006
 930   Status:
 931
 932   What:   It would be nice for the database backend to support more
 933           databases. I'm thinking of SQL Server at the moment, but I guess Oracle,
 934           DB2, MaxDB, etc are all candidates. SQL Server would presumably be
 935           implemented using FreeTDS or maybe an ODBC library?
 936
 937   Why:    We only really have one database server, which is MS SQL Server
 938           2000. Maintaining a second one for the backup software (we grew out of
 939           SQLite, which I liked, but which didn't work so well with our database
 940           size). We don't really have a machine with the resources to run
 941           postgres, and would rather only maintain a single DBMS. We're stuck with
 942           SQL Server because pretty much all the company's custom applications
 943           (written by consultants) are locked into SQL Server 2000. I can imagine
 944           this scenario is fairly common, and it would be nice to use the existing
 945           properly specced database server for storing Bacula's catalog, rather
 946           than having to run a second DBMS.
 947
 948
 949 Item 34:  Archive data
 950   Date:   15/5/2006
 951   Origin: calvin streeting calvin at absentdream dot com
 952   Status:
 953
 954   What:   The abilty to archive to media (dvd/cd) in a uncompressed format
 955           for dead filing (archiving not backing up)
 956
 957   Why:  At my works when jobs are finished and moved off of the main file
 958         servers (raid based systems) onto a simple linux file server (ide based
 959         system) so users can find old information without contacting the IT
 960         dept.
 961
 962         So this data dosn't realy change it only gets added to,
 963         But it also needs backing up.  At the moment it takes
 964         about 8 hours to back up our servers (working data) so
 965         rather than add more time to existing backups i am trying
 966         to implement a system where we backup the acrhive data to
 967         cd/dvd these disks would only need to be appended to
 968         (burn only new/changed files to new disks for off site
 969         storage).  basialy understand the differnce between
 970         achive data and live data.
 971
 972  Notes: Scan the data and email me when it needs burning divide
 973         into predifind chunks keep a recored of what is on what
 974         disk make me a label (simple php->mysql=>pdf stuff) i
 975         could do this bit ability to save data uncompresed so
 976         it can be read in any other system (future proof data)
 977         save the catalog with the disk as some kind of menu
 978         system
 979
 980 Item 35:  Filesystem watch triggered backup.
 981   Date:   31 August 2006
 982   Origin: Jesper Krogh <jesper@krogh.cc>
 983   Status: Unimplemented, depends probably on "client initiated backups"
 984
 985   What:   With inotify and similar filesystem triggeret notification
 986           systems is it possible to have the file-daemon to monitor
 987           filesystem changes and initiate backup.
 988
 989   Why:    There are 2 situations where this is nice to have.
 990           1) It is possible to get a much finer-grained backup than
 991              the fixed schedules used now.. A file created and deleted
 992              a few hours later, can automatically be caught.
 993
 994           2) The introduced load on the system will probably be
 995              distributed more even on the system.
 996
 997   Notes:  This can be combined with configration that specifies
 998           something like: "at most every 15 minutes or when changes
 999           consumed XX MB".
1000
1001 Kern Notes: I would rather see this implemented by an external program
1002           that monitors the Filesystem changes, then uses the console
1003           to start the appropriate job.
1004
1005 Item 36:  Implement multiple numeric backup levels as supported by dump
1006 Date:     3 April 2006
1007 Origin:   Daniel Rich <drich@employees.org>
1008 Status:
1009 What:     Dump allows specification of backup levels numerically instead of just
1010           "full", "incr", and "diff".  In this system, at any given level, all
1011           files are backed up that were were modified since the last backup of a
1012           higher level (with 0 being the highest and 9 being the lowest).  A
1013           level 0 is therefore equivalent to a full, level 9 an incremental, and
1014           the levels 1 through 8 are varying levels of differentials.  For
1015           bacula's sake, these could be represented as "full", "incr", and
1016           "diff1", "diff2", etc.
1017
1018 Why:      Support of multiple backup levels would provide for more advanced backup
1019           rotation schemes such as "Towers of Hanoi".  This would allow better
1020           flexibility in performing backups, and can lead to shorter recover
1021           times.
1022
1023 Notes:    Legato Networker supports a similar system with full, incr, and 1-9 as
1024           levels.
1025 Item 1:   Implement a server-side compression feature
1026   Date:   18 December 2006
1027   Origin: Vadim A. Umanski , e-mail umanski@ext.ru
1028   Status:
1029   What:   The ability to compress backup data on server receiving data
1030           instead of doing that on client sending data.
1031   Why:    The need is practical. I've got some machines that can send
1032           data to the network 4 or 5 times faster than compressing
1033           them (I've measured that). They're using fast enough SCSI/FC
1034           disk subsystems but rather slow CPUs (ex. UltraSPARC II).
1035           And the backup server has got a quite fast CPUs (ex. Dual P4
1036           Xeons) and quite a low load. When you have 20, 50 or 100 GB
1037           of raw data - running a job 4 to 5 times faster - that
1038           really matters. On the other hand, the data can be
1039           compressed 50% or better - so losing twice more space for
1040           disk backup is not good at all. And the network is all mine
1041           (I have a dedicated management/provisioning network) and I
1042           can get as high bandwidth as I need - 100Mbps, 1000Mbps...
1043           That's why the server-side compression feature is needed!
1044   Notes:
1045
1046 Item 1:  Cause daemons to use a specific IP address to source communications
1047  Origin: Bill Moran <wmoran@collaborativefusion.com>
1048  Date:   18 Dec 2006
1049  Status:
1050  What:   Cause Bacula daemons (dir, fd, sd) to always use the ip address
1051          specified in the [DIR|DF|SD]Addr directive as the source IP
1052          for initiating communication.
1053  Why:    On complex networks, as well as extremely secure networks, it's
1054          not unusual to have multiple possible routes through the network.
1055          Often, each of these routes is secured by different policies
1056          (effectively, firewalls allow or deny different traffic depending
1057          on the source address)
1058          Unfortunately, it can sometimes be difficult or impossible to
1059          represent this in a system routing table, as the result is
1060          excessive subnetting that quickly exhausts available IP space.
1061          The best available workaround is to provide multiple IPs to
1062          a single machine that are all on the same subnet.  In order
1063          for this to work properly, applications must support the ability
1064          to bind outgoing connections to a specified address, otherwise
1065          the operating system will always choose the first IP that
1066          matches the required route.
1067  Notes:  Many other programs support this.  For example, the following
1068          can be configured in BIND:
1069          query-source address 10.0.0.1;
1070          transfer-source 10.0.0.2;
1071          Which means queries from this server will always come from
1072          10.0.0.1 and zone transfers will always originate from
1073          10.0.0.2.
1074
1075 Kern notes: I think this would add very little functionality, but a *lot* of
1076           additional overhead to Bacula.
1077
1078
1079
1080 ============= Empty Feature Request form ===========
1081 Item  n:  One line summary ...
1082   Date:   Date submitted
1083   Origin: Name and email of originator.
1084   Status:
1085
1086   What:   More detailed explanation ...
1087
1088   Why:    Why it is important ...
1089
1090   Notes:  Additional notes or features (omit if not used)
1091 ============== End Feature Request form ==============