git.sur5r.net Git - bacula/bacula/blob - bacula/projects

   1
   2 Projects:
   3                      Bacula Projects Roadmap
   4                     Status updated 26 January 2007
   5                    After re-ordering in vote priority
   6
   7 Items Completed:
   8 Item:  18   Quick release of FD-SD connection after backup.
   9 Item:  40   Include JobID in spool file name
  10
  11 Summary:
  12 Item:   1   Accurate restoration of renamed/deleted files
  13 Item:   2   Implement a Bacula GUI/management tool.
  14 Item:   3   Allow FD to initiate a backup
  15 Item:   4   Merge multiple backups (Synthetic Backup or Consolidation).
  16 Item:   5   Deletion of Disk-Based Bacula Volumes
  17 Item:   6   Implement Base jobs.
  18 Item:   7   Implement creation and maintenance of copy pools
  19 Item:   8   Directive/mode to backup only file changes, not entire file
  20 Item:   9   Implement a server-side compression feature
  21 Item:  10   Improve Bacula's tape and drive usage and cleaning management.
  22 Item:  11   Allow skipping execution of Jobs
  23 Item:  12   Add a scheduling syntax that permits weekly rotations
  24 Item:  13   Archival (removal) of User Files to Tape
  25 Item:  14   Cause daemons to use a specific IP address to source communications
  26 Item:  15   Multiple threads in file daemon for the same job
  27 Item:  16   Add Plug-ins to the FileSet Include statements.
  28 Item:  17   Restore only file attributes (permissions, ACL, owner, group...)
  29 Item:  18*  Quick release of FD-SD connection after backup.
  30 Item:  19   Implement a Python interface to the Bacula catalog.
  31 Item:  20   Archive data
  32 Item:  21   Split documentation
  33 Item:  22   Implement support for stacking arbitrary stream filters, sinks.
  34 Item:  23   Implement from-client and to-client on restore command line.
  35 Item:  24   Add an override in Schedule for Pools based on backup types.
  36 Item:  25   Implement huge exclude list support using hashing.
  37 Item:  26   Implement more Python events in Bacula.
  38 Item:  27   Incorporation of XACML2/SAML2 parsing
  39 Item:  28   Filesystem watch triggered backup.
  40 Item:  29   Allow inclusion/exclusion of files in a fileset by creation/mod times
  41 Item:  30   Tray monitor window cleanups
  42 Item:  31   Implement multiple numeric backup levels as supported by dump
  43 Item:  32   Automatic promotion of backup levels
  44 Item:  33   Clustered file-daemons
  45 Item:  34   Commercial database support
  46 Item:  35   Automatic disabling of devices
  47 Item:  36   An option to operate on all pools with update vol parameters
  48 Item:  37   Add an item to the restore option where you can select a pool
  49 Item:  38   Include timestamp of job launch in "stat clients" output
  50 Item:  39   Message mailing based on backup types
  51 Item:  40*  Include JobID in spool file name
  52
  53
  54 Item  1:  Accurate restoration of renamed/deleted files
  55   Date:   28 November 2005
  56   Origin: Martin Simmons (martin at lispworks dot com)
  57   Status: Robert Nelson will implement this
  58
  59   What:   When restoring a fileset for a specified date (including "most
  60           recent"), Bacula should give you exactly the files and directories
  61           that existed at the time of the last backup prior to that date.
  62
  63           Currently this only works if the last backup was a Full backup.
  64           When the last backup was Incremental/Differential, files and
  65           directories that have been renamed or deleted since the last Full
  66           backup are not currently restored correctly.  Ditto for files with
  67           extra/fewer hard links than at the time of the last Full backup.
  68
  69   Why:    Incremental/Differential would be much more useful if this worked.
  70
  71   Notes:  Merging of multiple backups into a single one seems to
  72           rely on this working, otherwise the merged backups will not be
  73           truly equivalent to a Full backup.
  74
  75           Kern: notes shortened. This can be done without the need for
  76           inodes. It is essentially the same as the current Verify job,
  77           but one additional database record must be written, which does
  78           not need any database change.
  79
  80           Kern: see if we can correct restoration of directories if
  81           replace=ifnewer is set.  Currently, if the directory does not
  82           exist, a "dummy" directory is created, then when all the files
  83           are updated, the dummy directory is newer so the real values
  84           are not updated.
  85
  86 Item  2:  Implement a Bacula GUI/management tool.
  87   Origin: Kern
  88   Date:   28 October 2005
  89   Status:
  90
  91   What:   Implement a Bacula console, and management tools
  92           probably using Qt3 and C++.
  93
  94   Why:    Don't we already have a wxWidgets GUI?  Yes, but
  95           it is written in C++ and changes to the user interface
  96           must be hand tailored using C++ code. By developing
  97           the user interface using Qt designer, the interface
  98           can be very easily updated and most of the new Python
  99           code will be automatically created.  The user interface
 100           changes become very simple, and only the new features
 101           must be implement.  In addition, the code will be in
 102           Python, which will give many more users easy (or easier)
 103           access to making additions or modifications.
 104
 105  Notes:   There is a partial Python-GTK implementation
 106           Lucas Di Pentima <lucas at lunix dot com dot ar> but
 107           it is no longer being developed.
 108
 109 Item  3:  Allow FD to initiate a backup
 110   Origin: Frank Volf (frank at deze dot org)
 111   Date:   17 November 2005
 112   Status:
 113
 114    What:  Provide some means, possibly by a restricted console that
 115           allows a FD to initiate a backup, and that uses the connection
 116           established by the FD to the Director for the backup so that
 117           a Director that is firewalled can do the backup.
 118
 119    Why:   Makes backup of laptops much easier.
 120
 121
 122 Item  4:  Merge multiple backups (Synthetic Backup or Consolidation).
 123   Origin: Marc Cousin and Eric Bollengier
 124   Date:   15 November 2005
 125   Status: Waiting implementation. Depends on first implementing
 126           project Item 2 (Migration) which is now done.
 127
 128   What:   A merged backup is a backup made without connecting to the Client.
 129           It would be a Merge of existing backups into a single backup.
 130           In effect, it is like a restore but to the backup medium.
 131
 132           For instance, say that last Sunday we made a full backup.  Then
 133           all week long, we created incremental backups, in order to do
 134           them fast.  Now comes Sunday again, and we need another full.
 135           The merged backup makes it possible to do instead an incremental
 136           backup (during the night for instance), and then create a merged
 137           backup during the day, by using the full and incrementals from
 138           the week.  The merged backup will be exactly like a full made
 139           Sunday night on the tape, but the production interruption on the
 140           Client will be minimal, as the Client will only have to send
 141           incrementals.
 142
 143           In fact, if it's done correctly, you could merge all the
 144           Incrementals into single Incremental, or all the Incrementals
 145           and the last Differential into a new Differential, or the Full,
 146           last differential and all the Incrementals into a new Full
 147           backup.  And there is no need to involve the Client.
 148
 149   Why:    The benefit is that :
 150           - the Client just does an incremental ;
 151           - the merged backup on tape is just as a single full backup,
 152             and can be restored very fast.
 153
 154           This is also a way of reducing the backup data since the old
 155           data can then be pruned (or not) from the catalog, possibly
 156           allowing older volumes to be recycled
 157
 158 Item  5:  Deletion of Disk-Based Bacula Volumes
 159   Date:   Nov 25, 2005
 160   Origin: Ross Boylan <RossBoylan at stanfordalumni dot org> (edited
 161           by Kern)
 162   Status:
 163
 164    What:  Provide a way for Bacula to automatically remove Volumes
 165           from the filesystem, or optionally to truncate them.
 166           Obviously, the Volume must be pruned prior removal.
 167
 168   Why:    This would allow users more control over their Volumes and
 169           prevent disk based volumes from consuming too much space.
 170
 171   Notes:  The following two directives might do the trick:
 172
 173           Volume Data Retention = <time period>
 174           Remove Volume After = <time period>
 175
 176           The migration project should also remove a Volume that is
 177           migrated. This might also work for tape Volumes.
 178
 179 Item  6:  Implement Base jobs.
 180   Date:   28 October 2005
 181   Origin: Kern
 182   Status:
 183
 184   What:   A base job is sort of like a Full save except that you
 185           will want the FileSet to contain only files that are
 186           unlikely to change in the future (i.e.  a snapshot of
 187           most of your system after installing it).  After the
 188           base job has been run, when you are doing a Full save,
 189           you specify one or more Base jobs to be used.  All
 190           files that have been backed up in the Base job/jobs but
 191           not modified will then be excluded from the backup.
 192           During a restore, the Base jobs will be automatically
 193           pulled in where necessary.
 194
 195   Why:    This is something none of the competition does, as far as
 196           we know (except perhaps BackupPC, which is a Perl program that
 197           saves to disk only).  It is big win for the user, it
 198           makes Bacula stand out as offering a unique
 199           optimization that immediately saves time and money.
 200           Basically, imagine that you have 100 nearly identical
 201           Windows or Linux machine containing the OS and user
 202           files.  Now for the OS part, a Base job will be backed
 203           up once, and rather than making 100 copies of the OS,
 204           there will be only one.  If one or more of the systems
 205           have some files updated, no problem, they will be
 206           automatically restored.
 207
 208   Notes:  Huge savings in tape usage even for a single machine.
 209           Will require more resources because the DIR must send
 210           FD a list of files/attribs, and the FD must search the
 211           list and compare it for each file to be saved.
 212
 213 Item  7:  Implement creation and maintenance of copy pools
 214   Date:   27 November 2005
 215   Origin: David Boyes (dboyes at sinenomine dot net)
 216   Status:
 217
 218   What:   I would like Bacula to have the capability to write copies
 219           of backed-up data on multiple physical volumes selected
 220           from different pools without transferring the data
 221           multiple times, and to accept any of the copy volumes
 222           as valid for restore.
 223
 224   Why:    In many cases, businesses are required to keep offsite
 225           copies of backup volumes, or just wish for simple
 226           protection against a human operator dropping a storage
 227           volume and damaging it. The ability to generate multiple
 228           volumes in the course of a single backup job allows
 229           customers to simple check out one copy and send it
 230           offsite, marking it as out of changer or otherwise
 231           unavailable. Currently, the library and magazine
 232           management capability in Bacula does not make this process
 233           simple.
 234
 235           Restores would use the copy of the data on the first
 236           available volume, in order of copy pool chain definition.
 237
 238           This is also a major scalability issue -- as the number of
 239           clients increases beyond several thousand, and the volume
 240           of data increases, transferring the data multiple times to
 241           produce additional copies of the backups will become
 242           physically impossible due to transfer speed
 243           issues. Generating multiple copies at server side will
 244           become the only practical option.
 245
 246   How:    I suspect that this will require adding a multiplexing
 247           SD that appears to be a SD to a specific FD, but 1-n FDs
 248           to the specific back end SDs managing the primary and copy
 249           pools.  Storage pools will also need to acquire parameters
 250           to define the pools to be used for copies.
 251
 252   Notes:  I would commit some of my developers' time if we can agree
 253           on the design and behavior.
 254
 255 Item  8:  Directive/mode to backup only file changes, not entire file
 256   Date:   11 November 2005
 257   Origin: Joshua Kugler <joshua dot kugler at uaf dot edu>
 258           Marek Bajon <mbajon at bimsplus dot com dot pl>
 259   Status:
 260
 261   What:   Currently when a file changes, the entire file will be backed up in
 262           the next incremental or full backup.  To save space on the tapes
 263           it would be nice to have a mode whereby only the changes to the
 264           file would be backed up when it is changed.
 265
 266   Why:    This would save lots of space when backing up large files such as
 267           logs, mbox files, Outlook PST files and the like.
 268
 269   Notes:  This would require the usage of disk-based volumes as comparing
 270           files would not be feasible using a tape drive.
 271
 272 Item  9:  Implement a server-side compression feature
 273   Date:   18 December 2006
 274   Origin: Vadim A. Umanski , e-mail umanski@ext.ru
 275   Status:
 276   What:   The ability to compress backup data on server receiving data
 277           instead of doing that on client sending data.
 278   Why:    The need is practical. I've got some machines that can send
 279           data to the network 4 or 5 times faster than compressing
 280           them (I've measured that). They're using fast enough SCSI/FC
 281           disk subsystems but rather slow CPUs (ex. UltraSPARC II).
 282           And the backup server has got a quite fast CPUs (ex. Dual P4
 283           Xeons) and quite a low load. When you have 20, 50 or 100 GB
 284           of raw data - running a job 4 to 5 times faster - that
 285           really matters. On the other hand, the data can be
 286           compressed 50% or better - so losing twice more space for
 287           disk backup is not good at all. And the network is all mine
 288           (I have a dedicated management/provisioning network) and I
 289           can get as high bandwidth as I need - 100Mbps, 1000Mbps...
 290           That's why the server-side compression feature is needed!
 291   Notes:
 292
 293 Item 10:  Improve Bacula's tape and drive usage and cleaning management.
 294   Date:   8 November 2005, November 11, 2005
 295   Origin: Adam Thornton <athornton at sinenomine dot net>,
 296           Arno Lehmann <al at its-lehmann dot de>
 297   Status:
 298
 299   What:   Make Bacula manage tape life cycle information, tape reuse
 300           times and drive cleaning cycles.
 301
 302   Why:    All three parts of this project are important when operating
 303           backups.
 304           We need to know which tapes need replacement, and we need to
 305           make sure the drives are cleaned when necessary.  While many
 306           tape libraries and even autoloaders can handle all this
 307           automatically, support by Bacula can be helpful for smaller
 308           (older) libraries and single drives.  Limiting the number of
 309           times a tape is used might prevent tape errors when using
 310           tapes until the drives can't read it any more.  Also, checking
 311           drive status during operation can prevent some failures (as I
 312           [Arno] had to learn the hard way...)
 313
 314   Notes:  First, Bacula could (and even does, to some limited extent)
 315           record tape and drive usage.  For tapes, the number of mounts,
 316           the amount of data, and the time the tape has actually been
 317           running could be recorded.  Data fields for Read and Write
 318           time and Number of mounts already exist in the catalog (I'm
 319           not sure if VolBytes is the sum of all bytes ever written to
 320           that volume by Bacula).  This information can be important
 321           when determining which media to replace.  The ability to mark
 322           Volumes as "used up" after a given number of write cycles
 323           should also be implemented so that a tape is never actually
 324           worn out.  For the tape drives known to Bacula, similar
 325           information is interesting to determine the device status and
 326           expected life time: Time it's been Reading and Writing, number
 327           of tape Loads / Unloads / Errors.  This information is not yet
 328           recorded as far as I [Arno] know.  A new volume status would
 329           be necessary for the new state, like "Used up" or "Worn out".
 330           Volumes with this state could be used for restores, but not
 331           for writing. These volumes should be migrated first (assuming
 332           migration is implemented) and, once they are no longer needed,
 333           could be moved to a Trash pool.
 334
 335           The next step would be to implement a drive cleaning setup.
 336           Bacula already has knowledge about cleaning tapes.  Once it
 337           has some information about cleaning cycles (measured in drive
 338           run time, number of tapes used, or calender days, for example)
 339           it can automatically execute tape cleaning (with an
 340           autochanger, obviously) or ask for operator assistance loading
 341           a cleaning tape.
 342
 343           The final step would be to implement TAPEALERT checks not only
 344           when changing tapes and only sending the information to the
 345           administrator, but rather checking after each tape error,
 346           checking on a regular basis (for example after each tape
 347           file), and also before unloading and after loading a new tape.
 348           Then, depending on the drives TAPEALERT state and the known
 349           drive cleaning state Bacula could automatically schedule later
 350           cleaning, clean immediately, or inform the operator.
 351
 352           Implementing this would perhaps require another catalog change
 353           and perhaps major changes in SD code and the DIR-SD protocol,
 354           so I'd only consider this worth implementing if it would
 355           actually be used or even needed by many people.
 356
 357           Implementation of these projects could happen in three distinct
 358           sub-projects: Measuring Tape and Drive usage, retiring
 359           volumes, and handling drive cleaning and TAPEALERTs.
 360
 361 Item 11:  Allow skipping execution of Jobs
 362   Date:   29 November 2005
 363   Origin: Florian Schnabel <florian.schnabel at docufy dot de>
 364   Status:
 365
 366     What: An easy option to skip a certain job  on a certain date.
 367      Why: You could then easily skip tape backups on holidays.  Especially
 368           if you got no autochanger and can only fit one backup on a tape
 369           that would be really handy, other jobs could proceed normally
 370           and you won't get errors that way.
 371
 372 Item 12:  Add a scheduling syntax that permits weekly rotations
 373    Date:  15 December 2006
 374   Origin: Gregory Brauer (greg at wildbrain dot com)
 375   Status:
 376
 377    What:  Currently, Bacula only understands how to deal with weeks of the
 378           month or weeks of the year in schedules.  This makes it impossible
 379           to do a true weekly rotation of tapes.  There will always be a
 380           discontinuity that will require disruptive manual intervention at
 381           least monthly or yearly because week boundaries never align with
 382           month or year boundaries.
 383
 384           A solution would be to add a new syntax that defines (at least)
 385           a start timestamp, and repetition period.
 386
 387    Why:   Rotated backups done at weekly intervals are useful, and Bacula
 388           cannot currently do them without extensive hacking.
 389
 390    Notes: Here is an example syntax showing a 3-week rotation where full
 391           Backups would be performed every week on Saturday, and an
 392           incremental would be performed every week on Tuesday.  Each
 393           set of tapes could be removed from the loader for the following
 394           two cycles before coming back and being reused on the third
 395           week.  Since the execution times are determined by intervals
 396           from a given point in time, there will never be any issues with
 397           having to adjust to any sort of arbitrary time boundary.  In
 398           the example provided, I even define the starting schedule
 399           as crossing both a year and a month boundary, but the run times
 400           would be based on the "Repeat" value and would therefore happen
 401           weekly as desired.
 402
 403
 404           Schedule {
 405               Name = "Week 1 Rotation"
 406               #Saturday.  Would run Dec 30, Jan 20, Feb 10, etc.
 407               Run {
 408                   Options {
 409                       Type   = Full
 410                       Start  = 2006-12-30 01:00
 411                       Repeat = 3w
 412                   }
 413               }
 414               #Tuesday.  Would run Jan 2, Jan 23, Feb 13, etc.
 415               Run {
 416                   Options {
 417                       Type   = Incremental
 418                       Start  = 2007-01-02 01:00
 419                       Repeat = 3w
 420                   }
 421               }
 422           }
 423
 424           Schedule {
 425               Name = "Week 2 Rotation"
 426               #Saturday.  Would run Jan 6, Jan 27, Feb 17, etc.
 427               Run {
 428                   Options {
 429                       Type   = Full
 430                       Start  = 2007-01-06 01:00
 431                       Repeat = 3w
 432                   }
 433               }
 434               #Tuesday.  Would run Jan 9, Jan 30, Feb 20, etc.
 435               Run {
 436                   Options {
 437                       Type   = Incremental
 438                       Start  = 2007-01-09 01:00
 439                       Repeat = 3w
 440                   }
 441               }
 442           }
 443
 444           Schedule {
 445               Name = "Week 3 Rotation"
 446               #Saturday.  Would run Jan 13, Feb 3, Feb 24, etc.
 447               Run {
 448                   Options {
 449                       Type   = Full
 450                       Start  = 2007-01-13 01:00
 451                       Repeat = 3w
 452                   }
 453               }
 454               #Tuesday.  Would run Jan 16, Feb 6, Feb 27, etc.
 455               Run {
 456                   Options {
 457                       Type   = Incremental
 458                       Start  = 2007-01-16 01:00
 459                       Repeat = 3w
 460                   }
 461               }
 462           }
 463
 464 Item 13:  Archival (removal) of User Files to Tape
 465   Date:   Nov. 24/2005
 466   Origin: Ray Pengelly [ray at biomed dot queensu dot ca
 467   Status:
 468
 469   What:   The ability to archive data to storage based on certain parameters
 470           such as age, size, or location.  Once the data has been written to
 471           storage and logged it is then pruned from the originating
 472           filesystem. Note! We are talking about user's files and not
 473           Bacula Volumes.
 474
 475   Why:    This would allow fully automatic storage management which becomes
 476           useful for large datastores.  It would also allow for auto-staging
 477           from one media type to another.
 478
 479           Example 1) Medical imaging needs to store large amounts of data.
 480           They decide to keep data on their servers for 6 months and then put
 481           it away for long term storage.  The server then finds all files
 482           older than 6 months writes them to tape.  The files are then removed
 483           from the server.
 484
 485           Example 2) All data that hasn't been accessed in 2 months could be
 486           moved from high-cost, fibre-channel disk storage to a low-cost
 487           large-capacity SATA disk storage pool which doesn't have as quick of
 488           access time.  Then after another 6 months (or possibly as one
 489           storage pool gets full) data is migrated to Tape.
 490
 491 Item 14:  Cause daemons to use a specific IP address to source communications
 492  Origin:  Bill Moran <wmoran@collaborativefusion.com>
 493  Date:    18 Dec 2006
 494  Status:
 495  What:    Cause Bacula daemons (dir, fd, sd) to always use the ip address
 496           specified in the [DIR|DF|SD]Addr directive as the source IP
 497           for initiating communication.
 498  Why:     On complex networks, as well as extremely secure networks, it's
 499           not unusual to have multiple possible routes through the network.
 500           Often, each of these routes is secured by different policies
 501           (effectively, firewalls allow or deny different traffic depending
 502           on the source address)
 503           Unfortunately, it can sometimes be difficult or impossible to
 504           represent this in a system routing table, as the result is
 505           excessive subnetting that quickly exhausts available IP space.
 506           The best available workaround is to provide multiple IPs to
 507           a single machine that are all on the same subnet.  In order
 508           for this to work properly, applications must support the ability
 509           to bind outgoing connections to a specified address, otherwise
 510           the operating system will always choose the first IP that
 511           matches the required route.
 512  Notes:   Many other programs support this.  For example, the following
 513           can be configured in BIND:
 514           query-source address 10.0.0.1;
 515           transfer-source 10.0.0.2;
 516           Which means queries from this server will always come from
 517           10.0.0.1 and zone transfers will always originate from
 518           10.0.0.2.
 519
 520 Item 15:  Multiple threads in file daemon for the same job
 521   Date:   27 November 2005
 522   Origin: Ove Risberg (Ove.Risberg at octocode dot com)
 523   Status:
 524
 525   What:   I want the file daemon to start multiple threads for a backup
 526           job so the fastest possible backup can be made.
 527
 528           The file daemon could parse the FileSet information and start
 529           one thread for each File entry located on a separate
 530           filesystem.
 531
 532           A confiuration option in the job section should be used to
 533           enable or disable this feature. The confgutration option could
 534           specify the maximum number of threads in the file daemon.
 535
 536           If the theads could spool the data to separate spool files
 537           the restore process will not be much slower.
 538
 539   Why:    Multiple concurrent backups of a large fileserver with many
 540           disks and controllers will be much faster.
 541
 542 Item 16:  Add Plug-ins to the FileSet Include statements.
 543   Date:   28 October 2005
 544   Origin:
 545   Status: Partially coded in 1.37 -- much more to do.
 546
 547   What:   Allow users to specify wild-card and/or regular
 548           expressions to be matched in both the Include and
 549           Exclude directives in a FileSet.  At the same time,
 550           allow users to define plug-ins to be called (based on
 551           regular expression/wild-card matching).
 552
 553   Why:    This would give the users the ultimate ability to control
 554           how files are backed up/restored.  A user could write a
 555           plug-in knows how to backup his Oracle database without
 556           stopping/starting it, for example.
 557
 558 Item 17:  Restore only file attributes (permissions, ACL, owner, group...)
 559   Origin: Eric Bollengier
 560   Date:   30/12/2006
 561   Status:
 562
 563   What:   The goal of this project is to be able to restore only rights
 564           and attributes of files without crushing them.
 565
 566   Why:    Who have never had to repair a chmod -R 777, or a wild update
 567           of recursive right under Windows? At this time, you must have
 568           enough space to restore data, dump attributes (easy with acl,
 569           more complex with unix/windows rights) and apply them to your
 570           broken tree. With this options, it will be very easy to compare
 571           right or ACL over the time.
 572
 573   Notes:  If the file is here, we skip restore and we change rights.
 574           If the file isn't here, we can create an empty one and apply
 575           rights or do nothing.
 576 Item 18:  Quick release of FD-SD connection after backup.
 577   Origin: Frank Volf (frank at deze dot org)
 578   Date:   17 November 2005
 579   Status: Done -- implemented by Kern -- in CVS 26Jan07
 580
 581    What:  In the Bacula implementation a backup is finished after all data
 582           and attributes are successfully written to storage.  When using a
 583           tape backup it is very annoying that a backup can take a day,
 584           simply because the current tape (or whatever) is full and the
 585           administrator has not put a new one in.  During that time the
 586           system cannot be taken off-line, because there is still an open
 587           session between the storage daemon and the file daemon on the
 588           client.
 589
 590           Although this is a very good strategy for making "safe backups"
 591           This can be annoying for e.g.  laptops, that must remain
 592           connected until the backup is completed.
 593
 594           Using a new feature called "migration" it will be possible to
 595           spool first to harddisk (using a special 'spool' migration
 596           scheme) and then migrate the backup to tape.
 597
 598           There is still the problem of getting the attributes committed.
 599           If it takes a very long time to do, with the current code, the
 600           job has not terminated, and the File daemon is not freed up.  The
 601           Storage daemon should release the File daemon as soon as all the
 602           file data and all the attributes have been sent to it (the SD).
 603           Currently the SD waits until everything is on tape and all the
 604           attributes are transmitted to the Director before signaling
 605           completion to the FD. I don't think I would have any problem
 606           changing this.  The reason is that even if the FD reports back to
 607           the Dir that all is OK, the job will not terminate until the SD
 608           has done the same thing -- so in a way keeping the SD-FD link
 609           open to the very end is not really very productive ...
 610
 611    Why:   Makes backup of laptops much faster.
 612
 613 Item 19:  Implement a Python interface to the Bacula catalog.
 614   Date:   28 October 2005
 615   Origin: Kern
 616   Status:
 617
 618   What:   Implement an interface for Python scripts to access
 619           the catalog through Bacula.
 620
 621   Why:    This will permit users to customize Bacula through
 622           Python scripts.
 623
 624 Item 20:  Archive data
 625   Date:   15/5/2006
 626   Origin: calvin streeting calvin at absentdream dot com
 627   Status:
 628
 629   What:   The abilty to archive to media (dvd/cd) in a uncompressed format
 630           for dead filing (archiving not backing up)
 631
 632     Why:  At my works when jobs are finished and moved off of the main file
 633           servers (raid based systems) onto a simple linux file server (ide based
 634           system) so users can find old information without contacting the IT
 635           dept.
 636
 637           So this data dosn't realy change it only gets added to,
 638           But it also needs backing up.  At the moment it takes
 639           about 8 hours to back up our servers (working data) so
 640           rather than add more time to existing backups i am trying
 641           to implement a system where we backup the acrhive data to
 642           cd/dvd these disks would only need to be appended to
 643           (burn only new/changed files to new disks for off site
 644           storage).  basialy understand the differnce between
 645           achive data and live data.
 646
 647   Notes:  Scan the data and email me when it needs burning divide
 648           into predifind chunks keep a recored of what is on what
 649           disk make me a label (simple php->mysql=>pdf stuff) i
 650           could do this bit ability to save data uncompresed so
 651           it can be read in any other system (future proof data)
 652           save the catalog with the disk as some kind of menu
 653           system
 654
 655 Item 21:  Split documentation
 656   Origin: Maxx <maxxatworkat gmail dot com>
 657   Date:   27th July 2006
 658   Status:
 659
 660   What:   Split documentation in several books
 661
 662   Why:    Bacula manual has now more than 600 pages, and looking for
 663           implementation details is getting complicated.  I think
 664           it would be good to split the single volume in two or
 665           maybe three parts:
 666
 667           1) Introduction, requirements and tutorial, typically
 668              are useful only until first installation time
 669
 670           2) Basic installation and configuration, with all the
 671              gory details about the directives supported 3)
 672              Advanced Bacula: testing, troubleshooting, GUI and
 673              ancillary programs, security managements, scripting,
 674              etc.
 675
 676
 677 Item 22:  Implement support for stacking arbitrary stream filters, sinks.
 678 Date:     23 November 2006
 679 Origin:   Landon Fuller <landonf@threerings.net>
 680 Status:   Planning. Assigned to landonf.
 681
 682   What:   Implement support for the following:
 683           - Stacking arbitrary stream filters (eg, encryption, compression,
 684             sparse data handling))
 685           - Attaching file sinks to terminate stream filters (ie, write out
 686             the resultant data to a file)
 687           - Refactor the restoration state machine accordingly
 688
 689    Why:   The existing stream implementation suffers from the following:
 690            - All state (compression, encryption, stream restoration), is
 691              global across the entire restore process, for all streams. There are
 692              multiple entry and exit points in the restoration state machine, and
 693              thus multiple places where state must be allocated, deallocated,
 694              initialized, or reinitialized. This results in exceptional complexity
 695              for the author of a stream filter.
 696            - The developer must enumerate all possible combinations of filters
 697              and stream types (ie, win32 data with encryption, without encryption,
 698              with encryption AND compression, etc).
 699
 700   Notes:  This feature request only covers implementing the stream filters/
 701           sinks, and refactoring the file daemon's restoration implementation
 702           accordingly. If I have extra time, I will also rewrite the backup
 703           implementation. My intent in implementing the restoration first is to
 704           solve pressing bugs in the restoration handling, and to ensure that
 705           the new restore implementation handles existing backups correctly.
 706
 707           I do not plan on changing the network or tape data structures to
 708           support defining arbitrary stream filters, but supporting that
 709           functionality is the ultimate goal.
 710
 711           Assistance with either code or testing would be fantastic.
 712
 713 Item 23:  Implement from-client and to-client on restore command line.
 714    Date:  11 December 2006
 715   Origin: Discussion on Bacula-users entitled 'Scripted restores to
 716           different clients', December 2006
 717   Status: New feature request
 718
 719   What:   While using bconsole interactively, you can specify the client
 720           that a backup job is to be restored for, and then you can
 721           specify later a different client to send the restored files
 722           back to. However, using the 'restore' command with all options
 723           on the command line, this cannot be done, due to the ambiguous
 724           'client' parameter. Additionally, this parameter means different
 725           things depending on if it's specified on the command line or
 726           afterwards, in the Modify Job screens.
 727
 728      Why: This feature would enable restore jobs to be more completely
 729           automated, for example by a web or GUI front-end.
 730
 731    Notes: client can also be implied by specifying the jobid on the command
 732           line
 733
 734 Item 24:  Add an override in Schedule for Pools based on backup types.
 735 Date:     19 Jan 2005
 736 Origin:   Chad Slater <chad.slater@clickfox.com>
 737 Status:
 738
 739   What:   Adding a FullStorage=BigTapeLibrary in the Schedule resource
 740           would help those of us who use different storage devices for different
 741           backup levels cope with the "auto-upgrade" of a backup.
 742
 743   Why:    Assume I add several new device to be backed up, i.e. several
 744           hosts with 1TB RAID.  To avoid tape switching hassles, incrementals are
 745           stored in a disk set on a 2TB RAID.  If you add these devices in the
 746           middle of the month, the incrementals are upgraded to "full" backups,
 747           but they try to use the same storage device as requested in the
 748           incremental job, filling up the RAID holding the differentials.  If we
 749           could override the Storage parameter for full and/or differential
 750           backups, then the Full job would use the proper Storage device, which
 751           has more capacity (i.e. a 8TB tape library.
 752
 753 Item 25:  Implement huge exclude list support using hashing.
 754   Date:   28 October 2005
 755   Origin: Kern
 756   Status:
 757
 758   What:   Allow users to specify very large exclude list (currently
 759           more than about 1000 files is too many).
 760
 761   Why:    This would give the users the ability to exclude all
 762           files that are loaded with the OS (e.g. using rpms
 763           or debs). If the user can restore the base OS from
 764           CDs, there is no need to backup all those files. A
 765           complete restore would be to restore the base OS, then
 766           do a Bacula restore. By excluding the base OS files, the
 767           backup set will be *much* smaller.
 768
 769 Item 26:  Implement more Python events in Bacula.
 770   Date:   28 October 2005
 771   Origin: Kern
 772   Status:
 773
 774   What:   Allow Python scripts to be called at more places
 775           within Bacula and provide additional access to Bacula
 776           internal variables.
 777
 778   Why:    This will permit users to customize Bacula through
 779           Python scripts.
 780
 781   Notes:  Recycle event
 782           Scratch pool event
 783           NeedVolume event
 784           MediaFull event
 785
 786           Also add a way to get a listing of currently running
 787           jobs (possibly also scheduled jobs).
 788
 789
 790 Item 27:  Incorporation of XACML2/SAML2 parsing
 791    Date:   19 January 2006
 792    Origin: Adam Thornton <athornton@sinenomine.net>
 793    Status: Blue sky
 794
 795    What:   XACML is "eXtensible Access Control Markup Language" and
 796           "SAML is the "Security Assertion Markup Language"--an XML standard
 797           for making statements about identity and authorization.  Having these
 798           would give us a framework to approach ACLs in a generic manner, and
 799           in a way flexible enough to support the four major sorts of ACLs I
 800           see as a concern to Bacula at this point, as well as (probably) to
 801           deal with new sorts of ACLs that may appear in the future.
 802
 803    Why:    Bacula is beginning to need to back up systems with ACLs
 804           that do not map cleanly onto traditional Unix permissions.  I see
 805           four sets of ACLs--in general, mutually incompatible with one
 806           another--that we're going to need to deal with.  These are: NTFS
 807           ACLs, POSIX ACLs, NFSv4 ACLS, and AFS ACLS.  (Some may question the
 808           relevance of AFS; AFS is one of Sine Nomine's core consulting
 809           businesses, and having a reputable file-level backup and restore
 810           technology for it (as Tivoli is probably going to drop AFS support
 811           soon since IBM no longer supports AFS) would be of huge benefit to
 812           our customers; we'd most likely create the AFS support at Sine Nomine
 813           for inclusion into the Bacula (and perhaps some changes to the
 814           OpenAFS volserver) core code.)
 815
 816           Now, obviously, Bacula already handles NTFS just fine.  However, I
 817           think there's a lot of value in implementing a generic ACL model, so
 818           that it's easy to support whatever particular instances of ACLs come
 819           down the pike: POSIX ACLS (think SELinux) and NFSv4 are the obvious
 820           things arriving in the Linux world in a big way in the near future.
 821           XACML, although overcomplicated for our needs, provides this
 822           framework, and we should be able to leverage other people's
 823           implementations to minimize the amount of work *we* have to do to get
 824           a generic ACL framework.  Basically, the costs of implementation are
 825           high, but they're largely both external to Bacula and already sunk.
 826
 827 Item 28:  Filesystem watch triggered backup.
 828   Date:   31 August 2006
 829   Origin: Jesper Krogh <jesper@krogh.cc>
 830   Status: Unimplemented, depends probably on "client initiated backups"
 831
 832   What:   With inotify and similar filesystem triggeret notification
 833           systems is it possible to have the file-daemon to monitor
 834           filesystem changes and initiate backup.
 835
 836   Why:    There are 2 situations where this is nice to have.
 837           1) It is possible to get a much finer-grained backup than
 838              the fixed schedules used now.. A file created and deleted
 839              a few hours later, can automatically be caught.
 840
 841           2) The introduced load on the system will probably be
 842              distributed more even on the system.
 843
 844   Notes:  This can be combined with configration that specifies
 845           something like: "at most every 15 minutes or when changes
 846           consumed XX MB".
 847
 848 Kern Notes: I would rather see this implemented by an external program
 849           that monitors the Filesystem changes, then uses the console
 850           to start the appropriate job.
 851
 852 Item 29:  Allow inclusion/exclusion of files in a fileset by creation/mod times
 853   Origin: Evan Kaufman <evan.kaufman@gmail.com>
 854   Date:   January 11, 2006
 855   Status:
 856
 857   What:   In the vein of the Wild and Regex directives in a Fileset's
 858           Options, it would be helpful to allow a user to include or exclude
 859           files and directories by creation or modification times.
 860
 861           You could factor the Exclude=yes|no option in much the same way it
 862           affects the Wild and Regex directives.  For example, you could exclude
 863           all files modified before a certain date:
 864
 865    Options {
 866      Exclude = yes
 867      Modified Before = ####
 868    }
 869
 870            Or you could exclude all files created/modified since a certain date:
 871
 872    Options {
 873       Exclude = yes
 874      Created Modified Since = ####
 875    }
 876
 877            The format of the time/date could be done several ways, say the number
 878            of seconds since the epoch:
 879            1137008553 = Jan 11 2006, 1:42:33PM   # result of `date +%s`
 880
 881            Or a human readable date in a cryptic form:
 882            20060111134233 = Jan 11 2006, 1:42:33PM   # YYYYMMDDhhmmss
 883
 884   Why:    I imagine a feature like this could have many uses. It would
 885           allow a user to do a full backup while excluding the base operating
 886           system files, so if I installed a Linux snapshot from a CD yesterday,
 887           I'll *exclude* all files modified *before* today.  If I need to
 888           recover the system, I use the CD I already have, plus the tape backup.
 889           Or if, say, a Windows client is hit by a particularly corrosive
 890           virus, and I need to *exclude* any files created/modified *since* the
 891           time of infection.
 892
 893   Notes:  Of course, this feature would work in concert with other
 894           in/exclude rules, and wouldnt override them (or each other).
 895
 896   Notes:  The directives I'd imagine would be along the lines of
 897           "[Created] [Modified] [Before|Since] = <date>".
 898           So one could compare against 'ctime' and/or 'mtime', but ONLY 'before'
 899            or 'since'.
 900
 901
 902 Item 30:  Tray monitor window cleanups
 903   Origin: Alan Brown ajb2 at mssl dot ucl dot ac dot uk
 904   Date:   24 July 2006
 905   Status:
 906   What:   Resizeable and scrollable windows in the tray monitor.
 907
 908   Why:    With multiple clients, or with many jobs running, the displayed
 909           window often ends up larger than the available screen, making
 910           the trailing items difficult to read.
 911
 912
 913 Item 31:  Implement multiple numeric backup levels as supported by dump
 914 Date:     3 April 2006
 915 Origin:   Daniel Rich <drich@employees.org>
 916 Status:
 917 What:     Dump allows specification of backup levels numerically instead of just
 918           "full", "incr", and "diff".  In this system, at any given level, all
 919           files are backed up that were were modified since the last backup of a
 920           higher level (with 0 being the highest and 9 being the lowest).  A
 921           level 0 is therefore equivalent to a full, level 9 an incremental, and
 922           the levels 1 through 8 are varying levels of differentials.  For
 923           bacula's sake, these could be represented as "full", "incr", and
 924           "diff1", "diff2", etc.
 925
 926 Why:      Support of multiple backup levels would provide for more advanced backup
 927           rotation schemes such as "Towers of Hanoi".  This would allow better
 928           flexibility in performing backups, and can lead to shorter recover
 929           times.
 930
 931 Notes:    Legato Networker supports a similar system with full, incr, and 1-9 as
 932           levels.
 933
 934 Item 32:  Automatic promotion of backup levels
 935    Date:  19 January 2006
 936   Origin: Adam Thornton <athornton@sinenomine.net>
 937   Status:
 938
 939     What: Amanda has a feature whereby it estimates the space that a
 940           differential, incremental, and full backup would take.  If the
 941           difference in space required between the scheduled level and the next
 942           level up is beneath some user-defined critical threshold, the backup
 943           level is bumped to the next type.  Doing this minimizes the number of
 944           volumes necessary during a restore, with a fairly minimal cost in
 945           backup media space.
 946
 947     Why:  I know at least one (quite sophisticated and smart) user
 948           for whom the absence of this feature is a deal-breaker in terms of
 949           using Bacula; if we had it it would eliminate the one cool thing
 950           Amanda can do and we can't (at least, the one cool thing I know of).
 951
 952 Item 33:  Clustered file-daemons
 953   Origin: Alan Brown ajb2 at mssl dot ucl dot ac dot uk
 954   Date:   24 July 2006
 955   Status:
 956   What:   A "virtual" filedaemon, which is actually a cluster of real ones.
 957
 958   Why:    In the case of clustered filesystems (SAN setups, GFS, or OCFS2, etc)
 959           multiple machines may have access to the same set of filesystems
 960
 961           For performance reasons, one may wish to initate backups from
 962           several of these machines simultaneously, instead of just using
 963           one backup source for the common clustered filesystem.
 964
 965           For obvious reasons, normally backups of $A-FD/$PATH and
 966           B-FD/$PATH are treated as different backup sets. In this case
 967           they are the same communal set.
 968
 969           Likewise when restoring, it would be easier to just specify
 970           one of the cluster machines and let bacula decide which to use.
 971
 972           This can be faked to some extent using DNS round robin entries
 973           and a virtual IP address, however it means "status client" will
 974           always give bogus answers. Additionally there is no way of
 975           spreading the load evenly among the servers.
 976
 977           What is required is something similar to the storage daemon
 978           autochanger directives, so that Bacula can keep track of
 979           operating backups/restores and direct new jobs to a "free"
 980           client.
 981
 982    Notes:
 983
 984 Item 34:  Commercial database support
 985   Origin: Russell Howe <russell_howe dot wreckage dot org>
 986   Date:   26 July 2006
 987   Status:
 988
 989   What:   It would be nice for the database backend to support more
 990           databases. I'm thinking of SQL Server at the moment, but I guess Oracle,
 991           DB2, MaxDB, etc are all candidates. SQL Server would presumably be
 992           implemented using FreeTDS or maybe an ODBC library?
 993
 994   Why:    We only really have one database server, which is MS SQL Server
 995           2000. Maintaining a second one for the backup software (we grew out of
 996           SQLite, which I liked, but which didn't work so well with our database
 997           size). We don't really have a machine with the resources to run
 998           postgres, and would rather only maintain a single DBMS. We're stuck with
 999           SQL Server because pretty much all the company's custom applications
1000           (written by consultants) are locked into SQL Server 2000. I can imagine
1001           this scenario is fairly common, and it would be nice to use the existing
1002           properly specced database server for storing Bacula's catalog, rather
1003           than having to run a second DBMS.
1004
1005 Item 35:  Automatic disabling of devices
1006    Date:  2005-11-11
1007   Origin: Peter Eriksson <peter at ifm.liu dot se>
1008   Status:
1009
1010    What:  After a configurable amount of fatal errors with a tape drive
1011           Bacula should automatically disable further use of a certain
1012           tape drive. There should also be "disable"/"enable" commands in
1013           the "bconsole" tool.
1014
1015    Why:   On a multi-drive jukebox there is a possibility of tape drives
1016           going bad during large backups (needing a cleaning tape run,
1017           tapes getting stuck). It would be advantageous if Bacula would
1018           automatically disable further use of a problematic tape drive
1019           after a configurable amount of errors has occurred.
1020
1021           An example: I have a multi-drive jukebox (6 drives, 380+ slots)
1022           where tapes occasionally get stuck inside the drive. Bacula will
1023           notice that the "mtx-changer" command will fail and then fail
1024           any backup jobs trying to use that drive. However, it will still
1025           keep on trying to run new jobs using that drive and fail -
1026           forever, and thus failing lots and lots of jobs... Since we have
1027           many drives Bacula could have just automatically disabled
1028           further use of that drive and used one of the other ones
1029           instead.
1030
1031 Item 36:  An option to operate on all pools with update vol parameters
1032   Origin: Dmitriy Pinchukov <absh@bossdev.kiev.ua>
1033    Date:  16 August 2006
1034   Status:
1035
1036    What:  When I do update -> Volume parameters -> All Volumes
1037           from Pool, then I have to select pools one by one.  I'd like
1038           console to have an option like "0: All Pools" in the list of
1039           defined pools.
1040
1041    Why:   I have many pools and therefore unhappy with manually
1042           updating each of them using update -> Volume parameters -> All
1043           Volumes from Pool -> pool #.
1044
1045 Item 37:  Add an item to the restore option where you can select a pool
1046   Origin: kshatriyak at gmail dot com
1047     Date: 1/1/2006
1048   Status:
1049
1050     What: In the restore option (Select the most recent backup for a
1051           client) it would be useful to add an option where you can limit
1052           the selection to a certain pool.
1053
1054      Why: When using cloned jobs, most of the time you have 2 pools - a
1055           disk pool and a tape pool.  People who have 2 pools would like to
1056           select the most recent backup from disk, not from tape (tape
1057           would be only needed in emergency).  However, the most recent
1058           backup (which may just differ a second from the disk backup) may
1059           be on tape and would be selected.  The problem becomes bigger if
1060           you have a full and differential - the most "recent" full backup
1061           may be on disk, while the most recent differential may be on tape
1062           (though the differential on disk may differ even only a second or
1063           so).  Bacula will complain that the backups reside on different
1064           media then.  For now the only solution now when restoring things
1065           when you have 2 pools is to manually search for the right
1066           job-id's and enter them by hand, which is a bit fault tolerant.
1067
1068 Item 38:  Include timestamp of job launch in "stat clients" output
1069   Origin: Mark Bergman <mark.bergman@uphs.upenn.edu>
1070   Date:   Tue Aug 22 17:13:39 EDT 2006
1071   Status:
1072
1073   What:   The "stat clients" command doesn't include any detail on when
1074           the active backup jobs were launched.
1075
1076   Why:    Including the timestamp would make it much easier to decide whether
1077           a job is running properly.
1078
1079   Notes:  It may be helpful to have the output from "stat clients" formatted
1080           more like that from "stat dir" (and other commands), in a column
1081           format. The per-client information that's currently shown (level,
1082           client name, JobId, Volume, pool, device, Files, etc.) is good, but
1083           somewhat hard to parse (both programmatically and visually),
1084           particularly when there are many active clients.
1085
1086
1087 Item 39:  Message mailing based on backup types
1088  Origin:  Evan Kaufman <evan.kaufman@gmail.com>
1089    Date:  January 6, 2006
1090  Status:
1091
1092    What:  In the "Messages" resource definitions, allowing messages
1093           to be mailed based on the type (backup, restore, etc.) and level
1094           (full, differential, etc) of job that created the originating
1095           message(s).
1096
1097  Why:     It would, for example, allow someone's boss to be emailed
1098           automatically only when a Full Backup job runs, so he can
1099           retrieve the tapes for offsite storage, even if the IT dept.
1100           doesn't (or can't) explicitly notify him.  At the same time, his
1101           mailbox wouldnt be filled by notifications of Verifies, Restores,
1102           or Incremental/Differential Backups (which would likely be kept
1103           onsite).
1104
1105  Notes:   One way this could be done is through additional message types, for example:
1106
1107    Messages {
1108      # email the boss only on full system backups
1109      Mail = boss@mycompany.com = full, !incremental, !differential, !restore,
1110             !verify, !admin
1111      # email us only when something breaks
1112      MailOnError = itdept@mycompany.com = all
1113    }
1114
1115
1116 Item 40:  Include JobID in spool file name ****DONE****
1117   Origin: Mark Bergman <mark.bergman@uphs.upenn.edu>
1118   Date:   Tue Aug 22 17:13:39 EDT 2006
1119   Status: Done. (patches/testing/project-include-jobid-in-spool-name.patch)
1120           No need to vote for this item.
1121
1122   What:   Change the name of the spool file to include the JobID
1123
1124   Why:    JobIDs are the common key used to refer to jobs, yet the
1125           spoolfile name doesn't include that information. The date/time
1126           stamp is useful (and should be retained).
1127
1128 ============= Empty Feature Request form ===========
1129 Item  n:  One line summary ...
1130   Date:   Date submitted
1131   Origin: Name and email of originator.
1132   Status:
1133
1134   What:   More detailed explanation ...
1135
1136   Why:    Why it is important ...
1137
1138   Notes:  Additional notes or features (omit if not used)
1139 ============== End Feature Request form ==============