git.sur5r.net Git - bacula/bacula/blob - bacula/projects

   1
   2 Projects:
   3                      Bacula Projects Roadmap
   4                     Status updated 26 January 2007
   5                    After re-ordering in vote priority
   6
   7 Items Completed:
   8 Item:  18   Quick release of FD-SD connection after backup.
   9 Item:  40   Include JobID in spool file name
  10 Item:  25   Implement huge exclude list support using dlist
  11
  12 Summary:
  13 Item:   1   Accurate restoration of renamed/deleted files
  14 Item:   2   Implement a Bacula GUI/management tool.
  15 Item:   3   Allow FD to initiate a backup
  16 Item:   4   Merge multiple backups (Synthetic Backup or Consolidation).
  17 Item:   5   Deletion of Disk-Based Bacula Volumes
  18 Item:   6   Implement Base jobs.
  19 Item:   7   Implement creation and maintenance of copy pools
  20 Item:   8   Directive/mode to backup only file changes, not entire file
  21 Item:   9   Implement a server-side compression feature
  22 Item:  10   Improve Bacula's tape and drive usage and cleaning management.
  23 Item:  11   Allow skipping execution of Jobs
  24 Item:  12   Add a scheduling syntax that permits weekly rotations
  25 Item:  13   Archival (removal) of User Files to Tape
  26 Item:  14   Cause daemons to use a specific IP address to source communications
  27 Item:  15   Multiple threads in file daemon for the same job
  28 Item:  16   Add Plug-ins to the FileSet Include statements.
  29 Item:  17   Restore only file attributes (permissions, ACL, owner, group...)
  30 Item:  18*  Quick release of FD-SD connection after backup.
  31 Item:  19   Implement a Python interface to the Bacula catalog.
  32 Item:  20   Archive data
  33 Item:  21   Split documentation
  34 Item:  22   Implement support for stacking arbitrary stream filters, sinks.
  35 Item:  23   Implement from-client and to-client on restore command line.
  36 Item:  24   Add an override in Schedule for Pools based on backup types.
  37 Item:  25*  Implement huge exclude list support using hashing.
  38 Item:  26   Implement more Python events in Bacula.
  39 Item:  27   Incorporation of XACML2/SAML2 parsing
  40 Item:  28   Filesystem watch triggered backup.
  41 Item:  29   Allow inclusion/exclusion of files in a fileset by creation/mod times
  42 Item:  30   Tray monitor window cleanups
  43 Item:  31   Implement multiple numeric backup levels as supported by dump
  44 Item:  32   Automatic promotion of backup levels
  45 Item:  33   Clustered file-daemons
  46 Item:  34   Commercial database support
  47 Item:  35   Automatic disabling of devices
  48 Item:  36   An option to operate on all pools with update vol parameters
  49 Item:  37   Add an item to the restore option where you can select a pool
  50 Item:  38   Include timestamp of job launch in "stat clients" output
  51 Item:  39   Message mailing based on backup types
  52 Item:  40*  Include JobID in spool file name
  53
  54
  55 Item  1:  Accurate restoration of renamed/deleted files
  56   Date:   28 November 2005
  57   Origin: Martin Simmons (martin at lispworks dot com)
  58   Status: Robert Nelson will implement this
  59
  60   What:   When restoring a fileset for a specified date (including "most
  61           recent"), Bacula should give you exactly the files and directories
  62           that existed at the time of the last backup prior to that date.
  63
  64           Currently this only works if the last backup was a Full backup.
  65           When the last backup was Incremental/Differential, files and
  66           directories that have been renamed or deleted since the last Full
  67           backup are not currently restored correctly.  Ditto for files with
  68           extra/fewer hard links than at the time of the last Full backup.
  69
  70   Why:    Incremental/Differential would be much more useful if this worked.
  71
  72   Notes:  Merging of multiple backups into a single one seems to
  73           rely on this working, otherwise the merged backups will not be
  74           truly equivalent to a Full backup.
  75
  76           Kern: notes shortened. This can be done without the need for
  77           inodes. It is essentially the same as the current Verify job,
  78           but one additional database record must be written, which does
  79           not need any database change.
  80
  81           Kern: see if we can correct restoration of directories if
  82           replace=ifnewer is set.  Currently, if the directory does not
  83           exist, a "dummy" directory is created, then when all the files
  84           are updated, the dummy directory is newer so the real values
  85           are not updated.
  86
  87 Item  2:  Implement a Bacula GUI/management tool.
  88   Origin: Kern
  89   Date:   28 October 2005
  90   Status: In progress
  91
  92   What:   Implement a Bacula console, and management tools
  93           probably using Qt3 and C++.
  94
  95   Why:    Don't we already have a wxWidgets GUI?  Yes, but
  96           it is written in C++ and changes to the user interface
  97           must be hand tailored using C++ code. By developing
  98           the user interface using Qt designer, the interface
  99           can be very easily updated and most of the new Python
 100           code will be automatically created.  The user interface
 101           changes become very simple, and only the new features
 102           must be implement.  In addition, the code will be in
 103           Python, which will give many more users easy (or easier)
 104           access to making additions or modifications.
 105
 106  Notes:   There is a partial Python-GTK implementation
 107           Lucas Di Pentima <lucas at lunix dot com dot ar> but
 108           it is no longer being developed.
 109
 110 Item  3:  Allow FD to initiate a backup
 111   Origin: Frank Volf (frank at deze dot org)
 112   Date:   17 November 2005
 113   Status:
 114
 115    What:  Provide some means, possibly by a restricted console that
 116           allows a FD to initiate a backup, and that uses the connection
 117           established by the FD to the Director for the backup so that
 118           a Director that is firewalled can do the backup.
 119
 120    Why:   Makes backup of laptops much easier.
 121
 122
 123 Item  4:  Merge multiple backups (Synthetic Backup or Consolidation).
 124   Origin: Marc Cousin and Eric Bollengier
 125   Date:   15 November 2005
 126   Status: Waiting implementation. Depends on first implementing
 127           project Item 2 (Migration) which is now done.
 128
 129   What:   A merged backup is a backup made without connecting to the Client.
 130           It would be a Merge of existing backups into a single backup.
 131           In effect, it is like a restore but to the backup medium.
 132
 133           For instance, say that last Sunday we made a full backup.  Then
 134           all week long, we created incremental backups, in order to do
 135           them fast.  Now comes Sunday again, and we need another full.
 136           The merged backup makes it possible to do instead an incremental
 137           backup (during the night for instance), and then create a merged
 138           backup during the day, by using the full and incrementals from
 139           the week.  The merged backup will be exactly like a full made
 140           Sunday night on the tape, but the production interruption on the
 141           Client will be minimal, as the Client will only have to send
 142           incrementals.
 143
 144           In fact, if it's done correctly, you could merge all the
 145           Incrementals into single Incremental, or all the Incrementals
 146           and the last Differential into a new Differential, or the Full,
 147           last differential and all the Incrementals into a new Full
 148           backup.  And there is no need to involve the Client.
 149
 150   Why:    The benefit is that :
 151           - the Client just does an incremental ;
 152           - the merged backup on tape is just as a single full backup,
 153             and can be restored very fast.
 154
 155           This is also a way of reducing the backup data since the old
 156           data can then be pruned (or not) from the catalog, possibly
 157           allowing older volumes to be recycled
 158
 159 Item  5:  Deletion of Disk-Based Bacula Volumes
 160   Date:   Nov 25, 2005
 161   Origin: Ross Boylan <RossBoylan at stanfordalumni dot org> (edited
 162           by Kern)
 163   Status:
 164
 165    What:  Provide a way for Bacula to automatically remove Volumes
 166           from the filesystem, or optionally to truncate them.
 167           Obviously, the Volume must be pruned prior removal.
 168
 169   Why:    This would allow users more control over their Volumes and
 170           prevent disk based volumes from consuming too much space.
 171
 172   Notes:  The following two directives might do the trick:
 173
 174           Volume Data Retention = <time period>
 175           Remove Volume After = <time period>
 176
 177           The migration project should also remove a Volume that is
 178           migrated. This might also work for tape Volumes.
 179
 180 Item  6:  Implement Base jobs.
 181   Date:   28 October 2005
 182   Origin: Kern
 183   Status:
 184
 185   What:   A base job is sort of like a Full save except that you
 186           will want the FileSet to contain only files that are
 187           unlikely to change in the future (i.e.  a snapshot of
 188           most of your system after installing it).  After the
 189           base job has been run, when you are doing a Full save,
 190           you specify one or more Base jobs to be used.  All
 191           files that have been backed up in the Base job/jobs but
 192           not modified will then be excluded from the backup.
 193           During a restore, the Base jobs will be automatically
 194           pulled in where necessary.
 195
 196   Why:    This is something none of the competition does, as far as
 197           we know (except perhaps BackupPC, which is a Perl program that
 198           saves to disk only).  It is big win for the user, it
 199           makes Bacula stand out as offering a unique
 200           optimization that immediately saves time and money.
 201           Basically, imagine that you have 100 nearly identical
 202           Windows or Linux machine containing the OS and user
 203           files.  Now for the OS part, a Base job will be backed
 204           up once, and rather than making 100 copies of the OS,
 205           there will be only one.  If one or more of the systems
 206           have some files updated, no problem, they will be
 207           automatically restored.
 208
 209   Notes:  Huge savings in tape usage even for a single machine.
 210           Will require more resources because the DIR must send
 211           FD a list of files/attribs, and the FD must search the
 212           list and compare it for each file to be saved.
 213
 214 Item  7:  Implement creation and maintenance of copy pools
 215   Date:   27 November 2005
 216   Origin: David Boyes (dboyes at sinenomine dot net)
 217   Status:
 218
 219   What:   I would like Bacula to have the capability to write copies
 220           of backed-up data on multiple physical volumes selected
 221           from different pools without transferring the data
 222           multiple times, and to accept any of the copy volumes
 223           as valid for restore.
 224
 225   Why:    In many cases, businesses are required to keep offsite
 226           copies of backup volumes, or just wish for simple
 227           protection against a human operator dropping a storage
 228           volume and damaging it. The ability to generate multiple
 229           volumes in the course of a single backup job allows
 230           customers to simple check out one copy and send it
 231           offsite, marking it as out of changer or otherwise
 232           unavailable. Currently, the library and magazine
 233           management capability in Bacula does not make this process
 234           simple.
 235
 236           Restores would use the copy of the data on the first
 237           available volume, in order of copy pool chain definition.
 238
 239           This is also a major scalability issue -- as the number of
 240           clients increases beyond several thousand, and the volume
 241           of data increases, transferring the data multiple times to
 242           produce additional copies of the backups will become
 243           physically impossible due to transfer speed
 244           issues. Generating multiple copies at server side will
 245           become the only practical option.
 246
 247   How:    I suspect that this will require adding a multiplexing
 248           SD that appears to be a SD to a specific FD, but 1-n FDs
 249           to the specific back end SDs managing the primary and copy
 250           pools.  Storage pools will also need to acquire parameters
 251           to define the pools to be used for copies.
 252
 253   Notes:  I would commit some of my developers' time if we can agree
 254           on the design and behavior.
 255
 256 Item  8:  Directive/mode to backup only file changes, not entire file
 257   Date:   11 November 2005
 258   Origin: Joshua Kugler <joshua dot kugler at uaf dot edu>
 259           Marek Bajon <mbajon at bimsplus dot com dot pl>
 260   Status:
 261
 262   What:   Currently when a file changes, the entire file will be backed up in
 263           the next incremental or full backup.  To save space on the tapes
 264           it would be nice to have a mode whereby only the changes to the
 265           file would be backed up when it is changed.
 266
 267   Why:    This would save lots of space when backing up large files such as
 268           logs, mbox files, Outlook PST files and the like.
 269
 270   Notes:  This would require the usage of disk-based volumes as comparing
 271           files would not be feasible using a tape drive.
 272
 273 Item  9:  Implement a server-side compression feature
 274   Date:   18 December 2006
 275   Origin: Vadim A. Umanski , e-mail umanski@ext.ru
 276   Status:
 277   What:   The ability to compress backup data on server receiving data
 278           instead of doing that on client sending data.
 279   Why:    The need is practical. I've got some machines that can send
 280           data to the network 4 or 5 times faster than compressing
 281           them (I've measured that). They're using fast enough SCSI/FC
 282           disk subsystems but rather slow CPUs (ex. UltraSPARC II).
 283           And the backup server has got a quite fast CPUs (ex. Dual P4
 284           Xeons) and quite a low load. When you have 20, 50 or 100 GB
 285           of raw data - running a job 4 to 5 times faster - that
 286           really matters. On the other hand, the data can be
 287           compressed 50% or better - so losing twice more space for
 288           disk backup is not good at all. And the network is all mine
 289           (I have a dedicated management/provisioning network) and I
 290           can get as high bandwidth as I need - 100Mbps, 1000Mbps...
 291           That's why the server-side compression feature is needed!
 292   Notes:
 293
 294 Item 10:  Improve Bacula's tape and drive usage and cleaning management.
 295   Date:   8 November 2005, November 11, 2005
 296   Origin: Adam Thornton <athornton at sinenomine dot net>,
 297           Arno Lehmann <al at its-lehmann dot de>
 298   Status:
 299
 300   What:   Make Bacula manage tape life cycle information, tape reuse
 301           times and drive cleaning cycles.
 302
 303   Why:    All three parts of this project are important when operating
 304           backups.
 305           We need to know which tapes need replacement, and we need to
 306           make sure the drives are cleaned when necessary.  While many
 307           tape libraries and even autoloaders can handle all this
 308           automatically, support by Bacula can be helpful for smaller
 309           (older) libraries and single drives.  Limiting the number of
 310           times a tape is used might prevent tape errors when using
 311           tapes until the drives can't read it any more.  Also, checking
 312           drive status during operation can prevent some failures (as I
 313           [Arno] had to learn the hard way...)
 314
 315   Notes:  First, Bacula could (and even does, to some limited extent)
 316           record tape and drive usage.  For tapes, the number of mounts,
 317           the amount of data, and the time the tape has actually been
 318           running could be recorded.  Data fields for Read and Write
 319           time and Number of mounts already exist in the catalog (I'm
 320           not sure if VolBytes is the sum of all bytes ever written to
 321           that volume by Bacula).  This information can be important
 322           when determining which media to replace.  The ability to mark
 323           Volumes as "used up" after a given number of write cycles
 324           should also be implemented so that a tape is never actually
 325           worn out.  For the tape drives known to Bacula, similar
 326           information is interesting to determine the device status and
 327           expected life time: Time it's been Reading and Writing, number
 328           of tape Loads / Unloads / Errors.  This information is not yet
 329           recorded as far as I [Arno] know.  A new volume status would
 330           be necessary for the new state, like "Used up" or "Worn out".
 331           Volumes with this state could be used for restores, but not
 332           for writing. These volumes should be migrated first (assuming
 333           migration is implemented) and, once they are no longer needed,
 334           could be moved to a Trash pool.
 335
 336           The next step would be to implement a drive cleaning setup.
 337           Bacula already has knowledge about cleaning tapes.  Once it
 338           has some information about cleaning cycles (measured in drive
 339           run time, number of tapes used, or calender days, for example)
 340           it can automatically execute tape cleaning (with an
 341           autochanger, obviously) or ask for operator assistance loading
 342           a cleaning tape.
 343
 344           The final step would be to implement TAPEALERT checks not only
 345           when changing tapes and only sending the information to the
 346           administrator, but rather checking after each tape error,
 347           checking on a regular basis (for example after each tape
 348           file), and also before unloading and after loading a new tape.
 349           Then, depending on the drives TAPEALERT state and the known
 350           drive cleaning state Bacula could automatically schedule later
 351           cleaning, clean immediately, or inform the operator.
 352
 353           Implementing this would perhaps require another catalog change
 354           and perhaps major changes in SD code and the DIR-SD protocol,
 355           so I'd only consider this worth implementing if it would
 356           actually be used or even needed by many people.
 357
 358           Implementation of these projects could happen in three distinct
 359           sub-projects: Measuring Tape and Drive usage, retiring
 360           volumes, and handling drive cleaning and TAPEALERTs.
 361
 362 Item 11:  Allow skipping execution of Jobs
 363   Date:   29 November 2005
 364   Origin: Florian Schnabel <florian.schnabel at docufy dot de>
 365   Status:
 366
 367     What: An easy option to skip a certain job  on a certain date.
 368      Why: You could then easily skip tape backups on holidays.  Especially
 369           if you got no autochanger and can only fit one backup on a tape
 370           that would be really handy, other jobs could proceed normally
 371           and you won't get errors that way.
 372
 373 Item 12:  Add a scheduling syntax that permits weekly rotations
 374    Date:  15 December 2006
 375   Origin: Gregory Brauer (greg at wildbrain dot com)
 376   Status:
 377
 378    What:  Currently, Bacula only understands how to deal with weeks of the
 379           month or weeks of the year in schedules.  This makes it impossible
 380           to do a true weekly rotation of tapes.  There will always be a
 381           discontinuity that will require disruptive manual intervention at
 382           least monthly or yearly because week boundaries never align with
 383           month or year boundaries.
 384
 385           A solution would be to add a new syntax that defines (at least)
 386           a start timestamp, and repetition period.
 387
 388    Why:   Rotated backups done at weekly intervals are useful, and Bacula
 389           cannot currently do them without extensive hacking.
 390
 391    Notes: Here is an example syntax showing a 3-week rotation where full
 392           Backups would be performed every week on Saturday, and an
 393           incremental would be performed every week on Tuesday.  Each
 394           set of tapes could be removed from the loader for the following
 395           two cycles before coming back and being reused on the third
 396           week.  Since the execution times are determined by intervals
 397           from a given point in time, there will never be any issues with
 398           having to adjust to any sort of arbitrary time boundary.  In
 399           the example provided, I even define the starting schedule
 400           as crossing both a year and a month boundary, but the run times
 401           would be based on the "Repeat" value and would therefore happen
 402           weekly as desired.
 403
 404
 405           Schedule {
 406               Name = "Week 1 Rotation"
 407               #Saturday.  Would run Dec 30, Jan 20, Feb 10, etc.
 408               Run {
 409                   Options {
 410                       Type   = Full
 411                       Start  = 2006-12-30 01:00
 412                       Repeat = 3w
 413                   }
 414               }
 415               #Tuesday.  Would run Jan 2, Jan 23, Feb 13, etc.
 416               Run {
 417                   Options {
 418                       Type   = Incremental
 419                       Start  = 2007-01-02 01:00
 420                       Repeat = 3w
 421                   }
 422               }
 423           }
 424
 425           Schedule {
 426               Name = "Week 2 Rotation"
 427               #Saturday.  Would run Jan 6, Jan 27, Feb 17, etc.
 428               Run {
 429                   Options {
 430                       Type   = Full
 431                       Start  = 2007-01-06 01:00
 432                       Repeat = 3w
 433                   }
 434               }
 435               #Tuesday.  Would run Jan 9, Jan 30, Feb 20, etc.
 436               Run {
 437                   Options {
 438                       Type   = Incremental
 439                       Start  = 2007-01-09 01:00
 440                       Repeat = 3w
 441                   }
 442               }
 443           }
 444
 445           Schedule {
 446               Name = "Week 3 Rotation"
 447               #Saturday.  Would run Jan 13, Feb 3, Feb 24, etc.
 448               Run {
 449                   Options {
 450                       Type   = Full
 451                       Start  = 2007-01-13 01:00
 452                       Repeat = 3w
 453                   }
 454               }
 455               #Tuesday.  Would run Jan 16, Feb 6, Feb 27, etc.
 456               Run {
 457                   Options {
 458                       Type   = Incremental
 459                       Start  = 2007-01-16 01:00
 460                       Repeat = 3w
 461                   }
 462               }
 463           }
 464
 465 Item 13:  Archival (removal) of User Files to Tape
 466   Date:   Nov. 24/2005
 467   Origin: Ray Pengelly [ray at biomed dot queensu dot ca
 468   Status:
 469
 470   What:   The ability to archive data to storage based on certain parameters
 471           such as age, size, or location.  Once the data has been written to
 472           storage and logged it is then pruned from the originating
 473           filesystem. Note! We are talking about user's files and not
 474           Bacula Volumes.
 475
 476   Why:    This would allow fully automatic storage management which becomes
 477           useful for large datastores.  It would also allow for auto-staging
 478           from one media type to another.
 479
 480           Example 1) Medical imaging needs to store large amounts of data.
 481           They decide to keep data on their servers for 6 months and then put
 482           it away for long term storage.  The server then finds all files
 483           older than 6 months writes them to tape.  The files are then removed
 484           from the server.
 485
 486           Example 2) All data that hasn't been accessed in 2 months could be
 487           moved from high-cost, fibre-channel disk storage to a low-cost
 488           large-capacity SATA disk storage pool which doesn't have as quick of
 489           access time.  Then after another 6 months (or possibly as one
 490           storage pool gets full) data is migrated to Tape.
 491
 492 Item 14:  Cause daemons to use a specific IP address to source communications
 493  Origin:  Bill Moran <wmoran@collaborativefusion.com>
 494  Date:    18 Dec 2006
 495  Status:
 496  What:    Cause Bacula daemons (dir, fd, sd) to always use the ip address
 497           specified in the [DIR|DF|SD]Addr directive as the source IP
 498           for initiating communication.
 499  Why:     On complex networks, as well as extremely secure networks, it's
 500           not unusual to have multiple possible routes through the network.
 501           Often, each of these routes is secured by different policies
 502           (effectively, firewalls allow or deny different traffic depending
 503           on the source address)
 504           Unfortunately, it can sometimes be difficult or impossible to
 505           represent this in a system routing table, as the result is
 506           excessive subnetting that quickly exhausts available IP space.
 507           The best available workaround is to provide multiple IPs to
 508           a single machine that are all on the same subnet.  In order
 509           for this to work properly, applications must support the ability
 510           to bind outgoing connections to a specified address, otherwise
 511           the operating system will always choose the first IP that
 512           matches the required route.
 513  Notes:   Many other programs support this.  For example, the following
 514           can be configured in BIND:
 515           query-source address 10.0.0.1;
 516           transfer-source 10.0.0.2;
 517           Which means queries from this server will always come from
 518           10.0.0.1 and zone transfers will always originate from
 519           10.0.0.2.
 520
 521 Item 15:  Multiple threads in file daemon for the same job
 522   Date:   27 November 2005
 523   Origin: Ove Risberg (Ove.Risberg at octocode dot com)
 524   Status:
 525
 526   What:   I want the file daemon to start multiple threads for a backup
 527           job so the fastest possible backup can be made.
 528
 529           The file daemon could parse the FileSet information and start
 530           one thread for each File entry located on a separate
 531           filesystem.
 532
 533           A confiuration option in the job section should be used to
 534           enable or disable this feature. The confgutration option could
 535           specify the maximum number of threads in the file daemon.
 536
 537           If the theads could spool the data to separate spool files
 538           the restore process will not be much slower.
 539
 540   Why:    Multiple concurrent backups of a large fileserver with many
 541           disks and controllers will be much faster.
 542
 543 Item 16:  Add Plug-ins to the FileSet Include statements.
 544   Date:   28 October 2005
 545   Origin:
 546   Status: Partially coded in 1.37 -- much more to do.
 547
 548   What:   Allow users to specify wild-card and/or regular
 549           expressions to be matched in both the Include and
 550           Exclude directives in a FileSet.  At the same time,
 551           allow users to define plug-ins to be called (based on
 552           regular expression/wild-card matching).
 553
 554   Why:    This would give the users the ultimate ability to control
 555           how files are backed up/restored.  A user could write a
 556           plug-in knows how to backup his Oracle database without
 557           stopping/starting it, for example.
 558
 559 Item 17:  Restore only file attributes (permissions, ACL, owner, group...)
 560   Origin: Eric Bollengier
 561   Date:   30/12/2006
 562   Status:
 563
 564   What:   The goal of this project is to be able to restore only rights
 565           and attributes of files without crushing them.
 566
 567   Why:    Who have never had to repair a chmod -R 777, or a wild update
 568           of recursive right under Windows? At this time, you must have
 569           enough space to restore data, dump attributes (easy with acl,
 570           more complex with unix/windows rights) and apply them to your
 571           broken tree. With this options, it will be very easy to compare
 572           right or ACL over the time.
 573
 574   Notes:  If the file is here, we skip restore and we change rights.
 575           If the file isn't here, we can create an empty one and apply
 576           rights or do nothing.
 577
 578 Item 18:  Quick release of FD-SD connection after backup.
 579   Origin: Frank Volf (frank at deze dot org)
 580   Date:   17 November 2005
 581   Status: Done -- implemented by Kern -- in CVS 26Jan07
 582
 583    What:  In the Bacula implementation a backup is finished after all data
 584           and attributes are successfully written to storage.  When using a
 585           tape backup it is very annoying that a backup can take a day,
 586           simply because the current tape (or whatever) is full and the
 587           administrator has not put a new one in.  During that time the
 588           system cannot be taken off-line, because there is still an open
 589           session between the storage daemon and the file daemon on the
 590           client.
 591
 592           Although this is a very good strategy for making "safe backups"
 593           This can be annoying for e.g.  laptops, that must remain
 594           connected until the backup is completed.
 595
 596           Using a new feature called "migration" it will be possible to
 597           spool first to harddisk (using a special 'spool' migration
 598           scheme) and then migrate the backup to tape.
 599
 600           There is still the problem of getting the attributes committed.
 601           If it takes a very long time to do, with the current code, the
 602           job has not terminated, and the File daemon is not freed up.  The
 603           Storage daemon should release the File daemon as soon as all the
 604           file data and all the attributes have been sent to it (the SD).
 605           Currently the SD waits until everything is on tape and all the
 606           attributes are transmitted to the Director before signaling
 607           completion to the FD. I don't think I would have any problem
 608           changing this.  The reason is that even if the FD reports back to
 609           the Dir that all is OK, the job will not terminate until the SD
 610           has done the same thing -- so in a way keeping the SD-FD link
 611           open to the very end is not really very productive ...
 612
 613    Why:   Makes backup of laptops much faster.
 614
 615 Item 19:  Implement a Python interface to the Bacula catalog.
 616   Date:   28 October 2005
 617   Origin: Kern
 618   Status:
 619
 620   What:   Implement an interface for Python scripts to access
 621           the catalog through Bacula.
 622
 623   Why:    This will permit users to customize Bacula through
 624           Python scripts.
 625
 626 Item 20:  Archive data
 627   Date:   15/5/2006
 628   Origin: calvin streeting calvin at absentdream dot com
 629   Status:
 630
 631   What:   The abilty to archive to media (dvd/cd) in a uncompressed format
 632           for dead filing (archiving not backing up)
 633
 634     Why:  At my works when jobs are finished and moved off of the main file
 635           servers (raid based systems) onto a simple linux file server (ide based
 636           system) so users can find old information without contacting the IT
 637           dept.
 638
 639           So this data dosn't realy change it only gets added to,
 640           But it also needs backing up.  At the moment it takes
 641           about 8 hours to back up our servers (working data) so
 642           rather than add more time to existing backups i am trying
 643           to implement a system where we backup the acrhive data to
 644           cd/dvd these disks would only need to be appended to
 645           (burn only new/changed files to new disks for off site
 646           storage).  basialy understand the differnce between
 647           achive data and live data.
 648
 649   Notes:  Scan the data and email me when it needs burning divide
 650           into predifind chunks keep a recored of what is on what
 651           disk make me a label (simple php->mysql=>pdf stuff) i
 652           could do this bit ability to save data uncompresed so
 653           it can be read in any other system (future proof data)
 654           save the catalog with the disk as some kind of menu
 655           system
 656
 657 Item 21:  Split documentation
 658   Origin: Maxx <maxxatworkat gmail dot com>
 659   Date:   27th July 2006
 660   Status:
 661
 662   What:   Split documentation in several books
 663
 664   Why:    Bacula manual has now more than 600 pages, and looking for
 665           implementation details is getting complicated.  I think
 666           it would be good to split the single volume in two or
 667           maybe three parts:
 668
 669           1) Introduction, requirements and tutorial, typically
 670              are useful only until first installation time
 671
 672           2) Basic installation and configuration, with all the
 673              gory details about the directives supported 3)
 674              Advanced Bacula: testing, troubleshooting, GUI and
 675              ancillary programs, security managements, scripting,
 676              etc.
 677
 678
 679 Item 22:  Implement support for stacking arbitrary stream filters, sinks.
 680 Date:     23 November 2006
 681 Origin:   Landon Fuller <landonf@threerings.net>
 682 Status:   Planning. Assigned to landonf.
 683
 684   What:   Implement support for the following:
 685           - Stacking arbitrary stream filters (eg, encryption, compression,
 686             sparse data handling))
 687           - Attaching file sinks to terminate stream filters (ie, write out
 688             the resultant data to a file)
 689           - Refactor the restoration state machine accordingly
 690
 691    Why:   The existing stream implementation suffers from the following:
 692            - All state (compression, encryption, stream restoration), is
 693              global across the entire restore process, for all streams. There are
 694              multiple entry and exit points in the restoration state machine, and
 695              thus multiple places where state must be allocated, deallocated,
 696              initialized, or reinitialized. This results in exceptional complexity
 697              for the author of a stream filter.
 698            - The developer must enumerate all possible combinations of filters
 699              and stream types (ie, win32 data with encryption, without encryption,
 700              with encryption AND compression, etc).
 701
 702   Notes:  This feature request only covers implementing the stream filters/
 703           sinks, and refactoring the file daemon's restoration implementation
 704           accordingly. If I have extra time, I will also rewrite the backup
 705           implementation. My intent in implementing the restoration first is to
 706           solve pressing bugs in the restoration handling, and to ensure that
 707           the new restore implementation handles existing backups correctly.
 708
 709           I do not plan on changing the network or tape data structures to
 710           support defining arbitrary stream filters, but supporting that
 711           functionality is the ultimate goal.
 712
 713           Assistance with either code or testing would be fantastic.
 714
 715 Item 23:  Implement from-client and to-client on restore command line.
 716    Date:  11 December 2006
 717   Origin: Discussion on Bacula-users entitled 'Scripted restores to
 718           different clients', December 2006
 719   Status: New feature request
 720
 721   What:   While using bconsole interactively, you can specify the client
 722           that a backup job is to be restored for, and then you can
 723           specify later a different client to send the restored files
 724           back to. However, using the 'restore' command with all options
 725           on the command line, this cannot be done, due to the ambiguous
 726           'client' parameter. Additionally, this parameter means different
 727           things depending on if it's specified on the command line or
 728           afterwards, in the Modify Job screens.
 729
 730      Why: This feature would enable restore jobs to be more completely
 731           automated, for example by a web or GUI front-end.
 732
 733    Notes: client can also be implied by specifying the jobid on the command
 734           line
 735
 736 Item 24:  Add an override in Schedule for Pools based on backup types.
 737 Date:     19 Jan 2005
 738 Origin:   Chad Slater <chad.slater@clickfox.com>
 739 Status:
 740
 741   What:   Adding a FullStorage=BigTapeLibrary in the Schedule resource
 742           would help those of us who use different storage devices for different
 743           backup levels cope with the "auto-upgrade" of a backup.
 744
 745   Why:    Assume I add several new device to be backed up, i.e. several
 746           hosts with 1TB RAID.  To avoid tape switching hassles, incrementals are
 747           stored in a disk set on a 2TB RAID.  If you add these devices in the
 748           middle of the month, the incrementals are upgraded to "full" backups,
 749           but they try to use the same storage device as requested in the
 750           incremental job, filling up the RAID holding the differentials.  If we
 751           could override the Storage parameter for full and/or differential
 752           backups, then the Full job would use the proper Storage device, which
 753           has more capacity (i.e. a 8TB tape library.
 754
 755 Item 25:  Implement huge exclude list support using hashing (dlists).
 756   Date:   28 October 2005
 757   Origin: Kern
 758   Status: Done in 2.1.2 but was done with dlists (doubly linked lists
 759           since hashing will not help. The huge list also supports
 760           large include lists).
 761
 762   What:   Allow users to specify very large exclude list (currently
 763           more than about 1000 files is too many).
 764
 765   Why:    This would give the users the ability to exclude all
 766           files that are loaded with the OS (e.g. using rpms
 767           or debs). If the user can restore the base OS from
 768           CDs, there is no need to backup all those files. A
 769           complete restore would be to restore the base OS, then
 770           do a Bacula restore. By excluding the base OS files, the
 771           backup set will be *much* smaller.
 772
 773 Item 26:  Implement more Python events in Bacula.
 774   Date:   28 October 2005
 775   Origin: Kern
 776   Status:
 777
 778   What:   Allow Python scripts to be called at more places
 779           within Bacula and provide additional access to Bacula
 780           internal variables.
 781
 782   Why:    This will permit users to customize Bacula through
 783           Python scripts.
 784
 785   Notes:  Recycle event
 786           Scratch pool event
 787           NeedVolume event
 788           MediaFull event
 789
 790           Also add a way to get a listing of currently running
 791           jobs (possibly also scheduled jobs).
 792
 793
 794 Item 27:  Incorporation of XACML2/SAML2 parsing
 795    Date:   19 January 2006
 796    Origin: Adam Thornton <athornton@sinenomine.net>
 797    Status: Blue sky
 798
 799    What:   XACML is "eXtensible Access Control Markup Language" and
 800           "SAML is the "Security Assertion Markup Language"--an XML standard
 801           for making statements about identity and authorization.  Having these
 802           would give us a framework to approach ACLs in a generic manner, and
 803           in a way flexible enough to support the four major sorts of ACLs I
 804           see as a concern to Bacula at this point, as well as (probably) to
 805           deal with new sorts of ACLs that may appear in the future.
 806
 807    Why:    Bacula is beginning to need to back up systems with ACLs
 808           that do not map cleanly onto traditional Unix permissions.  I see
 809           four sets of ACLs--in general, mutually incompatible with one
 810           another--that we're going to need to deal with.  These are: NTFS
 811           ACLs, POSIX ACLs, NFSv4 ACLS, and AFS ACLS.  (Some may question the
 812           relevance of AFS; AFS is one of Sine Nomine's core consulting
 813           businesses, and having a reputable file-level backup and restore
 814           technology for it (as Tivoli is probably going to drop AFS support
 815           soon since IBM no longer supports AFS) would be of huge benefit to
 816           our customers; we'd most likely create the AFS support at Sine Nomine
 817           for inclusion into the Bacula (and perhaps some changes to the
 818           OpenAFS volserver) core code.)
 819
 820           Now, obviously, Bacula already handles NTFS just fine.  However, I
 821           think there's a lot of value in implementing a generic ACL model, so
 822           that it's easy to support whatever particular instances of ACLs come
 823           down the pike: POSIX ACLS (think SELinux) and NFSv4 are the obvious
 824           things arriving in the Linux world in a big way in the near future.
 825           XACML, although overcomplicated for our needs, provides this
 826           framework, and we should be able to leverage other people's
 827           implementations to minimize the amount of work *we* have to do to get
 828           a generic ACL framework.  Basically, the costs of implementation are
 829           high, but they're largely both external to Bacula and already sunk.
 830
 831 Item 28:  Filesystem watch triggered backup.
 832   Date:   31 August 2006
 833   Origin: Jesper Krogh <jesper@krogh.cc>
 834   Status: Unimplemented, depends probably on "client initiated backups"
 835
 836   What:   With inotify and similar filesystem triggeret notification
 837           systems is it possible to have the file-daemon to monitor
 838           filesystem changes and initiate backup.
 839
 840   Why:    There are 2 situations where this is nice to have.
 841           1) It is possible to get a much finer-grained backup than
 842              the fixed schedules used now.. A file created and deleted
 843              a few hours later, can automatically be caught.
 844
 845           2) The introduced load on the system will probably be
 846              distributed more even on the system.
 847
 848   Notes:  This can be combined with configration that specifies
 849           something like: "at most every 15 minutes or when changes
 850           consumed XX MB".
 851
 852 Kern Notes: I would rather see this implemented by an external program
 853           that monitors the Filesystem changes, then uses the console
 854           to start the appropriate job.
 855
 856 Item 29:  Allow inclusion/exclusion of files in a fileset by creation/mod times
 857   Origin: Evan Kaufman <evan.kaufman@gmail.com>
 858   Date:   January 11, 2006
 859   Status:
 860
 861   What:   In the vein of the Wild and Regex directives in a Fileset's
 862           Options, it would be helpful to allow a user to include or exclude
 863           files and directories by creation or modification times.
 864
 865           You could factor the Exclude=yes|no option in much the same way it
 866           affects the Wild and Regex directives.  For example, you could exclude
 867           all files modified before a certain date:
 868
 869    Options {
 870      Exclude = yes
 871      Modified Before = ####
 872    }
 873
 874            Or you could exclude all files created/modified since a certain date:
 875
 876    Options {
 877       Exclude = yes
 878      Created Modified Since = ####
 879    }
 880
 881            The format of the time/date could be done several ways, say the number
 882            of seconds since the epoch:
 883            1137008553 = Jan 11 2006, 1:42:33PM   # result of `date +%s`
 884
 885            Or a human readable date in a cryptic form:
 886            20060111134233 = Jan 11 2006, 1:42:33PM   # YYYYMMDDhhmmss
 887
 888   Why:    I imagine a feature like this could have many uses. It would
 889           allow a user to do a full backup while excluding the base operating
 890           system files, so if I installed a Linux snapshot from a CD yesterday,
 891           I'll *exclude* all files modified *before* today.  If I need to
 892           recover the system, I use the CD I already have, plus the tape backup.
 893           Or if, say, a Windows client is hit by a particularly corrosive
 894           virus, and I need to *exclude* any files created/modified *since* the
 895           time of infection.
 896
 897   Notes:  Of course, this feature would work in concert with other
 898           in/exclude rules, and wouldnt override them (or each other).
 899
 900   Notes:  The directives I'd imagine would be along the lines of
 901           "[Created] [Modified] [Before|Since] = <date>".
 902           So one could compare against 'ctime' and/or 'mtime', but ONLY 'before'
 903            or 'since'.
 904
 905
 906 Item 30:  Tray monitor window cleanups
 907   Origin: Alan Brown ajb2 at mssl dot ucl dot ac dot uk
 908   Date:   24 July 2006
 909   Status:
 910   What:   Resizeable and scrollable windows in the tray monitor.
 911
 912   Why:    With multiple clients, or with many jobs running, the displayed
 913           window often ends up larger than the available screen, making
 914           the trailing items difficult to read.
 915
 916
 917 Item 31:  Implement multiple numeric backup levels as supported by dump
 918 Date:     3 April 2006
 919 Origin:   Daniel Rich <drich@employees.org>
 920 Status:
 921 What:     Dump allows specification of backup levels numerically instead of just
 922           "full", "incr", and "diff".  In this system, at any given level, all
 923           files are backed up that were were modified since the last backup of a
 924           higher level (with 0 being the highest and 9 being the lowest).  A
 925           level 0 is therefore equivalent to a full, level 9 an incremental, and
 926           the levels 1 through 8 are varying levels of differentials.  For
 927           bacula's sake, these could be represented as "full", "incr", and
 928           "diff1", "diff2", etc.
 929
 930 Why:      Support of multiple backup levels would provide for more advanced backup
 931           rotation schemes such as "Towers of Hanoi".  This would allow better
 932           flexibility in performing backups, and can lead to shorter recover
 933           times.
 934
 935 Notes:    Legato Networker supports a similar system with full, incr, and 1-9 as
 936           levels.
 937
 938 Item 32:  Automatic promotion of backup levels
 939    Date:  19 January 2006
 940   Origin: Adam Thornton <athornton@sinenomine.net>
 941   Status:
 942
 943     What: Amanda has a feature whereby it estimates the space that a
 944           differential, incremental, and full backup would take.  If the
 945           difference in space required between the scheduled level and the next
 946           level up is beneath some user-defined critical threshold, the backup
 947           level is bumped to the next type.  Doing this minimizes the number of
 948           volumes necessary during a restore, with a fairly minimal cost in
 949           backup media space.
 950
 951     Why:  I know at least one (quite sophisticated and smart) user
 952           for whom the absence of this feature is a deal-breaker in terms of
 953           using Bacula; if we had it it would eliminate the one cool thing
 954           Amanda can do and we can't (at least, the one cool thing I know of).
 955
 956 Item 33:  Clustered file-daemons
 957   Origin: Alan Brown ajb2 at mssl dot ucl dot ac dot uk
 958   Date:   24 July 2006
 959   Status:
 960   What:   A "virtual" filedaemon, which is actually a cluster of real ones.
 961
 962   Why:    In the case of clustered filesystems (SAN setups, GFS, or OCFS2, etc)
 963           multiple machines may have access to the same set of filesystems
 964
 965           For performance reasons, one may wish to initate backups from
 966           several of these machines simultaneously, instead of just using
 967           one backup source for the common clustered filesystem.
 968
 969           For obvious reasons, normally backups of $A-FD/$PATH and
 970           B-FD/$PATH are treated as different backup sets. In this case
 971           they are the same communal set.
 972
 973           Likewise when restoring, it would be easier to just specify
 974           one of the cluster machines and let bacula decide which to use.
 975
 976           This can be faked to some extent using DNS round robin entries
 977           and a virtual IP address, however it means "status client" will
 978           always give bogus answers. Additionally there is no way of
 979           spreading the load evenly among the servers.
 980
 981           What is required is something similar to the storage daemon
 982           autochanger directives, so that Bacula can keep track of
 983           operating backups/restores and direct new jobs to a "free"
 984           client.
 985
 986    Notes:
 987
 988 Item 34:  Commercial database support
 989   Origin: Russell Howe <russell_howe dot wreckage dot org>
 990   Date:   26 July 2006
 991   Status:
 992
 993   What:   It would be nice for the database backend to support more
 994           databases. I'm thinking of SQL Server at the moment, but I guess Oracle,
 995           DB2, MaxDB, etc are all candidates. SQL Server would presumably be
 996           implemented using FreeTDS or maybe an ODBC library?
 997
 998   Why:    We only really have one database server, which is MS SQL Server
 999           2000. Maintaining a second one for the backup software (we grew out of
1000           SQLite, which I liked, but which didn't work so well with our database
1001           size). We don't really have a machine with the resources to run
1002           postgres, and would rather only maintain a single DBMS. We're stuck with
1003           SQL Server because pretty much all the company's custom applications
1004           (written by consultants) are locked into SQL Server 2000. I can imagine
1005           this scenario is fairly common, and it would be nice to use the existing
1006           properly specced database server for storing Bacula's catalog, rather
1007           than having to run a second DBMS.
1008
1009 Item 35:  Automatic disabling of devices
1010    Date:  2005-11-11
1011   Origin: Peter Eriksson <peter at ifm.liu dot se>
1012   Status:
1013
1014    What:  After a configurable amount of fatal errors with a tape drive
1015           Bacula should automatically disable further use of a certain
1016           tape drive. There should also be "disable"/"enable" commands in
1017           the "bconsole" tool.
1018
1019    Why:   On a multi-drive jukebox there is a possibility of tape drives
1020           going bad during large backups (needing a cleaning tape run,
1021           tapes getting stuck). It would be advantageous if Bacula would
1022           automatically disable further use of a problematic tape drive
1023           after a configurable amount of errors has occurred.
1024
1025           An example: I have a multi-drive jukebox (6 drives, 380+ slots)
1026           where tapes occasionally get stuck inside the drive. Bacula will
1027           notice that the "mtx-changer" command will fail and then fail
1028           any backup jobs trying to use that drive. However, it will still
1029           keep on trying to run new jobs using that drive and fail -
1030           forever, and thus failing lots and lots of jobs... Since we have
1031           many drives Bacula could have just automatically disabled
1032           further use of that drive and used one of the other ones
1033           instead.
1034
1035 Item 36:  An option to operate on all pools with update vol parameters
1036   Origin: Dmitriy Pinchukov <absh@bossdev.kiev.ua>
1037    Date:  16 August 2006
1038   Status:
1039
1040    What:  When I do update -> Volume parameters -> All Volumes
1041           from Pool, then I have to select pools one by one.  I'd like
1042           console to have an option like "0: All Pools" in the list of
1043           defined pools.
1044
1045    Why:   I have many pools and therefore unhappy with manually
1046           updating each of them using update -> Volume parameters -> All
1047           Volumes from Pool -> pool #.
1048
1049 Item 37:  Add an item to the restore option where you can select a pool
1050   Origin: kshatriyak at gmail dot com
1051     Date: 1/1/2006
1052   Status:
1053
1054     What: In the restore option (Select the most recent backup for a
1055           client) it would be useful to add an option where you can limit
1056           the selection to a certain pool.
1057
1058      Why: When using cloned jobs, most of the time you have 2 pools - a
1059           disk pool and a tape pool.  People who have 2 pools would like to
1060           select the most recent backup from disk, not from tape (tape
1061           would be only needed in emergency).  However, the most recent
1062           backup (which may just differ a second from the disk backup) may
1063           be on tape and would be selected.  The problem becomes bigger if
1064           you have a full and differential - the most "recent" full backup
1065           may be on disk, while the most recent differential may be on tape
1066           (though the differential on disk may differ even only a second or
1067           so).  Bacula will complain that the backups reside on different
1068           media then.  For now the only solution now when restoring things
1069           when you have 2 pools is to manually search for the right
1070           job-id's and enter them by hand, which is a bit fault tolerant.
1071
1072 Item 38:  Include timestamp of job launch in "stat clients" output
1073   Origin: Mark Bergman <mark.bergman@uphs.upenn.edu>
1074   Date:   Tue Aug 22 17:13:39 EDT 2006
1075   Status:
1076
1077   What:   The "stat clients" command doesn't include any detail on when
1078           the active backup jobs were launched.
1079
1080   Why:    Including the timestamp would make it much easier to decide whether
1081           a job is running properly.
1082
1083   Notes:  It may be helpful to have the output from "stat clients" formatted
1084           more like that from "stat dir" (and other commands), in a column
1085           format. The per-client information that's currently shown (level,
1086           client name, JobId, Volume, pool, device, Files, etc.) is good, but
1087           somewhat hard to parse (both programmatically and visually),
1088           particularly when there are many active clients.
1089
1090
1091 Item 39:  Message mailing based on backup types
1092  Origin:  Evan Kaufman <evan.kaufman@gmail.com>
1093    Date:  January 6, 2006
1094  Status:
1095
1096    What:  In the "Messages" resource definitions, allowing messages
1097           to be mailed based on the type (backup, restore, etc.) and level
1098           (full, differential, etc) of job that created the originating
1099           message(s).
1100
1101  Why:     It would, for example, allow someone's boss to be emailed
1102           automatically only when a Full Backup job runs, so he can
1103           retrieve the tapes for offsite storage, even if the IT dept.
1104           doesn't (or can't) explicitly notify him.  At the same time, his
1105           mailbox wouldnt be filled by notifications of Verifies, Restores,
1106           or Incremental/Differential Backups (which would likely be kept
1107           onsite).
1108
1109  Notes:   One way this could be done is through additional message types, for example:
1110
1111    Messages {
1112      # email the boss only on full system backups
1113      Mail = boss@mycompany.com = full, !incremental, !differential, !restore,
1114             !verify, !admin
1115      # email us only when something breaks
1116      MailOnError = itdept@mycompany.com = all
1117    }
1118
1119
1120 Item 40:  Include JobID in spool file name ****DONE****
1121   Origin: Mark Bergman <mark.bergman@uphs.upenn.edu>
1122   Date:   Tue Aug 22 17:13:39 EDT 2006
1123   Status: Done. (patches/testing/project-include-jobid-in-spool-name.patch)
1124           No need to vote for this item.
1125
1126   What:   Change the name of the spool file to include the JobID
1127
1128   Why:    JobIDs are the common key used to refer to jobs, yet the
1129           spoolfile name doesn't include that information. The date/time
1130           stamp is useful (and should be retained).
1131
1132 ============= Empty Feature Request form ===========
1133 Item  n:  One line summary ...
1134   Date:   Date submitted
1135   Origin: Name and email of originator.
1136   Status:
1137
1138   What:   More detailed explanation ...
1139
1140   Why:    Why it is important ...
1141
1142   Notes:  Additional notes or features (omit if not used)
1143 ============== End Feature Request form ==============