Kern's ToDo List
- 02 May 2008
+ 17 July 2009
+
+Rescue:
+Add to USB key:
+ gftp sshfs kile kate lsssci m4 mtx nfs-common nfs-server
+ patch squashfs-tools strace sg3-utils screen scsiadd
+ system-tools-backend telnet dpkg traceroute urar usbutils
+ whois apt-file autofs busybox chkrootkit clamav dmidecode
+ manpages-dev manpages-posix manpages-posix-dev
Document:
+- package sg3-utils, program sg_map
- !!! Cannot restore two jobs a the same time that were
written simultaneously unless they were totally spooled.
- Document cleaning up the spool files:
for disaster recovery.
Professional Needs:
+- Nexenta (zfs + hardy + iscsi + nas + smf support)
+- NDMP
+ - For NAS OpenNAS
+ - ndmfs -- File Server extention in NDMPv4.
+ - ndmjob -- NDMP backup/restore NDMPv2, NDMPv3, and NDMPv4
+- Base jobs
- Migration from other vendors
- Date change
- Path change
- Detect state change of system (verify)
- Synthetic Full, Diff, Inc (Virtual, Reconstructed)
- SD to SD
-- Modules for Databases, Exchange, ...
- Novell NSS backup http://www.novell.com/coolsolutions/tools/18952.html
- Compliance norms that compare restored code hash code.
- When glibc crash, get address with
info symbol 0x809780c
- How to sync remote offices.
-- Exchange backup:
- http://www.microsoft.com/technet/itshowcase/content/exchbkup.mspx
- David's priorities
Copypools
Extract capability (#25)
Priority:
================
-- Deadlock detection, watchdog sees if counter advances when jobs are
- running. With debug on, can do a "status" command.
+
+- Why no error message if restore has no permission on the where
+ directory?
+- Possibly allow manual "purge" to purge a Volume that has not
+ yet been written (even if FirstWritten time is zero) see ua_purge.c
+ is_volume_purged().
+- Add disk block detection bsr code (make it work).
+- Remove done bsrs.
- User options for plugins.
+- Pool Storage override precedence over command line.
- Autolabel only if Volume catalog information indicates tape not
written. This will avoid overwriting a tape that gets an I/O
error on reading the volume label.
+- I/O error, SD thinks it is not the right Volume, should check slot
+ then disable volume, but Asks for mount.
- Can be posible modify package to create and use configuration files in
the Debian manner?
not exist, back it up, then try a full restore. It fails.
- Softlinks that point to non-existent file are not restored in restore all,
but are restored if the file is individually selected. BUG!
-- New directive "Delete purged Volumes"
- Prune by Job
- Prune by Job Level (Full, Differential, Incremental)
- Strict automatic pruning
-- Implement unmount of USB volumes.
- Use "./config no-idea no-mdc2 no-rc5" on building OpenSSL for
Win32 to avoid patent problems.
- Implement multiple jobid specification for the cancel command,
similar to what is permitted on the update slots command.
-- Implement Bacula plugins -- design API
+ - Better yet allow wild-cards or regexes.
+- Add Group resource for grouping Jobs so they can all be
+ run at the same time or canceled at the same time.
- modify pruning to keep a fixed number of versions of a file,
if requested.
- the cd-command should allow complete paths
its faster to enter the specified directory
- Make tree walk routines like cd, ls, ... more user friendly
by handling spaces better.
+- When doing a restore, if the user does an "update slots"
+ after the job started in order to add a restore volume, the
+ values prior to the update slots will be put into the catalog.
+ Must retrieve catalog record merge it then write it back at the
+ end of the restore job, if we want to do this right.
=== rate design
jcr->last_rate
jcr->last_runtime
VolSessionId and VolSessionTime.
=========================================================
+=========================================================
+ Preliminary design of Deletion of disk volumes
+
+tem 5: Deletion of disk Volumes when pruned
+ Date: Nov 25, 2005
+ Origin: Ross Boylan <RossBoylan at stanfordalumni dot org> (edited
+ by Kern)
+ Status:
+
+ What: Provide a way for Bacula to automatically remove Volumes
+ from the filesystem, or optionally to truncate them.
+ Obviously, the Volume must be pruned prior removal.
+
+ Why: This would allow users more control over their Volumes and
+ prevent disk based volumes from consuming too much space.
+
+ Notes: The following two directives might do the trick:
+
+ Volume Data Retention = <time period>
+ Remove Volume After = <time period>
+
+ The migration project should also remove a Volume that is
+ migrated. This might also work for tape Volumes.
+
+ Notes: (Kern). The data fields to control this have been added
+ to the new 3.0.0 database table structure.
+
+As noted above, in version 3.0.0, we added a new Media column
+named ActionOnPurge, which is a TINYINT (smallint in PostgreSQL).
+The purpose of this field is to have a flag set with each Volume
+that determines certain actions that will be performed when a
+Volume is being marked Purged (i.e. when there are no longer any
+Job records pointing to that Volume).
+
+We have envisioned that ActionOnPurge could take on the following
+values (some are exclusive and others inclusive):
+
+ Flag Value Comments
+ Delete Delete the Volume from the catalog and disk
+ What delete means for a tape is unclear.
+ Truncate Truncate the Volume
+ Erase Erase the Volume (overwrite data) could be
+ very time consuming. Erase could be specified
+ with either Truncate or Delete.
+
+Implementation details:
+- ActionOnPurge is probably a bit mask.
+- There needs to be a new Directive in the Pool resource that allows
+ setting of this flag.
+- The flag must be passed to the SD along with the current Volume information.
+- There needs to be a new command sent from the Director to the SD
+ that indicates that a Purge was done, the Volume name, and that it
+ should be handled.
+- For security reasons the SD must very carefully check that it actually
+ can find the correct volume. This means, it must mount it, read the label
+ or already have done so, and verify that the Volume is really there.
+ Then the SD can perform the requested function (delete or truncate).
+- Doing an Erase could be implemented later.
+- In the above Feature Request, the proposed Volume Data Retention
+ directive is already implemented with Volume Retention Interval.
+- In the above Feature Request, the proposed Remove Volume After is
+ a bit problematic as it means that some action must occur some time
+ later, and currently Bacula has no mechanism to handle such events.
+ This will probably be considered as a feature to be added later
+ if there is sufficient demand.
+
+=========================================================
+
+Item 1: Ability to restart failed jobs
+ Date: 26 April 2009
+ Origin: Kern/Eric
+ Status:
+
+ What: Often jobs fail because of a communications line drop or max run time,
+ cancel, or some other non-critical problem. Currrently any data
+ saved is lost. This implementation should modify the Storage daemon
+ so that it saves all the files that it knows are completely backed
+ up to the Volume
+
+ The jobs should then be marked as incomplete and a subsequent
+ Incremental Accurate backup will then take into account all the
+ previously saved job.
+
+ Why: Avoids backuping data already saved.
+
+ Notes: Requires Accurate to restart correctly. Must completed have a minimum
+ volume of data or files stored on Volume before enabling.
+
+ Implementation notes:
+ - Must define new I job termination code for incomplete Jobs -- Done
+ - In the SD must track the position of the attributes being spooled
+ when data is actually written to the Volume -- Done
+ - In the SD, truncate the attributes to the last valid file written
+ to the Volume
+ - The Dir must past restart flag to SD -- Done
+ - If restart flag is sent in SD, and Job fails, must truncate attribute
+ file and send it to Dir marking the job as I (incomplete).
+ - In Dir when a Job is restarted, if there is an Incomplete job, must
+ send Accurate information to FD.
+ - In FD must use accurate information
+ - If Incomplete job finishes, must mark it T.
+
+
====
Handling removable disks
===
- Fix bpipe.c so that it does not modify results pointer.
***FIXME*** calling sequence should be changed.
+
+- When reserving a device to read, check to see if the Volume
+ is already in use, if so wait. Probably will need to pass the
+ Volume. See bug #1313. Create a regression test to simulate
+ this problem and see if VolumePollInterval fixes it. Possibly turn
+ it on by default.
+
+- Fix restore of acls and extended attributes to count ERROR
+ messages and make errors non-fatal.
+- Put save/restore various platform acl/xattrs on a pointer to simplify
+ the code.
+- Add blast attributes to DIR to SD.
+- Detect deadlocks in reservations.
+- Plugins:
+ - Add list during dump
+ - Add in plugin code flag
+ - Add bRC_EndJob -- stops more calls to plugin this job
+ - Add bRC_Term (unload plugin)
+ - remove time_t from Jmsg and use utime_t?
+- Deadlock detection, watchdog sees if counter advances when jobs are
+ running. With debug on, can do a "status" command.
+- New directive "Delete purged Volumes"
+- Implement unmount of USB volumes.
+