+++ /dev/null
- Kern's ToDo List
- 21 September 2009
-
-Rescue:
-Add to USB key:
- gftp sshfs kile kate lsssci m4 mtx nfs-common nfs-server
- patch squashfs-tools strace sg3-utils screen scsiadd
- system-tools-backend telnet dpkg traceroute urar usbutils
- whois apt-file autofs busybox chkrootkit clamav dmidecode
- manpages-dev manpages-posix manpages-posix-dev
-
-
-Document:
-- package sg3-utils, program sg_map
-- !!! Cannot restore two jobs a the same time that were
- written simultaneously unless they were totally spooled.
-- Document cleaning up the spool files:
- db, pid, state, bsr, mail, conmsg, spool
-- Document the multiple-drive-changer.txt script.
-- Pruning with Admin job.
-- Does WildFile match against full name? Doc.
-- %d and %v only valid on Director, not for ClientRunBefore/After.
-- During tests with the 260 char fix code, I found one problem:
- if the system "sees" a long path once, it seems to forget it's
- working drive (e.g. c:\), which will lead to a problem during
- the next job (create bootstrap file will fail). Here is the
- workaround: specify absolute working and pid directory in
- bacula-fd.conf (e.g. c:\bacula\working instead of
- \bacula\working).
-- Document techniques for restoring large numbers of files.
-- Document setting my.cnf to big file usage.
-- Correct the Include syntax in the m4.xxx files in examples/conf
-- Document all the little details of setting up certificates for
- the Bacula data encryption code.
-- Document more precisely how to use master keys -- especially
- for disaster recovery.
-
-Priority:
-================
-24-Jul 09:56 rufus-fd JobId 1: VSS Writer (BackupComplete): "System Writer", State: 0x1 (VSS_WS_STABLE)
-24-Jul 09:56 rufus-fd JobId 1: Warning: VSS Writer (BackupComplete): "ASR Writer", State: 0x8 (VSS_WS_FAILED_AT_PREPARE_SNAPSHOT)
-24-Jul 09:56 rufus-fd JobId 1: VSS Writer (BackupComplete): "WMI Writer", State: 0x1 (VSS_WS_STABLE)
-- Add external command to lookup hostname (eg nmblookup timmy-win7)
-nmblookup gato
-querying gato on 127.255.255.255
-querying gato on 192.168.1.255
- 192.168.1.8 gato<00>
- 192.168.1.11 gato<00>
- 192.168.1.8 gato<00>
- 192.168.1.11 gato<00>
-- Possibly allow SD to spool even if a tape is not mounted.
-- How to sync remote offices.
-- Windows Bare Metal
-- Backup up windows system state
-- Complete Job restart
-- Look at rsysnc for incremental updates and dedupping
-- Implement rwlock() for SD that takes why and can_steal to replace
- existing block/lock mechanism. rlock() would allow multiple readers
- wlock would allow only one writer.
-- For Windows disaster recovery see http://unattended.sf.net/
-- Add "before=" "olderthan=" to FileSet for doing Base of
- unchanged files.
-- Show files/second in client status output.
-- Don't attempt to restore from "Disabled" Volumes.
-- Have SD compute MD5 or SHA1 and compare to what FD computes.
-- Make VolumeToCatalog calculate an MD5 or SHA1 from the
- actual data on the Volume and compare it.
-- Remove queue.c code.
-- Implement multiple jobid specification for the cancel command,
- similar to what is permitted on the update slots command.
-- Ensure that the SD re-reads the Media record if the JobFiles
- does not match -- it may have been updated by another job.
-- Add MD5 or SHA1 check in SD for data validation
-- When reserving a device to read, check to see if the Volume
- is already in use, if so wait. Probably will need to pass the
- Volume. See bug #1313. Create a regression test to simulate
- this problem and see if VolumePollInterval fixes it. Possibly turn
- it on by default.
-
-- Page hash tables
-- Deduplication
-- Why no error message if restore has no permission on the where
- directory?
-- Possibly allow manual "purge" to purge a Volume that has not
- yet been written (even if FirstWritten time is zero) see ua_purge.c
- is_volume_purged().
-- Add disk block detection bsr code (make it work).
-- Remove done bsrs.
-- Detect deadlocks in reservations.
-- Plugins:
- - Add list during dump
- - Add in plugin code flag
- - Add bRC_EndJob -- stops more calls to plugin this job
- - Add bRC_Term (unload plugin)
- - remove time_t from Jmsg and use utime_t?
-- Deadlock detection, watchdog sees if counter advances when jobs are
- running. With debug on, can do a "status" command.
-- User options for plugins.
-- Pool Storage override precedence over command line.
-- Autolabel only if Volume catalog information indicates tape not
- written. This will avoid overwriting a tape that gets an I/O
- error on reading the volume label.
-- I/O error, SD thinks it is not the right Volume, should check slot
- then disable volume, but Asks for mount.
-- Can be posible modify package to create and use configuration files in
- the Debian manner?
-
- For example:
-
- /etc/bacula/bacula-dir.conf
- /etc/bacula/conf.d/pools.conf
- /etc/bacula/conf.d/clients.conf
- /etc/bacula/conf.d/storages.conf
-
- and into bacula-dir.conf file include
-
- @/etc/bacula/conf.d/pools.conf
- @/etc/bacula/conf.d/clients.conf
- @/etc/bacula/conf.d/storages.conf
-- Possibly add an Inconsistent state when a Volume is in error
- for non I/O reasons.
-- Fix #ifdefing so that smartalloc can be disabled. Check manual
- -- the default is enabled.
-- Dangling softlinks are not restored properly. For example, take a
- soft link such as src/testprogs/install-sh, which points to /usr/share/autoconf...
- move the directory to another machine where the file /usr/share/autoconf does
- not exist, back it up, then try a full restore. It fails.
-- Softlinks that point to non-existent file are not restored in restore all,
- but are restored if the file is individually selected. BUG!
-- Prune by Job
-- Prune by Job Level (Full, Differential, Incremental)
-- modify pruning to keep a fixed number of versions of a file,
- if requested.
-- the cd-command should allow complete paths
- i.e. cd /foo/bar/foo/bar
- -> if a customer mails me the path to a certain file,
- its faster to enter the specified directory
-- Make tree walk routines like cd, ls, ... more user friendly
- by handling spaces better.
-- When doing a restore, if the user does an "update slots"
- after the job started in order to add a restore volume, the
- values prior to the update slots will be put into the catalog.
- Must retrieve catalog record merge it then write it back at the
- end of the restore job, if we want to do this right.
-=== rate design
- jcr->last_rate
- jcr->last_runtime
- MA = (last_MA * 3 + rate) / 4
- rate = (bytes - last_bytes) / (runtime - last_runtime)
-===
-- Add a recursive mark command (rmark) to restore.
-- Look at simplifying File exclusions.
-- Scripts
-- Separate Files and Directories in catalog
-- Create FileVersions table
-- finish implementation of fdcalled -- see ua_run.c:105
-- Fix problem in postgresql.c in my_postgresql_query, where the
- generation of the error message doesn't differentiate result==NULL
- and a bad status from that result. Not only that, the result is
- cleared on a bail_out without having generated the error message.
-- Implement SDErrors (must return from SD)
-- Implement continue spooling while despooling.
-- Remove all install temp files in Win32 PLUGINSDIR.
-- No where in restore causes kaboom.
-- Performance: multiple spool files for a single job.
-- Performance: despool attributes when despooling data (problem
- multiplexing Dir connection).
-- Implement wait_for_sysop() message display in wait_for_device(), which
- now prints warnings too often.
-- Ensure that each device in an Autochanger has a different
- Device Index.
-- Look at sg_logs -a /dev/sg0 for getting soft errors.
-- btape "test" command with Offline on Unmount = yes
-
- This test is essential to Bacula.
-
- I'm going to write one record in file 0,
- two records in file 1,
- and three records in file 2
-
- 02-Feb 11:00 btape: ABORTING due to ERROR in dev.c:715
- dev.c:714 Bad call to rewind. Device "LTO" (/dev/nst0) not open
- 02-Feb 11:00 btape: Fatal Error because: Bacula interrupted by signal 11: Segmentation violation
- Kaboom! btape, btape got signal 11. Attempting traceback.
-
-- Encryption -- email from Landon
- > The backup encryption algorithm is currently not configurable, and is
- > set to AES_128_CBC in src/filed/backup.c. The encryption code
- > supports a number of different ciphers (as well as adding arbitrary
- > new ones) -- only a small bit of code would be required to map a
- > configuration string value to a CRYPTO_CIPHER_* value, if anyone is
- > interested in implementing this functionality.
-
-- Add the OS version back to the Win32 client info.
-- Restarted jobs have a NULL in the from field.
-- Modify SD status command to indicate when the SD is writing
- to a DVD (the device is not open -- see bug #732).
-- Look at the possibility of adding "SET NAMES UTF8" for MySQL,
- and possibly changing the blobs into varchar.
-- Test Volume compatibility between machine architectures
-- Encryption documentation
-
-Professional Needs:
-- Migration from other vendors
- - Date change
- - Path change
-- Filesystem types
-- Backup conf/exe (all daemons)
-- Detect state change of system (verify)
-- SD to SD
-- Novell NSS backup http://www.novell.com/coolsolutions/tools/18952.html
-- Compliance norms that compare restored code hash code.
-- David's priorities
- Copypools
- Extract capability (#25)
- Threshold triggered migration jobs (not currently in list, but will be
- needed ASAP)
- Client triggered backups
- Complete rework of the scheduling system (not in list)
- Performance and usage instrumentation (not in list)
- See email of 21Aug2007 for details.
-- Look at: http://tech.groups.yahoo.com/group/cfg2html
- and http://www.openeyet.nl/scc/ for managing customer changes
-
-Projects:
-- Pool enhancements
- - Access Mode = Read-Only, Read-Write, Unavailable, Destroyed, Offsite
- - Pool Type = Copy
- - Maximum number of scratch volumes
- - Maximum File size
- - Next Pool (already have)
- - Reclamation threshold
- - Reclamation Pool
- - Reuse delay (after all files purged from volume before it can be used)
- - Copy Pool = xx, yyy (or multiple lines).
- - Catalog = xxx
- - Allow pool selection during restore.
-
-- Average tape size from Eric
- SELECT COALESCE(media_avg_size.volavg,0) * count(Media.MediaId) AS volmax, GROUP BY Media.MediaType, Media.PoolId, media_avg_size.volavg
- count(Media.MediaId) AS volnum,
- sum(Media.VolBytes) AS voltotal,
- Media.PoolId AS PoolId,
- Media.MediaType AS MediaType
- FROM Media
- LEFT JOIN (SELECT avg(Media.VolBytes) AS volavg,
- Media.MediaType AS MediaType
- FROM Media
- WHERE Media.VolStatus = 'Full'
- GROUP BY Media.MediaType
- ) AS media_avg_size ON (Media.MediaType = media_avg_size.MediaType)
- GROUP BY Media.MediaType, Media.PoolId, media_avg_size.volavg
-- Performance
- - Despool attributes in separate thread
- - Database speedups
- - Embedded MySQL
- - Check why restore repeatedly sends Rechdrs between
- each data chunk -- according to James Harper 9Jan07.
-- Features
- - Better scheduling
- - More intelligent re-run
- - Incremental backup -- rsync, Stow
-
-- Make Bacula by default not backup tmpfs, procfs, sysfs, ...
-- Fix hardlinked immutable files when linking a second file, the
- immutable flag must be removed prior to trying to link it.
-- Change dbcheck to tell users to use native tools for fixing
- broken databases, and to ensure they have the proper indexes.
-- add udev rules for Bacula devices.
-- If a job terminates, the DIR connection can close before the
- Volume info is updated, leaving the File count wrong.
-- Look at why SIGPIPE during connection can cause seg fault in
- writing the daemon message, when Dir dropped to bacula:bacula
-- Look at zlib 32 => 64 problems.
-- Fix bextract to restore ACLs, or better yet, use common routines.
-- New dot commands from Arno.
- .show device=xxx lists information from one storage device, including
- devices (I'm not even sure that information exists in the DIR...)
- .move eject device=xxx mostly the same as 'unmount xxx' but perhaps with
- better machine-readable output like "Ok" or "Error busy"
- .move eject device=xxx toslot=yyy the same as above, but with a new
- target slot. The catalog should be updated accordingly.
- .move transfer device=xxx fromslot=yyy toslot=zzz
-
-Low priority:
-- Article: http://www.heise.de/open/news/meldung/83231
-- Article: http://www.golem.de/0701/49756.html
-- Article: http://lwn.net/Articles/209809/
-- Article: http://www.onlamp.com/pub/a/onlamp/2004/01/09/bacula.html
-- Article: http://www.linuxdevcenter.com/pub/a/linux/2005/04/07/bacula.html
-- Article: http://www.osreviews.net/reviews/admin/bacula
-- Article: http://www.debianhelp.co.uk/baculaweb.htm
-- Article:
-- Wikis mentioning Bacula
- http://wiki.finkproject.org/index.php/Admin:Backups
- http://wiki.linuxquestions.org/wiki/Bacula
- http://www.openpkg.org/product/packages/?package=bacula
- http://www.iterating.com/products/Bacula
- http://net-snmp.sourceforge.net/wiki/index.php/Net-snmp_extensions
- http://www.section6.net/wiki/index.php/Using_Bacula_for_Tape_Backups
- http://bacula.darwinports.com/
- http://wiki.mandriva.com/en/Releases/Corporate/Server_4/Notes#Bacula
- http://en.wikipedia.org/wiki/Bacula
-
-- Bacula Wikis
- http://www.devco.net/pubwiki/Bacula/
- http://paramount.ind.wpi.edu/wiki/doku.php
- http://gentoo-wiki.com/HOWTO_Backup
- http://www.georglutz.de/wiki/Bacula
- http://www.clarkconnect.com/wiki/index.php?title=Modules_-_LAN_Backup/Recovery
- http://linuxwiki.de/Bacula (in German)
-
-- Figure out how to configure query.sql. Suggestion to use m4:
- == changequote.m4 ===
- changequote(`[',`]')dnl
- ==== query.sql.in ===
- :List next 20 volumes to expire
- SELECT
- Pool.Name AS PoolName,
- Media.VolumeName,
- Media.VolStatus,
- Media.MediaType,
- ifdef([MySQL],
- [ FROM_UNIXTIME(UNIX_TIMESTAMP(Media.LastWritten) Media.VolRetention) AS Expire, ])dnl
- ifdef([PostgreSQL],
- [ media.lastwritten + interval '1 second' * media.volretention as expire, ])dnl
- Media.LastWritten
- FROM Pool
- LEFT JOIN Media
- ON Media.PoolId=Pool.PoolId
- WHERE Media.LastWritten>0
- ORDER BY Expire
- LIMIT 20;
- ====
- Command: m4 -DmySQL changequote.m4 query.sql.in >query.sql
-
- The problem is that it requires m4, which is not present on all machines
- at ./configure time.
-
-==== SQL
-# get null file
-select FilenameId from Filename where Name='';
-# Get list of all directories referenced in a Backup.
-select Path.Path from Path,File where File.JobId=nnn and
- File.FilenameId=(FilenameId-from-above) and File.PathId=Path.PathId
- order by Path.Path ASC;
-
-- Mount on an Autochanger with no tape in the drive causes:
- Automatically selected Storage: LTO-changer
- Enter autochanger drive[0]: 0
- 3301 Issuing autochanger "loaded drive 0" command.
- 3302 Autochanger "loaded drive 0", result: nothing loaded.
- 3301 Issuing autochanger "loaded drive 0" command.
- 3302 Autochanger "loaded drive 0", result: nothing loaded.
- 3902 Cannot mount Volume on Storage Device "LTO-Drive1" (/dev/nst0) because:
- Couldn't rewind device "LTO-Drive1" (/dev/nst0): ERR=dev.c:678 Rewind error on "LTO-Drive1" (/dev/nst0). ERR=No medium found.
- 3905 Device "LTO-Drive1" (/dev/nst0) open but no Bacula volume is mounted.
- If this is not a blank tape, try unmounting and remounting the Volume.
-- If Drive 0 is blocked, and drive 1 is set "Autoselect=no", drive 1 will
- be used.
-- Autochanger did not change volumes.
- select * from Storage;
- +-----------+-------------+-------------+
- | StorageId | Name | AutoChanger |
- +-----------+-------------+-------------+
- | 1 | LTO-changer | 0 |
- +-----------+-------------+-------------+
- 05-May 03:50 roxie-sd: 3302 Autochanger "loaded drive 0", result is Slot 11.
- 05-May 03:50 roxie-sd: Tibs.2006-05-05_03.05.02 Warning: Director wanted Volume "LT
- Current Volume "LT0-002" not acceptable because:
- 1997 Volume "LT0-002" not in catalog.
- 05-May 03:50 roxie-sd: Tibs.2006-05-05_03.05.02 Error: Autochanger Volume "LT0-002"
- Setting InChanger to zero in catalog.
- 05-May 03:50 roxie-dir: Tibs.2006-05-05_03.05.02 Error: Unable to get Media record
-
- 05-May 03:50 roxie-sd: Tibs.2006-05-05_03.05.02 Fatal error: Error getting Volume i
- 05-May 03:50 roxie-sd: Tibs.2006-05-05_03.05.02 Fatal error: Job 530 canceled.
- 05-May 03:50 roxie-sd: Tibs.2006-05-05_03.05.02 Fatal error: spool.c:249 Fatal appe
- 05-May 03:49 Tibs: Tibs.2006-05-05_03.05.02 Fatal error: c:\cygwin\home\kern\bacula
- , got
- (missing)
- llist volume=LTO-002
- MediaId: 6
- VolumeName: LTO-002
- Slot: 0
- PoolId: 1
- MediaType: LTO-2
- FirstWritten: 2006-05-05 03:11:54
- LastWritten: 2006-05-05 03:50:23
- LabelDate: 2005-12-26 16:52:40
- VolJobs: 1
- VolFiles: 0
- VolBlocks: 1
- VolMounts: 0
- VolBytes: 206
- VolErrors: 0
- VolWrites: 0
- VolCapacityBytes: 0
- VolStatus:
- Recycle: 1
- VolRetention: 31,536,000
- VolUseDuration: 0
- MaxVolJobs: 0
- MaxVolFiles: 0
- MaxVolBytes: 0
- InChanger: 0
- EndFile: 0
- EndBlock: 0
- VolParts: 0
- LabelType: 0
- StorageId: 1
-
- Note VolStatus is blank!!!!!
- llist volume=LTO-003
- MediaId: 7
- VolumeName: LTO-003
- Slot: 12
- PoolId: 1
- MediaType: LTO-2
- FirstWritten: 0000-00-00 00:00:00
- LastWritten: 0000-00-00 00:00:00
- LabelDate: 2005-12-26 16:52:40
- VolJobs: 0
- VolFiles: 0
- VolBlocks: 0
- VolMounts: 0
- VolBytes: 1
- VolErrors: 0
- VolWrites: 0
- VolCapacityBytes: 0
- VolStatus: Append
- Recycle: 1
- VolRetention: 31,536,000
- VolUseDuration: 0
- MaxVolJobs: 0
- MaxVolFiles: 0
- MaxVolBytes: 0
- InChanger: 0
- EndFile: 0
- EndBlock: 0
- VolParts: 0
- LabelType: 0
- StorageId: 1
-===
- mount
- Automatically selected Storage: LTO-changer
- Enter autochanger drive[0]: 0
- 3301 Issuing autochanger "loaded drive 0" command.
- 3302 Autochanger "loaded drive 0", result: nothing loaded.
- 3301 Issuing autochanger "loaded drive 0" command.
- 3302 Autochanger "loaded drive 0", result: nothing loaded.
- 3902 Cannot mount Volume on Storage Device "LTO-Drive1" (/dev/nst0) because:
- Couldn't rewind device "LTO-Drive1" (/dev/nst0): ERR=dev.c:678 Rewind error on "LTO-Drive1" (/dev/nst0). ERR=No medium found.
-
- 3905 Device "LTO-Drive1" (/dev/nst0) open but no Bacula volume is mounted.
- If this is not a blank tape, try unmounting and remounting the Volume.
-
-- http://www.dwheeler.com/essays/commercial-floss.html
-- Add VolumeLock to prevent all but lock holder (SD) from updating
- the Volume data (with the exception of VolumeState).
-- The btape fill command does not seem to use the Autochanger
-- Make Windows installer default to system disk drive.
-- Look at using ioctl(FIOBMAP, ...) on Linux, and
- DeviceIoControl(..., FSCTL_QUERY_ALLOCATED_RANGES, ...) on
- Win32 for sparse files.
- http://www.flexhex.com/docs/articles/sparse-files.phtml
- http://www.informatik.uni-frankfurt.de/~loizides/reiserfs/fibmap.html
-- Directive: at <event> "command"
-- Command: pycmd "command" generates "command" event. How to
- attach to a specific job?
-- run_cmd() returns int should return JobId_t
-- get_next_jobid_from_list() returns int should return JobId_t
-- Document export LDFLAGS=-L/usr/lib64
-- Network error on Win32 should set Win32 error code.
-- What happens when you rename a Disk Volume?
-- Job retention period in a Pool (and hence Volume). The job would
- then be migrated.
-- Add Win32 FileSet definition somewhere
-- Look at fixing restore status stats in SD.
-- Look at using ioctl(FIMAP) and FIGETBSZ for sparse files.
- http://www.informatik.uni-frankfurt.de/~loizides/reiserfs/fibmap.html
-- Implement a mode that says when a hard read error is
- encountered, read many times (as it currently does), and if the
- block cannot be read, skip to the next block, and try again. If
- that fails, skip to the next file and try again, ...
-- Add level table:
- create table LevelType (LevelType binary(1), LevelTypeLong tinyblob);
- insert into LevelType (LevelType,LevelTypeLong) values
- ("F","Full"),
- ("D","Diff"),
- ("I","Inc");
-- new pool XXX with ScratchPoolId = MyScratchPool's PoolId and
- let it fill itself, and RecyclePoolId = XXX's PoolId so I can
- see if it become stable and I just have to supervise
- MyScratchPool
-- If I want to remove this pool, I set RecyclePoolId = MyScratchPool's
- PoolId, and when it is empty remove it.
-- Add Volume=SCRTCH
-- Allow Check Labels to be used with Bacula labels.
-- "Resuming" a failed backup (lost line for example) by using the
- failed backup as a sort of "base" job.
-- Look at NDMP
-- Email to the user when the tape is about to need changing x
- days before it needs changing.
-- Command to show next tape that will be used for a job even
- if the job is not scheduled.
-- From: Arunav Mandal <amandal@trolltech.com>
- 1. When jobs are running and bacula for some reason crashes or if I do a
- restart it remembers and jobs it was running before it crashed or restarted
- as of now I loose all jobs if I restart it.
-
- 2. When spooling and in the midway if client is disconnected for instance a
- laptop bacula completely discard the spool. It will be nice if it can write
- that spool to tape so there will be some backups for that client if not all.
-
- 3. We have around 150 clients machines it will be nice to have a option to
- upgrade all the client machines bacula version automatically.
-
- 4. Atleast one connection should be reserved for the bconsole so at heavy load
- I should connect to the director via bconsole which at sometimes I can't
-
- 5. Another most important feature that is missing, say at 10am I manually
- started backup of client abc and it was a full backup since client abc has
- no backup history and at 10.30am bacula again automatically started backup of
- client abc as that was in the schedule. So now we have 2 multiple Full
- backups of the same client and if we again try to start a full backup of
- client backup abc bacula won't complain. That should be fixed.
-
-- regardless of the retention period, Bacula will not prune the
- last Full, Diff, or Inc File data until a month after the
- retention period for the last Full backup that was done.
-- update volume=xxx --- add status=Full
-- Remove old spool files on startup.
-- Exclude SD spool/working directory.
-- Refuse to prune last valid Full backup. Same goes for Catalog.
-- Python:
- - Make a callback when Rerun failed levels is called.
- - Give Python program access to Scheduled jobs.
- - Add setting Volume State via Python.
- - Python script to save with Python, not save, save with Bacula.
- - Python script to do backup.
- - What events?
- - Change the Priority, Client, Storage, JobStatus (error)
- at the start of a job.
-- Why is SpoolDirectory = /home/bacula/spool; not reported
- as an error when writing a DVD?
-- Make bootstrap file handle multiple MediaTypes (SD)
-- Remove all old Device resource code in Dir and code to pass it
- back in SD -- better, rework it to pass back device statistics.
-- Check locking of resources -- be sure to lock devices where previously
- resources were locked.
-- The last part is left in the spool dir.
-
-
-- In restore don't compare byte count on a raw device -- directory
- entry does not contain bytes.
-
-
-- Max Vols limit in Pool off by one?
-- Implement Files/Bytes,... stats for restore job.
-- Implement Total Bytes Written, ... for restore job.
-- Despool attributes simultaneously with data in a separate
- thread, rejoined at end of data spooling.
-- 7. Implement new Console commands to allow offlining/reserving drives,
- and possibly manipulating the autochanger (much asked for).
-- Add start/end date editing in messages (%t %T, %e?) ...
-- Add ClientDefs similar to JobDefs.
-- Print more info when bextract -p accepts a bad block.
-- Fix FD JobType to be set before RunBeforeJob in FD.
-- Look at adding full Volume and Pool information to a Volume
- label so that bscan can get *all* the info.
-- If the user puts "Purge Oldest Volume = yes" or "Recycle Oldest Volume = yes"
- and there is only one volume in the pool, refuse to do it -- otherwise
- he fills the Volume, then immediately starts reusing it.
-- Implement copies and stripes.
-- Add history file to console.
-- Each file on tape creates a JobMedia record. Peter has 4 million
- files spread over 10000 tape files and four tapes. A restore takes
- 16 hours to build the restore list.
-- Add and option to check if the file size changed during backup.
-- Make sure SD deletes spool files on error exit.
-- Delete old spool files when SD starts.
-- When labeling tapes, if you enter 000026, Bacula uses
- the tape index rather than the Volume name 000026.
-- Add offline tape command to Bacula console.
-- Bug:
- Enter MediaId or Volume name: 32
- Enter new Volume name: DLT-20Dec04
- Automatically selected Pool: Default
- Connecting to Storage daemon DLTDrive at 192.168.68.104:9103 ...
- Sending relabel command from "DLT-28Jun03" to "DLT-20Dec04" ...
- block.c:552 Write error at 0:0 on device /dev/nst0. ERR=Bad file descriptor.
- Error writing final EOF to tape. This tape may not be readable.
- dev.c:1207 ioctl MTWEOF error on /dev/nst0. ERR=Permission denied.
- askdir.c:219 NULL Volume name. This shouldn't happen!!!
- 3912 Failed to label Volume: ERR=dev.c:1207 ioctl MTWEOF error on /dev/nst0. ERR=Permission denied.
- Label command failed for Volume DLT-20Dec04.
- Do not forget to mount the drive!!!
-- Bug: if a job is manually scheduled to run later, it does not appear
- in any status report and cannot be cancelled.
-
-
-====
-From David:
-How about introducing a Type = MgmtPolicy job type? That job type would
-be responsible for scanning the Bacula environment looking for specific
-conditions, and submitting the appropriate jobs for implementing said
-policy, eg:
-
-Job {
- Name = "Migration-Policy"
- Type = MgmtPolicy
- Policy Selection Job Type = Migrate
- Scope = "<keyword> <operator> <regexp>"
- Threshold = "<keyword> <operator> <regexp>"
- Job Template = <template-name>
-}
-
-Where <keyword> is any legal job keyword, <operator> is a comparison
-operator (=,<,>,!=, logical operators AND/OR/NOT) and <regexp> is a
-appropriate regexp. I could see an argument for Scope and Threshold
-being SQL queries if we want to support full flexibility. The
-Migration-Policy job would then get scheduled as frequently as a site
-felt necessary (suggested default: every 15 minutes).
-
-Example:
-
-Job {
- Name = "Migration-Policy"
- Type = MgmtPolicy
- Policy Selection Job Type = Migration
- Scope = "Pool=*"
- Threshold = "Migration Selection Type = LowestUtil"
- Job Template = "MigrationTemplate"
-}
-
-would select all pools for examination and generate a job based on
-MigrationTemplate to automatically select the volume with the lowest
-usage and migrate it's contents to the nextpool defined for that pool.
-
-This policy abstraction would be really handy for adjusting the behavior
-of Bacula according to site-selectable criteria (one thing that pops
-into mind is Amanda's ability to automatically adjust backup levels
-depending on various criteria).
-
-
-=====
-
-Regression tests:
-- Add Pool/Storage override regression test.
-- Add delete JobId to regression.
-- Add a regression test for dbcheck.
-- New test to add bscan to four-concurrent-jobs regression,
- i.e. after the four-concurrent jobs zap the
- database as is done in the bscan-test, then use bscan to
- restore the database, do a restore and compare with the
- original.
-- Add restore of specific JobId to regression (item 3
- on the restore prompt)
-- Add IPv6 to regression
-- Add database test to regression. Test each function like delete,
- purge, ...
-
-- AntiVir can slow down backups on Win32 systems.
-- Win32 systems with FAT32 can be much slower than NTFS for
- more than 1000 files per directory.
-
-
-1.37 Possibilities:
-- A HOLD command to stop all jobs from starting.
-- A PAUSE command to pause all running jobs ==> release the
- drive.
-- Media Type = LTO,LTO-2,LTO-3
- Media Type Read = LTO,LTO2,LTO3
- Media Type Write = LTO2, LTO3
-
-=== From Carsten Menke <bootsy52@gmx.net>
-
-Following is a list of what I think in the situations where I'm faced with,
-could be a usefull enhancement to bacula, which I'm certain other users will
-benefit from as well.
-
-1. NextJob/NextJobs Directive within a Job Resource in the form of
- NextJobs = job1,job2.
-
- Why:
- I currently solved the problem with running multiple jobs each after each
- by setting the Max Wait Time for a job to 8 hours, and give
- the jobs different Priorities. However, there scenarios where
- 1 Job is directly depending on another job, so if the former job fails,
- the job after it needn't to be run
- while maybe other jobs should run despite of that
-
-Example:
- A Backup Job and a Verify job, if the backup job fails there is no need to run
- the verify job, as the backup job already failed. However, one may like
- to backup the Catalog to disk despite of that the main backup job failed.
-
-Notes:
- I see that this is related to the Event Handlers which are on the ToDo
- list, also it is maybe a good idea to check for the return value and
- execute different actions based on the return value
-
-
-3. offline capability to bconsole
-
- Why:
- Currently I use a script which I execute within the last Job via the
- RunAfterJob Directive, to release and eject the tape.
- So I have to call bconsole "release=Storage-Name" and afterwards
- mt -f /dev/nst0 eject to get the tape out.
-
- If I have multiple Storage Devices, than these may not be /dev/nst0 and
- I have to modify the script or call it with parameters etc.
- This would actually not be needed, as everything is already defined
- in bacula-sd.conf and if I can invoke bconsole with the
- storage name via $1 in the script than I'm done and information is
- not duplicated.
-
-4. %s for Storage Name added to the chars being substituted in "RunAfterJob"
-
- Why:
-
- For the reason mentioned in 3. to have the ability to call a
- script with /scripts/foobar %s and in the script use $1
- to pass the Storage Name to bconsole
-
-5. Setting Volume State within a Job Resource
-
- Why:
- Instead of using "Maximum Volume Jobs" in the Pool Resource,
- I would have the possibilty to define
- in a Job Resource that after this certain job is run, the Volume State
- should be set to "Volume State = Used", this give more flexibility (IMHO).
-
-7. OK, this is evil, probably bound to security risks and maybe not possible
- due to the design of bacula.
-
- Implementation of Backtics ( `command` ) for shell comand execution to
- the "Label Format" Directive.
-
-Why:
-
- Currently I have defined BACULA_DAY_OF_WEEK="day1|day2..." resulting in
- Label Format = "HolyBackup-${BACULA_DAY_OF_WEEK[${WeekDay}]}". If I could
- use backticks than I could use "Label Format = HolyBackup-`date +%A` to have
- the localized name for the day of the week appended to the
- format string. Then I have the tape labeled automatically with weekday
- name in the correct language.
-==========
-- Make output from status use html table tags for nicely
- presenting in a browser.
-- Browse generations of files.
-- I've seen an error when my catalog's File table fills up. I
- then have to recreate the File table with a larger maximum row
- size. Relevant information is at
- http://dev.mysql.com/doc/mysql/en/Full_table.html ; I think the
- "Installing and Configuring MySQL" chapter should talk a bit
- about this potential problem, and recommend a solution.
-- Want speed of writing to tape while despooling.
-- Supported autochanger:
-OS: Linux
-Man.: HP
-Media: LTO-2
-Model: SSL1016
-Slots: 16
-Cap: 200GB
-- Supported drive:
- Wangtek 6525ES (SCSI-1 QIC drive, 525MB), under Linux 2.4.something,
- bacula 1.36.0/1 works with blocksize 16k INSIDE bacula-sd.conf.
-- Add regex from http://www.pcre.org to Bacula for Win32.
-- Use only shell tools no make in CDROM package.
-- Include within include does it work?
-- Implement a Pool of type Cleaning?
-- Think about making certain database errors fatal.
-- Look at correcting the time jump in the scheduler for daylight
- savings time changes.
-- Check dates entered by user for correctness (month/day/... ranges)
-- Compress restore Volume listing by date and first file.
-- Look at patches/bacula_db.b2z postgresql that loops during restore.
- See Gregory Wright.
-- Perhaps add read/write programs and/or plugins to FileSets.
-- How to handle backing up portables ...
-- Limit bandwidth
-
-Documentation to do: (any release a little bit at a time)
-- Doc to do unmount before removing magazine.
-- Alternative to static linking "ldd prog" save all binaries listed,
- restore them and point LD_LIBRARY_PATH to them.
-- Document add "</dev/null >/dev/null 2>&1" to the bacula-fd command line
-- Document query file format.
-- Add more documentation for bsr files.
-- Document problems with Verify and pruning.
-- Document how to use multiple databases.
-- VXA drives have a "cleaning required"
- indicator, but Exabyte recommends preventive cleaning after every 75
- hours of operation.
- From Phil:
- In this context, it should be noted that Exabyte has a command-line
- vxatool utility available for free download. (The current version is
- vxatool-3.72.) It can get diagnostic info, read, write and erase tapes,
- test the drive, unload tapes, change drive settings, flash new firmware,
- etc.
- Of particular interest in this context is that vxatool <device> -i will
- report, among other details, the time since last cleaning in tape motion
- minutes. This information can be retrieved (and settings changed, for
- that matter) through the generic-SCSI device even when Bacula has the
- regular tape device locked. (Needless to say, I don't recommend
- changing tape settings while a job is running.)
-- Lookup HP cleaning recommendations.
-- Lookup HP tape replacement recommendations (see trouble shooting autochanger)
-- Document doing table repair
-
-
-===================================
-- Add macro expansions in JobDefs.
- Run Before Job = "SomeFile %{Level} %{Client}"
- Write Bootstrap="/some/dir/%{JobName}_%{Client}.bsr"
-- Use non-blocking network I/O but if no data is available, use
- select().
-- Use gather write() for network I/O.
-- Autorestart on crash.
-- Add bandwidth limiting.
-- When an error in input occurs and conio beeps, you can back
- up through the prompt.
-- Detect fixed tape block mode during positioning by looking at
- block numbers in btape "test". Possibly adjust in Bacula.
-- Fix list volumes to output volume retention in some other
- units, perhaps via a directive.
-- If you use restore replace=never, the directory attributes for
- non-existent directories will not be restored properly.
-
-- see lzma401.zip in others directory for new compression
- algorithm/library.
-- Allow the user to select JobType for manual pruning/purging.
-- bscan does not put first of two volumes back with all info in
- bscan-test.
-- Figure out how to make named console messages go only to that
- console and to the non-restricted console (new console class?).
-- Make restricted console prompt for password if *ask* is set or
- perhaps if password is undefined.
-- Implement "from ISO-date/time every x hours/days/weeks/months" in
- schedules.
-
-==== from Marc Schoechlin
-- the help-command should be more verbose
- (it should explain the paramters of the different
- commands in detail)
- -> its time-comsuming to consult the manual anytime
- you need a special parameter
- -> maybe its more easy to maintain this, if the
- descriptions of that commands are outsourced to
- a ceratin-file
-- if the password is not configured in bconsole.conf
- you should be asked for it.
- -> sometimes you like to do restore on a customer-machine
- which shouldnt know the password for bacula.
- -> adding the password to the file favours admins
- to forget to remove the password after usage
- -> security-aspects
- the protection of that file is less important
-- long-listed-output of commands should be scrollable
- like the unix more/less-command does
- -> if someone runs 200 and more machines, the lists could
- be a little long and complex
-- command-output should be shown column by column
- to reduce scrolling and to increase clarity
- -> see last item
-- lsmark should list the selected files with full
- paths
-- wildcards for selecting and file and directories would be nice
-- any actions should be interuptable with STRG+C
-- command-expansion would be pretty cool
-====
-- When the replace Never option is set, new directory permissions
- are not restored. See bug 213. To fix this requires creating a
- list of newly restored directories so that those directory
- permissions *can* be restored.
-- Add prune all command
-- Document fact that purge can destroy a part of a restore by purging
- one volume while others remain valid -- perhaps mark Jobs.
-- Add multiple-media-types.txt
-- look at mxt-changer.html
-- Make ? do a help command (no return needed).
-- Implement restore directory.
-- Document streams and how to implement them.
-- Try not to re-backup a file if a new hard link is added.
-- Add feature to backup hard links only, but not the data.
-- Fix stream handling to be simpler.
-- Add Priority and Bootstrap to Run a Job.
-- Eliminate Restore "Run Restore Job" prompt by allowing new "run command
- to be issued"
-- Remove View FileSet button from Run a Job dialog.
-- Handle prompt for restore job at end of Restore command.
-- Add display of total selected files to Restore window.
-- Add tree pane to left of window.
-- Add progress meter.
-- Max wait time or max run time causes seg fault -- see runtime-bug.txt
-- Add message to user to check for fixed block size when the forward
- space test fails in btape.
-- When unmarking a directory check if all files below are unmarked and
- then remove the + flag -- in the restore tree.
-- Possibly implement: Action = Unmount Device="TapeDrive1" in Admin jobs.
-- Setup lrrd graphs: (http://www.linpro.no/projects/lrrd/) Mike Acar.
-- Revisit the question of multiple Volumes (disk) on a single device.
-- Add a block copy option to bcopy.
-- Fix "llist jobid=xx" where no fileset or client exists.
-- For each job type (Admin, Restore, ...) require only the really necessary
- fields.- Pass Director resource name as an option to the Console.
-- Add a "batch" mode to the Console (no unsolicited queries, ...).
-- Allow browsing the catalog to see all versions of a file (with
- stat data on each file).
-- Restore attributes of directory if replace=never set but directory
- did not exist.
-- Use SHA1 on authentication if possible.
-- See comtest-xxx.zip for Windows code to talk to USB.
-- Add John's appended files:
- Appended = { /files/server/logs/http/*log }
- and such files would be treated as follows.On a FULL backup, they would
- be backed up like any other file.On an INCREMENTAL backup, where a
- previous INCREMENTAL or FULL was already in thecatalogue and the length
- of the file wasgreater than the length of the last backup, only thedata
- added since the last backup will be dumped.On an INCREMENTAL backup, if
- the length of the file is less than thelength of the file with the same
- name last backed up, the completefile is dumped.On Windows systems, with
- creation date of files, we can be evensmarter about this and not count
- entirely upon the length.On a restore, the full and all incrementals
- since it will beapplied in sequence to restore the file.
-- Check new HAVE_WIN32 open bits.
-- Check if the tape has moved before writing.
-- Handling removable disks -- see below:
-- Add FromClient and ToClient keywords on restore command (or
- BackupClient RestoreClient).
-- Implement a JobSet, which groups any number of jobs. If the
- JobSet is started, all the jobs are started together.
- Allow Pool, Level, and Schedule overrides.
-- Look at updating Volume Jobs so that Max Volume Jobs = 1 will work
- correctly for multiple simultaneous jobs.
-- Implement the Media record flag that indicates that the Volume does disk
- addressing.
-- Fix fast block rejection (stored/read_record.c:118). It passes a null
- pointer (rec) to try_repositioning().
-- Implement RestoreJobRetention? Maybe better "JobRetention" in a Job,
- which would take precidence over the Catalog "JobRetention".
-- Implement Label Format in Add and Label console commands.
-- Put email tape request delays on one or more variables. User wants
- to cancel the job after a certain time interval. Maximum Mount Wait?
-- Job, Client, Device, Pool, or Volume?
- Is it possible to make this a directive which is *optional* in multiple
- resources, like Level? If so, I think I'd make it an optional directive
- in Job, Client, and Pool, with precedence such that Job overrides Client
- which in turn overrides Pool.
-
-- New Storage specifications:
- - Want to write to multiple storage devices simultaneously
- - Want to write to multiple storage devices sequentially (in one job)
- - Want to read/write simultaneously
- - Key is MediaType -- it must match
-
- Passed to SD as a sort of BSR record called Storage Specification
- Record or SSR.
- SSR
- Next -> Next SSR
- MediaType -> Next MediaType
- Pool -> Next Pool
- Device -> Next Device
- Job Resource
- Allow multiple Storage specifications
- New flags
- One Archive = yes
- One Device = yes
- One Storage = yes
- One MediaType = yes
- One Pool = yes
- Storage
- Allow Multiple Pool specifications (note, Pool currently
- in Job resource).
- Allow Multiple MediaType specifications in Dir conf
- Allow Multiple Device specifications in Dir conf
- Perhaps keep this in a single SSR
- Tie a Volume to a specific device by using a MediaType that
- is contained in only one device.
- In SD allow Device to have Multiple MediaTypes
-
-- Ideas from Jerry Scharf:
- First let's point out some big pluses that bacula has for this
- it's open source
- more importantly it's active. Thank you so much for that
- even more important, it's not flaky
- it has an open access catalog, opening many possibilities
- it's pushing toward heterogeneous systems capability
- big things:
- Macintosh file client
- working bare iron recovery for windows
- the option for inc/diff backups not reset on fileset revision
- a) use both change and inode update time against base time
- b) do the full catalog check (expensive but accurate)
- sizing guide (how much system is needed to back up N systems/files)
- consultants on using bacula in building a disaster recovery system
- an integration guide
- or how to get at fancy things that one could do with bacula
- logwatch code for bacula logs (or similar)
- support for Oracle database ??
-===
-- Look at adding SQL server and Exchange support for Windows.
-- Add progress of files/bytes to SD and FD.
-- do a "messages" before the first prompt in Console
-- Client does not show busy during Estimate command.
-- Implement Console mtx commands.
-- Implement a Mount Command and an Unmount Command where
- the users could specify a system command to be performed
- to do the mount, after which Bacula could attempt to
- read the device. This is for Removeable media such as a CDROM.
- - Most likely, this mount command would be invoked explicitly
- by the user using the current Console "mount" and "unmount"
- commands -- the Storage Daemon would do the right thing
- depending on the exact nature of the device.
- - As with tape drives, when Bacula wanted a new removable
- disk mounted, it would unmount the old one, and send a message
- to the user, who would then use "mount" as described above
- once he had actually inserted the disk.
-- Implement dump/print label to UA
-- Spool to disk only when the tape is full, then when a tape is hung move
- it to tape.
-- bextract is sending everything to the log file ****FIXME****
-- Allow multiple Storage specifications (or multiple names on
- a single Storage specification) in the Job record. Thus a job
- can be backed up to a number of storage devices.
-- Implement some way for the File daemon to contact the Director
- to start a job or pass its DHCP obtained IP number.
-- Implement a query tape prompt/replace feature for a console
-- Make sure that Bacula rechecks the tape after the 20 min wait.
-- Set IO_NOWAIT on Bacula TCP/IP packets.
-- Try doing a raw partition backup and restore by mounting a
- Windows partition.
-- From Lars Kellers:
- Yes, it would allow to highly automatic the request for new tapes. If a
- tape is empty, bacula reads the barcodes (native or simulated), and if
- an unused tape is found, it runs the label command with all the
- necessary parameters.
-
- By the way can bacula automatically "move" an empty/purged volume say
- in the "short" pool to the "long" pool if this pool runs out of volume
- space?
-- What to do about "list files job=xxx".
-- Look at how fuser works and /proc/PID/fd that is how Nic found the
- file descriptor leak in Bacula.
-- Can we dynamically change FileSets?
-- If pool specified to label command and Label Format is specified,
- automatically generate the Volume name.
-- Add ExhautiveRestoreSearch
-- Look at the possibility of loading only the necessary
- data into the restore tree (i.e. do it one directory at a
- time as the user walks through the tree).
-- Possibly use the hash code if the user selects all for a restore command.
-- Fix "restore all" to bypass building the tree.
-- Prohibit backing up archive device (findlib/find_one.c:128)
-- Implement Release Device in the Job resource to unmount a drive.
-- Implement Acquire Device in the Job resource to mount a drive,
- be sure this works with admin jobs so that the user can get
- prompted to insert the correct tape. Possibly some way to say to
- run the job but don't save the files.
-- Make things like list where a file is saved case independent for
- Windows.
-- Implement a Recycle command
-- From Phil Stracchino:
- It would probably be a per-client option, and would be called
- something like, say, "Automatically purge obsoleted jobs". What it
- would do is, when you successfully complete a Differential backup of a
- client, it would automatically purge all Incremental backups for that
- client that are rendered redundant by that Differential. Likewise,
- when a Full backup on a client completed, it would automatically purge
- all Differential and Incremental jobs obsoleted by that Full backup.
- This would let people minimize the number of tapes they're keeping on
- hand without having to master the art of retention times.
-- When doing a Backup send all attributes back to the Director, who
- would then figure out what files have been deleted.
-- Currently in mount.c:236 the SD simply creates a Volume. It should have
- explicit permission to do so. It should also mark the tape in error
- if there is an error.
-- Cancel waiting for Client connect in SD if FD goes away.
-
-- Implement timeout in response() when it should come quickly.
-- Implement a Slot priority (loaded/not loaded).
-- Implement "vacation" Incremental only saves.
-- Implement create "FileSet"?
-- Add prefixlinks to where or not where absolute links to FD.
-- Issue message to mount a new tape before the rewind.
-- Simplified client job initiation for portables.
-- If SD cannot open a drive, make it periodically retry.
-- Add more of the config info to the tape label.
-
-- Refine SD waiting output:
- Device is being positioned
- > Device is being positioned for append
- > Device is being positioned to file x
- >
-- Figure out some way to estimate output size and to avoid splitting
- a backup across two Volumes -- this could be useful for writing CDROMs
- where you really prefer not to have it split -- not serious.
-- Make bcopy read through bad tape records.
-- Program files (i.e. execute a program to read/write files).
- Pass read date of last backup, size of file last time.
-- Add Signature type to File DB record.
-- CD into subdirectory when open()ing files for backup to
- speed up things. Test with testfind().
-- Priority job to go to top of list.
-- Why are save/restore of device different sizes (sparse?) Yup! Fix it.
-- Implement some way for the Console to dynamically create a job.
-- Solaris -I on tar for include list
-- Need a verbose mode in restore, perhaps to bsr.
-- bscan without -v is too quiet -- perhaps show jobs.
-- Add code to reject whole blocks if not wanted on restore.
-- Check if we can increase Bacula FD priorty in Win2000
-- Check if both CatalogFiles and UseCatalog are set to SD.
-- Possibly add email to Watchdog if drive is unmounted too
- long and a job is waiting on the drive.
-- After unmount, if restore job started, ask to mount.
-- Add UA rc and history files.
-- put termcap (used by console) in ./configure and
- allow -with-termcap-dir.
-- Fix Autoprune for Volumes to respect need for full save.
-- Compare tape to Client files (attributes, or attributes and data)
-- Make all database Ids 64 bit.
-- Allow console commands to detach or run in background.
-- Add SD message variables to control operator wait time
- - Maximum Operator Wait
- - Minimum Message Interval
- - Maximum Message Interval
-- Send Operator message when cannot read tape label.
-- Verify level=Volume (scan only), level=Data (compare of data to file).
- Verify level=Catalog, level=InitCatalog
-- Events file
-- Add keyword search to show command in Console.
-- Events : tape has more than xxx bytes.
-- Complete code in Bacula Resources -- this will permit
- reading a new config file at any time.
-- Handle ctl-c in Console
-- Implement script driven addition of File daemon to config files.
-- Think about how to make Bacula work better with File (non-tape) archives.
-- Write Unix emulator for Windows.
-- Make database type selectable by .conf files i.e. at runtime
-- Set flag for uname -a. Add to Volume label.
-- Restore files modified after date
-- SET LD_RUN_PATH=$HOME/mysql/lib/mysql
-- Remove duplicate fields from jcr (e.g. jcr.level and jcr.jr.Level, ...).
-- Timout a job or terminate if link goes down, or reopen link and query.
-- Concept of precious tapes (cannot be reused).
-- Make bcopy copy with a single tape drive.
-- Permit changing ownership during restore.
-
-- From Phil:
- > My suggestion: Add a feature on the systray menu-icon menu to request
- > an immediate backup now. This would be useful for laptop users who may
- > not be on the network when the regular scheduled backup is run.
- >
- > My wife's suggestion: Add a setting to the win32 client to allow it to
- > shut down the machine after backup is complete (after, of course,
- > displaying a "System will shut down in one minute, click here to cancel"
- > warning dialog). This would be useful for sites that want user
- > woorkstations to be shut down overnight to save power.
- >
-
-- Autolabel should be specified by DIR instead of SD.
-- Storage daemon
- - Add media capacity
- - AutoScan (check checksum of tape)
- - Format command = "format /dev/nst0"
- - MaxRewindTime
- - MinRewindTime
- - MaxBufferSize
- - Seek resolution (usually corresponds to buffer size)
- - EODErrorCode=ENOSPC or code
- - Partial Read error code
- - Partial write error code
- - Nonformatted read error
- - Nonformatted write error
- - WriteProtected error
- - IOTimeout
- - OpenRetries
- - OpenTimeout
- - IgnoreCloseErrors=yes
- - Tape=yes
- - NoRewind=yes
-- Pool
- - Maxwrites
- - Recycle period
-- Job
- - MaxWarnings
- - MaxErrors (job?)
-=====
-- Write a Storage daemon that uses pipes and
- standard Unix programs to write to the tape.
- See afbackup.
-- Need something that monitors the JCR queue and
- times out jobs by asking the deamons where they are.
-- Verify from Volume
-- Need report class for messages. Perhaps
- report resource where report=group of messages
-- enhance scan_attrib and rename scan_jobtype, and
- fill in code for "since" option
-- Director needs a time after which the report status is sent
- anyway -- or better yet, a retry time for the job.
-- Don't reschedule a job if previous incarnation is still running.
-- Some way to automatically backup everything is needed????
-- Need a structure for pending actions:
- - buffered messages
- - termination status (part of buffered msgs?)
-- Drive management
- Read, Write, Clean, Delete
-- Login to Bacula; Bacula users with different permissions:
- owner, group, user, quotas
-- Store info on each file system type (probably in the job header on tape.
- This could be the output of df; or perhaps some sort of /etc/mtab record.
-
-========= ideas ===============
-From: "Jerry K. Schieffer" <jerry@skylinetechnology.com>
-To: <kern@sibbald.com>
-Subject: RE: [Bacula-users] future large programming jobs
-Date: Thu, 26 Feb 2004 11:34:54 -0600
-
-I noticed the subject thread and thought I would offer the following
-merely as sources of ideas, i.e. something to think about, not even as
-strong as a request. In my former life (before retiring) I often
-dealt with backups and storage management issues/products as a
-developer and as a consultant. I am currently migrating my personal
-network from amanda to bacula specifically because of the ability to
-cross media boundaries during storing backups.
-Are you familiar with the commercial product called ADSM (I think IBM
-now sells it under the Tivoli label)? It has a couple of interesting
-ideas that may apply to the following topics.
-
-1. Migration: Consider that when you need to restore a system, there
-may be pressure to hurry. If all the information for a single client
-can eventually end up on the same media (and in chronological order),
-the restore is facillitated by not having to search past information
-from other clients. ADSM has the concept of "client affinity" that
-may be associated with it's storage pools. It seems to me that this
-concept (as an optional feature) might fit in your architecture for
-migration.
-
-ADSM also has the concept of defining one or more storage pools as
-"copy pools" (almost mirrors, but only in the sense of contents).
-These pools provide the ability to have duplicte data stored both
-onsite and offsite. The copy process can be scheduled to be handled
-by their storage manager during periods when there is no backup
-activity. Again, the migration process might be a place to consider
-implementing something like this.
-
->
-> It strikes me that it would be very nice to be able to do things
-like
-> have the Job(s) backing up the machines run, and once they have all
-> completed, start a migration job to copy the data from disks Volumes
-to
-> a tape library and then to offsite storage. Maybe this can already
-be
-> done with some careful scheduling and Job prioritzation; the events
-> mechanism described below would probably make it very easy.
-
-This is the goal. In the first step (before events), you simply
-schedule
-the Migration to tape later.
-
-2. Base jobs: In ADSM, each copy of each stored file is tracked in
-the database. Once a file (unique by path and metadata such as dates,
-size, ownership, etc.) is in a copy pool, no more copies are made. In
-other words, when you start ADSM, it begins like your concept of a
-base job. After that it is in the "incremental" mode. You can
-configure the number of "generations" of files to be retained, plus a
-retention date after which even old generations are purged. The
-database tracks the contents of media and projects the percentage of
-each volume that is valid. When the valid content of a volume drops
-below a configured percentage, the valid data are migrated to another
-volume and the old volume is marked as empty. Note, this requires
-ADSM to have an idea of the contents of a client, i.e. marking the
-database when an existing file was deleted, but this would solve your
-issue of restoring a client without restoring deleted files.
-
-This is pretty far from what bacula now does, but if you are going to
-rip things up for Base jobs,.....
-Also, the benefits of this are huge for very large shops, especially
-with media robots, but are a pain for shops with manual media
-mounting.
-
-Regards,
-Jerry Schieffer
-
-==============================
-
-Longer term to do:
-- Audit M_ error codes to ensure they are correct and consistent.
-- Add variable break characters to lex analyzer.
- Either a bit mask or a string of chars so that
- the caller can change the break characters.
-- Make a single T_BREAK to replace T_COMMA, etc.
-- Ensure that File daemon and Storage daemon can
- continue a save if the Director goes down (this
- is NOT currently the case). Must detect socket error,
- buffer messages for later.
-- Add ability to backup to two Storage devices (two SD sessions) at
- the same time -- e.g. onsite, offsite.
-
-======================================================
-
-====
- Handling removable disks
-
- From: Karl Cunningham <karlc@keckec.com>
-
- My backups are only to hard disk these days, in removable bays. This is my
- idea of how a backup to hard disk would work more smoothly. Some of these
- things Bacula does already, but I mention them for completeness. If others
- have better ways to do this, I'd like to hear about it.
-
- 1. Accommodate several disks, rotated similar to how tapes are. Identified
- by partition volume ID or perhaps by the name of a subdirectory.
- 2. Abort & notify the admin if the wrong disk is in the bay.
- 3. Write backups to different subdirectories for each machine to be backed
- up.
- 4. Volumes (files) get created as needed in the proper subdirectory, one
- for each backup.
- 5. When a disk is recycled, remove or zero all old backup files. This is
- important as the disk being recycled may be close to full. This may be
- better done manually since the backup files for many machines may be
- scattered in many subdirectories.
-====
-
-
-=== Done
-
-===
- Base Jobs design
-It is somewhat like a Full save becomes an incremental since
-the Base job (or jobs) plus other non-base files.
-Need:
-- A Base backup is same as Full backup, just different type.
-- New BaseFiles table that contains:
- BaseId - index
- BaseJobId - Base JobId referenced for this FileId (needed ???)
- JobId - JobId currently running
- FileId - File not backed up, exists in Base Job
- FileIndex - FileIndex from Base Job.
- i.e. for each base file that exists but is not saved because
- it has not changed, the File daemon sends the JobId, BaseId,
- FileId, FileIndex back to the Director who creates the DB entry.
-- To initiate a Base save, the Director sends the FD
- the FileId, and full filename for each file in the Base.
-- When the FD finds a Base file, he requests the Director to
- send him the full File entry (stat packet plus MD5), or
- conversely, the FD sends it to the Director and the Director
- says yes or no. This can be quite rapid if the FileId is kept
- by the FD for each Base Filename.
-- It is probably better to have the comparison done by the FD
- despite the fact that the File entry must be sent across the
- network.
-- An alternative would be to send the FD the whole File entry
- from the start. The disadvantage is that it requires a lot of
- space. The advantage is that it requires less communications
- during the save.
-- The Job record must be updated to indicate that one or more
- Bases were used.
-- At end of Job, FD returns:
- 1. Count of base files/bytes not written to tape (i.e. matches)
- 2. Count of base file that were saved i.e. had changed.
-- No tape record would be written for a Base file that matches, in the
- same way that no tape record is written for Incremental jobs where
- the file is not saved because it is unchanged.
-- On a restore, all the Base file records must explicitly be
- found from the BaseFile tape. I.e. for each Full save that is marked
- to have one or more Base Jobs, search the BaseFile for all occurrences
- of JobId.
-- An optimization might be to make the BaseFile have:
- JobId
- BaseId
- FileId
- plus
- FileIndex
- This would avoid the need to explicitly fetch each File record for
- the Base job. The Base Job record will be fetched to get the
- VolSessionId and VolSessionTime.
-- Fix bpipe.c so that it does not modify results pointer.
- ***FIXME*** calling sequence should be changed.
-- Fix restore of acls and extended attributes to count ERROR
- messages and make errors non-fatal.
-- Put save/restore various platform acl/xattrs on a pointer to simplify
- the code.
-- Add blast attributes to DIR to SD.
-- Implement unmount of USB volumes.
-- Look into using Dart for testing
- http://public.kitware.com/Dart/HTML/Index.shtml