Kern's ToDo List
- 13 January 2004
+ 28 February 2004
Documentation to do: (any release a little bit at a time)
+- DB upgrade to version 5 in bacula-1.27b, DB upgrade to
+ version 6 in 1.31; DB upgrade to version 7 in 1.33/4.
- Document running a test version.
- Document query file format.
- Document static linking
- Test cancel at EOM.
For 1.33 Testing/Documentation:
+- Newly labeled tapes are chosen before ones already in use.
- Document new alias records in Director. SDAddress SDDeviceName, SDPassword.
FDPassword, FDAddress, DBAddress, DBPort, DBPassword.
- Document new Include/Exclude ...
- Document Dan's new --with-dir-user, ... options.
See userid.txt
- Figure out how to use ssh or stunnel to protect Bacula communications.
- Add Dan's work to manual
- See ssl.txt
+ Add Dan's work to manual See ssl.txt
- Add db check test to regression. Test each function like delete,
purge, ...
- Add subsections to the Disaster Recovery index section.
- Document Pool keyword for restore.
-
+- If you use restore replace=never, the directory attributes for
+ non-existent directories will not be restored properly.
+- In the Bacula User Guide you write:"Note, one major disadvantage of
+ writing to a NFS mounted volume as I do isthat if the other machine goes
+ down, the OS will wait forever on the fopen()call that Bacula makes. As
+ a consequence, Bacula will completely stall untilthe machine exporting
+ the NSF mounts comes back up. If someone knows a wayaround this, please
+ let me know."I haven't tried using NFS in years, but I think that the
+ "soft" and "intr"remount options may well help you. The only way of
+ being sure would be totry it.See, for example,
+ http://howtos.linux.com/guides/nag2/x-087-2-nfs.mountd.shtml
+
For 1.33
+- Look at installation file permissions with Scott so that make install
+ and the rpms agree.
+- Add a regression test for dbcheck.
+- Add disk seeking on restore.
+- Add atime preservation.
+- Allow for optional cancelling of SD and FD in case DIR
+ gets a fatal error. Requested by Jesse Guardiani <jesse@wingnet.net>
+- Do not err job if could not write bootstrap file.
+- Bizarre message: Error: Could not open WriteBootstrap file:
+- Build console in client only build.
+- For "list jobs" order by EndTime.
+- Add "limit=n" for "list jobs"
+- Check new HAVE_WIN32 open bits.
+- Make column listing for running jobs.
+ JobId Level Type Started Name Status
+- Make two tape fill test work.
+- Check if the tape has moved before writing.
+- Save and restore last_job across executions.
+- Implement restart of daemon.
+- Handling removable disks -- see below:
+- Multiple drive autochanger support -- see below.
+- Keep track of tape use time, and report when cleaning is necessary.
+- Add Events and Perl scripting.
+- See comtest-xxx.zip for Windows code to talk to USB.
+- Fix FreeBSD mt_count problem.
+- During install, copy any console.conf to bconsole.conf.
+- Have each daemon save the last_jobs structure when exiting and
+ read it back in when starting up.
+- "restore jobid=1 select" calls get_storage_xxx, which prints "JobId 1 is
+ not running."
+- Add FromClient and ToClient keywords on restore command (or
+ BackupClient RestoreClient).
+- Automatic "update slots" on user configuration directive when a
+ slot error occurs.
+- Allow "delete job jobid=xx jobid=xxx".
+- Allow "delete job jobid=xxx,yyy,aaa-bbb" i.e. list + ranges.
+- Implement multiple Volume in "purge jobs volume=".
+- Implement a JobSet, which groups any number of jobs. If the
+ JobSet is started, all the jobs are started together.
+ Allow Pool, Level, and Schedule overrides.
+- Enhance cancel to timeout BSOCK packets after a specific delay.
+- When I restore to Windows the Created, Accessed and Modifiedtimes are
+ those of the time of the restore, not those of the originalfile.
+ The dates you will find in your restore log seem to be the original
+ creation dates
+- Rescue builds incorrect script files on Rufus.
+- Write a Qmsg() to be used in bnet.c to prevent recursion. Queue the
+ message. If dequeueing toss the messages. Lock while dequeuing so that
+ it cannot be called recursively and set dequeuing flag.
+- Add all pools in Dir conf to DB also update them to catch changed
+ LabelFormats and such.
+- Symbolic link a directory to another one, then backup the symbolic
+ link.
+- Build console in client-only build.
+- Restore attributes of directory if replace=never set but directory
+ did not exist.
+- Check why Phil's Verify exclude does not work.
+- Phil says that Windows file sizes mismatch in Verify when they should,
+ and that either the file size or the catalog size was zero.
+- Fix option 2 of restore -- list where file is backed up -- require Client,
+ then list last 20 backups.
+- Allow browsing the catalog to see all versions of a file (with
+ stat data on each file).
- Finish code passing files=nnn to restore start.
-- Add Console usr permissions -- do by adding regex filters for
- jobs, clients, storage, ...
-- Put max network buffer size on a directive.
-- Why does "mark cygwin" take so long!!!!!!!!
-- When a file is set for restore, walk back up the chain of
- directories, setting them to be restored.
-- Figure out a way to set restore on a directory without recursively
- decending. (recurse off?).
- Add level to estimate command.
- Check time/dates printed during restore when using Win32 API.
- Volume "add"ed to Pool gets recycled in first use. VolBytes=0
F Number Number of filenames to follow
<file-name>
...
+
+- Spooling ideas taken from Volker Sauer's and other's emails:
+ > IMHO job spooling should be turned on
+ >
+ > 1) by job
+ > 2) by schedule
+ > 3) by sd
+ >
+ > where and 2) overrides 1) and 3) is independent.
+
+ Yes, this is the minimum that I think is necessary.
+
+ >
+ > Reason(s):
+ > It should be switched by job, because the job that backs up the machine
+ > with the bacula-sd on doesn't need spooling.
+ > It should be switched by schedule, because for full-backups I don't need
+ > spooling, so I can switch it off (because the network faster then the
+ > tapedrive)
+
+ True, with the exception that if you have enough disk spool space,
+ and you want to run concurrent jobs, spooling can eliminate the block
+ interleaving restore inefficiencies.
+
+ > And you should be able to turn it of by sd for sd-machines with low disk
+ > capacity or if you just don't need or want this feature.
+ >
+ > There should be:
+ > - definitly the possibility for multipe spool direcories
+
+ Having multiple directories is no problem -- having different maximum
+ sizes creates specification problems. At some point, I will probably
+ have a common SD pool of spool directories as well as a set of
+ private spool directories for each device. The first implementation
+ will be a set of private spool directories for each device since
+ managing a global pool with a bunch of threads writing into the same
+ directory is *much* more complicated and prone to error.
+
+ > - the ability to spool parts of a backup (not the whole client)
+
+ This may change in the future, but for the moment, it will spool
+ either to a job high water mark, or until the directory is full
+ (reaches max spool size or I/O error). It will then write to tape,
+ truncate the spool file, and begin spooling again.
+
+ > - spooling while writing to tape
+
+ Not within a job, but yes, if you run concurrent jobs -- each is a
+ different thread. Within a job could be a feature, but *much* later.
+
+ > - parallel spooling (like parallel jobs/ concurrent jobs) of clients
+
+ Yes, this is one of my main motivations for doing it (aside from
+ eliminating tape "shoe shine" during incremental backups.
+
+ > - flushing a backup that only went to disk (like amflush in amanda)
+
+ This will be a future feature, since spooling is different from backing
+ up to disk. The future feature will be "migration" which will move a job
+ from one backup Volume to another.
+
- New Storage specifications:
Passed to SD as a sort of BSR record called Storage Specification
Record or SSR.
In SD allow Device to have Multiple MediaTypes
After 1.33:
+- Look at www.nu2.nu/pebuilder as a helper for full windows
+ bare metal restore.
+From Chris Hull:
+ it seems to be complaining about 12:00pm which should be a valid 12
+ hour time. I changed the time to 11:59am and everything works fine.
+ Also 12:00am works fine. 0:00pm also works (which I don't think
+ should). None of the values 12:00pm - 12:59pm work for that matter.
Ideas from Jerry Scharf:
First let's point out some big pluses that bacula has for this
it's open source
Job report (Volker Sauer).
- Client does not show busy during Estimate command.
- Implement Console mtx commands.
-- Implement 3 Pools for a Job:
- Job {
- Name = ...
- Full Backup Pool = xxx
- Incremental Backup Pool = yyy
- Differential Backup Pool = zzz
- }
- Add a default DB password to MySQL.
GRANT all privileges ON bacula.* TO bacula@localhost IDENTIFIED BY
'bacula_password';
- Implement forward spacing block/file: position_device(bsr) --
just before read_block_from_device();
+=====
+ Multiple drive autochanger data: see Alan Brown
+ mtx -f xxx unloadStorage Element 1 is Already Full(drive 0 was empty)
+ Unloading Data Transfer Element into Storage Element 1...source Element
+ Address 480 is Empty
+
+ (drive 0 was empty and so was slot 1)
+ > mtx -f xxx load 15 0
+ no response, just returns to the command prompt when complete.
+ > mtx -f xxx status Storage Changer /dev/changer:2 Drives, 60 Slots ( 2 Import/Export )
+ Data Transfer Element 0:Full (Storage Element 15 Loaded):VolumeTag = HX001
+ Data Transfer Element 1:Empty
+ Storage Element 1:Empty
+ Storage Element 2:Full :VolumeTag=HX002
+ Storage Element 3:Full :VolumeTag=HX003
+ Storage Element 4:Full :VolumeTag=HX004
+ Storage Element 5:Full :VolumeTag=HX005
+ Storage Element 6:Full :VolumeTag=HX006
+ Storage Element 7:Full :VolumeTag=HX007
+ Storage Element 8:Full :VolumeTag=HX008
+ Storage Element 9:Full :VolumeTag=HX009
+ Storage Element 10:Full :VolumeTag=HX010
+ Storage Element 11:Empty
+ Storage Element 12:Empty
+ Storage Element 13:Empty
+ Storage Element 14:Empty
+ Storage Element 15:Empty
+ Storage Element 16:Empty....
+ Storage Element 28:Empty
+ Storage Element 29:Full :VolumeTag=CLNU01L1
+ Storage Element 30:Empty....
+ Storage Element 57:Empty
+ Storage Element 58:Full :VolumeTag=NEX261L2
+ Storage Element 59 IMPORT/EXPORT:Empty
+ Storage Element 60 IMPORT/EXPORT:Empty
+ $ mtx -f xxx unload
+ Unloading Data Transfer Element into Storage Element 15...done
+
+ (just to verify it remembers where it came from, however it can be
+ overrriden with mtx unload {slotnumber} to go to any storage slot.)
+ Configuration wise:
+ There needs to be a table of drive # to devices somewhere - If there are
+ multiple changers or drives there may not be a 1:1 correspondance between
+ changer drive number and system device name - and depending on the way the
+ drives are hooked up to scsi busses, they may not be linearly numbered
+ from an offset point either.something like
+
+ Autochanger drives = 2
+ Autochanger drive 0 = /dev/nst1
+ Autochanger drive 1 = /dev/nst2
+ IMHO, it would be _safest_ to use explicit mtx unload commands at all
+ times, not just for multidrive changers. For a 1 drive changer, that's
+ just:
+
+ mtx load xx 0
+ mtx unload xx 0
+
+ MTX's manpage (1.2.15):
+ unload [<slotnum>] [ <drivenum> ]
+ Unloads media from drive <drivenum> into slot
+ <slotnum>. If <drivenum> is omitted, defaults to
+ drive 0 (as do all commands). If <slotnum> is
+ omitted, defaults to the slot that the drive was
+ loaded from. Note that there's currently no way
+ to say 'unload drive 1's media to the slot it
+ came from', other than to explicitly use that
+ slot number as the destination.AB
+====
+
+====
+SCSI info:
+FreeBSD
+undef# camcontrol devlist
+<WANGTEK 51000 SCSI M74H 12B3> at scbus0 target 2 lun 0 (pass0,sa0)
+<ARCHIVE 4586XX 28887-XXX 4BGD> at scbus0 target 4 lun 0 (pass1,sa1)
+<ARCHIVE 4586XX 28887-XXX 4BGD> at scbus0 target 4 lun 1 (pass2)
+
+tapeinfo -f /dev/sg0 with a bad tape in drive 1:
+[kern@rufus mtx-1.2.17kes]$ ./tapeinfo -f /dev/sg0
+Product Type: Tape Drive
+Vendor ID: 'HP '
+Product ID: 'C5713A '
+Revision: 'H107'
+Attached Changer: No
+TapeAlert[3]: Hard Error: Uncorrectable read/write error.
+TapeAlert[20]: Clean Now: The tape drive neads cleaning NOW.
+MinBlock:1
+MaxBlock:16777215
+SCSI ID: 5
+SCSI LUN: 0
+Ready: yes
+BufferedMode: yes
+Medium Type: Not Loaded
+Density Code: 0x26
+BlockSize: 0
+DataCompEnabled: yes
+DataCompCapable: yes
+DataDeCompEnabled: yes
+CompType: 0x20
+DeCompType: 0x0
+Block Position: 0
+=====
+
+====
+ Handling removable disks
+
+ From: Karl Cunningham <karlc@keckec.com>
+
+ My backups are only to hard disk these days, in removable bays. This is my
+ idea of how a backup to hard disk would work more smoothly. Some of these
+ things Bacula does already, but I mention them for completeness. If others
+ have better ways to do this, I'd like to hear about it.
+
+ 1. Accommodate several disks, rotated similar to how tapes are. Identified
+ by partition volume ID or perhaps by the name of a subdirectory.
+ 2. Abort & notify the admin if the wrong disk is in the bay.
+ 3. Write backups to different subdirectories for each machine to be backed
+ up.
+ 4. Volumes (files) get created as needed in the proper subdirectory, one
+ for each backup.
+ 5. When a disk is recycled, remove or zero all old backup files. This is
+ important as the disk being recycled may be close to full. This may be
+ better done manually since the backup files for many machines may be
+ scattered in many subdirectories.
+====
+
+
=== for 1.33
- Change console to bconsole.
- Change smtp to bsmtp.
- Something is not right in last block of fill command.
- Add FileSet to command line arguments for restore.
- Enhance time and size scanning routines.
+- Add Console usr permissions -- do by adding filters for
+ jobs, clients, storage, ...
+- Put max network buffer size on a directive.
+- Why does "mark cygwin" take so long!!!!!!!!
+- Implement alist processing for ACLs from Console.
+- When a file is set for restore, walk back up the chain of
+ directories, setting them to be restored.
+- Figure out a way to set restore on a directory without recursively
+ decending. (recurse off?).
+- Fix restore to only pull in last Differential and later Incrementals.
+- Implement 3 Pools for a Job:
+ Job {
+ Name = ...
+ Full Backup Pool = xxx
+ Incremental Backup Pool = yyy
+ Differential Backup Pool = zzz
+ }
+- Look at ASSERT() at 384 src/lib/bnet.c
+- Dates are wrong in restore list from Win32 FD.
+- Dates are wrong in catalog from Win32 FD.
+- Remove h_errno from bnet.c by including proper header.
+