Kern's ToDo List
- 3 September 2003
+ 30 September 2003
Documentation to do: (any release a little bit at a time)
- Document running a test version.
hours of operation.
- Lookup HP cleaning recommendations.
- Lookup HP tape replacement recommendations (see trouble shooting autochanger)
-- Document FInclude ...
-- Document all the status codes JobLevel, JobType, JobStatus.
-- Document SDConnectTimeout (in FD).
Testing to do: (painful)
- that ALL console command line options work and are always implemented
- Test if rewind at end of tape waits for tape to rewind.
- Test cancel at EOM.
- Test not zeroing Autochanger slot when it is wrong.
-- Test recycling and purging (code changed in db_find_next_volume and
- in recycle.c).
- Figure out how to use ssh or stunnel to protect Bacula communications.
-For 1.32:
-- Specify list of files to restore
+For 1.33 Testing/Documentation:
+- Document to start higher priorty jobs before lower ones.
+- suppress "Do not forget to mount the drive!!!" if error
+- Document new records in Director. SDAddress SDDeviceName, SDPassword.
+ FDPassword, FDAddress, DBAddress, DBPort, DBPassword.
+- Document new Include/Exclude ...
+- Add test of exclusion, test multiple Include {} statements.
+- Add counter variable test.
+- Document ln -sf /usr/lib/libncurses.so /usr/lib/libtermcap.so
+ and install the esound-dev package for compiling Console on
+ SuSE.
+
+For 1.33
+- Implement a RunAfterFailedJob
+- Zap illegal characters in job name for mail files (e.g. /).
+- From Lars Köllers:
+ Yes, it would allow to highly automatic the request for new tapes. If a
+ tape is empty, bacula reads the barcodes (native or simulated), and if
+ an unused tape is found, it runs the label command with all the
+ necessary parameters.
+
+ By the way can bacula automatically "move" an empry/purged volume say
+ in the "short" pool to the "long" pool if this pool runs out of volume
+ space?
+- Implement a move Volume from one pool to another.
+- Either restrict the characters in a name, or fix the problem
+ emailing with names containing / (smtp command line breaks).
+- Eliminate ua_retention.c (retentioncmd) if possible.
+- Eliminate orphaned jobs: dbcheck, normal pruning, delete job command.
+ Hm. Well, there are the remaining orphaned job records:
+
+ | 105 | Llioness Save | 0000-00-00 00:00:00 | B | D | 0 | 0 | f |
+ | 110 | Llioness Save | 0000-00-00 00:00:00 | B | I | 0 | 0 | f |
+ | 115 | Llioness Save | 2003-09-10 02:22:03 | B | I | 0 | 0 | A |
+ | 128 | Catalog Save | 2003-09-11 03:53:32 | B | I | 0 | 0 | C |
+ | 131 | Catalog Save | 0000-00-00 00:00:00 | B | I | 0 | 0 | f |
+
+ As you can see, three of the five are failures. I already deleted the
+ one restore and one other failure using the by-client option. Deciding
+ what is an orphaned job is a tricky problem though, I agree. All these
+ records have or had 0 files/ 0 bytes, except for the restore. With no
+ files, of course, I don't know of the job ever actually becomes
+ associated with a Volume.
+
+ (I'm not sure if this is documented anywhere -- what are the meanings of
+ all the possible JobStatus codes?)
+
+ Looking at my database, it appears to me as though all the "orphaned"
+ jobs fit into one of two categories:
+
+ 1) The Job record has a StartTime but no EndTime, and the job is not
+ currently running;
+ or
+ 2) The Job record has an EndTime, indicating that it completed, but
+ it has no associated JobMedia record.
+
+
+ This does suggest an approach. If failed jobs (or jobs that, for some
+ other reason, write no files) are associated with a volume via a
+ JobMedia record, then they should be purged when the associated volume
+ is purged. I see two ways to handle jobs that are NOT associated with a
+ specific volume:
+
+ 1) purge them automatically whenever any volume is manually purged;
+ or
+ 2) add an option to the purge command to manually purge all jobs with
+ no associated volume.
+
+ I think Restore jobs also fall into category 2 above .... so one might
+ want to make that "The Job record has an EndTime,, but no associated
+ JobMedia record, and is not a Restore job."
+- Implement RestoreJobRetention? Maybe better "JobRetention" in a Job,
+ which would take precidence over the Catalog "JobRetention".
+- Implement Label Format in Add and Label console commands.
+- make "btape /tmp" work.
+- Make sure a rescheduled job is properly reported by status.
+- Walk through the Pool records rather than the Job records
+ in dird.c to create/update pools.
+- Figure out a way to move Volumes from one pool to another.
+- What to do about "list files job=xxx".
+- Implement delete Job.
+- Document need to put LabelFormat in quotes.
+- Implement scan: for every slot it finds, zero the slot of
+ Volume other volume having that slot.
+- When job rescheduled, status gives is waiting for Client Rufus
+ to connect to Storage File. Dir needs to inform SD that job
+ is rescheduled.
+- Fix get_storage_from_media_type (ua_restore) to use command line
+ storage=
- Enhance "update slots" to include a "scan" feature
scan 1; scan 1-5; scan 1,2,4 ... to update the catalog
- Allow a slot or range of slots on the label barcodes command.
- when the magazine is changed.
-- Implement ClientRunBeforeJob and ClientRunAfterJob.
-- Figure out what is interrupting sql command in console.
-- Don't print "Warning: Wrong Volume mounted ..." if mounting second
+- Don't print "Warning: Wrong Volume mounted ..." if mounting second volume.
+- Make Dmsg look at global before calling subroutine.
+- Enable trace output at runtime for Win32
+- Make sure that Volumes are recycled based on "Least recently used"
+ rather than lowest MediaId.
+- Available volumes for autochangers (see patrick@baanboard.com 3 Sep 03
+ and 4 Sep) scan slots.
+- Upgrade to cygwin 1.5
+- Get MySQL 3.23.58
+- Get and test MySQL 4.0
+- Do a complete audit of all pthreads_mutex, cond, ... to ensure that
+ any that are dynamically initialized are destroyed when no longer used.
+- Write a mini-readline with history and editing.
+- Look at how fuser works and /proc/PID/fd that is how Nic found the
+ file descriptor leak in Bacula.
+- Implement WrapCounters in Counters.
+- Turn on SIGHUP in dird.c and test.
+- Use system dependent calls to get more precise info on tape errors.
+- Add heartbeat from FD to SD if hb interval expires.
+- Suppress read error on blank tape when doing a label.
+- Can we dynamically change FileSets?
+- If pool specified to label command and Label Format is specified,
+ automatically generate the Volume name.
+- Take a careful look a the Basic recycling algorithm. When Bacula
+ chooses, the order should be:
+ - Look for Append
+ - Look for Recycle or Purged
+ - Prune volumes
+ - Look for purged
+ Instead of using lowest media Id, find the least recently used
volume.
-- Implement List Volume Job=xxx or List scheduled volumes or Status Director
-- Make | and < work on FD side.
-For 1.33
+ When the tape is mounted and Bacula requests the status
+ - Do everything possible to use it.
+
+ Define a "available" status, which is the currently mounted
+ Volume and all volumes that are currently in the autochanger.
+
- Why can't SQL do the filename sort for restore?
- Is a pool specification really needed for a restore? Yes, and
you may want to exclude archive Pools.
> > prod4-sd: End of medium on Volume "REU007" Bytes=16,303,521,933
- Use autochanger to handle multiple devices.
-- Fix packet too big problem.
+- Fix packet too big problem. This is most likely a Windows TCP stack
+ problem.
- Add SuSE install doc to list.
- Check and rechedk "Invalid block number"
- Make bextract release the drive properly between tapes
- Figure out some way to estimate output size and to avoid splitting
a backup across two Volumes -- this could be useful for writing CDROMs
where you really prefer not to have it split -- not serious.
-- Add RunBeforeJob and RunAfterJob to the Client program.
- Have SD compute MD5 or SHA1 and compare to what FD computes.
- Make VolumeToCatalog calculate an MD5 or SHA1 from the
actual data on the Volume and compare it.
- Check if we can increase Bacula FD priorty in Win2000
- Make sure the MaxVolFiles is fully implemented in SD
- Check if both CatalogFiles and UseCatalog are set to SD.
-- Need return status on read_cb() from read_records(). Need multiple
- records -- one per Job, maybe a JCR or some other structure with
- a block and a record.
- Figure out how to do a bare metal Windows restore
- Possibly add email to Watchdog if drive is unmounted too
long and a job is waiting on the drive.
Proposed Implementation:
To solve this problem, I propose the following:
- - Add a new Director resource type called FileOptions.
+ - Add a new Director resource type called Options.
- - The FileOptions resource will have records for all
+ - The Options resource will have records for all
options that can currently be specified on the Include record
(in a FileSet). Examples below.
- - The FileOptions resource will permit an exclude option as well
+ - The Options resource will permit an exclude option as well
as a number of additional options.
- - The heart of the FileOptions resource is the ability to
- supply any number of ApplyTo records which specify POSIX
- regular expressions. These ApplyTo regular expressions are
+ - The heart of the Options resource is the ability to
+ supply any number of Match records which specify POSIX
+ regular expressions. These Match regular expressions are
applied to the fully qualified filename (path and all). If
- one matches, then the FileOptions will be used.
+ one matches, then the Options will be used.
- - When an ApplyTo specification matches an included file, the
- options specified in the FileOptions resource will override
+ - When an Match specification matches an included file, the
+ options specified in the Options resource will override
the default options specified on the Include record.
- Include records will be modified to permit referencing one or
- more FileOptions resources. The FileOptions will be used
+ more Options resources. The Options will be used
in the order listed on the Include record and the first
one that matches will be applied.
year or so from now).
- The Exclude record will be deprecated as the same functionality
- can be obtained by using an Exclude = yes in the FileOptions.
+ can be obtained by using an Exclude = yes in the Options.
-FileOptions records:
- The following records can appear in the FileOptions resource. An
+Options records:
+ The following records can appear in the Options resource. An
asterisk preceding the name indicates a feature not currently
implemented.
For Restore Jobs:
- replace= (always/ifnewer/ifolder/never) - replace options currently
- implemented in 1.27
+ implemented in 1.31
- *Writer= (filename) - external write (restore) program
Implementation:
Currently options specifying compression, MD5 signatures, recursion,
... of a FileSet are supplied on the Include record. These will now
- all be collected into a FileOptions resource, which will be
- specified on the Include in place of the options. Multiple FileOptions
- may be specified. Since the FileOptions contain regular expressions
+ all be collected into a Options resource, which will be
+ specified in the Include in place of the options. Multiple Options
+ may be specified. Since the Options may contain regular expressions
that are applied to the full filename, this will give the ability
to specify backup options on a file by file basis to whatever level
of detail you wish.
FileSet {
Name = "FullSet"
- FInclude {
+ Include {
Compression = GZIP;
Signature = MD5
Match = /*.?*/ # matches all files.
That's a lot more to do the same thing, but it gives the ability to
apply options on a file by file basis. For example, suppose you
want to compress all files but not any file with extensions .gz or .Z.
- You could do so as follows:
+ In that case, you will need to group two sets of options using
+ the Options resource as follows:
+
FileSet {
Name = "FullSet"
- FInclude {
- FileOptions {
+ Include {
+ Options {
Signature = MD5
# Note multiple Matches are ORed
Match = /*.gz/ # matches .gz files */
Match = /*.Z/ # matches .Z files */
}
- FileOptions {
+ Options {
Compression = GZIP
Signature = MD5
Match = /*.?*/ # matches all files
}
}
- Now, since the NoCompress FileOptions is specified first on the
- Include line, any *.gz or *.Z file will have an MD5 signature computed,
- but will not be compressed. For all other files, the NoCompress will not
- match, so the Opts options will be used which will include GZIP
+ Now, since the no Compression option is specified in the
+ first group of Options, *.gz or *.Z file will have an MD5 signature computed,
+ but will not be compressed. For all other files, the *.gz *.Z will not
+ match, so the second group of options will be used which will include GZIP
compression.
Questions:
Look at src/host.h
- Use repositioning at the beginning of the tape.
- Do full check the command line args in update (e.g. VolStatus ...).
+- Specify list of files to restore
+- Implement ClientRunBeforeJob and ClientRunAfterJob.
+- Make | and < work on FD side.
+- Check to see if "blocked" is set during restore.
+- Figure out what is interrupting sql command in console.
+- Make new job print warning User Unmounted Tape.
+- Test recycling and purging (code changed in db_find_next_volume and
+ in recycle.c).
+- Document SDConnectTimeout (in FD).
+- Add restore by filename test.
+- Document restore by files.
+- Make variable expansion work correctly.
+- Implement List Volume Job=xxx or List scheduled volumes or Status Director
+- Copy static programs into install directory.
+- Think about changing Storage resource Device record to be
+ SDDeviceName.
+- Add RunBeforeJob and RunAfterJob to the Client program.
+- Need return status on read_cb() from read_records(). Need multiple
+ records -- one per Job, maybe a JCR or some other structure with
+ a block and a record.
+- LabelFormat on tape volume apparently creates the db record but
+ never actually labels the volume.
+- Recycling a volume when two jobs are using it is going to break. Fixed.
+- Document list nextvol and new format status dir.
+- Client files in Win32 with Unix eol conventions doesn't work.
+- Either fix or document that fill command in btape can be
+ compressed enormously by the hardware - a 36GB tape wrote 750GB!
+- Add multiple character duration qualifiers.
+- Require some modifer.
+- Restrict characters permitted in a Resource name, and don't permit
+ duplicate names.
+- Figure out some way to ignore or get past checksum errors in
+ reading.
+- The SD spooling file gets created even if it is not used.
+- Look at Cleaning tape in ua_label.c for media create/update
+- Add regression testing to the manual
+- End time: in job output of rescheduled job is time of first run.
+- Document list nextvol and status output.
+- Separate Dir heartbeat in FD from the SD heartbeat.
+- Fix sparse file handeling so that it always reads a multiple
+ of 512. Currently, it subtracts 8 bytes (for faddr).
+ Kludged with #ifdef for FreeBSD.
+- Document that Volume pruning can delete last Full backup and
+ hence you will not have a valid backup.
+- Clarify the fact that having the Bacula cygwin1.dll loaded
+ is not the same as having cygwin installed.
+- Document that it is safe to use the drive when the lights stop flashing.
+- Document all the status codes JobLevel, JobType, JobStatus.
+- Add GUI interface to manual
+- Combine the 3 places that search run records for the next
+ job. Use find_job_pool() modified in ua_output.c
+- Test connect timeouts.
+- Fix FreeBSD build with tcp_wrapper -- should not have -lnsl
+