Kern's ToDo List
- 29 December 2002
+ 11 May 2003
-Documentation to do: (a little bit at a time)
+Documentation to do: (any release a little bit at a time)
- Document running a test version.
-- Make sure restore options are documented
- Document query file format.
- Document static linking
+- Document problems with Verify and pruning.
+- Document how to use multiple databases.
+- Add a section to the doc on Manual cycling of Volumes.
+
Testing to do: (painful)
-- that console command line options work
+- that ALL console command line options work and are always implemented
- blocksize recognition code.
+- multiple simultaneous Volumes
+
+- Figure out how to use ssh or stunnel to protect Bacula communications.
+
+
+For 1.31 release:
+- Merge SQLite, MySQL, and Rel spec into a single file.
+- Implement "Reschedule OnError=yes interval=nnn times=xxx"
+- Fix config of "console"
+- Shell character expansion is failing occassionally.
+- One block was orphaned in the SD probably after cancel.
+- Test if rewind at end of tape waits for tape to rewind.
+- Check if cancel works with FD.
+- Error labeling tape from console gets Jmsg error because of no Job.
+- Fix the following:
+ rufus-dir: Max configured use duration exceeded. Marking Volume "MatouBackup" as Used.
+ rufus-sd: Volume "" previously written, moving to end of data.
+ rufus-sd: Matou.2003-05-10_10.39.18 Error: I canot write on this volume because:
+ The number of files mismatch! Volume=1 Catalog=0
+ rufus-sd: Matou.2003-05-10_10.39.18 Error: askdir.c:155 NULL Volume name. This shouldn't happen!!!
+- Properly configure console and gconsole (currently for source not
+ configured for installation).
+- Fix "access not allowed" for backup of files on WinXP.
+- Check for existence of all new Win32 API's. See LoadLibrary in
+ winservice.cpp
+- Add Progress command that periodically reports the progress of
+ a job or all jobs.
+- Fix problem reported by Christopher McCurdy <cmccurdy@eecis.udel.edu>
+ xeon-fd: Could not stat c:/Documents and Settings/All
+ Users/Application Data/Humc:\Documents and Settings\All User98_AIX.kbf:
+ ERR=No such file or directory
+- Implement argv/argk in place of sscanf in the daemon protocol.
+- Examine Bare Metal restore problem.
+- Test multiple simultaneous Volumes
+- Document FInclude ...
+- Test and implement get_pint and get_yesno.
+- Implement timeout in response() when it should come quickly.
+- Check if Job/File retentions apply to multivolume jobs.
+- Remove subsysdir from conf files (used only in autostart scripts).
+- Implement console @echo command.
+- Implement global with DB name and add to btraceback.gdb
+- Bug: fix access problems on files restored on WinXP.
+- Implement a Slot priority (loaded/not loaded).
+- Implement "vacation" Incremental only saves.
+- Implement single pane restore (much like the Gftp panes).
+- Implement Automatic Mount even in operator wait.
+- Implement create "FileSet"?
+- Implement Release Device in the Job resource to unmount a drive.
+- Implement Acquire Device in the Job resource to mount a drive,
+ be sure this works with admin jobs so that the user can get
+ prompted to insert the correct tape. Possibly some way to say to
+ run the job but don't save the files.
+- Implement all command line args on run.
+- Implement command line "restore" args.
+- Implement "restore current select=no"
+- Fix watchdog pthread crash on Win32 (this is pthread_kill() Cygwin bug)
+- Implement "scratch pool" where tapes are defined and can be
+ taken by any pool that needs them.
+- Implement restore "current system", but take all files without
+ doing selection tree -- so that jobs without File records can
+ be restored.
+- Make | and < work on FD side.
+- Pass prefix_links to FD.
+- Implement a M_SECURITY message class.
+- Implement disk spooling. Two parts: 1. Spool to disk then
+ immediately to tape to speed up tape operations. 2. Spool to
+ disk only when the tape is full, then when a tape is hung move
+ it to tape.
+- From Phil Stracchino:
+ It would probably be a per-client option, and would be called
+ something like, say, "Automatically purge obsoleted jobs". What it
+ would do is, when you successfully complete a Differential backup of a
+ client, it would automatically purge all Incremental backups for that
+ client that are rendered redundant by that Differential. Likewise,
+ when a Full backup on a client completed, it would automatically purge
+ all Differential and Incremental jobs obsoleted by that Full backup.
+ This would let people minimize the number of tapes they're keeping on
+ hand without having to master the art of retention times.
-For 1.28 release:
-- Look at ua_prune.c in detail. Why did JobType work at all??????
-- Figure out how to allow multiple simultaneous file Volumes on
- a single device.
-
-For 1.29 release:
-- Enable avoid backing up archive device (findlib/find_one.c:128)
+- Allow multiple Storage specifications (or multiple names on
+ a single Storage specification) in the Job record. Thus a job
+ can be backed up to a number of storage devices.
+- Implement dump/print label to UA
+- Add prefixlinks to where or not where absolute links to FD.
+- Look at Python for a Bacula scripting language -- www.python.org
+- Issue message to mount a new tape before the rewind.
+- Simplified client job initiation for portables.
+- If SD cannot open a drive, make it periodically retry.
+- Implement LabelTemplate (at least first cut).
+- Add more of the config info to the tape label.
+- Implement a Mount Command and an Unmount Command where
+ the users could specify a system command to be performed
+ to do the mount, after which Bacula could attempt to
+ read the device. This is for Removeable media such as a CDROM.
+ - Most likely, this mount command would be invoked explicitly
+ by the user using the current Console "mount" and "unmount"
+ commands -- the Storage Daemon would do the right thing
+ depending on the exact nature of the device.
+ - As with tape drives, when Bacula wanted a new removable
+ disk mounted, it would unmount the old one, and send a message
+ to the user, who would then use "mount" as described above
+ once he had actually inserted the disk.
+
+- Make some way so that if a machine is skipped because it is not up
+ that Bacula will continue retrying for a specified period of time --
+ periodically.
+- If tape is marked read-only, then try opening it read-only rather than
+ failing, and remember that it cannot be written.
+- Refine SD waiting output:
+ Device is being positioned
+ > Device is being positioned for append
+ > Device is being positioned to file x
+ >
+- Figure out some way to estimate output size and to avoid splitting
+ a backup across two Volumes -- this could be useful for writing CDROMs
+ where you really prefer not to have it split -- not serious.
+- Add RunBeforeJob and RunAfterJob to the Client program.
+- Have SD compute MD5 or SHA1 and compare to what FD computes.
+- Make VolumeToCatalog calculate an MD5 or SHA1 from the
+ actual data on the Volume and compare it.
- Implement FileOptions (see end of this document)
- Implement Bacula plugins -- design API
-- Make hash table for linked files in findlib/find_one.c:161
- Make bcopy read through bad tape records.
+- Fix read_record to handle multiple sessions.
+- Program files (i.e. execute a program to read/write files).
+ Pass read date of last backup, size of file last time.
+- Add Signature type to File DB record.
+- Make Restore report an error if FD or SD term codes are not OK.
+- CD into subdirectory when open()ing files for backup to
+ speed up things. Test with testfind().
+- Priority job to go to top of list.
+- Find out why Full saves run slower and slower (hashing?)
+- Why are save/restore of device different sizes (sparse?) Yup! Fix it.
+- Implement some way for the Console to dynamically create a job.
+- Restore to a particular time -- e.g. before date, after date.
+- Solaris -I on tar for include list
+- Prohibit backing up archive device (findlib/find_one.c:128)
- Need a verbose mode in restore, perhaps to bsr.
-- Should we dump a SOS when starting a new tape?
- bscan without -v is too quiet -- perhaps show jobs.
- Add code to reject whole blocks if not wanted on restore.
-- Implement multiple simultaneous file Volumes on a single device.
- Start working on Base jobs.
+- Check if we can increase Bacula FD priorty in Win2000
- Make sure the MaxVolFiles is fully implemented in SD
-- Flush all the daemon messages at the end of every job.
- Check if both CatalogFiles and UseCatalog are set to SD.
-- Check if we can increase Bacula FD priorty in Win2000
- Need return status on read_cb() from read_records(). Need multiple
records -- one per Job, maybe a JCR or some other structure with
a block and a record.
- Figure out how to do a bare metal Windows restore
-- Fix read_record to handle multiple sessions.
-- Program files (i.e. execute a program to read/write files).
- Pass read date of last backup, size of file last time.
- Put system type returned by FD into catalog.
- Possibly add email to Watchdog if drive is unmounted too
long and a job is waiting on the drive.
-- Strip trailing slashes from Include directory names in the FD.
- Use read_record.c in SD code.
- Why don't we get an error message from Win32 FD when bootstrap
file cannot be created for restore command?
-- Need to specify MaximumConcurrentJobs in the Job resource.
- When Marking a file in Restore that is a hard link, also
mark the link so that the data will be reloaded.
- Restore program that errors in SD due to no tape reports
OK incorrectly in output.
- After unmount, if restore job started, ask to mount.
-- Fix db_get_fileset in cats/sql_get.c for multiple records.
-- Fix catalog filename truncation in sql_get and sql_create. Use
- only a single filename split routine.
-- Make Restore report an error if FD or SD term codes are not OK.
- Convert all %x substitution variables, which are hard to remember
and read to %(variable-name). Idea from TMDA.
- Add JobLevel in FD status (but make sure it is defined).
- Make Pool resource handle Counter resources.
- Remove NextId for SQLite. Optimize.
-- Fix gethostbyname() to use gethostbyname_r()
-- Strip trailing / from Include
- Move all SQL statements into a single location.
-- Cleanup db_update_media and db_update_pool
- Add UA rc and history files.
- put termcap (used by console) in ./configure and
allow -with-termcap-dir.
- Fix Autoprune for Volumes to respect need for full save.
- Fix Win32 config file definition name on /install
- No READLINE_SRC if found in alternate directory.
-- Add Client FS/OS id (Linux, Win95/98, ...).
- Test a second language e.g. french.
- Compare tape to Client files (attributes, or attributes and data)
-- Restore to a particular time -- e.g. before date, after date.
- Make all database Ids 64 bit.
- Write an applet for Linux.
- Add estimate to Console commands
- Fix Win2000 error with no messages during startup.
- Events : tape has more than xxx bytes.
- Restrict characters permitted in a Resource name.
-- Complete code in Bacula Resources -- this will permit
+- Complete code in Bacula Resources -- this will permit
reading a new config file at any time.
- Handle ctl-c in Console
-- Implement LabelTemplate (at least first cut).
- Implement script driven addition of File daemon to config files.
- Think about how to make Bacula work better with File (non-tape) archives.
-
-- see setgroup and user for Bacula p4-5 of stunnel.c
+- Write Unix emulator for Windows.
- Implement new serialize subroutines
send(socket, "string", &Vol, "uint32", &i, NULL)
- Audit all UA commands to ensure that we always prompt where possible.
- Put memory utilization in Status output of each daemon
if full status requested or if some level of debug on.
- Make database type selectable by .conf files i.e. at runtime
-- gethostbyname failure in bnet_connect() continues
- generating errors -- should stop.
-- Add HOST to Volume label.
- Set flag for uname -a. Add to Volume label.
- Implement throttled work queue.
- Check for EOT at ENOSPC or EIO or ENXIO (unix Pc)
-- Allow multiple Storage specifications (or multiple names on
- a single Storage specification) in the Job record. Thus a job
- can be backed up to a number of storage devices.
-- Implement dump label to UA
-- Concept of VolumeSet during restore which is a list
- of Volume names needed.
- Restore files modified after date
- Restore file modified before date
- Emergency restore info:
- Implement Restore FileSet=
- Create a protocol.h and protocol.c where all protocol messages
are concentrated.
-- If SD cannot open a drive, make it periodically retry.
- Remove duplicate fields from jcr (e.g. jcr.level and jcr.jr.Level, ...).
- Timout a job or terminate if link goes down, or reopen link and query.
- Find general solution for sscanf size problems (as well
- Make bcopy copy with a single tape drive.
- Permit changing ownership during restore.
-- Restore should get Device and Pool information from
- job record rather than from config.
- Autolabel should be specified by DIR instead of SD.
- Find out how to get the system tape block limits, e.g.:
Apr 22 21:22:10 polymatou kernel: st1: Block limits 1 - 245760 bytes.
=====
- FD sends unsaved file list to Director at end of job (see
RFC below).
+- File daemon should build list of files skipped, and then
+ at end of save retry and report any errors.
- Write a Storage daemon that uses pipes and
standard Unix programs to write to the tape.
See afbackup.
if tape is not valid.
- Verify from Volume
- Ensure that /dev/null works
-- File daemon should build list of files skipped, and then
- at end of save retry and report any errors.
- Need report class for messages. Perhaps
report resource where report=group of messages
- enhance scan_attrib and rename scan_jobtype, and
fill in code for "since" option
-- Need to save contents of FileSet to tape?
- Director needs a time after which the report status is sent
anyway -- or better yet, a retry time for the job.
Don't reschedule a job if previous incarnation is still running.
-- Figure out how to save the catalog (possibly a special FileSet).
-- Figure out how to restore the catalog.
- Some way to automatically backup everything is needed????
- Need a structure for pending actions:
- buffered messages
- termination status (part of buffered msgs?)
-- Concept of grouping Storage devices and job can use
- any of a number of devices
- Drive management
Read, Write, Clean, Delete
- Login to Bacula; Bacula users with different permissions:
This could be the output of df; or perhaps some sort of /etc/mtab record.
Longer term to do:
-- Design at hierarchial storage for Bacula.
+- Design at hierarchial storage for Bacula. Migration and Clone.
- Implement FSM (File System Modules).
- Identify unchanged or "system" files and save them to a
special tape thus removing them from the standard
backup FileSet -- BASE backup.
-- Turn virutally all sprintfs into snprintfs.
-- Heartbeat between daemons.
- Audit M_ error codes to ensure they are correct and consistent.
- Add variable break characters to lex analyzer.
Either a bit mask or a string of chars so that
is NOT currently the case). Must detect socket error,
buffer messages for later.
- Enhance time/duration input to allow multiple qualifiers e.g. 3d2h
-
+- Add ability to backup to two Storage devices (two SD sessions) at
+ the same time -- e.g. onsite, offsite.
+- Add the ability to consolidate old backup sets (basically do a restore
+ to tape and appropriately update the catalog). Compress Volume sets.
+ Might need to spool via file is only one drive is available.
+- Compress or consolidate Volumes of old possibly deleted files. Perhaps
+ someway to do so with every volume that has less than x% valid
+ files.
+
+
+Migration: Move a backup from one Volume to another
+Clone: Copy a backup -- two Volumes
+
+Bacula Migration is based on Jobs (apparently Networker is file by file).
+
+Migration triggered by:
+ Number of Jobs
+ Number of Volumes
+ Age of Jobs
+ Highwater mark (keep total size)
+ Lowwater mark
Projects:
Bacula Projects Roadmap
17 August 2002
- last update 27 November 2002
+ last update 8 May 2003
Item 1: Multiple simultaneous Jobs. (done)
-Done
+Done -- Restore part needs better implementation to work correctly
+ Also, it needs considerable testing
What: Permit multiple simultaneous jobs in Bacula.
Item 6: Write a regression script.
-Started
+Done -- Continue to expand its testing.
What: This is an automatic script that runs and tests as many features
of Bacula as possible. The output is compared to previous
Item 9: Add SSL to daemon communications.
+Inprogress as of version 1.31.
What: This provides for secure communications between the daemons.
VolSessionId and VolSessionTime.
=========================================================
-==========================================================
- Unsaved File design
-For each Incremental job that is run, there may be files that
-were found but not saved because they were locked (this applies
-only to Windows). Such a system could send back to the Director
-a list of Unsaved files.
-Need:
-- New UnSavedFiles table that contains:
- JobId
- PathId
- FilenameId
-- Then in the next Incremental job, the list of Unsaved Files will be
- feed to the FD, who will ensure that they are explicitly chosen even
- if standard date/time check would not have selected them.
-=============================================================
-
=============================================================
After implementing the above, the user will be able to specify
on a file by file basis (using regular expressions) what options are
applied for the backup.
-====================================
+
+
+=============================================
+
+==========================================================
+ Unsaved File design
+For each Incremental job that is run, there may be files that
+were found but not saved because they were locked (this applies
+only to Windows). Such a system could send back to the Director
+a list of Unsaved files.
+Need:
+- New UnSavedFiles table that contains:
+ JobId
+ PathId
+ FilenameId
+- Then in the next Incremental job, the list of Unsaved Files will be
+ feed to the FD, who will ensure that they are explicitly chosen even
+ if standard date/time check would not have selected them.
+=============================================================
+
Done: (see kernsdone for more)
-- Add EOM records? No, not at this time. The current system works and
- above all is simple.
-- Add VolumeUseDuration and MaximumVolumeJobs to Pool db record and
- to Media db record.
-- Add VOLUME_CAT_INFO to the EOS tape record (as
- well as to the EOD record). -- No, not at this time.
-- Put MaximumVolumeSize in Director (MaximumVolumeJobs, MaximumVolumeFiles,
- MaximumFileSize).
-- Enhance schedule to have 1stSat, ...
-- Make sure catalog doesn't keep growing.
-- On I/O error, write EOF, then try to write again ? No, keep it simple.
-- Figure out how compress everything except .gz,... files.
- Implement FileOptions.
-- Put Bacula version somewhere in Job stream, probably Start Session Labels.
-- Fix start/end blocks for File devices
-- Make Job err if WriteBootstrap fails.
-- Test that mod of restore options works.
-- Test that week position schedule code works.
-- Make BSR accept count (total files to be restored).
-- Add code to fast seek to proper place on tape/file when doing Restore.
-- Replace popen() and pclose() -- fail safe and timeout, no SIG dep.
-- Add code to put VolFile in bsr for restore command.
-- Volumes can be listed multiple times in Restore volume list.
-- Add watchdog timeout for child processes start_child_timer()
- end_child_timer();
-- Get rid of bscan.c:534 error message (one time only).
-- Print some statistics when get EOF on device in bscan -- feedback
- to let user know it is working.
-- DateWritten field on tape may be wrong.
-- Ensure that restore of differential jobs works (check SQL).
-- Count number of ignored messages in bscan and print when first SOS is found.
-- Test that EndFile/Block are correctly updated at end of tape
- (in view of new block reading code).
-- Test watchdog child timer code.
-- Test new BSR code (mostly done).
-- Work more on how to to a Bacula restore beginning with
- just a Bacula tape and a boot floppy (bare metal recovery).
-- Restore options (overwrite, overwrite if older,
- overwrite if newer, never overwrite, ...)
-- Fill all fields in Vol/Job Header -- ensure that everything
- needed is written to tape. Think about restore to Catalog
- from tape. Client record needs improving.
-- Implement ./configure --with-client-only
-- Finish up static linking
-- that restore options work in FD
-
+- Heartbeat between daemons.
+- Fix Dir heartbeat in restore and verify vol. Be sure to make
+ bnet_recv() ignore BNET_HEARTBEAT.
+- Implement HEART_BEAT while SD waiting for tapes.
+- Include RunBeforeJob and RunAfterJob output in the message
+ stream.
+- Change M_INFO to M_RESTORED for all restored files.
+- Fix command prompt in gnome-console by checking on Ready.