Kern's ToDo List
- 25 August 2006
+ 12 November 2006
Major development:
Project Developer
======= =========
Document:
+- !!! Cannot restore two jobs a the same time that were
+ written simultaneously unless they were totally spooled.
- Document cleaning up the spool files:
db, pid, state, bsr, mail, conmsg, spool
- Document the multiple-drive-changer.txt script.
Priority:
-
-For 1.39:
+- Ensure that each device in an Autochanger has a different
+ Device Index.
+- Add Catalog = to Pool resource so that pools will exist
+ in only one catalog -- currently Pools are "global".
+- Look at sg_logs -a /dev/sg0 for getting soft errors.
+- btape "test" command with Offline on Unmount = yes
+
+ This test is essential to Bacula.
+
+ I'm going to write one record in file 0,
+ two records in file 1,
+ and three records in file 2
+
+ 02-Feb 11:00 btape: ABORTING due to ERROR in dev.c:715
+ dev.c:714 Bad call to rewind. Device "LTO" (/dev/nst0) not open
+ 02-Feb 11:00 btape: Fatal Error because: Bacula interrupted by signal 11: Segmentation violation
+ Kaboom! btape, btape got signal 11. Attempting traceback.
+
+- Why the heck doesn't bacula drop root priviledges before connecting to
+ the DB?
+- Ensure that moving a purged Volume in ua_purge.c to the RecyclePool
+ does the right thing.
+- Why doesn't @"xxx abc" work in a conf file?
+- Figure out some way to "automatically" backup conf changes.
+- Look at using posix_fadvise(2) for backups -- see bug #751.
+ Possibly add the code at findlib/bfile.c:795
+- Add the OS version back to the Win32 client info.
+- Restarted jobs have a NULL in the from field.
+- Modify SD status command to indicate when the SD is writing
+ to a DVD (the device is not open -- see bug #732).
+- Look at the possibility of adding "SET NAMES UTF8" for MySQL,
+ and possibly changing the blobs into varchar.
+- Check if gnome-console works with TLS.
+- Ensure that the SD re-reads the Media record if the JobFiles
+ does not match -- it may have been updated by another job.
+- Look at moving the Storage directive from the Job to the
+ Pool in the default conf files.
+- Test FIFO backup/restore -- make regression
+- Doc items
+- Test Volume compatibility between machine architectures
+- Encryption documentation
+- Wrong jobbytes with query 12 (todo)
+- bacula-1.38.2-ssl.patch
+- Bare-metal recovery Windows (todo)
+
+
+Projects:
+- GUI
+ - Admin
+ - Management reports
+ - Add doc for bweb -- especially Installation
+ - Look at Webmin
+ http://www.orangecrate.com/modules.php?name=News&file=article&sid=501
+- Performance
+ - FD-SD quick disconnect
+ - Despool attributes in separate thread
+ - Database speedups
+ - Embedded MySQL
+ - Check why restore repeatedly sends Rechdrs between
+ each data chunk -- according to James Harper 9Jan07.
+ - Building the in memory restore tree is slow.
+- Features
+ - Better scheduling
+ - Full at least once a month, ...
+ - Cancel Inc if Diff/Full running
+ - More intelligent re-run
+ - New/deleted file backup
+ - FD plugins
+ - Incremental backup -- rsync, Stow
+
+
+
+
+For next release:
+- Look at mondo/mindi
+- Don't restore Solaris Door files:
+ #define S_IFDOOR in st_mode.
+ see: http://docs.sun.com/app/docs/doc/816-5173/6mbb8ae23?a=view#indexterm-360
+- Make Bacula by default not backup tmpfs, procfs, sysfs, ...
+- Fix hardlinked immutable files when linking a second file, the
+ immutable flag must be removed prior to trying to link it.
- Implement Python event for backing up/restoring a file.
- Change dbcheck to tell users to use native tools for fixing
broken databases, and to ensure they have the proper indexes.
- Look at why SIGPIPE during connection can cause seg fault in
writing the daemon message, when Dir dropped to bacula:bacula
- Look at zlib 32 => 64 problems.
-- Try turning on disk seek code.
- Possibly turn on St. Bernard code.
- Fix bextract to restore ACLs, or better yet, use common routines.
- Do we migrate appendable Volumes?
- Remove queue.c code.
-- Some users claim that they must do two prune commands to get a
- Volume marked as purged.
- Print warning message if LANG environment variable does not specify
UTF-8.
- New dot commands from Arno.
.move transfer device=xxx fromslot=yyy toslot=zzz
Low priority:
-- Check to see if jcr->stime is lost during rescheduling of
- jobs in jobq.c
+- Article: http://www.heise.de/open/news/meldung/83231
+- Article: http://www.golem.de/0701/49756.html
+- Article: http://lwn.net/Articles/209809/
+- Article: http://www.onlamp.com/pub/a/onlamp/2004/01/09/bacula.html
+- Article: http://www.linuxdevcenter.com/pub/a/linux/2005/04/07/bacula.html
+- Article: http://www.osreviews.net/reviews/admin/bacula
+- Article: http://www.debianhelp.co.uk/baculaweb.htm
+- Article:
+- It appears to me that you have run into some sort of race
+ condition where two threads want to use the same Volume and they
+ were both given access. Normally that is no problem. However,
+ one thread wanted the particular Volume in drive 0, but it was
+ loaded into drive 1 so it decided to unload it from drive 1 and
+ then loaded it into drive 0, while the second thread went on
+ thinking that the Volume could be used in drive 1 not realizing
+ that in between time, it was loaded in drive 0.
+ I'll look at the code to see if there is some way we can avoid
+ this kind of problem. Probably the best solution is to make the
+ first thread simply start using the Volume in drive 1 rather than
+ transferring it to drive 0.
- Fix re-read of last block to check if job has actually written
a block, and check if block was written by a different job
(i.e. multiple simultaneous jobs writing).
The problem is that it requires m4, which is not present on all machines
at ./configure time.
-- Get Perl replacement for bregex.c
- Given all the problems with FIFOs, I think the solution is to do something a
little different, though I will look at the code and see if there is not some
simple solution (i.e. some bug that was introduced). What might be a better
3905 Device "LTO-Drive1" (/dev/nst0) open but no Bacula volume is mounted.
If this is not a blank tape, try unmounting and remounting the Volume.
-- Add VolumeState (enable, disable, archive)
+- http://www.dwheeler.com/essays/commercial-floss.html
- Add VolumeLock to prevent all but lock holder (SD) from updating
the Volume data (with the exception of VolumeState).
- The btape fill command does not seem to use the Autochanger
- What happens when you rename a Disk Volume?
- Job retention period in a Pool (and hence Volume). The job would
then be migrated.
-- Detect resource deadlock in Migrate when same job wants to read
- and write the same device.
-- Queue warning/error messages during restore so that they
- are reported at the end of the report rather than being
- hidden in the file listing ...
- Look at -D_FORTIFY_SOURCE=2
- Add Win32 FileSet definition somewhere
- Look at fixing restore status stats in SD.
-- Make selection of Database used in restore correspond to
- client.
- Look at using ioctl(FIMAP) and FIGETBSZ for sparse files.
http://www.informatik.uni-frankfurt.de/~loizides/reiserfs/fibmap.html
- Implement a mode that says when a hard read error is
("F","Full"),
("D","Diff"),
("I","Inc");
-- Add ACL to restore only to original location.
- Show files/second in client status output.
- Add a recursive mark command (rmark) to restore.
- "Minimum Job Interval = nnn" sets minimum interval between Jobs
block numbers in btape "test". Possibly adjust in Bacula.
- Fix list volumes to output volume retention in some other
units, perhaps via a directive.
-- If opening a tape in read/write mode fails attempt to open
- it in read-only mode, and mark the tape for read only.
- Allow Simultaneous Priorities = yes => run up to Max concurrent jobs even
with multiple priorities.
- If you use restore replace=never, the directory attributes for
- see lzma401.zip in others directory for new compression
algorithm/library.
-- Minimal autochanger handling in Bacula and in btape.
-- Look into how tar does not save sockets and the possiblity of
- not saving them in Bacula (Martin Simmons reported this).
-- Fix restore jobs so that multiple jobs can run if they
- are not using the same tape(s).
- Allow the user to select JobType for manual pruning/purging.
- bscan does not put first of two volumes back with all info in
bscan-test.
are not restored. See bug 213. To fix this requires creating a
list of newly restored directories so that those directory
permissions *can* be restored.
-- Compaction of Disk space by "migrating" Volumes that have pruned
- Jobs (what criteria? size, #jobs, time).
- Add prune all command
- Document fact that purge can destroy a part of a restore by purging
one volume while others remain valid -- perhaps mark Jobs.
- Add tree pane to left of window.
- Add progress meter.
- Max wait time or max run time causes seg fault -- see runtime-bug.txt
-- Document writing to a CD/DVD with Bacula.
-- Add a "base" package to the window installer for pthreadsVCE.dll
- which is needed by all packages.
- Add message to user to check for fixed block size when the forward
space test fails in btape.
- When unmarking a directory check if all files below are unmarked and
- Setup lrrd graphs: (http://www.linpro.no/projects/lrrd/) Mike Acar.
- Revisit the question of multiple Volumes (disk) on a single device.
- Add a block copy option to bcopy.
-- Investigate adding Mac Resource Forks.
- Finish work on Gnome restore GUI.
- Fix "llist jobid=xx" where no fileset or client exists.
- For each job type (Admin, Restore, ...) require only the really necessary
to start a job or pass its DHCP obtained IP number.
- Implement a query tape prompt/replace feature for a console
- Copy console @ code to gnome2-console
-- Make AES the only encryption algorithm see
- http://csrc.nist.gov/CryptoToolkit/aes/). It's
- an officially adopted standard, has survived peer
- review, and provides keys up to 256 bits.
-- Take a careful look at SetACL http://setacl.sourceforge.net
- Make tree walk routines like cd, ls, ... more user friendly
by handling spaces better.
- Make sure that Bacula rechecks the tape after the 20 min wait.
in the "short" pool to the "long" pool if this pool runs out of volume
space?
- What to do about "list files job=xxx".
-- Get and test MySQL 4.0
- Look at how fuser works and /proc/PID/fd that is how Nic found the
file descriptor leak in Bacula.
- Implement WrapCounters in Counters.
run the job but don't save the files.
- Make things like list where a file is saved case independent for
Windows.
-- Implement migrate
- Use autochanger to handle multiple devices.
-- On Windows with very long path names, it may be impossible to create
- a file (and thus restore it) because the total length is too long.
- We must cd into the directory then create the file without the
- full path name.
- Implement a Recycle command
-- Test a second language e.g. french.
- Start working on Base jobs.
- Implement UnsavedFiles DB record.
- From Phil Stracchino:
- If SD cannot open a drive, make it periodically retry.
- Add more of the config info to the tape label.
-- If tape is marked read-only, then try opening it read-only rather than
- failing, and remember that it cannot be written.
- Refine SD waiting output:
Device is being positioned
> Device is being positioned for append
- Compare tape to Client files (attributes, or attributes and data)
- Make all database Ids 64 bit.
- Allow console commands to detach or run in background.
-- Fix status delay on storage daemon during rewind.
- Add SD message variables to control operator wait time
- Maximum Operator Wait
- Minimum Message Interval
Migration: Move a backup from one Volume to another
Clone: Copy a backup -- two Volumes
-Bacula Migration is based on Jobs (apparently Networker is file by file).
-
-Migration triggered by:
- Number of Jobs
- Number of Volumes
- Age of Jobs
- Highwater mark (keep total size)
- Lowwater mark
-
-
======================================================
Base Jobs design
- Restore of a raw drive should not try to check the volume size.
- Lock tape drive door when open()
- Make release unload any autochanger.
+- Arno's reservation deadlock.
+- Eric's SD patch
+- Make sure the new level=Full syntax is used in all
+ example conf files (especially in the manual).
+- Fix prog copyright (SD) all other files.
+- Document need for UTF-8 format
+- Try turning on disk seek code.
+- Some users claim that they must do two prune commands to get a
+ Volume marked as purged.
+- Document fact that CatalogACL now needed for Tray monitor (fixed).
+- If you have two Catalogs, it will take the first one.
+- Migration Volume span bug
+- Rescue release
+- Bug reports