Kern's ToDo List
- 26 October 2004
+ 28 April 2005
Major development:
Project Developer
======= =========
-IPv6_2 Meno Abels
-Data encryption Meno Abels (see projects)
-Communication encryption Meno Abels
-Version 1.35 Kern (see below)
+TLS Landon Fuller
+Unicode in Win32 Thorsten Engel
+VSS Thorsten Engel (under consideration)
+Version 1.37 Kern (see below)
========================================================
-1.37 Items to do for release:
-
+1.37 Major Projects:
+#3 Migration (Move, Copy, Archive Jobs)
+ (probably not this version)
+#7 Single Job Writing to Multiple Storage Devices
+ (probably not this version)
-1.37 Items:
-- Tell the "restore" user when browsing is no longer possible.
-- Write non-optimized bsrs from the JobMedia and Media records,
- even after Files are pruned.
+## Integrate web-bacula into a new Bacula project with
+ bimagemgr.
+## Consider moving docs to their own project.
+## Move rescue to its own project.
+## Create a new GUI chapter explaining all the GUI programs.
+
+Autochangers:
+- 3. Prevent two drives requesting the same Volume in any given
+ autochanger, by checking if a Volume is mounted on another drive
+ in an Autochanger.
+- 7. Implement new Console commands to allow offlining/reserving drives,
+ and possibly manipulating the autochanger (much asked for).
+- Make "update slots" when pointing to Autochanger, remove
+ all Volumes from other drives. "update slots all-drives"?
+
+Document:
+- Pruning with Admin job.
+- Add better documentation on how restores can be done
+- OS linux 2.4
+ 1) ADIC, DLT, FastStor 4000, 7*20GB
+ 2) Sun, DDS, (Suns name unknown - Archive Python DDS drive), 1.2GB
+ 3) Wangtek, QIC, 6525ES, 525MB (fixed block size 1k, block size etc.
+ driver dependent - aic7xxx works, ncr53c8xx with problems)
+ 4) HP, DDS-2, C1553A, 6*4GB
+- Doc the following
+ to activate, check or disable the hardware compression feature on my
+ exb-8900 i use the exabyte "MammothTool" you can get it here:
+ http://www.exabyte.com/support/online/downloads/index.cfm
+ There is a solaris version of this tool. With option -C 0 or 1 you can
+ disable or activate compression. Start this tool without any options for
+ a small reference.
+- Linux Sony LIB-D81, AIT-3 library works.
+- Document PostgreSQL performance problems bug 131.
+- Document testing
+- Document that ChangerDevice is used for Alert command.
+
+For 1.37:
+- --without-openssl breaks at least on Solaris.
+- Python:
+ - Make a callback when Rerun failed levels is called.
+ - Give Python program access to Scheduled jobs.
+ - Python script to save with Python, not save, save with Bacula.
+ - Python script to do backup.
+ - What events?
+ - Change the Priority, Client, Storage, JobStatus (error)
+ at the start of a job.
+ - Make sure that Python has access to Client address/port so that
+ it can check if Clients are alive.
+
+- Implement "NewVolumeEachJob = yes|no" in Dir.
+- Implement Maximum Job Spool Size
+- Remove all old Device resource code in Dir and code to pass it
+ back in SD -- better, rework it to pass back device statistics.
+- Check locking of resources -- be sure to lock devices where previously
+ resources were locked.
+- Add global lock on all devices when creating a device structure.
+- Fix the Rescue CDROM.
+
+Maybe in 1.37:
+- Print more info when bextract -p accepts a bad block.
+- To mark files as deleted, run essentially a Verify to disk, and
+ when a file is found missing (MarkId != JobId), then create
+ a new File record with FileIndex == -1. This could be done
+ by the FD at the same time as the backup.
+- Fix FD JobType to be set before RunBeforeJob in FD.
+- Look at adding full Volume and Pool information to a Volume
+ label so that bscan can get *all* the info.
+- If the user puts "Purge Oldest Volume = yes" or "Recycle Oldest Volume = yes"
+ and there is only one volume in the pool, refuse to do it -- otherwise
+ he fills the Volume, then immediately starts reusing it.
+- Implement copies and stripes.
+- Add history file to console.
+- Each file on tape creates a JobMedia record. Peter has 4 million
+ files spread over 10000 tape files and four tapes. A restore takes
+ 16 hours to build the restore list.
+- By the way: on page http://www.bacula.org/?page=tapedrives , at the
+ bottom, the link to "Tape Testing Chapter" is broken. It goes to
+ /html-manual/... while the others point to /rel-manual/...
+- Device resource needs the "name" of the SD.
+- Add and option to see if the file size changed during backup.
+- Make sure SD deletes spool files on error exit.
+- Delete old spool files when SD starts.
+- When labeling tapes, if you enter 000026, Bacula uses
+ the tape index rather than the Volume name 000026.
+- Max Vols limit in Pool off by one?
+- Require restore via the restore command or make a restore Job
+ get the bootstrap file.
+- Make bootstrap file handle multiple MediaTypes (SD)
+- Add offline tape command to Bacula console.
- Document that Bootstrap files can be written with cataloging
turned off.
-- Add dump of VolSessionId/Time and FileIndex with bls.
-- Look at adding full Volume and Pool information to a Volume.
+- Upgrade to MySQL 4.1.1 See:
+ http://dev.mysql.com/doc/mysql/en/Server_SQL_mode.html
+- Add client version to the Client name line that prints in
+ the Job report.
+- Bug:
+ Enter MediaId or Volume name: 32
+ Enter new Volume name: DLT-20Dec04
+ Automatically selected Pool: Default
+ Connecting to Storage daemon DLTDrive at 192.168.68.104:9103 ...
+ Sending relabel command from "DLT-28Jun03" to "DLT-20Dec04" ...
+ block.c:552 Write error at 0:0 on device /dev/nst0. ERR=Bad file descriptor.
+ Error writing final EOF to tape. This tape may not be readable.
+ dev.c:1207 ioctl MTWEOF error on /dev/nst0. ERR=Permission denied.
+ askdir.c:219 NULL Volume name. This shouldn't happen!!!
+ 3912 Failed to label Volume: ERR=dev.c:1207 ioctl MTWEOF error on /dev/nst0. ERR=Permission denied.
+ Label command failed for Volume DLT-20Dec04.
+ Do not forget to mount the drive!!!
+- Bug: if a job is manually scheduled to run later, it does not appear
+ in any status report and cannot be cancelled.
+
+Regression tests (Scott):
+- Add Pool/Storage override regression test.
+- Add delete JobId to regression.
+- Add a regression test for dbcheck.
+- New test to add bscan to four-concurrent-jobs regression,
+ i.e. after the four-concurrent jobs zap the
+ database as is done in the bscan-test, then use bscan to
+ restore the database, do a restore and compare with the
+ original.
+- Add restore of specific JobId to regression (item 3
+ on the restore prompt)
+- Add IPv6 to regression
+- Add database test to regression. Test each function like delete,
+ purge, ...
+
+- AntiVir can slow down backups on Win32 systems.
+- Win32 systems with FAT32 can be much slower than NTFS for
+ more than 1000 files per directory.
+
+
+1.37 Possibilities:
+- A HOLD command to stop all jobs from starting.
+- A PAUSE command to pause all running jobs ==> release the
+ drive.
+- Media Type = LTO,LTO-2,LTO-3
+ Media Type Read = LTO,LTO2,LTO3
+ Media Type Write = LTO2, LTO3
+
+=== From Carsten Menke <bootsy52@gmx.net>
+
+Following is a list of what I think in the situations where I'm faced with,
+could be a usefull enhancement to bacula, which I'm certain other users will
+benefit from as well.
+
+1. NextJob/NextJobs Directive within a Job Resource in the form of
+ NextJobs = job1,job2.
+
+ Why:
+ I currently solved the problem with running multiple jobs each after each
+ by setting the Max Wait Time for a job to 8 hours, and give
+ the jobs different Priorities. However, there scenarios where
+ 1 Job is directly depending on another job, so if the former job fails,
+ the job after it needn't to be run
+ while maybe other jobs should run despite of that
+
+Example:
+ A Backup Job and a Verify job, if the backup job fails there is no need to run
+ the verify job, as the backup job already failed. However, one may like
+ to backup the Catalog to disk despite of that the main backup job failed.
+
+Notes:
+ I see that this is related to the Event Handlers which are on the ToDo
+ list, also it is maybe a good idea to check for the return value and
+ execute different actions based on the return value
+
+
+3. offline capability to bconsole
+
+ Why:
+ Currently I use a script which I execute within the last Job via the
+ RunAfterJob Directive, to release and eject the tape.
+ So I have to call bconsole "release=Storage-Name" and afterwards
+ mt -f /dev/nst0 eject to get the tape out.
+
+ If I have multiple Storage Devices, than these may not be /dev/nst0 and
+ I have to modify the script or call it with parameters etc.
+ This would actually not be needed, as everything is already defined
+ in bacula-sd.conf and if I can invoke bconsole with the
+ storage name via $1 in the script than I'm done and information is
+ not duplicated.
+
+4. %s for Storage Name added to the chars being substituted in "RunAfterJob"
+
+ Why:
+
+ For the reason mentioned in 3. to have the ability to call a
+ script with /scripts/foobar %s and in the script use $1
+ to pass the Storage Name to bconsole
+
+5. Setting Volume State within a Job Resource
+
+ Why:
+ Instead of using "Maximum Volume Jobs" in the Pool Resource,
+ I would have the possibilty to define
+ in a Job Resource that after this certain job is run, the Volume State
+ should be set to "Volume State = Used", this give more flexibility (IMHO).
+
+6. Localization of Bacula Messages
+
+ Why:
+ Unfortunatley many,many people I work with don't speak english very well.
+ So if at least the Reporting messages would be localized then they
+ would understand that they have to change the tape,etc. etc.
+
+ I volunteer to do the german translations, and if I can convince my wife also
+ french and Morre (western african language).
+
+7. OK, this is evil, probably bound to security risks and maybe not possible
+ due to the design of bacula.
+
+ Implementation of Backtics ( `command` ) for shell comand execution to
+ the "Label Format" Directive.
+
+Why:
+
+ Currently I have defined BACULA_DAY_OF_WEEK="day1|day2..." resulting in
+ Label Format = "HolyBackup-${BACULA_DAY_OF_WEEK[${WeekDay}]}". If I could
+ use backticks than I could use "Label Format = HolyBackup-`date +%A` to have
+ the localized name for the day of the week appended to the
+ format string. Then I have the tape labeled automatically with weekday
+ name in the correct language.
+==========
+- Yes, that is surely the case. I probably should turn those into Warning
+ errors. In addition, you just made me think that it might not be bad to
+ add an option to check the file size after backing up the file and
+ report if it changes. This would be done as an option because it would
+ add extra overhead.
+
+ Kern, good idea. If you do do that, mention in the output: file
+ shrunk, or file expanded, just to make it obvious to the user
+ (without having to the refer to file size), just how the file size
+ changed.
+
+ Would this option be for all file, or just one file? Or a fileset?
+- Make output from status use html table tags for nicely
+ presenting in a browser.
+- Can one write tapes faster with 8192 byte block sizes?
+- Specify a single directory to restore.
+- Document security problems with the same password for everyone in
+ rpm and Win32 releases.
+- Browse generations of files.
+- I've seen an error when my catalog's File table fills up. I
+ then have to recreate the File table with a larger maximum row
+ size. Relevant information is at
+ http://dev.mysql.com/doc/mysql/en/Full_table.html ; I think the
+ "Installing and Configuring MySQL" chapter should talk a bit
+ about this potential problem, and recommend a solution.
+- For Solaris must use POSIX awk.
+- Want speed of writing to tape while despooling.
+- Supported autochanger:
+OS: Linux
+Man.: HP
+Media: LTO-2
+Model: SSL1016
+Slots: 16
+Cap: 200GB
+- Supported drive:
+ Wangtek 6525ES (SCSI-1 QIC drive, 525MB), under Linux 2.4.something,
+ bacula 1.36.0/1 works with blocksize 16k INSIDE bacula-sd.conf.
+- Add regex from http://www.pcre.org to Bacula for Win32.
+- Use only shell tools no make in CDROM package.
+- Include within include does it work?
+- Implement a Pool of type Cleaning?
+- Implement VolReadTime and VolWriteTime in SD
+- Modify Backing up Your Database to include a bootstrap file.
+- Think about making certain database errors fatal.
+- Look at correcting the time jump in the scheduler for daylight
+ savings time changes.
+- Add a "real" timer to network connections.
- Promote to Full = Time period
-- Scratch Pool where the volumes can be re-assigned to any Pool.
-- Update StartTime if job held in Job Queue.
- Despool attributes simultaneously with data in a separate
thread, rejoined at end of data spooling.
- Implement Files/Bytes,... stats for restore job.
- Implement Total Bytes Written, ... for restore job.
- Check dates entered by user for correctness (month/day/... ranges)
- Compress restore Volume listing by date and first file.
-- Add Pool/Storage override regression test.
-- Add delete JobId to regression.
-- Add bscan to four-concurrent-jobs regression.
- Look at patches/bacula_db.b2z postgresql that loops during restore.
See Gregory Wright.
-- Add IPv6 to regression
- Perhaps add read/write programs and/or plugins to FileSets.
- How to handle backing up portables ...
-- Add "Rerun failed levels = yes/no" to Job resource.
- Add some sort of guaranteed Interval for upgrading jobs.
- Can we write the state file after every job terminates? On Win32
the system crashes and the state file is not updated.
hours of operation.
From Phil:
In this context, it should be noted that Exabyte has a command-line
- vxatool utility available for free download. (The current version is
+ vxatool utility available for free download. (The current version is
vxatool-3.72.) It can get diagnostic info, read, write and erase tapes,
test the drive, unload tapes, change drive settings, flash new firmware,
etc.
Of particular interest in this context is that vxatool <device> -i will
report, among other details, the time since last cleaning in tape motion
- minutes. This information can be retrieved (and settings changed, for
+ minutes. This information can be retrieved (and settings changed, for
that matter) through the generic-SCSI device even when Bacula has the
- regular tape device locked. (Needless to say, I don't recommend
+ regular tape device locked. (Needless to say, I don't recommend
changing tape settings while a job is running.)
- Lookup HP cleaning recommendations.
- Lookup HP tape replacement recommendations (see trouble shooting autochanger)
- Document doing table repair
-
-
-Testing to do: (painful)
-For 1.37 Testing/Documentation:
+===================================
+
+- Use non-blocking network I/O but if no data is available, use
+ select().
+- Use gather write() for network I/O.
+- Autorestart on crash.
- Add bandwidth limiting.
- Add acks every once and a while from the SD to keep
the line from timing out.
units, perhaps via a directive.
- If opening a tape in read/write mode fails attempt to open
it in read-only mode, and mark the tape for read only.
-- Add a read-only mode to the mount option.
- Allow Simultaneous Priorities = yes => run up to Max concurrent jobs even
with multiple priorities.
-- Add db check test to regression. Test each function like delete,
- purge, ...
- If you use restore replace=never, the directory attributes for
non-existent directories will not be restored properly.
- Minimal autochanger handling in Bacula and in btape.
- Look into how tar does not save sockets and the possiblity of
not saving them in Bacula (Martin Simmons reported this).
-- Add All Local Partitions = yes to new style saves.
-- localmounts=`awk '/ext/ { print $2 }' /proc/mounts` # or whatever
- find $localmounts -xdev -type s -ls
- Fix restore jobs so that multiple jobs can run if they
are not using the same tape(s).
- Allow the user to select JobType for manual pruning/purging.
-- Look at adding Client run command that will use the
- port opened by the client.
- bscan does not put first of two volumes back with all info in
bscan-test.
- Implement the FreeBSD nodump flag in chflags.
perhaps if password is undefined.
- Implement "from ISO-date/time every x hours/days/weeks/months" in
schedules.
+=== rate design
+ jcr->last_rate
+ jcr->last_runtime
+ MA = (last_MA * 3 + rate) / 4
+ rate = (bytes - last_bytes) / (runtime - last_runtime)
+
==== from Marc Schoechlin
- the help-command should be more verbose
(it should explain the paramters of the different
commands in detail)
- -> it´s time-comsuming to consult the manual anytime
+ -> its time-comsuming to consult the manual anytime
you need a special parameter
- -> maybe it´s more easy to maintain this, if the
+ -> maybe its more easy to maintain this, if the
descriptions of that commands are outsourced to
a ceratin-file
- the cd-command should allow complete paths
i.e. cd /foo/bar/foo/bar
-> if a customer mails me the path to a certain file,
- it´s faster to enter the specified directory
+ its faster to enter the specified directory
- if the password is not configured in bconsole.conf
you should be asked for it.
-> sometimes you like to do restore on a customer-machine
- which shouldn´t know the password for bacula.
+ which shouldnt know the password for bacula.
-> adding the password to the file favours admins
to forget to remove the password after usage
-> security-aspects
- any actions should be interuptable with STRG+C
- command-expansion would be pretty cool
====
+- When the replace Never option is set, new directory permissions
+ are not restored. See bug 213. To fix this requires creating a
+ list of newly restored directories so that those directory
+ permissions *can* be restored.
- Compaction of Disk space by "migrating" Volumes that have pruned
Jobs (what criteria? size, #jobs, time).
- Add prune all command
should). None of the values 12:00pm - 12:59pm work for that matter.
- Add level to estimate command.
- For each job type (Admin, Restore, ...) require only the really necessary
- fields.
-- Fix option 2 of restore -- list where file is backed up -- require Client,
- then list last 20 backups.
-- Pass Director resource name as an option to the Console.
+ fields.- Pass Director resource name as an option to the Console.
- Add a "batch" mode to the Console (no unsolicited queries, ...).
- Add a .list all files in the restore tree (probably also a list all files)
Do both a long and short form.
creation date of files, we can be evensmarter about this and not count
entirely upon the length.On a restore, the full and all incrementals
since it will beapplied in sequence to restore the file.
-- Add a regression test for dbcheck.
-- Add disk seeking on restore. - Allow
- for optional cancelling of SD and FD in case DIR
+- Allow for optional cancelling of SD and FD in case DIR
gets a fatal error. Requested by Jesse Guardiani <jesse@wingnet.net>
- Add "limit=n" for "list jobs"
- Check new HAVE_WIN32 open bits.
is contained in only one device.
In SD allow Device to have Multiple MediaTypes
-- Look at www.nu2.nu/pebuilder as a helper for full windows
- bare metal restore.
- Ideas from Jerry Scharf:
First let's point out some big pluses that bacula has for this
it's open source
- Look at adding SQL server and Exchange support for Windows.
- Each DVD-RAM disk would be a volume, just like each tape is
a volume. It's a 4.7GB media with random access, but there's nothing about
- it that I can see that makes it so different than a tape from bacula's
+ it that I can see that makes it so different than a tape from bacula's
perspective. Why couldn't I back up to a bare floppy as a volume (ignoring
the media capacity?)
- Make dev->file and dev->block_num signed integers so that -1 can
=== Done
+- Save mount point for directories not traversed with onefs=yes.
+- Add seconds to start and end times in the Job report output.
+- if 2 concurrent backups are attempted on the same tape
+ drive (autoloader) into different tape pools, one of them will exit
+ fatally instead of halting until the drive is idle
+- Update StartTime if job held in Job Queue.
+- Look at www.nu2.nu/pebuilder as a helper for full windows
+ bare metal restore. (done by Scott)
+- Fix orphanned buffers:
+ Orphaned buffer: 24 bytes allocated at line 808 of rufus-dir job.c
+ Orphaned buffer: 40 bytes allocated at line 45 of rufus-dir alist.c
+- Implement Preben's suggestion to add
+ File System Types = ext2, ext3
+ to FileSets, thus simplifying backup of *all* local partitions.
+- Try to open a device on each Job if it was not opened
+ when the SD started.
+- Add dump of VolSessionId/Time and FileIndex with bls.
+- If Bacula does not find the right tape in the Autochanger,
+ then mark the tape in error and move on rather than asking
+ for operator intervention.
+- Cancel command should include JobId in list of Jobs.
+- Add performance testing hooks
+- Bootstrap from JobMedia records.
+- Implement WildFile and WildDir to solve problem of
+ saving only *.doc files.
+- Fix
+ Please use the "label" command to create a new Volume for:
+ Storage: DDS-4-changer
+ Media type:
+ Pool: Default
+ label
+ The defined Storage resources are:
+- Copy Changer Device and Changer Command from Autochanger
+ to Device resource in SD if none given in Device resource.
+- 1. Automatic use of more than one drive in an autochanger (done)
+- 2. Automatic selection of the correct drive for each Job (i.e.
+ selects a drive with an appropriate Volume for the Job) (done)
+- 6. Allow multiple simultaneous Jobs referencing the same pool write
+ to several tapes (some new directive(s) are are probably needed for
+ this) (done)
+- Locking (done)
+- Key on Storage rather than Pool (done)
+- Allow multiple drives to use same Pool (change jobq.c DIR) (done).
+- Synchronize multiple drives so that not more
+ than one loads a tape and any time (done)
+- 4. Use Changer Device and Changer Command specified in the
+ Autochanger resource, if none is found in the Device resource.
+ You can continue to specify them in the Device resource if you want
+ or need them to be different for each device.
+- 5. Implement a new Device directive (perhaps "Autoselect = yes/no")
+ that can allow a Device be part of an Autochanger, and hence the changer
+ script protected, but if set to no, will prevent the Device from being
+ automatically selected from the changer. This allows the device to
+ be directly accessed through its Device name, but not through the
+ AutoChanger name.
+#6 Select one from among Multiple Storage Devices for Job
+#5 Events that call a Python program
+ (Implemented in Dir/SD)
+- Make sure the Device name is in the Query packet returned.
+- Don't start a second file job if one is already running.
+- Implement EOF/EOV labels for ANSI labels
+- Implement IBM labels.
+- When Python creates a new label, the tape is immediately
+ recycled and no label created. This happens when using
+ autolabeling -- even when Python doesn't generate the name.
+- Scratch Pool where the volumes can be re-assigned to any Pool.
+- 28-Mar 23:19 rufus-sd: acquire.c:379 Device "DDS-4" (/dev/nst0)
+ is busy reading. Job 6 canceled.
+- Remove separate thread for opening devices in SD. On the other
+ hand, don't block waiting for open() for devices.
+- Fix code to either handle updating NumVol or to calculate it in
+ Dir next_vol.c
+- Ensure that you cannot exclude a directory or a file explicitly
+ Included with File.
+#4 Embedded Python Scripting
+ (Implemented in Dir/SD/FD)
+- Add Python writable variable for changing the Priority,
+ Client, Storage, JobStatus (error), ...
+- SD Python
+ - Solicit Events
+- Add disk seeking on restore; turn off seek on tapes.
+ stored/match_bsr.c
+- Look at dird_conf.c:1000: warning: `int size'
+ might be used uninitialized in this function
+- Indicate when a Job is purged/pruned during restore.
+- Implement some way to turn off automatic pruning in Jobs.
+- Implement a way an Admin Job can prune, possibly multiple
+ clients -- Python script?
+- Look at Preben's acl.c error handling code.
+- SD crashes after a tape restore then doing a backup.
+- If drive is opened read/write, close it and re-open
+ read-only if doing a restore, and vice-versa.
+- Windows restore:
+ data-fd: RestoreFiles.2004-12-07_15.56.42 Error:
+ > ..\findlib\../../findlib/create_file.c:275 Could not open e:/: ERR=Der
+ > Prozess kann nicht auf die Datei zugreifen, da sie von einem anderen
+ > Prozess verwendet wird.
+ Restore restores all files, but then fails at the end trying
+ to set the attributes of e:
+ from failed jobs.- Resolve the problem between Device name and Archive name,
+ and fix SD messages.
+- Tell the "restore" user when browsing is no longer possible.
+- Add a restore directory-x
+- Write non-optimized bsrs from the JobMedia and Media records,
+ even after Files are pruned.
+- Delete Stripe and Copy from VolParams to save space.
+- Fix option 2 of restore -- list where file is backed up -- require Client,
+ then list last 20 backups.
+- Finish implementation of passing all Storage and Device needs to
+ the SD.
+- Move test for max wait time exceeded in job.c up -- Peter's idea.