Kern's ToDo List
- 20 July 2005
+ 07 January 2006
Major development:
Project Developer
======= =========
-Version 1.37 Kern (see below)
-========================================================
-
-Final items for 1.37 before release:
-1. Fix bugs
-- --without-openssl breaks at least on Solaris.
-3. Document all the new features (about half done).
- - VSS. Shall I write "Include Writer 'WMI Writer", "MSDEWriter"
- Let me explain this: An windows application can (no must)
- register as VSS writer. This means that the applications opts to
- be notified if a backup (or restore) occurs. If it then gets
- this message, it will immediately store a consistent state to
- disk. Examples for these writers are "MSDE" (Microsoft database
- engine), "Event Log Writer", "Registry Writer" plus 3rd
- party-writers. If you have a non-vss aware application (e.g.
- SQL Anywhere or probably MySQL), a shadow copy is still generated
- and the open files can be backed up, but there is no guarantee
- that the file is consistent.
-
- At least the Microsoft example makes a significant effort to
- determine which writers may be involved when a drive or file is
- to be shadow copied. Of course, every single writer may fail....
- So they offer a user interface to explicitly include or exclude a
- writer when creating a VSS shadow copy. I personally would not
- like to bother the user with this - at least not right now
- (efforts for exchanging lists between fd and director + efforts
- for selecting, etc.).
-
- But I personally would like to have an information message about
- the individual writers involved in the backup-process ("vssadmin
- list writers" produced 4 entries on my xp-client and 7 on my w2k3
- server, please try this on your machine to understand the system
- a little bit better).
-
- - Multiple drive autochanger support
- - Support for ANSI/IBM labels.
- - Seven new options keywords in a FileSet resource:
- ignorecase, fstype, hfsplussupport, wilddir, wildfile, regexdir,
- and regexfile thanks to Pruben Guldberg). See below for details.
- - Restore of all files for a Job or set of jobs even if the file
- records have been removed from the catalog.
- - Restore of a directory (non-recursive, i.e. only one level).
- - Support for TLS (ssl) between all the daemon connections thanks
- to Landon Fuller.
- - Any Volume in the Pool named Scratch may be reassigned to any
- other Pool when a new Volume is needed.
- - Unicode filename support for Win32 (thanks to Thorsten Engel)
- - Volume Shadow Copy support for Win32 thus the capability to
- backup exclusively opened files (thanks to Thorsten Engel).
- A VSS enabled Win32 FD is available. You must explicitly
- turn on VSS with "Enable VSS = yes" in your FileSet resource.
- - SQLite3 support, but it seems to run at 1/2 to 1/4 the speed of
- SQLite2.
- - New Job directive "Prefer Mounted Volumes = yes|no" causes the
- SD to select either an Autochanger or a drive with a valid
- Volume already mounted in preference. If none is available,
- it will select the first available drive.
- - New Run directive in Job resource of DIR. It permits
- cloning of jobs. To clone a copy of the current job, use
- Run = "job-name level=%l since=\"%s\""
- Note, job-name is normally the same name as the job that
- is running but there is no restriction on what you put. If you
- want to start the job by hand and use job overrides such as
- storage=xxx, realize that the job will be started with the
- default storage values not the overrides. The level=%l guarantees
- that the chosen level of the job is the same, and the since=...
- ensures that the job uses *exactly* the same time/date for incremental
- and differential jobs. The since=... is ignored when level=Full.
- A cloned job will not start additional clones, so it is not possible
- to recurse.
- - New Options keywords in a FileSet directive:
- - WildDir xxx
- Will do a wild card match against directories (files will not
- be matched).
- - WildFile xxx
- Will do a wild card match against files (directories will not
- be matched).
- - RegexDir xxx
- Will do a regular expression match against directories (files
- will not be matched).
- - RegexFile xxx
- Will do a regular expression match against files( directories
- will not be matched).
- - IgnoreCase = yes | no
- Will ignore case in wild card and regular expression matches.
- This is handy for Windows where filename case is not significant.
- - FsType = string
- where string is a filesystem type: ext2, jfs, ntfs, proc,
- reiserfs, xfs, usbdevfs, sysfs, smbfs, iso9660. For ext3
- systems, use ext2. You may have multiple fstype directives
- and thus permit multiple filesystem types. If the type
- specified on the fstype directive does not match the
- filesystem for a particular directive, that directory will
- not be backed up. This directive can be used to prevent
- backing up non-local filesystems.
- - HFS Plus Support = yes | no
- If set, Mac OS X resource forks will be saved and restored.
- - Label Type = ANSI | IBM | Bacula
- Implemented in Director Pool resource and in SD Device resource.
- If it is specified in the SD Device resource, it will take
- precedence over the value passed from the Director to the SD.
- - Check Labels = yes | no
- Implemented in the SD Device resource. If you intend to read
- ANSI or IBM labels, this *must* be set. Even if the volume
- is not ANSI labeled, you can set this to yes, and Bacula will
- check the label type.
- - Scripts Directory = <directory> name. Defines the directory from
- which Bacula scripts will be called for events. In fact, Bacula
- appends this name to the standard Python list of search directories,
- so the script could also be in any of the Python system directories.
- - In FileSet, you can exclude backing up of hardlinks (if you have
- a lot, it can be very expensive), by using:
- HardLinks = no
- in the Options section. Patch supplied by David R Bosso. Thanks.
- - MaximumPartSize = bytes (SD, Device resource)
- Defines the maximum part size.
- - Requires Mount = Yes/No (SD, Device resource)
- Defines if the device require to be mounted to be read, and if it
- must be written in a special way. If it set, the following directives
- must be defined in the same Device resource:
- + Mount Point = directory
- Directory where the device must be mounted.
- + Mount Command = name-string
- Command that must be executed to mount the device. Before the command
- is executed, %a is replaced with the Archive Device, and %m with the
- Mount Point.
- + Unmount Command = name-string
- Command that must be executed to unmount the device. Before the
- command is executed, %a is replaced with the Archive Device, and
- %m with the Mount Point.
- + Write Part Command = name-string
- Command that must be executed to write a part to the device. Before
- the command is executed, %a is replaced with the Archive Device, %m
- with the Mount Point, %n with the current part number (0-based),
- and %v with the current part filename.
- + Free Space Command = name-string
- Command that must be executed to check how much free space is left
- on the device. Before the command is executed, %a is replaced with
- the Archive Device, %m with the Mount Point, %n with the current part
- number (0-based), and %v with the current part filename.
- - Write Part After Job = Yes/No (DIR, Job Resource, and Schedule Resource)
- If this directive is set to yes (default no), a new part file will be
- created after the job is finished.
- - A pile of new Directives to support TLS. Please see the TLS chapter
- of the manual.
-
- - "python restart" restarts the Python interpreter. Rather brutal, make
- sure no Python scripts are running. This permits you to change
- a Python script and ge
- - With Python 2.3, there are a few compiler warnings.
- - You must add --with-openssl to the configure command line if
- you want TLS communications encryption support.
-7. Write a bacula-web document
-9. Run the regression scripts on Solaris and FreeBSD
-- Figure out how to package gui, and rescue programs.
-- Test TLS.
Document:
+- Does ClientRunAfterJob fail the job on a bad return code?
- Document cleaning up the spool files:
db, pid, state, bsr, mail, conmsg, spool
- Document the multiple-drive-changer.txt script.
- Pruning with Admin job.
-
+- Does WildFile match against full name? Doc.
+- %d and %v only valid on Director, not for ClientRunBefore/After.
+
+Priority:
+- Implement status that shows why a job is being held in reserve, or
+ rather why none of the drives are suitable.
+- Implement a way to disable a drive (so you can use the second
+ drive of an autochanger, and the first one will not be used or
+ even defined).
+- Implement code that makes the Dir aware that a drive is an
+ autochanger (so the user doesn't need to use the Autochanger = yes
+ directive).
For 1.39:
+- Make hardlink code at line 240 of find_one.c use binary search.
+- Queue warning/error messages during restore so that they
+ are reported at the end of the report rather than being
+ hidden in the file listing ...
+- A Volume taken from Scratch should take on the retention period
+ of the new pool.
+- Correct doc for Maximum Changer Wait (and others) accepting only
+ integers.
+- Fix Maximum Changer Wait (and others) to accept qualifiers.
+- Look at -D_FORTIFY_SOURCE=2
+- Add Win32 FileSet definition somewhere
+- Look at fixing restore status stats in SD.
+- Make selection of Database used in restore correspond to
+ client.
+- Implement a mode that says when a hard read error is
+ encountered, read many times (as it currently does), and if the
+ block cannot be read, skip to the next block, and try again. If
+ that fails, skip to the next file and try again, ...
+- Add level table:
+ create table LevelType (LevelType binary(1), LevelTypeLong tinyblob);
+ insert into LevelType (LevelType,LevelTypeLong) values
+ ("F","Full"),
+ ("D","Diff"),
+ ("I","Inc");
+- Add ACL to restore only to original location.
+- Add a recursive mark command (rmark) to restore.
+- "Minimum Job Interval = nnn" sets minimum interval between Jobs
+ of the same level and does not permit multiple simultaneous
+ running of that Job (i.e. lets any previous invocation finish
+ before doing Interval testing).
+- Look at simplifying File exclusions.
+- Fix store_yesno to be store_bitmask.
+- New directive "Delete purged Volumes"
+- new pool XXX with ScratchPoolId = MyScratchPool's PoolId and
+ let it fill itself, and RecyclePoolId = XXX's PoolId so I can
+ see if it become stable and I just have to supervise
+ MyScratchPool
+- If I want to remove this pool, I set RecyclePoolId = MyScratchPool's
+ PoolId, and when it is empty remove it.
+- Figure out how to recycle Scratch volumes back to the Scratch
+ Pool.
+- Add Volume=SCRTCH
+- Allow Check Labels to be used with Bacula labels.
+- "Resuming" a failed backup (lost line for example) by using the
+ failed backup as a sort of "base" job.
+- Look at NDMP
+- Email to the user when the tape is about to need changing x
+ days before it needs changing.
+- Command to show next tape that will be used for a job even
+ if the job is not scheduled.
+--- create_file.c.orig Fri Jul 8 12:13:05 2005
++++ create_file.c Fri Jul 8 12:13:07 2005
+@@ -195,6 +195,8 @@
+ attr->ofname, be.strerror());
+ return CF_ERROR;
+ }
++ } else if(S_ISSOCK(attr->statp.st_mode)) {
++ Dmsg1(200, "Skipping socket: %s\n", attr->ofname);
+ } else {
+ Dmsg1(200, "Restore node: %s\n", attr->ofname);
+ if (mknod(attr->ofname, attr->statp.st_mode, attr->statp.st_rdev) != 0 && errno != EEXIST) {
+- From: Arunav Mandal <amandal@trolltech.com>
+ 1. When jobs are running and bacula for some reason crashes or if I do a
+ restart it remembers and jobs it was running before it crashed or restarted
+ as of now I loose all jobs if I restart it.
+
+ 2. When spooling and in the midway if client is disconnected for instance a
+ laptop bacula completely discard the spool. It will be nice if it can write
+ that spool to tape so there will be some backups for that client if not all.
+
+ 3. We have around 150 clients machines it will be nice to have a option to
+ upgrade all the client machines bacula version automatically.
+
+ 4. Atleast one connection should be reserved for the bconsole so at heavy load
+ I should connect to the director via bconsole which at sometimes I can't
+
+ 5. Another most important feature that is missing, say at 10am I manually
+ started backup of client abc and it was a full backup since client abc has
+ no backup history and at 10.30am bacula again automatically started backup of
+ client abc as that was in the schedule. So now we have 2 multiple Full
+ backups of the same client and if we again try to start a full backup of
+ client backup abc bacula won't complain. That should be fixed.
+
- Fix bpipe.c so that it does not modify results pointer.
***FIXME*** calling sequence should be changed.
1.xx Major Projects:
=== Done
-- Save mount point for directories not traversed with onefs=yes.
-- Add seconds to start and end times in the Job report output.
-- if 2 concurrent backups are attempted on the same tape
- drive (autoloader) into different tape pools, one of them will exit
- fatally instead of halting until the drive is idle
-- Update StartTime if job held in Job Queue.
-- Look at www.nu2.nu/pebuilder as a helper for full windows
- bare metal restore. (done by Scott)
-- Fix orphanned buffers:
- Orphaned buffer: 24 bytes allocated at line 808 of rufus-dir job.c
- Orphaned buffer: 40 bytes allocated at line 45 of rufus-dir alist.c
-- Implement Preben's suggestion to add
- File System Types = ext2, ext3
- to FileSets, thus simplifying backup of *all* local partitions.
-- Try to open a device on each Job if it was not opened
- when the SD started.
-- Add dump of VolSessionId/Time and FileIndex with bls.
-- If Bacula does not find the right tape in the Autochanger,
- then mark the tape in error and move on rather than asking
- for operator intervention.
-- Cancel command should include JobId in list of Jobs.
-- Add performance testing hooks
-- Bootstrap from JobMedia records.
-- Implement WildFile and WildDir to solve problem of
- saving only *.doc files.
-- Fix
- Please use the "label" command to create a new Volume for:
- Storage: DDS-4-changer
- Media type:
- Pool: Default
- label
- The defined Storage resources are:
-- Copy Changer Device and Changer Command from Autochanger
- to Device resource in SD if none given in Device resource.
-- 1. Automatic use of more than one drive in an autochanger (done)
-- 2. Automatic selection of the correct drive for each Job (i.e.
- selects a drive with an appropriate Volume for the Job) (done)
-- 6. Allow multiple simultaneous Jobs referencing the same pool write
- to several tapes (some new directive(s) are are probably needed for
- this) (done)
-- Locking (done)
-- Key on Storage rather than Pool (done)
-- Allow multiple drives to use same Pool (change jobq.c DIR) (done).
-- Synchronize multiple drives so that not more
- than one loads a tape and any time (done)
-- 4. Use Changer Device and Changer Command specified in the
- Autochanger resource, if none is found in the Device resource.
- You can continue to specify them in the Device resource if you want
- or need them to be different for each device.
-- 5. Implement a new Device directive (perhaps "Autoselect = yes/no")
- that can allow a Device be part of an Autochanger, and hence the changer
- script protected, but if set to no, will prevent the Device from being
- automatically selected from the changer. This allows the device to
- be directly accessed through its Device name, but not through the
- AutoChanger name.
-#6 Select one from among Multiple Storage Devices for Job
-#5 Events that call a Python program
- (Implemented in Dir/SD)
-- Make sure the Device name is in the Query packet returned.
-- Don't start a second file job if one is already running.
-- Implement EOF/EOV labels for ANSI labels
-- Implement IBM labels.
-- When Python creates a new label, the tape is immediately
- recycled and no label created. This happens when using
- autolabeling -- even when Python doesn't generate the name.
-- Scratch Pool where the volumes can be re-assigned to any Pool.
-- 28-Mar 23:19 rufus-sd: acquire.c:379 Device "DDS-4" (/dev/nst0)
- is busy reading. Job 6 canceled.
-- Remove separate thread for opening devices in SD. On the other
- hand, don't block waiting for open() for devices.
-- Fix code to either handle updating NumVol or to calculate it in
- Dir next_vol.c
-- Ensure that you cannot exclude a directory or a file explicitly
- Included with File.
-#4 Embedded Python Scripting
- (Implemented in Dir/SD/FD)
-- Add Python writable variable for changing the Priority,
- Client, Storage, JobStatus (error), ...
-- SD Python
- - Solicit Events
-- Add disk seeking on restore; turn off seek on tapes.
- stored/match_bsr.c
-- Look at dird_conf.c:1000: warning: `int size'
- might be used uninitialized in this function
-- Indicate when a Job is purged/pruned during restore.
-- Implement some way to turn off automatic pruning in Jobs.
-- Implement a way an Admin Job can prune, possibly multiple
- clients -- Python script?
-- Look at Preben's acl.c error handling code.
-- SD crashes after a tape restore then doing a backup.
-- If drive is opened read/write, close it and re-open
- read-only if doing a restore, and vice-versa.
-- Windows restore:
- data-fd: RestoreFiles.2004-12-07_15.56.42 Error:
- > ..\findlib\../../findlib/create_file.c:275 Could not open e:/: ERR=Der
- > Prozess kann nicht auf die Datei zugreifen, da sie von einem anderen
- > Prozess verwendet wird.
- Restore restores all files, but then fails at the end trying
- to set the attributes of e:
- from failed jobs.- Resolve the problem between Device name and Archive name,
- and fix SD messages.
-- Tell the "restore" user when browsing is no longer possible.
-- Add a restore directory-x
-- Write non-optimized bsrs from the JobMedia and Media records,
- even after Files are pruned.
-- Delete Stripe and Copy from VolParams to save space.
-- Fix option 2 of restore -- list where file is backed up -- require Client,
- then list last 20 backups.
-- Finish implementation of passing all Storage and Device needs to
- the SD.
-- Move test for max wait time exceeded in job.c up -- Peter's idea.
-## Consider moving docs to their own project.
-## Move rescue to its own project.
-- Add client version to the Client name line that prints in
- the Job report.
-- Fix the Rescue CDROM.
-- By the way: on page http://www.bacula.org/?page=tapedrives , at the
- bottom, the link to "Tape Testing Chapter" is broken. It goes to
- /html-manual/... while the others point to /rel-manual/...
-- Device resource needs the "name" of the SD.
-- Specify a single directory to restore.
-- Implement MediaType keyword in bsr?
-- Add a date and time stamp at the beginning of every line in the
- Job report (Volker Sauer).
-- Add level to estimate command.
-- Add "limit=n" for "list jobs"
-- Make bootstrap filename unique.
-- Make Dmsg look at global before calling subroutine.
-- From Chris Hull:
- it seems to be complaining about 12:00pm which should be a valid 12
- hour time. I changed the time to 11:59am and everything works fine.
- Also 12:00am works fine. 0:00pm also works (which I don't think
- should). None of the values 12:00pm - 12:59pm work for that matter.
-- Require restore via the restore command or make a restore Job
- get the bootstrap file.
-- Implement Maximum Job Spool Size
-- Fix 3993 error in SD. It forgets to look at autochanger
- resource for device command, ...
-- 3. Prevent two drives requesting the same Volume in any given
- autochanger, by checking if a Volume is mounted on another drive
- in an Autochanger.
-- Upgrade to MySQL 4.1.12 See:
- http://dev.mysql.com/doc/mysql/en/Server_SQL_mode.html
-- Add # Job Level date to bsr file
-- Implement "PreferMountedVolumes = yes|no" in Job resource.
-## Integrate web-bacula into a new Bacula project with
- bimagemgr.
-- Cleaning tapes should have Status "Cleaning" rather than append.
-- Make sure that Python has access to Client address/port so that
- it can check if Clients are alive.
-- Review all items in "restore".
-- Fix PostgreSQL GROUP BY problems in restore.
-- Fix PostgreSQL sql problems in bugs.
-- After rename
- 04-Jul 13:01 MainSD: Rufus.2005-07-04_01.05.02 Warning: Director wanted Volume
- "DLT-13Feb04".
- Current Volume "DLT-04Jul05" not acceptable because:
- 1997 Volume "DLT-13Feb04" not in catalog.
- 04-Jul 13:01 MainSD: Please mount Volume "DLT-04Jul05" on Storage Device
- "HP DLT 80" (/dev/nst0) for Job Rufus.2005-07-04_01.05.02
-## Create a new GUI chapter explaining all the GUI programs.
-- Make "update slots" when pointing to Autochanger, remove
- all Volumes from other drives. "update slots all-drives"?
- No, this is done by modifying mtx-changer to list what is
- in the drives.
-- Finish TLS implementation.
-- Port limiting -m in iptables to prevent DoS attacks
- could cause broken pipes on Bacula.
-6. Build and test the Volume Shadow Copy (VSS) for Win32.
-- Allow cancel of unknown Job
-- State not saved when closing Win32 FD by icon
-- bsr-opt-test fails. bsr deleted. Fix.
-- Move Python daemon variables from Job to Bacula object.
- WorkingDir, ConfigFile
-- Document that Bootstrap files can be written with cataloging
- turned off.
-- Document details of ANSI/IBM labels
-- OS linux 2.4
- 1) ADIC, DLT, FastStor 4000, 7*20GB
-- Linux Sony LIB-D81, AIT-3 library works.
-- Doc the following
- to activate, check or disable the hardware compression feature on my
- exb-8900 i use the exabyte "MammothTool" you can get it here:
- http://www.exabyte.com/support/online/downloads/index.cfm
- There is a solaris version of this tool. With option -C 0 or 1 you can
- disable or activate compression. Start this tool without any options for
- a small reference.
-- Document Heartbeat Interval in the dealing with firewalls section.
-- Document new CDROM directory.
-- On Win32 working directory must have drive letter ????
-- On Win32 working directory must be writable by SYSTEM to
- do restores.
-- Document that ChangerDevice is used for Alert command.
-- Add better documentation on how restores can be done
-8. Take one more try at making DVD writing work (no go)
+- Make sure that all do_prompt() calls in Dir check for
+ -1 (error) and -2 (cancel) returns.
+- Fix foreach_jcr() to have free_jcr() inside next().
+ jcr=jcr_walk_start();
+ for ( ; jcr; (jcr=jcr_walk_next(jcr)) )
+ ...
+ jcr_walk_end(jcr);
+