Kern's ToDo List
- 06 March 2008
+ 17 July 2009
+
+Rescue:
+Add to USB key:
+ gftp sshfs kile kate lsssci m4 mtx nfs-common nfs-server
+ patch squashfs-tools strace sg3-utils screen scsiadd
+ system-tools-backend telnet dpkg traceroute urar usbutils
+ whois apt-file autofs busybox chkrootkit clamav dmidecode
+ manpages-dev manpages-posix manpages-posix-dev
Document:
+- package sg3-utils, program sg_map
- !!! Cannot restore two jobs a the same time that were
written simultaneously unless they were totally spooled.
- Document cleaning up the spool files:
for disaster recovery.
Professional Needs:
+- Nexenta (zfs + hardy + iscsi + nas + smf support)
+- NDMP
+ - For NAS OpenNAS
+ - ndmfs -- File Server extention in NDMPv4.
+ - ndmjob -- NDMP backup/restore NDMPv2, NDMPv3, and NDMPv4
+- Base jobs
- Migration from other vendors
- Date change
- Path change
- Detect state change of system (verify)
- Synthetic Full, Diff, Inc (Virtual, Reconstructed)
- SD to SD
-- Modules for Databases, Exchange, ...
- Novell NSS backup http://www.novell.com/coolsolutions/tools/18952.html
- Compliance norms that compare restored code hash code.
- When glibc crash, get address with
info symbol 0x809780c
- How to sync remote offices.
-- Exchange backup:
- http://www.microsoft.com/technet/itshowcase/content/exchbkup.mspx
- David's priorities
Copypools
Extract capability (#25)
and http://www.openeyet.nl/scc/ for managing customer changes
Priority:
-- Doc Duplicate Jobs.
-- New directive "Delete purged Volumes"
+================
+
+- Why no error message if restore has no permission on the where
+ directory?
+- Possibly allow manual "purge" to purge a Volume that has not
+ yet been written (even if FirstWritten time is zero) see ua_purge.c
+ is_volume_purged().
+- Add disk block detection bsr code (make it work).
+- Remove done bsrs.
+- User options for plugins.
+- Pool Storage override precedence over command line.
+- Autolabel only if Volume catalog information indicates tape not
+ written. This will avoid overwriting a tape that gets an I/O
+ error on reading the volume label.
+- I/O error, SD thinks it is not the right Volume, should check slot
+ then disable volume, but Asks for mount.
+- Can be posible modify package to create and use configuration files in
+ the Debian manner?
+
+ For example:
+
+ /etc/bacula/bacula-dir.conf
+ /etc/bacula/conf.d/pools.conf
+ /etc/bacula/conf.d/clients.conf
+ /etc/bacula/conf.d/storages.conf
+
+ and into bacula-dir.conf file include
+
+ @/etc/bacula/conf.d/pools.conf
+ @/etc/bacula/conf.d/clients.conf
+ @/etc/bacula/conf.d/storages.conf
+- Possibly add an Inconsistent state when a Volume is in error
+ for non I/O reasons.
+- Fix #ifdefing so that smartalloc can be disabled. Check manual
+ -- the default is enabled.
+- Change calling sequence to delete_job_id_range() in ua_cmds.c
+ the preceding strtok() is done inside the subroutine only once.
+- Dangling softlinks are not restored properly. For example, take a
+ soft link such as src/testprogs/install-sh, which points to /usr/share/autoconf...
+ move the directory to another machine where the file /usr/share/autoconf does
+ not exist, back it up, then try a full restore. It fails.
+- Softlinks that point to non-existent file are not restored in restore all,
+ but are restored if the file is individually selected. BUG!
- Prune by Job
- Prune by Job Level (Full, Differential, Incremental)
- Strict automatic pruning
-- Implement unmount of USB volumes.
- Use "./config no-idea no-mdc2 no-rc5" on building OpenSSL for
Win32 to avoid patent problems.
-- Implement Bacula plugins -- design API
+- Implement multiple jobid specification for the cancel command,
+ similar to what is permitted on the update slots command.
+ - Better yet allow wild-cards or regexes.
+- Add Group resource for grouping Jobs so they can all be
+ run at the same time or canceled at the same time.
- modify pruning to keep a fixed number of versions of a file,
if requested.
- the cd-command should allow complete paths
i.e. cd /foo/bar/foo/bar
-> if a customer mails me the path to a certain file,
its faster to enter the specified directory
-- Fix bpipe.c so that it does not modify results pointer.
- ***FIXME*** calling sequence should be changed.
- Make tree walk routines like cd, ls, ... more user friendly
by handling spaces better.
+- When doing a restore, if the user does an "update slots"
+ after the job started in order to add a restore volume, the
+ values prior to the update slots will be put into the catalog.
+ Must retrieve catalog record merge it then write it back at the
+ end of the restore job, if we want to do this right.
=== rate design
jcr->last_rate
jcr->last_runtime
02-Nov 12:58 rufus-sd JobId 10: Wrote label to prelabeled Volume "Vol001" on device "DDS-4" (/dev/nst0)
02-Nov 12:58 rufus-sd JobId 10: Alert: TapeAlert[7]: Media Life: The tape has reached the end of its useful life.
02-Nov 12:58 rufus-dir JobId 10: Bacula rufus-dir 2.3.6 (26Oct07): 02-Nov-2007 12:58:51
-- Eliminate: /var is a different filesystem. Will not descend from / into /var
- Separate Files and Directories in catalog
- Create FileVersions table
- Look at rsysnc for incremental updates and dedupping
each data chunk -- according to James Harper 9Jan07.
- Features
- Better scheduling
- - Full at least once a month, ...
- - Cancel Inc if Diff/Full running
- More intelligent re-run
- - New/deleted file backup
- FD plugins
- Incremental backup -- rsync, Stow
-
For next release:
- Try to fix bscan not working with multiple DVD volumes bug #912.
- Look at mondo/mindi
- Make Bacula by default not backup tmpfs, procfs, sysfs, ...
- Fix hardlinked immutable files when linking a second file, the
immutable flag must be removed prior to trying to link it.
-- Implement Python event for backing up/restoring a file.
- Change dbcheck to tell users to use native tools for fixing
broken databases, and to ensure they have the proper indexes.
- add udev rules for Bacula devices.
http://linuxwiki.de/Bacula (in German)
- Possibly allow SD to spool even if a tape is not mounted.
-- Fix re-read of last block to check if job has actually written
- a block, and check if block was written by a different job
- (i.e. multiple simultaneous jobs writing).
- Figure out how to configure query.sql. Suggestion to use m4:
== changequote.m4 ===
changequote(`[',`]')dnl
The problem is that it requires m4, which is not present on all machines
at ./configure time.
-- Given all the problems with FIFOs, I think the solution is to do something a
- little different, though I will look at the code and see if there is not some
- simple solution (i.e. some bug that was introduced). What might be a better
- solution would be to use a FIFO as a sort of "key" to tell Bacula to read and
- write data to a program rather than the FIFO. For example, suppose you
- create a FIFO named:
-
- /home/kern/my-fifo
-
- Then, I could imagine if you backup and restore this file with a direct
- reference as is currently done for fifos, instead, during backup Bacula will
- execute:
-
- /home/kern/my-fifo.backup
-
- and read the data that my-fifo.backup writes to stdout. For restore, Bacula
- will execute:
-
- /home/kern/my-fifo.restore
-
- and send the data backed up to stdout. These programs can either be an
- executable or a shell script and they need only read/write to stdin/stdout.
-
- I think this would give a lot of flexibility to the user without making any
- significant changes to Bacula.
-
==== SQL
# get null file
- Bug: if a job is manually scheduled to run later, it does not appear
in any status report and cannot be cancelled.
-==== Keeping track of deleted/new files ====
-- To mark files as deleted, run essentially a Verify to disk, and
- when a file is found missing (MarkId != JobId), then create
- a new File record with FileIndex == -1. This could be done
- by the FD at the same time as the backup.
-
- My "trick" for keeping track of deletions is the following.
- Assuming the user turns on this option, after all the files
- have been backed up, but before the job has terminated, the
- FD will make a pass through all the files and send their
- names to the DIR (*exactly* the same as what a Verify job
- currently does). This will probably be done at the same
- time the files are being sent to the SD avoiding a second
- pass. The DIR will then compare that to what is stored in
- the catalog. Any files in the catalog but not in what the
- FD sent will receive a catalog File entry that indicates
- that at that point in time the file was deleted. This
- either transmitted to the FD or simultaneously computed in
- the FD, so that the FD can put a record on the tape that
- indicates that the file has been deleted at this point.
- A delete file entry could potentially be one with a FileIndex
- of 0 or perhaps -1 (need to check if FileIndex is used for
- some other thing as many of the Bacula fields are "overloaded"
- in the SD).
-
- During a restore, any file initially picked up by some
- backup (Full, ...) then subsequently having a File entry
- marked "delete" will be removed from the tree, so will not
- be restored. If a file with the same name is later OK it
- will be inserted in the tree -- this already happens. All
- will be consistent except for possible changes during the
- running of the FD.
-
- Since I'm on the subject, some of you may be wondering what
- the utility of the in memory tree is if you are going to
- restore everything (at least it comes up from time to time
- on the list). Well, it is still *very* useful because it
- allows only the last item found for a particular filename
- (full path) to be entered into the tree, and thus if a file
- is backed up 10 times, only the last copy will be restored.
- I recently (last Friday) restored a complete directory, and
- the Full and all the Differential and Incremental backups
- spanned 3 Volumes. The first Volume was not even mounted
- because all the files had been updated and hence backed up
- since the Full backup was made. In this case, the tree
- saved me a *lot* of time.
-
- Make sure this information is stored on the tape too so
- that it can be restored directly from the tape.
-
- All the code (with the exception of formally generating and
- saving the delete file entries) already exists in the Verify
- Catalog command. It explicitly recognizes added/deleted files since
- the last InitCatalog. It is more or less a "simple" matter of
- taking that code and adapting it slightly to work for backups.
-
- Comments from Martin Simmons (I think they are all covered):
- Ok, that should cover the basics. There are few issues though:
-
- - Restore will depend on the catalog. I think it is better to include the
- extra data in the backup as well, so it can be seen by bscan and bextract.
-
- - I'm not sure if it will preserve multiple hard links to the same inode. Or
- maybe adding or removing links will cause the data to be dumped again?
-
- - I'm not sure if it will handle renamed directories. Possibly it will work
- by dumping the whole tree under a renamed directory?
-
- - It remains to be seen how the backup performance of the DIR's will be
- affected when comparing the catalog for a large filesystem.
-
- 1. Use the current Director in-memory tree code (very fast), but currently in
- memory. It probably could be paged.
-
- 2. Use some DB such as Berkeley DB or SQLite. SQLite is already compiled and
- built for Win32, and it is something we could compile into the program.
-
- 3. Implement our own custom DB code.
-
- Note, by appropriate use of Directives in the Director, we can dynamically
- decide if the work is done in the Director or in the FD, and we can even
- allow the user to choose.
-
-=== most recent accurate file backup/restore ===
- Here is a sketch (i.e. more details must be filled in later) that I recently
- made of an algorithm for doing Accurate Backup.
-
- 1. Dir informs FD that it is doing an Accurate backup and lookup done by
- Director.
-
- 2. FD passes through the file system doing a normal backup based on normal
- conditions, recording the names of all files and their attributes, and
- indicating which files were backed up. This is very similar to what Verify
- does.
-
- 3. The Director receives the two lists of files at the end of the FD backup.
- One, files backed up, and one files not backed up. It then looks up all the
- files not backed up (using Verify style code).
-
- 4. The Dir sends the FD a list of:
- a. Additional files to backup (based on user specified criteria, name, size
- inode date, hash, ...).
- b. Files to delete.
-
- 5. Dir deletes list of file not backed up.
-
- 6. FD backs up additional files generates a list of those backed up and sends
- it to the Director, which adds it to the list of files backed up. The list
- is now complete and current.
-
- 7. The FD generates delete records for all the files that were deleted and
- sends to the SD.
-
- 8. The Dir deletes the previous CurrentBackup list, and then does a
- transaction insert of the new list that it has.
-
- 9. The rest works as before ...
-
- That is it.
-
- Two new tables needed.
- 1. CurrentBackupId table that contains Client, JobName, FileSet, and a unique
- BackupId. This is created during a Full save, and the BackupId can be set to
- the JobId of the Full save. It will remain the same until another Full
- backup is done. That is when new records are added during a Differential or
- Incremental, they must use the same BackupId.
-
- 2. CurrentBackup table that contains essentially a File record (less a number
- of fields, but with a few extra fields) -- e.g. a flag that the File was
- backed up by a Full save (this permits doing a Differential). The unique
- BackupId allows us to look up the CurrentBackup for a particular Client,
- Jobname, FileSet using that unique BackupId as the key, so this table must be
- indexed by the BackupId.
-
- Note any time a file is saved by the FD other than during a Full save, the
- Full save flag is cleared. When doing a Differential backup, if a file has
- the Full save flag set, it is skipped, otherwise it is backed up. For an
- Incremental backup, we check to see if the file has changed since the last
- time we backed it up.
-
- Deleted files should have FileIndex == 0
====
From David:
==========
- Make output from status use html table tags for nicely
presenting in a browser.
-- Can one write tapes faster with 8192 byte block sizes?
-- Document security problems with the same password for everyone in
- rpm and Win32 releases.
- Browse generations of files.
- I've seen an error when my catalog's File table fills up. I
then have to recreate the File table with a larger maximum row
- Use gather write() for network I/O.
- Autorestart on crash.
- Add bandwidth limiting.
-- Add acks every once and a while from the SD to keep
- the line from timing out.
- When an error in input occurs and conio beeps, you can back
up through the prompt.
- Detect fixed tape block mode during positioning by looking at
- Allow the user to select JobType for manual pruning/purging.
- bscan does not put first of two volumes back with all info in
bscan-test.
-- Implement the FreeBSD nodump flag in chflags.
- Figure out how to make named console messages go only to that
console and to the non-restricted console (new console class?).
- Make restricted console prompt for password if *ask* is set or
- Setup lrrd graphs: (http://www.linpro.no/projects/lrrd/) Mike Acar.
- Revisit the question of multiple Volumes (disk) on a single device.
- Add a block copy option to bcopy.
-- Finish work on Gnome restore GUI.
- Fix "llist jobid=xx" where no fileset or client exists.
- For each job type (Admin, Restore, ...) require only the really necessary
fields.- Pass Director resource name as an option to the Console.
- Add a "batch" mode to the Console (no unsolicited queries, ...).
-- Add a .list all files in the restore tree (probably also a list all files)
- Do both a long and short form.
- Allow browsing the catalog to see all versions of a file (with
stat data on each file).
- Restore attributes of directory if replace=never set but directory
- Check new HAVE_WIN32 open bits.
- Check if the tape has moved before writing.
- Handling removable disks -- see below:
-- Keep track of tape use time, and report when cleaning is necessary.
- Add FromClient and ToClient keywords on restore command (or
BackupClient RestoreClient).
- Implement a JobSet, which groups any number of jobs. If the
JobSet is started, all the jobs are started together.
Allow Pool, Level, and Schedule overrides.
-- Enhance cancel to timeout BSOCK packets after a specific delay.
-- Do scheduling by UTC using gmtime_r() in run_conf, scheduler, and
- ua_status.!!! Thanks to Alan Brown for this tip.
- Look at updating Volume Jobs so that Max Volume Jobs = 1 will work
correctly for multiple simultaneous jobs.
-- Correct code so that FileSet MD5 is calculated for < and | filename
- generation.
- Implement the Media record flag that indicates that the Volume does disk
addressing.
- Implement VolAddr, which is used when Volume is addressed like a disk,
and form it from VolFile and VolBlock.
-- Make multiple restore jobs for multiple media types specifying
- the proper storage type.
- Fix fast block rejection (stored/read_record.c:118). It passes a null
pointer (rec) to try_repositioning().
-- Look at extracting Win data from BackupRead.
- Implement RestoreJobRetention? Maybe better "JobRetention" in a Job,
which would take precidence over the Catalog "JobRetention".
- Implement Label Format in Add and Label console commands.
-- Possibly up network buffers to 65K. Put on variable.
- Put email tape request delays on one or more variables. User wants
to cancel the job after a certain time interval. Maximum Mount Wait?
- Job, Client, Device, Pool, or Volume?
support for Oracle database ??
===
- Look at adding SQL server and Exchange support for Windows.
-- Make dev->file and dev->block_num signed integers so that -1 can
- be an invalid value which happens with BSR.
- Create VolAddr for disk files in place of VolFile and VolBlock. This
is needed to properly specify ranges.
- Add progress of files/bytes to SD and FD.
- Implement some way for the File daemon to contact the Director
to start a job or pass its DHCP obtained IP number.
- Implement a query tape prompt/replace feature for a console
-- Copy console @ code to gnome2-console
- Make sure that Bacula rechecks the tape after the 20 min wait.
- Set IO_NOWAIT on Bacula TCP/IP packets.
- Try doing a raw partition backup and restore by mounting a
- What to do about "list files job=xxx".
- Look at how fuser works and /proc/PID/fd that is how Nic found the
file descriptor leak in Bacula.
-- Implement WrapCounters in Counters.
-- Add heartbeat from FD to SD if hb interval expires.
- Can we dynamically change FileSets?
- If pool specified to label command and Label Format is specified,
automatically generate the Volume name.
-- Why can't SQL do the filename sort for restore?
- Add ExhautiveRestoreSearch
- Look at the possibility of loading only the necessary
data into the restore tree (i.e. do it one directory at a
run the job but don't save the files.
- Make things like list where a file is saved case independent for
Windows.
-- Use autochanger to handle multiple devices.
- Implement a Recycle command
- Start working on Base jobs.
-- Implement UnsavedFiles DB record.
- From Phil Stracchino:
It would probably be a per-client option, and would be called
something like, say, "Automatically purge obsoleted jobs". What it
- bscan without -v is too quiet -- perhaps show jobs.
- Add code to reject whole blocks if not wanted on restore.
- Check if we can increase Bacula FD priorty in Win2000
-- Make sure the MaxVolFiles is fully implemented in SD
- Check if both CatalogFiles and UseCatalog are set to SD.
- Possibly add email to Watchdog if drive is unmounted too
long and a job is waiting on the drive.
- Implement script driven addition of File daemon to config files.
- Think about how to make Bacula work better with File (non-tape) archives.
- Write Unix emulator for Windows.
-- Put memory utilization in Status output of each daemon
- if full status requested or if some level of debug on.
- Make database type selectable by .conf files i.e. at runtime
- Set flag for uname -a. Add to Volume label.
- Restore files modified after date
- MaxWarnings
- MaxErrors (job?)
=====
-- FD sends unsaved file list to Director at end of job (see
- RFC below).
-- File daemon should build list of files skipped, and then
- at end of save retry and report any errors.
- Write a Storage daemon that uses pipes and
standard Unix programs to write to the tape.
See afbackup.
- Need something that monitors the JCR queue and
times out jobs by asking the deamons where they are.
- Enhance Jmsg code to permit buffering and saving to disk.
-- device driver = "xxxx" for drives.
- Verify from Volume
-- Ensure that /dev/null works
- Need report class for messages. Perhaps
report resource where report=group of messages
- enhance scan_attrib and rename scan_jobtype, and
Nobody is dying for them, but when you see what it does, you will die
without it.
-3. Restoring deleted files: Since I think my comments in (2) above
-have low probability of implementation, I'll also suggest that you
-could approach the issue of deleted files by a mechanism of having the
-fd report to the dir, a list of all files on the client for every
-backup job. The dir could note in the database entry for each file
-the date that the file was seen. Then if a restore as of date X takes
-place, only files that exist from before X until after X would be
-restored. Probably the major cost here is the extra date container in
-each row of the files table.
-
-Thanks for "listening". I hope some of this helps. If you want to
-contact me, please send me an email - I read some but not all of the
-mailing list traffic and might miss a reply there.
-
-Please accept my compliments for bacula. It is doing a great job for
-me!! I sympathize with you in the need to wrestle with excelence in
-execution vs. excelence in feature inclusion.
-
Regards,
Jerry Schieffer
==============================
Longer term to do:
-- Design at hierarchial storage for Bacula. Migration and Clone.
- Implement FSM (File System Modules).
- Audit M_ error codes to ensure they are correct and consistent.
- Add variable break characters to lex analyzer.
continue a save if the Director goes down (this
is NOT currently the case). Must detect socket error,
buffer messages for later.
-- Enhance time/duration input to allow multiple qualifiers e.g. 3d2h
- Add ability to backup to two Storage devices (two SD sessions) at
the same time -- e.g. onsite, offsite.
-- Compress or consolidate Volumes of old possibly deleted files. Perhaps
- someway to do so with every volume that has less than x% valid
- files.
-
-
-Migration: Move a backup from one Volume to another
-Clone: Copy a backup -- two Volumes
-
======================================================
Base Jobs design
VolSessionId and VolSessionTime.
=========================================================
+=========================================================
+ Preliminary design of Deletion of disk volumes
+
+tem 5: Deletion of disk Volumes when pruned
+ Date: Nov 25, 2005
+ Origin: Ross Boylan <RossBoylan at stanfordalumni dot org> (edited
+ by Kern)
+ Status:
+
+ What: Provide a way for Bacula to automatically remove Volumes
+ from the filesystem, or optionally to truncate them.
+ Obviously, the Volume must be pruned prior removal.
+
+ Why: This would allow users more control over their Volumes and
+ prevent disk based volumes from consuming too much space.
+
+ Notes: The following two directives might do the trick:
+
+ Volume Data Retention = <time period>
+ Remove Volume After = <time period>
+
+ The migration project should also remove a Volume that is
+ migrated. This might also work for tape Volumes.
+
+ Notes: (Kern). The data fields to control this have been added
+ to the new 3.0.0 database table structure.
+
+As noted above, in version 3.0.0, we added a new Media column
+named ActionOnPurge, which is a TINYINT (smallint in PostgreSQL).
+The purpose of this field is to have a flag set with each Volume
+that determines certain actions that will be performed when a
+Volume is being marked Purged (i.e. when there are no longer any
+Job records pointing to that Volume).
+
+We have envisioned that ActionOnPurge could take on the following
+values (some are exclusive and others inclusive):
+
+ Flag Value Comments
+ Delete Delete the Volume from the catalog and disk
+ What delete means for a tape is unclear.
+ Truncate Truncate the Volume
+ Erase Erase the Volume (overwrite data) could be
+ very time consuming. Erase could be specified
+ with either Truncate or Delete.
+
+Implementation details:
+- ActionOnPurge is probably a bit mask.
+- There needs to be a new Directive in the Pool resource that allows
+ setting of this flag.
+- The flag must be passed to the SD along with the current Volume information.
+- There needs to be a new command sent from the Director to the SD
+ that indicates that a Purge was done, the Volume name, and that it
+ should be handled.
+- For security reasons the SD must very carefully check that it actually
+ can find the correct volume. This means, it must mount it, read the label
+ or already have done so, and verify that the Volume is really there.
+ Then the SD can perform the requested function (delete or truncate).
+- Doing an Erase could be implemented later.
+- In the above Feature Request, the proposed Volume Data Retention
+ directive is already implemented with Volume Retention Interval.
+- In the above Feature Request, the proposed Remove Volume After is
+ a bit problematic as it means that some action must occur some time
+ later, and currently Bacula has no mechanism to handle such events.
+ This will probably be considered as a feature to be added later
+ if there is sufficient demand.
+
+=========================================================
+
+Item 1: Ability to restart failed jobs
+ Date: 26 April 2009
+ Origin: Kern/Eric
+ Status:
+
+ What: Often jobs fail because of a communications line drop or max run time,
+ cancel, or some other non-critical problem. Currrently any data
+ saved is lost. This implementation should modify the Storage daemon
+ so that it saves all the files that it knows are completely backed
+ up to the Volume
+
+ The jobs should then be marked as incomplete and a subsequent
+ Incremental Accurate backup will then take into account all the
+ previously saved job.
+
+ Why: Avoids backuping data already saved.
+
+ Notes: Requires Accurate to restart correctly. Must completed have a minimum
+ volume of data or files stored on Volume before enabling.
+
+ Implementation notes:
+ - Must define new I job termination code for incomplete Jobs -- Done
+ - In the SD must track the position of the attributes being spooled
+ when data is actually written to the Volume -- Done
+ - In the SD, truncate the attributes to the last valid file written
+ to the Volume
+ - The Dir must past restart flag to SD -- Done
+ - If restart flag is sent in SD, and Job fails, must truncate attribute
+ file and send it to Dir marking the job as I (incomplete).
+ - In Dir when a Job is restarted, if there is an Incomplete job, must
+ send Accurate information to FD.
+ - In FD must use accurate information
+ - If Incomplete job finishes, must mark it T.
-=====
- Multiple drive autochanger data: see Alan Brown
- mtx -f xxx unloadStorage Element 1 is Already Full(drive 0 was empty)
- Unloading Data Transfer Element into Storage Element 1...source Element
- Address 480 is Empty
-
- (drive 0 was empty and so was slot 1)
- > mtx -f xxx load 15 0
- no response, just returns to the command prompt when complete.
- > mtx -f xxx status Storage Changer /dev/changer:2 Drives, 60 Slots ( 2 Import/Export )
- Data Transfer Element 0:Full (Storage Element 15 Loaded):VolumeTag = HX001
- Data Transfer Element 1:Empty
- Storage Element 1:Empty
- Storage Element 2:Full :VolumeTag=HX002
- Storage Element 3:Full :VolumeTag=HX003
- Storage Element 4:Full :VolumeTag=HX004
- Storage Element 5:Full :VolumeTag=HX005
- Storage Element 6:Full :VolumeTag=HX006
- Storage Element 7:Full :VolumeTag=HX007
- Storage Element 8:Full :VolumeTag=HX008
- Storage Element 9:Full :VolumeTag=HX009
- Storage Element 10:Full :VolumeTag=HX010
- Storage Element 11:Empty
- Storage Element 12:Empty
- Storage Element 13:Empty
- Storage Element 14:Empty
- Storage Element 15:Empty
- Storage Element 16:Empty....
- Storage Element 28:Empty
- Storage Element 29:Full :VolumeTag=CLNU01L1
- Storage Element 30:Empty....
- Storage Element 57:Empty
- Storage Element 58:Full :VolumeTag=NEX261L2
- Storage Element 59 IMPORT/EXPORT:Empty
- Storage Element 60 IMPORT/EXPORT:Empty
- $ mtx -f xxx unload
- Unloading Data Transfer Element into Storage Element 15...done
-
- (just to verify it remembers where it came from, however it can be
- overrriden with mtx unload {slotnumber} to go to any storage slot.)
- Configuration wise:
- There needs to be a table of drive # to devices somewhere - If there are
- multiple changers or drives there may not be a 1:1 correspondance between
- changer drive number and system device name - and depending on the way the
- drives are hooked up to scsi busses, they may not be linearly numbered
- from an offset point either.something like
-
- Autochanger drives = 2
- Autochanger drive 0 = /dev/nst1
- Autochanger drive 1 = /dev/nst2
- IMHO, it would be _safest_ to use explicit mtx unload commands at all
- times, not just for multidrive changers. For a 1 drive changer, that's
- just:
-
- mtx load xx 0
- mtx unload xx 0
-
- MTX's manpage (1.2.15):
- unload [<slotnum>] [ <drivenum> ]
- Unloads media from drive <drivenum> into slot
- <slotnum>. If <drivenum> is omitted, defaults to
- drive 0 (as do all commands). If <slotnum> is
- omitted, defaults to the slot that the drive was
- loaded from. Note that there's currently no way
- to say 'unload drive 1's media to the slot it
- came from', other than to explicitly use that
- slot number as the destination.AB
-====
-
-====
-SCSI info:
-FreeBSD
-undef# camcontrol devlist
-<WANGTEK 51000 SCSI M74H 12B3> at scbus0 target 2 lun 0 (pass0,sa0)
-<ARCHIVE 4586XX 28887-XXX 4BGD> at scbus0 target 4 lun 0 (pass1,sa1)
-<ARCHIVE 4586XX 28887-XXX 4BGD> at scbus0 target 4 lun 1 (pass2)
-
-tapeinfo -f /dev/sg0 with a bad tape in drive 1:
-[kern@rufus mtx-1.2.17kes]$ ./tapeinfo -f /dev/sg0
-Product Type: Tape Drive
-Vendor ID: 'HP '
-Product ID: 'C5713A '
-Revision: 'H107'
-Attached Changer: No
-TapeAlert[3]: Hard Error: Uncorrectable read/write error.
-TapeAlert[20]: Clean Now: The tape drive neads cleaning NOW.
-MinBlock:1
-MaxBlock:16777215
-SCSI ID: 5
-SCSI LUN: 0
-Ready: yes
-BufferedMode: yes
-Medium Type: Not Loaded
-Density Code: 0x26
-BlockSize: 0
-DataCompEnabled: yes
-DataCompCapable: yes
-DataDeCompEnabled: yes
-CompType: 0x20
-DeCompType: 0x0
-Block Position: 0
-=====
-
====
Handling removable disks
=== Done
-- Why the heck doesn't bacula drop root priviledges before connecting to
- the DB?
-- Look at using posix_fadvise(2) for backups -- see bug #751.
- Possibly add the code at findlib/bfile.c:795
-/* TCP socket options */
-#define TCP_KEEPIDLE 4 /* Start keeplives after this period */
-- Fix bnet_connect() code to set a timer and to use time to
- measure the time.
-- Implement 4th argument to make_catalog_backup that passes hostname.
-- Test FIFO backup/restore -- make regression
-- Please mount volume "xxx" on Storage device ... should also list
- Pool and MediaType in case user needs to create a new volume.
-- On restore add Restore Client, Original Client.
-01-Apr 00:42 rufus-dir: Start Backup JobId 55, Job=kernsave.2007-04-01_00.42.48
-01-Apr 00:42 rufus-sd: Python SD JobStart: JobId=55 Client=Rufus
-01-Apr 00:42 rufus-dir: Created new Volume "Full0001" in catalog.
-01-Apr 00:42 rufus-dir: Using Device "File"
-01-Apr 00:42 rufus-sd: kernsave.2007-04-01_00.42.48 Warning: Device "File" (/tmp) not configured to autolabel Volumes.
-01-Apr 00:42 rufus-sd: kernsave.2007-04-01_00.42.48 Warning: Device "File" (/tmp) not configured to autolabel Volumes.
-01-Apr 00:42 rufus-sd: Please mount Volume "Full0001" on Storage Device "File" (/tmp) for Job kernsave.2007-04-01_00.42.48
-01-Apr 00:44 rufus-sd: Wrote label to prelabeled Volume "Full0001" on device "File" (/tmp)
-- Check if gnome-console works with TLS.
-- the director seg faulted when I omitted the pool directive from a
- job resource. I was experimenting and thought it redundant that I had
- specified Pool, Full Backup Pool. and Differential Backup Pool. but
- apparently not. This happened when I removed the pool directive and
- started the director.
-- Add Where: client:/.... to restore job report.
-- Ensure that moving a purged Volume in ua_purge.c to the RecyclePool
- does the right thing.
-- FD-SD quick disconnect
-- Building the in memory restore tree is slow.
-- Erabt if min_block_size > max_block_size
-- Add the ability to consolidate old backup sets (basically do a restore
- to tape and appropriately update the catalog). Compress Volume sets.
- Might need to spool via file is only one drive is available.
-- Why doesn't @"xxx abc" work in a conf file?
-- Don't restore Solaris Door files:
- #define S_IFDOOR in st_mode.
- see: http://docs.sun.com/app/docs/doc/816-5173/6mbb8ae23?a=view#indexterm-360
-- Figure out how to recycle Scratch volumes back to the Scratch Pool.
-- Implement Despooling data status.
-- Use E'xxx' to escape PostgreSQL strings.
-- Look at mincore: http://insights.oetiker.ch/linux/fadvise.html
-- Unicode input http://en.wikipedia.org/wiki/Byte_Order_Mark
-- Look at moving the Storage directive from the Job to the
- Pool in the default conf files.
-- Look at in src/filed/backup.c
-> pm_strcpy(ff_pkt->fname, ff_pkt->fname_save);
-> pm_strcpy(ff_pkt->link, ff_pkt->link_save);
-- Add Catalog = to Pool resource so that pools will exist
- in only one catalog -- currently Pools are "global".
-- Add TLS to bat (should be done).
-=== Duplicate jobs ===
- hese apply only to backup jobs.
-
- 1. Allow Duplicate Jobs = Yes | No | Higher (Yes)
-
- 2. Duplicate Job Interval = <time-interval> (0)
-
- The defaults are in parenthesis and would produce the same behavior as today.
-
- If Allow Duplicate Jobs is set to No, then any job starting while a job of the
- same name is running will be canceled.
-
- If Allow Duplicate Jobs is set to Higher, then any job starting with the same
- or lower level will be canceled, but any job with a Higher level will start.
- The Levels are from High to Low: Full, Differential, Incremental
-
- Finally, if you have Duplicate Job Interval set to a non-zero value, any job
- of the same name which starts <time-interval> after a previous job of the
- same name would run, any one that starts within <time-interval> would be
- subject to the above rules. Another way of looking at it is that the Allow
- Duplicate Jobs directive will only apply after <time-interval> of when the
- previous job finished (i.e. it is the minimum interval between jobs).
-
- So in summary:
-
- Allow Duplicate Jobs = Yes | No | HigherLevel | CancelLowerLevel (Yes)
-
- Where HigherLevel cancels any waiting job but not any running job.
- Where CancelLowerLevel is same as HigherLevel but cancels any running job or
- waiting job.
-
- Duplicate Job Proximity = <time-interval> (0)
-
- My suggestion was to define it as the minimum guard time between
- executions of a specific job -- ie, if a job was scheduled within Job
- Proximity number of seconds, it would be considered a duplicate and
- consolidated.
-
- Skip = Do not allow two or more jobs with the same name to run
- simultaneously within the proximity interval. The second and subsequent
- jobs are skipped without further processing (other than to note the job
- and exit immediately), and are not considered errors.
-
- Fail = The second and subsequent jobs that attempt to run during the
- proximity interval are cancelled and treated as error-terminated jobs.
-
- Promote = If a job is running, and a second/subsequent job of higher
- level attempts to start, the running job is promoted to the higher level
- of processing using the resources already allocated, and the subsequent
- job is treated as in Skip above.
-
-
-DuplicateJobs {
- Name = "xxx"
- Description = "xxx"
- Allow = yes|no (no = default)
-
- AllowHigherLevel = yes|no (no)
-
- AllowLowerLevel = yes|no (no)
-
- AllowSameLevel = yes|no
-
- Cancel = Running | New (no)
-
- CancelledStatus = Fail | Skip (fail)
-
- Job Proximity = <time-interval> (0)
- My suggestion was to define it as the minimum guard time between
- executions of a specific job -- ie, if a job was scheduled within Job
- Proximity number of seconds, it would be considered a duplicate and
- consolidated.
-
-}
===
+- Fix bpipe.c so that it does not modify results pointer.
+ ***FIXME*** calling sequence should be changed.
+
+- When reserving a device to read, check to see if the Volume
+ is already in use, if so wait. Probably will need to pass the
+ Volume. See bug #1313. Create a regression test to simulate
+ this problem and see if VolumePollInterval fixes it. Possibly turn
+ it on by default.
+
+- Fix restore of acls and extended attributes to count ERROR
+ messages and make errors non-fatal.
+- Put save/restore various platform acl/xattrs on a pointer to simplify
+ the code.
+- Add blast attributes to DIR to SD.
+- Detect deadlocks in reservations.
+- Plugins:
+ - Add list during dump
+ - Add in plugin code flag
+ - Add bRC_EndJob -- stops more calls to plugin this job
+ - Add bRC_Term (unload plugin)
+ - remove time_t from Jmsg and use utime_t?
+- Deadlock detection, watchdog sees if counter advances when jobs are
+ running. With debug on, can do a "status" command.
+- New directive "Delete purged Volumes"
+- Implement unmount of USB volumes.
+