Kern's ToDo List
- 02 January 2008
+ 02 May 2008
Document:
+- This patch will give Bacula the option to specify files in
+ FileSets which can be dropped in directories which are Included
+ which will cause that directory not the be backed up.
+
+ For example, my FileSet contains:
+ # List of files to be backed up
+ FileSet {
+ Name = "Remote Specified1"
+ Include {
+ Options {
+ signature = MD5
+ }
+ File = "\\</etc/bacula-include"
+ IgnoreDir = .notthisone
+ }
+ Exclude {
+ File = "\\</etc/bacula-exclude"
+ }
+ }
+
+ And /etc/bacula-include contains:
+
+ /home
+
+ But in /home, there are hundreds of directories of users and some
+ people want to indicate that they don't want to have certain
+ directories backed-up:
+
+ /home/edwin/www/cache
+ /home/edwin/temp
+
+ So I can put them in /etc/bacula-exclude, but that is a system
+ file and not editable for mortal users. To make it possible for
+ users to make it clear to the system that certain directories
+ don't need to be backed up, they now can create file called
+ .notthisone:
+
+ /home/edwin/www/cache/.notthisone
+ /home/edwin/temp/.notthisone
+
+ so that the backup system will be clear of rubbish like stuff in
+ these two directories but still that I as administrator of the
+ system don't have to be involved in it.
- !!! Cannot restore two jobs a the same time that were
written simultaneously unless they were totally spooled.
- Document cleaning up the spool files:
Complete rework of the scheduling system (not in list)
Performance and usage instrumentation (not in list)
See email of 21Aug2007 for details.
-- Implement Diff,Inc Retention Periods
- Look at: http://tech.groups.yahoo.com/group/cfg2html
and http://www.openeyet.nl/scc/ for managing customer changes
Priority:
+================
+- Change calling sequence to delete_job_id_range() in ua_cmds.c
+ the preceding strtok() is done inside the subroutine only once.
+- Dangling softlinks are not restored properly. For example, take a
+ soft link such as src/testprogs/install-sh, which points to /usr/share/autoconf...
+ move the directory to another machine where the file /usr/share/autoconf does
+ not exist, back it up, then try a full restore. It fails.
+- Check for FD compatibility -- eg .nobackup ...
+- Re-check new dcr->reserved_volume
+- Softlinks that point to non-existent file are not restored in restore all,
+ but are restored if the file is individually selected. BUG!
+- Doc Duplicate Jobs.
+- New directive "Delete purged Volumes"
+- Prune by Job
+- Prune by Job Level (Full, Differential, Incremental)
+- Strict automatic pruning
+- Implement unmount of USB volumes.
- Use "./config no-idea no-mdc2 no-rc5" on building OpenSSL for
Win32 to avoid patent problems.
-- Plugins
-- Implement Despooling data status.
-=== Duplicate jobs ===
- hese apply only to backup jobs.
-
- 1. Allow Duplicate Jobs = Yes | No | Higher (Yes)
-
- 2. Duplicate Job Interval = <time-interval> (0)
-
- The defaults are in parenthesis and would produce the same behavior as today.
-
- If Allow Duplicate Jobs is set to No, then any job starting while a job of the
- same name is running will be canceled.
-
- If Allow Duplicate Jobs is set to Higher, then any job starting with the same
- or lower level will be canceled, but any job with a Higher level will start.
- The Levels are from High to Low: Full, Differential, Incremental
-
- Finally, if you have Duplicate Job Interval set to a non-zero value, any job
- of the same name which starts <time-interval> after a previous job of the
- same name would run, any one that starts within <time-interval> would be
- subject to the above rules. Another way of looking at it is that the Allow
- Duplicate Jobs directive will only apply after <time-interval> of when the
- previous job finished (i.e. it is the minimum interval between jobs).
-
- So in summary:
-
- Allow Duplicate Jobs = Yes | No | HigherLevel | CancelLowerLevel (Yes)
-
- Where HigherLevel cancels any waiting job but not any running job.
- Where CancelLowerLevel is same as HigherLevel but cancels any running job or
- waiting job.
-
- Duplicate Job Proximity = <time-interval> (0)
-
- Skip = Do not allow two or more jobs with the same name to run
- simultaneously within the proximity interval. The second and subsequent
- jobs are skipped without further processing (other than to note the job
- and exit immediately), and are not considered errors.
-
- Fail = The second and subsequent jobs that attempt to run during the
- proximity interval are cancelled and treated as error-terminated jobs.
-
- Promote = If a job is running, and a second/subsequent job of higher
- level attempts to start, the running job is promoted to the higher level
- of processing using the resources already allocated, and the subsequent
- job is treated as in Skip above.
-===
+- Implement multiple jobid specification for the cancel command,
+ similar to what is permitted on the update slots command.
+- Implement Bacula plugins -- design API
+- modify pruning to keep a fixed number of versions of a file,
+ if requested.
- the cd-command should allow complete paths
i.e. cd /foo/bar/foo/bar
-> if a customer mails me the path to a certain file,
its faster to enter the specified directory
-- Fix bpipe.c so that it does not modify results pointer.
- ***FIXME*** calling sequence should be changed.
- Make tree walk routines like cd, ls, ... more user friendly
by handling spaces better.
=== rate design
running of that Job (i.e. lets any previous invocation finish
before doing Interval testing).
- Look at simplifying File exclusions.
-- New directive "Delete purged Volumes"
-- It appears to me that you have run into some sort of race
- condition where two threads want to use the same Volume and they
- were both given access. Normally that is no problem. However,
- one thread wanted the particular Volume in drive 0, but it was
- loaded into drive 1 so it decided to unload it from drive 1 and
- then loaded it into drive 0, while the second thread went on
- thinking that the Volume could be used in drive 1 not realizing
- that in between time, it was loaded in drive 0.
- I'll look at the code to see if there is some way we can avoid
- this kind of problem. Probably the best solution is to make the
- first thread simply start using the Volume in drive 1 rather than
- transferring it to drive 0.
->>>>>>> .r6288
-- Complete Catalog in Pool
-- Implement Bacula plugins -- design API
->>>>>>> .r6079
- Scripts
-- Prune by Job
-- Prune by Job Level
-- True automatic pruning
-- Duplicate Jobs
- Run, Fail, Skip, Higher, Promote, CancelLowerLevel
- Proximity
- New directive.
- Auto update of slot:
rufus-dir: ua_run.c:456-10 JobId=10 NewJobId=10 using pool Full priority=10
02-Nov 12:58 rufus-dir JobId 10: Start Backup JobId 10, Job=kernsave.2007-11-02_12.58.03
- Create FileVersions table
- Look at rsysnc for incremental updates and dedupping
- Add MD5 or SHA1 check in SD for data validation
-- modify pruning to keep a fixed number of versions of a file,
- if requested.
- finish implementation of fdcalled -- see ua_run.c:105
- Fix problem in postgresql.c in my_postgresql_query, where the
generation of the error message doesn't differentiate result==NULL
- Implement continue spooling while despooling.
- Remove all install temp files in Win32 PLUGINSDIR.
- Audit retention periods to make sure everything is 64 bit.
-- Use E'xxx' to escape PostgreSQL strings.
- No where in restore causes kaboom.
- Performance: multiple spool files for a single job.
- Performance: despool attributes when despooling data (problem
multiplexing Dir connection).
- Make restore use the in-use volume reservation algorithm.
-- Look at mincore: http://insights.oetiker.ch/linux/fadvise.html
-- Unicode input http://en.wikipedia.org/wiki/Byte_Order_Mark
-- Add TLS to bat (should be done).
- When Pool specifies Storage command override does not work.
- Implement wait_for_sysop() message display in wait_for_device(), which
now prints warnings too often.
- Ensure that each device in an Autochanger has a different
Device Index.
-- Add Catalog = to Pool resource so that pools will exist
- in only one catalog -- currently Pools are "global".
- Look at sg_logs -a /dev/sg0 for getting soft errors.
- btape "test" command with Offline on Unmount = yes
and possibly changing the blobs into varchar.
- Ensure that the SD re-reads the Media record if the JobFiles
does not match -- it may have been updated by another job.
-- Look at moving the Storage directive from the Job to the
- Pool in the default conf files.
- Doc items
- Test Volume compatibility between machine architectures
- Encryption documentation
==============================
Longer term to do:
-- Implement wait on multiple objects
- - Multiple max times
- - pthread signal
- - socket input ready
- Design at hierarchial storage for Bacula. Migration and Clone.
- Implement FSM (File System Modules).
- Audit M_ error codes to ensure they are correct and consistent.
=========================================================
-==========================================================
- Unsaved File design
-For each Incremental job that is run, there may be files that
-were found but not saved because they were locked (this applies
-only to Windows). Such a system could send back to the Director
-a list of Unsaved files.
-Need:
-- New UnSavedFiles table that contains:
- JobId
- PathId
- FilenameId
-- Then in the next Incremental job, the list of Unsaved Files will be
- feed to the FD, who will ensure that they are explicitly chosen even
- if standard date/time check would not have selected them.
-=============================================================
-
=====
Multiple drive autochanger data: see Alan Brown
#define S_IFDOOR in st_mode.
see: http://docs.sun.com/app/docs/doc/816-5173/6mbb8ae23?a=view#indexterm-360
- Figure out how to recycle Scratch volumes back to the Scratch Pool.
+- Implement Despooling data status.
+- Use E'xxx' to escape PostgreSQL strings.
+- Look at mincore: http://insights.oetiker.ch/linux/fadvise.html
+- Unicode input http://en.wikipedia.org/wiki/Byte_Order_Mark
+- Look at moving the Storage directive from the Job to the
+ Pool in the default conf files.
+- Look at in src/filed/backup.c
+> pm_strcpy(ff_pkt->fname, ff_pkt->fname_save);
+> pm_strcpy(ff_pkt->link, ff_pkt->link_save);
+- Add Catalog = to Pool resource so that pools will exist
+ in only one catalog -- currently Pools are "global".
+- Add TLS to bat (should be done).
+=== Duplicate jobs ===
+- Done, but implemented somewhat differently than described below!!!
+
+ hese apply only to backup jobs.
+
+ 1. Allow Duplicate Jobs = Yes | No | Higher (Yes)
+
+ 2. Duplicate Job Interval = <time-interval> (0)
+
+ The defaults are in parenthesis and would produce the same behavior as today.
+
+ If Allow Duplicate Jobs is set to No, then any job starting while a job of the
+ same name is running will be canceled.
+
+ If Allow Duplicate Jobs is set to Higher, then any job starting with the same
+ or lower level will be canceled, but any job with a Higher level will start.
+ The Levels are from High to Low: Full, Differential, Incremental
+
+ Finally, if you have Duplicate Job Interval set to a non-zero value, any job
+ of the same name which starts <time-interval> after a previous job of the
+ same name would run, any one that starts within <time-interval> would be
+ subject to the above rules. Another way of looking at it is that the Allow
+ Duplicate Jobs directive will only apply after <time-interval> of when the
+ previous job finished (i.e. it is the minimum interval between jobs).
+
+ So in summary:
+
+ Allow Duplicate Jobs = Yes | No | HigherLevel | CancelLowerLevel (Yes)
+
+ Where HigherLevel cancels any waiting job but not any running job.
+ Where CancelLowerLevel is same as HigherLevel but cancels any running job or
+ waiting job.
+
+ Duplicate Job Proximity = <time-interval> (0)
+
+ My suggestion was to define it as the minimum guard time between
+ executions of a specific job -- ie, if a job was scheduled within Job
+ Proximity number of seconds, it would be considered a duplicate and
+ consolidated.
+
+ Skip = Do not allow two or more jobs with the same name to run
+ simultaneously within the proximity interval. The second and subsequent
+ jobs are skipped without further processing (other than to note the job
+ and exit immediately), and are not considered errors.
+
+ Fail = The second and subsequent jobs that attempt to run during the
+ proximity interval are cancelled and treated as error-terminated jobs.
+
+ Promote = If a job is running, and a second/subsequent job of higher
+ level attempts to start, the running job is promoted to the higher level
+ of processing using the resources already allocated, and the subsequent
+ job is treated as in Skip above.
+
+
+DuplicateJobs {
+ Name = "xxx"
+ Description = "xxx"
+ Allow = yes|no (no = default)
+
+ AllowHigherLevel = yes|no (no)
+
+ AllowLowerLevel = yes|no (no)
+
+ AllowSameLevel = yes|no
+
+ Cancel = Running | New (no)
+
+ CancelledStatus = Fail | Skip (fail)
+
+ Job Proximity = <time-interval> (0)
+ My suggestion was to define it as the minimum guard time between
+ executions of a specific job -- ie, if a job was scheduled within Job
+ Proximity number of seconds, it would be considered a duplicate and
+ consolidated.
+
+}
+
+===
+- Fix bpipe.c so that it does not modify results pointer.
+ ***FIXME*** calling sequence should be changed.