X-Git-Url: https://git.sur5r.net/?a=blobdiff_plain;ds=sidebyside;f=bacula%2Fkernstodo;h=cfa76f1b9acf9ef38a113cc09c7ea50ab11b214e;hb=fc92e04201e428fbf206dbd01518a02490ba50f9;hp=0e1013f55a54987a17169a819b2ff34a77a8f626;hpb=5de97d6f0ee8002de6c0c1a907b86a2c2f42f9bc;p=bacula%2Fbacula diff --git a/bacula/kernstodo b/bacula/kernstodo index 0e1013f55a..cfa76f1b9a 100644 --- a/bacula/kernstodo +++ b/bacula/kernstodo @@ -1,14 +1,11 @@ Kern's ToDo List - 30 November 2005 + 22 February 2006 Major development: Project Developer ======= ========= -Version 1.37 Kern (see below) -======================================================== Document: -- Does ClientRunAfterJob fail the job on a bad return code? - Document cleaning up the spool files: db, pid, state, bsr, mail, conmsg, spool - Document the multiple-drive-changer.txt script. @@ -16,16 +13,95 @@ Document: - Does WildFile match against full name? Doc. - %d and %v only valid on Director, not for ClientRunBefore/After. +Priority: + For 1.39: +- Fix re-read of last block to check if job has actually written + a block, and check if block was written by a different job + (i.e. multiple simultaneous jobs writing). +- JobStatus and Termination codes. +- Some users claim that they must do two prune commands to get a + Volume marked as purged. +- Print warning message if LANG environment variable does not specify + UTF-8. +=== Migration from David === +What I'd like to see: + +Job { + Name = "-migrate" + Type = Migrate + Messages = Standard + Pool = Default + Migration Selection Type = LowestUtil | OldestVol | PoolOccupancy | +Client | PoolResidence | Volume | JobName | SQLquery + Migration Selection Pattern = "regexp" + Next Pool = +} + +There should be no need for a Level (migration is always Full, since you +don't calculate differential/incremental differences for migration), +Storage should be determined by the volume types in the pool, and Client +is really a selection issue. Migration should always occur to the +NextPool defined in the pool definition. If no nextpool is defined, the +job should end with a reason of "no place to go". If Next Pool statement +is present, we override the check in the pool definition and use the +pool specified. + +Here's how I'd define Migration Selection Types: + +With Regexes: +Client -- Migrate data from selected client only. Migration Selection +Pattern regexp provides pattern to select client names, eg ^FS00* makes +all client names starting with FS00 eligible for migration. + +Jobname -- Migration all jobs matching name. Migration Selection Pattern +regexp provides pattern to select jobnames existing in pool. + +Volume -- Migrate all data on specified volumes. Migration Selection +Pattern regexp provides selection criteria for volumes to be migrated. +Volumes must exist in pool to be eligible for migration. + + +With Regex optional: +LowestUtil -- Identify the volume in the pool with the least data on it +and empty it. No Migration Selection Pattern required. + +OldestVol -- Identify the LRU volume with data written, and empty it. No +Migration Selection Pattern required. + +PoolOccupancy -- if pool occupancy exceeds , migrate volumes +(starting with most full volumes) until pool occupancy drops below +. Pool highmig and lowmig values are in pool definition, no +Migration Selection Pattern required. + + +No regex: +SQLQuery -- Migrate all jobuids returned by the supplied SQL query. +Migration Selection Pattern contains SQL query to execute; should return +a list of 1 or more jobuids to migrate. + +PoolResidence -- Migrate data sitting in pool for longer than +PoolResidence value in pool definition. Migration Selection Pattern +optional; if specified, override value in pool definition (value in +minutes). + + +[ possibly a Python event -- kes ] +=== +- run_cmd() returns int should return JobId_t +- get_next_jobid_from_list() returns int should return JobId_t +- Document export LDFLAGS=-L/usr/lib64 +- Don't attempt to restore from "Disabled" Volumes. +- Network error on Win32 should set Win32 error code. +- What happens when you rename a Disk Volume? +- Job retention period in a Pool (and hence Volume). The job would + then be migrated. +- Detect resource deadlock in Migrate when same job wants to read + and write the same device. - Make hardlink code at line 240 of find_one.c use binary search. - Queue warning/error messages during restore so that they are reported at the end of the report rather than being hidden in the file listing ... -- A Volume taken from Scratch should take on the retention period - of the new pool. -- Correct doc for Maximum Changer Wait (and others) accepting only - integers. -- Fix Maximum Changer Wait (and others) to accept qualifiers. - Look at -D_FORTIFY_SOURCE=2 - Add Win32 FileSet definition somewhere - Look at fixing restore status stats in SD. @@ -48,7 +124,6 @@ For 1.39: running of that Job (i.e. lets any previous invocation finish before doing Interval testing). - Look at simplifying File exclusions. -- Fix store_yesno to be store_bitmask. - New directive "Delete purged Volumes" - new pool XXX with ScratchPoolId = MyScratchPool's PoolId and let it fill itself, and RecyclePoolId = XXX's PoolId so I can @@ -56,8 +131,7 @@ For 1.39: MyScratchPool - If I want to remove this pool, I set RecyclePoolId = MyScratchPool's PoolId, and when it is empty remove it. -- Figure out how to recycle Scratch volumes back to the Scratch - Pool. +- Figure out how to recycle Scratch volumes back to the Scratch Pool. - Add Volume=SCRTCH - Allow Check Labels to be used with Bacula labels. - "Resuming" a failed backup (lost line for example) by using the @@ -67,17 +141,6 @@ For 1.39: days before it needs changing. - Command to show next tape that will be used for a job even if the job is not scheduled. ---- create_file.c.orig Fri Jul 8 12:13:05 2005 -+++ create_file.c Fri Jul 8 12:13:07 2005 -@@ -195,6 +195,8 @@ - attr->ofname, be.strerror()); - return CF_ERROR; - } -+ } else if(S_ISSOCK(attr->statp.st_mode)) { -+ Dmsg1(200, "Skipping socket: %s\n", attr->ofname); - } else { - Dmsg1(200, "Restore node: %s\n", attr->ofname); - if (mknod(attr->ofname, attr->statp.st_mode, attr->statp.st_rdev) != 0 && errno != EEXIST) { - From: Arunav Mandal 1. When jobs are running and bacula for some reason crashes or if I do a restart it remembers and jobs it was running before it crashed or restarted @@ -102,12 +165,6 @@ For 1.39: - Fix bpipe.c so that it does not modify results pointer. ***FIXME*** calling sequence should be changed. -1.xx Major Projects: -#3 Migration (Move, Copy, Archive Jobs) -#7 Single Job Writing to Multiple Storage Devices -- Reserve blocks other restore jobs when first cannot connect - to SD. -- Add true/false to conf same as yes/no - For Windows disaster recovery see http://unattended.sf.net/ - regardless of the retention period, Bacula will not prune the last Full, Diff, or Inc File data until a month after the @@ -242,6 +299,50 @@ For 1.39: - It remains to be seen how the backup performance of the DIR's will be affected when comparing the catalog for a large filesystem. +==== +From David: +How about introducing a Type = MgmtPolicy job type? That job type would +be responsible for scanning the Bacula environment looking for specific +conditions, and submitting the appropriate jobs for implementing said +policy, eg: + +Job { + Name = "Migration-Policy" + Type = MgmtPolicy + Policy Selection Job Type = Migrate + Scope = " " + Threshold = " " + Job Template = +} + +Where is any legal job keyword, is a comparison +operator (=,<,>,!=, logical operators AND/OR/NOT) and is a +appropriate regexp. I could see an argument for Scope and Threshold +being SQL queries if we want to support full flexibility. The +Migration-Policy job would then get scheduled as frequently as a site +felt necessary (suggested default: every 15 minutes). + +Example: + +Job { + Name = "Migration-Policy" + Type = MgmtPolicy + Policy Selection Job Type = Migration + Scope = "Pool=*" + Threshold = "Migration Selection Type = LowestUtil" + Job Template = "MigrationTemplate" +} + +would select all pools for examination and generate a job based on +MigrationTemplate to automatically select the volume with the lowest +usage and migrate it's contents to the nextpool defined for that pool. + +This policy abstraction would be really handy for adjusting the behavior +of Bacula according to site-selectable criteria (one thing that pops +into mind is Amanda's ability to automatically adjust backup levels +depending on various criteria). + + ===== Regression tests: @@ -1263,3 +1364,43 @@ Block Position: 0 === Done - Make sure that all do_prompt() calls in Dir check for -1 (error) and -2 (cancel) returns. +- Fix foreach_jcr() to have free_jcr() inside next(). + jcr=jcr_walk_start(); + for ( ; jcr; (jcr=jcr_walk_next(jcr)) ) + ... + jcr_walk_end(jcr); +- A Volume taken from Scratch should take on the retention period + of the new pool. +- Correct doc for Maximum Changer Wait (and others) accepting only + integers. +- Implement status that shows why a job is being held in reserve, or + rather why none of the drives are suitable. +- Implement a way to disable a drive (so you can use the second + drive of an autochanger, and the first one will not be used or + even defined). +- Make sure Maximum Volumes is respected in Pools when adding + Volumes (e.g. when pulling a Scratch volume). +- Keep same dcr when switching device ... +- Implement code that makes the Dir aware that a drive is an + autochanger (so the user doesn't need to use the Autochanger = yes + directive). +- Make catalog respect ACL. +- Add recycle count to Media record. +- Add initial write date to Media record. +- Fix store_yesno to be store_bitmask. +--- create_file.c.orig Fri Jul 8 12:13:05 2005 ++++ create_file.c Fri Jul 8 12:13:07 2005 +@@ -195,6 +195,8 @@ + attr->ofname, be.strerror()); + return CF_ERROR; + } ++ } else if(S_ISSOCK(attr->statp.st_mode)) { ++ Dmsg1(200, "Skipping socket: %s\n", attr->ofname); + } else { + Dmsg1(200, "Restore node: %s\n", attr->ofname); + if (mknod(attr->ofname, attr->statp.st_mode, attr->statp.st_rdev) != 0 && errno != EEXIST) { +- Add true/false to conf same as yes/no +- Reserve blocks other restore jobs when first cannot connect to SD. +- Fix Maximum Changer Wait, Maximum Open Wait, Maximum Rewind Wait to + accept time qualifiers. +- Does ClientRunAfterJob fail the job on a bad return code?