X-Git-Url: https://git.sur5r.net/?a=blobdiff_plain;ds=inline;f=bacula%2Fkernstodo;h=963aadbf6cacceab4e3f9e6c7f5c4e7f1c327d69;hb=21ed6544eabc4f55882cb3c0ef30574e3dd41970;hp=45e793b61676cdb280f62ff6b37d4062b5c87841;hpb=64db9587378adfff02e0a31f180a4e5124587d14;p=bacula%2Fbacula diff --git a/bacula/kernstodo b/bacula/kernstodo index 45e793b616..963aadbf6c 100644 --- a/bacula/kernstodo +++ b/bacula/kernstodo @@ -1,5 +1,5 @@ - Kern's ToDo List - 31 August 2004 + Kern's ToDo List + 17 Septermber 2004 Major development: Project Developer @@ -11,73 +11,28 @@ Version 1.35 Kern (see below) ======================================================== 1.35 Items to do for release: -- Bacula rescue CDROM implement isolinux -- Add new DCR calling sequences everywhere in SD. This will permit - simultaneous use of multiple devices by a single job. -- Look at patches/bacula_db.b2z postgresql that loops during restore. - See Gregory Wright. -- Perhaps add read/write programs and/or plugins to FileSets. -- Make sure Qmsgs are dequeued by FD and SD. -- Check if ACLs allocated at dird_conf.c:1214 are being properly - released. +- Backspace to beginning of line (conio) does not erase first char. -- Add bscan to four-concurrent-jobs regression. -- Add IPv6 to regression -- Alternative to static linking "ldd prog" save all binaries listed, - restore them and point LD_LIBRARY_PATH to them. - Document a get out of jail procedure if everything breaks if you lost/broke the Catalog -- do the same for "I know my file is there how do I get it back?". -- Test/doc Tape Alerts -- Doc update AllFromVol -- Doc dbcheck eliminate orphaned clients. -- Doc -p option in stored -- Document that console commands can be abbreviated. -- New IP address specification is used as follows: - [sdaddresses|diraddresses|fdaddresses] = { [[ip|ipv4|ipv6] = { - [[addr|port] = [^ ]+[\n;]+] }] } - so it could look for example like this: - SDaddresses = { ip = { - addr = 1.2.3.4; port = 1205; } - ipv4 = { - addr = 1.2.3.4; port = http; } - ipv6 = { - addr = 1.2.3.4; - port = 1205; - } - ip = { - addr = 1.2.3.4 - port = 1205 - } - ip = { - addr = 1.2.3.4 - } - ip = { - addr = 201:220:222::2 - } - ip = { - addr = bluedot.thun.net - } - } - as a consequence, you can now specify multiple IP addresses and - ports to be used. In the case of a server, it will listen on - all those that you specify. In the case of connecting to the server, - Bacula will attempt connecting to one at a time until it succeeds. - And, in a few other special cases, Bacula will use only the first - address specified. - - The default port numbers are still the same and the services and hosts - are also resolved by name. So now you could use the real names for the - port numbers. - - An ip section will allow resolution to either an ipv4 or an ipv6 address. - An ipv4 section forces the resolution to be only ipv4, and an ipv6 section - forces the resolution to be only ipv6. +- Add "Rerun failed levels = yes/no" to Job resource. +Maybe for 1.35: +- Look at patches/bacula_db.b2z postgresql that loops during restore. + See Gregory Wright. +- Add delete JobId to regression. +- Add bscan to four-concurrent-jobs regression. +- Add IPv6 to regression +- Perhaps add read/write programs and/or plugins to FileSets. +- How to handle backing up portables ... Documentation to do: (any release a little bit at a time) +- Alternative to static linking "ldd prog" save all binaries listed, + restore them and point LD_LIBRARY_PATH to them. +- Document add "/dev/null 2>&1" to the bacula-fd command line - Document query file format. - Add more documentation for bsr files. - Document problems with Verify and pruning. @@ -107,6 +62,10 @@ Testing to do: (painful) For 1.37 Testing/Documentation: +- Add "Allow multiple connections" in Catalog resource to open a new + database connection for each job. +- Allow Simultaneous Priorities = yes => run up to Max concurrent jobs even + with multiple priorities. - Fix find_device in stored/dircmd.c:462 (see code) - Add db check test to regression. Test each function like delete, purge, ... @@ -168,8 +127,8 @@ Wish list: - Minimal autochanger handling in Bacula and in btape. - Look into how tar does not save sockets and the possiblity of not saving them in Bacula (Martin Simmons reported this). - The next two lines will show them. - localmounts=`awk '/ext/ { print $2 }' /proc/mounts` # or whatever +- Add All Local Partitions = yes to new style saves. +- localmounts=`awk '/ext/ { print $2 }' /proc/mounts` # or whatever find $localmounts -xdev -type s -ls - Fix restore jobs so that multiple jobs can run if they are not using the same tape(s). @@ -228,14 +187,12 @@ Wish list: - look at mxt-changer.html - Make ? do a help command (no return needed). - Implement restore directory. -- Add All Local Partitions = yes to new style saves. - Document streams and how to implement them. - Possibly implement "Ensure Full Backup = yes" looks for a failed full backup and upgrades the current backup if one exists. - Check that barcode reading and update slots scan works. - Try not to re-backup a file if a new hard link is added. - Add feature to backup hard links only, but not the data. -- Add "All Local = yes" option to save to include all local partitions. - Fix stream handling to be simpler. - Add Priority and Bootstrap to Run a Job. - Eliminate Restore "Run Restore Job" prompt by allowing new "run command @@ -247,13 +204,8 @@ Wish list: - Add display of total selected files to Restore window. - Add tree pane to left of window. - Add progress meter. -- Polling does not work for restore. It tries a number of times, - gives up, and crashes the SD. -- Lock jcr_chain when doing attach/detach in acquire.c -- Add assert in free_jcr if attach/detach chain active. - Max wait time or max run time causes seg fault -- see runtime-bug.txt - Document writing to a CD/DVD with Bacula. -- Add check for tape alerts. - Add a "base" package to the window installer for pthreadsVCE.dll which is needed by all packages. - Add message to user to check for fixed block size when the forward @@ -263,14 +215,10 @@ Wish list: - Possibly implement: Action = Unmount Device="TapeDrive1" in Admin jobs. - Setup lrrd graphs: (http://www.linpro.no/projects/lrrd/) Mike Acar. - Revisit the question of multiple Volumes (disk) on a single device. -- Finish SIGHUP work. -- Check that all change in wait status in the SD are - signaled to the Director. - Add a block copy option to bcopy. - Investigate adding Mac Resource Forks. - Finish work on Gnome restore GUI. - Fix "llist jobid=xx" where no fileset or client exists. -- Check pruning of restore jobs. - From Chris Hull: it seems to be complaining about 12:00pm which should be a valid 12 hour time. I changed the time to 11:59am and everything works fine. @@ -283,26 +231,14 @@ Wish list: then list last 20 backups. - Add all pools in Dir conf to DB also update them to catch changed LabelFormats and such. -- Update volumes FromPool=xxx does all volumes. - Pass Director resource name as an option to the Console. - Add a "batch" mode to the Console (no unsolicited queries, ...). -- Add code to check for tape alerts -- tapeinfo. -- Make sure list of Volumes needed is in correct order for restore. - See havana. -- Remove paths (and files that reference them) that have no trailing slash - in dbcheck -- or add a trailing slash. -- Remove Filenames (and files that reference them) that have a trailing - slash in dbcheck -- or remove the trailing slash. -- Remove orphaned paths/filenames by copying them to a new table with a - reference count, then mark all referenced files/paths and remove unreferenced - ones. - Add a .list all files in the restore tree (probably also a list all files) Do both a long and short form. - Allow browsing the catalog to see all versions of a file (with stat data on each file). - Restore attributes of directory if replace=never set but directory did not exist. -- Allow "delete job jobid=xxx,yyy,aaa-bbb" i.e. list + ranges. - Use SHA1 on authentication if possible. - See comtest-xxx.zip for Windows code to talk to USB. - Make btape accept Device Names in addition to Archive names. @@ -323,40 +259,23 @@ Wish list: - Add disk seeking on restore. - Allow for optional cancelling of SD and FD in case DIR gets a fatal error. Requested by Jesse Guardiani -- Bizarre message: Error: Could not open WriteBootstrap file: -- Build console in client only build. - Add "limit=n" for "list jobs" - Check new HAVE_WIN32 open bits. - Check if the tape has moved before writing. - Handling removable disks -- see below: -- Multiple drive autochanger support -- see below. - Keep track of tape use time, and report when cleaning is necessary. -- Fix FreeBSD mt_count problem. - Add FromClient and ToClient keywords on restore command (or BackupClient RestoreClient). -- Automatic "update slots" on user configuration directive when a - slot error occurs. - Implement a JobSet, which groups any number of jobs. If the JobSet is started, all the jobs are started together. Allow Pool, Level, and Schedule overrides. - Enhance cancel to timeout BSOCK packets after a specific delay. -- When I restore to Windows the Created, Accessed and Modifiedtimes are - those of the time of the restore, not those of the originalfile. - The dates you will find in your restore log seem to be the original - creation dates -- Volume "add"ed to Pool gets recycled in first use. VolBytes=0 -- If a tape is recycled while it is mounted, Stanislav Tvrudy must do an - additional mount to deblock the job. -- From Johan Decock: - bscan: sql_update.c:65 UPDATE File SET MD5='Ij+5kwN6TFIxK+8l8+/I+A' WHERE FileId=0 - bscan: bscan.c:1074 Could not add MD5/SHA1 to File record. ERR=sql_update.c:65 Update problem: affected_rows=0 - Do scheduling by UTC using gmtime_r() in run_conf, scheduler, and ua_status.!!! Thanks to Alan Brown for this tip. - Look at updating Volume Jobs so that Max Volume Jobs = 1 will work correctly for multiple simultaneous jobs. - Correct code so that FileSet MD5 is calculated for < and | filename generation. -- Mark Volume in error on error from WEOF. - Implement the Media record flag that indicates that the Volume does disk addressing. - Implement VolAddr, which is used when Volume is addressed like a disk, @@ -367,8 +286,6 @@ Wish list: - Fix fast block rejection (stored/read_record.c:118). It passes a null pointer (rec) to try_repositioning(). - Look at extracting Win data from BackupRead. -- Having dashes in filenames apparently creates problems for restore - by filename??? hard to believe. - Implement RestoreJobRetention? Maybe better "JobRetention" in a Job, which would take precidence over the Catalog "JobRetention". - Implement Label Format in Add and Label console commands. @@ -380,7 +297,6 @@ Wish list: resources, like Level? If so, I think I'd make it an optional directive in Job, Client, and Pool, with precedence such that Job overrides Client which in turn overrides Pool. -- Print a message when a job starts if the conf file is not current. - Spooling ideas taken from Volker Sauer's and other's emails: > IMHO job spooling should be turned on @@ -443,6 +359,11 @@ Wish list: from one backup Volume to another. - New Storage specifications: + - Want to write to multiple storage devices simultaneously + - Want to write to multiple storage devices sequentially (in one job) + - Want to read/write simultaneously + - Key is MediaType -- it must match + Passed to SD as a sort of BSR record called Storage Specification Record or SSR. SSR @@ -450,7 +371,6 @@ Wish list: MediaType -> Next MediaType Pool -> Next Pool Device -> Next Device - Write Copy Resource that makes a copy of a resource. Job Resource Allow multiple Storage specifications New flags @@ -462,8 +382,8 @@ Wish list: Storage Allow Multiple Pool specifications (note, Pool currently in Job resource). - Allow Multiple MediaType specifications - Allow Multiple Device specifications + Allow Multiple MediaType specifications in Dir conf + Allow Multiple Device specifications in Dir conf Perhaps keep this in a single SSR Tie a Volume to a specific device by using a MediaType that is contained in only one device. @@ -479,20 +399,10 @@ Ideas from Jerry Scharf: even more important, it's not flaky it has an open access catalog, opening many possibilities it's pushing toward heterogeneous systems capability - simple things: - I don't remember an include file directive for config files - (not filesets, actual config directives) - can you check the configs without starting the daemon? - some warnings about possible common mistakes big things: - doing the testing and blessing of concurrent backup writes - this is absolutely necessary in the enterprise - easy user recovery GUI with full access checking Macintosh file client macs are an interesting niche, but I fear a server is a rathole working bare iron recovery for windows - much better handling on running config changes - thinking through the logic of what happens to jobs in progress the option for inc/diff backups not reset on fileset revision a) use both change and inode update time against base time b) do the full catalog check (expensive but accurate) @@ -545,9 +455,7 @@ Ideas from Jerry Scharf: to the user, who would then use "mount" as described above once he had actually inserted the disk. - Implement dump/print label to UA -- Implement disk spooling. Two parts: 1. Spool to disk then - immediately to tape to speed up tape operations. 2. Spool to - disk only when the tape is full, then when a tape is hung move +- Spool to disk only when the tape is full, then when a tape is hung move it to tape. - Scratch Pool where the volumes can be re-assigned to any Pool. - bextract is sending everything to the log file ****FIXME**** @@ -864,12 +772,8 @@ Ideas from Jerry Scharf: if full status requested or if some level of debug on. - Make database type selectable by .conf files i.e. at runtime - Set flag for uname -a. Add to Volume label. -- Implement throttled work queue. - Restore files modified after date - SET LD_RUN_PATH=$HOME/mysql/lib/mysql -- Implement Restore FileSet= -- Create a protocol.h and protocol.c where all protocol messages - are concentrated. - Remove duplicate fields from jcr (e.g. jcr.level and jcr.jr.Level, ...). - Timout a job or terminate if link goes down, or reopen link and query. - Concept of precious tapes (cannot be reused). @@ -947,6 +851,105 @@ Ideas from Jerry Scharf: - Store info on each file system type (probably in the job header on tape. This could be the output of df; or perhaps some sort of /etc/mtab record. +========= ideas =============== +From: "Jerry K. Schieffer" +To: +Subject: RE: [Bacula-users] future large programming jobs +Date: Thu, 26 Feb 2004 11:34:54 -0600 + +I noticed the subject thread and thought I would offer the following +merely as sources of ideas, i.e. something to think about, not even as +strong as a request. In my former life (before retiring) I often +dealt with backups and storage management issues/products as a +developer and as a consultant. I am currently migrating my personal +network from amanda to bacula specifically because of the ability to +cross media boundaries during storing backups. +Are you familiar with the commercial product called ADSM (I think IBM +now sells it under the Tivoli label)? It has a couple of interesting +ideas that may apply to the following topics. + +1. Migration: Consider that when you need to restore a system, there +may be pressure to hurry. If all the information for a single client +can eventually end up on the same media (and in chronological order), +the restore is facillitated by not having to search past information +from other clients. ADSM has the concept of "client affinity" that +may be associated with it's storage pools. It seems to me that this +concept (as an optional feature) might fit in your architecture for +migration. + +ADSM also has the concept of defining one or more storage pools as +"copy pools" (almost mirrors, but only in the sense of contents). +These pools provide the ability to have duplicte data stored both +onsite and offsite. The copy process can be scheduled to be handled +by their storage manager during periods when there is no backup +activity. Again, the migration process might be a place to consider +implementing something like this. + +> +> It strikes me that it would be very nice to be able to do things +like +> have the Job(s) backing up the machines run, and once they have all +> completed, start a migration job to copy the data from disks Volumes +to +> a tape library and then to offsite storage. Maybe this can already +be +> done with some careful scheduling and Job prioritzation; the events +> mechanism described below would probably make it very easy. + +This is the goal. In the first step (before events), you simply +schedule +the Migration to tape later. + +2. Base jobs: In ADSM, each copy of each stored file is tracked in +the database. Once a file (unique by path and metadata such as dates, +size, ownership, etc.) is in a copy pool, no more copies are made. In +other words, when you start ADSM, it begins like your concept of a +base job. After that it is in the "incremental" mode. You can +configure the number of "generations" of files to be retained, plus a +retention date after which even old generations are purged. The +database tracks the contents of media and projects the percentage of +each volume that is valid. When the valid content of a volume drops +below a configured percentage, the valid data are migrated to another +volume and the old volume is marked as empty. Note, this requires +ADSM to have an idea of the contents of a client, i.e. marking the +database when an existing file was deleted, but this would solve your +issue of restoring a client without restoring deleted files. + +This is pretty far from what bacula now does, but if you are going to +rip things up for Base jobs,..... +Also, the benefits of this are huge for very large shops, especially +with media robots, but are a pain for shops with manual media +mounting. + +> +> Base jobs sound pretty useful, but I'm not dying for them. + +Nobody is dying for them, but when you see what it does, you will die +without it. + +3. Restoring deleted files: Since I think my comments in (2) above +have low probability of implementation, I'll also suggest that you +could approach the issue of deleted files by a mechanism of having the +fd report to the dir, a list of all files on the client for every +backup job. The dir could note in the database entry for each file +the date that the file was seen. Then if a restore as of date X takes +place, only files that exist from before X until after X would be +restored. Probably the major cost here is the extra date container in +each row of the files table. + +Thanks for "listening". I hope some of this helps. If you want to +contact me, please send me an email - I read some but not all of the +mailing list traffic and might miss a reply there. + +Please accept my compliments for bacula. It is doing a great job for +me!! I sympathize with you in the need to wrestle with excelence in +execution vs. excelence in feature inclusion. + +Regards, +Jerry Schieffer + +============================== + Longer term to do: - Design at hierarchial storage for Bacula. Migration and Clone. - Implement FSM (File System Modules). @@ -1290,4 +1293,62 @@ Block Position: 0 - Fix error handling in spooling both data and attribute. - Implement Ignore FileSet Change. - Doc new duration time input editing. - +- Bacula rescue CDROM implement isolinux +- Make sure Qmsgs are dequeued by FD and SD. +- Check if ACLs allocated at dird_conf.c:1214 are being properly + released. +- Test/doc Tape Alerts +- Doc dbcheck eliminate orphaned clients. +- Doc Phil's new delete job jobid scanning code. +- Document that console commands can be abbreviated. +- Doc update AllFromVol +- Doc -p option in stored +- New IP address specification is used as follows: + [sdaddresses|diraddresses|fdaddresses] = { [[ip|ipv4|ipv6] = { + [[addr|port] = [^ ]+[\n;]+] }] } + + so it could look for example like this: + SDaddresses = { ip = { + addr = 1.2.3.4; port = 1205; } + ipv4 = { + addr = 1.2.3.4; port = http; } + ipv6 = { + addr = 1.2.3.4; + port = 1205; + } + ip = { + addr = 1.2.3.4 + port = 1205 + } + ip = { + addr = 1.2.3.4 + } + ip = { + addr = 201:220:222::2 + } + ip = { + addr = bluedot.thun.net + } + } + as a consequence, you can now specify multiple IP addresses and + ports to be used. In the case of a server, it will listen on + all those that you specify. In the case of connecting to the server, + Bacula will attempt connecting to one at a time until it succeeds. + And, in a few other special cases, Bacula will use only the first + address specified. + + The default port numbers are still the same and the services and hosts + are also resolved by name. So now you could use the real names for the + port numbers. + + An ip section will allow resolution to either an ipv4 or an ipv6 address. + An ipv4 section forces the resolution to be only ipv4, and an ipv6 section + forces the resolution to be only ipv6. +- Fix silly restriction requiring Include { Options { xxx } } to be + on separate lines. +- Restore c: with a prefix into /prefix/c/ to prevent c: and d: + files with the same name from overwritting each other. +- Add "Multiple connections = yes/no" to catalog resource. +- Add new DCR calling sequences everywhere in SD. This will permit + simultaneous use of multiple devices by a single job. +