X-Git-Url: https://git.sur5r.net/?a=blobdiff_plain;f=bacula%2Fkernstodo;h=fe4d5703501cba30b30372920506725eb86c3dae;hb=b5176d7560168c760634c017c29eff45deccf61a;hp=5a34b80e5bdde7424aadca8ffff22abd2f250130;hpb=765ee0b4659ebba18766d555451a7b174520964b;p=bacula%2Fbacula diff --git a/bacula/kernstodo b/bacula/kernstodo index 5a34b80e5b..fe4d570350 100644 --- a/bacula/kernstodo +++ b/bacula/kernstodo @@ -1,8 +1,17 @@ Kern's ToDo List - 23 February 2008 + 17 July 2009 + +Rescue: +Add to USB key: + gftp sshfs kile kate lsssci m4 mtx nfs-common nfs-server + patch squashfs-tools strace sg3-utils screen scsiadd + system-tools-backend telnet dpkg traceroute urar usbutils + whois apt-file autofs busybox chkrootkit clamav dmidecode + manpages-dev manpages-posix manpages-posix-dev Document: +- package sg3-utils, program sg_map - !!! Cannot restore two jobs a the same time that were written simultaneously unless they were totally spooled. - Document cleaning up the spool files: @@ -39,6 +48,12 @@ Document: for disaster recovery. Professional Needs: +- Nexenta (zfs + hardy + iscsi + nas + smf support) +- NDMP + - For NAS OpenNAS + - ndmfs -- File Server extention in NDMPv4. + - ndmjob -- NDMP backup/restore NDMPv2, NDMPv3, and NDMPv4 +- Base jobs - Migration from other vendors - Date change - Path change @@ -48,14 +63,11 @@ Professional Needs: - Detect state change of system (verify) - Synthetic Full, Diff, Inc (Virtual, Reconstructed) - SD to SD -- Modules for Databases, Exchange, ... - Novell NSS backup http://www.novell.com/coolsolutions/tools/18952.html - Compliance norms that compare restored code hash code. - When glibc crash, get address with info symbol 0x809780c - How to sync remote offices. -- Exchange backup: - http://www.microsoft.com/technet/itshowcase/content/exchbkup.mspx - David's priorities Copypools Extract capability (#25) @@ -70,80 +82,72 @@ Professional Needs: and http://www.openeyet.nl/scc/ for managing customer changes Priority: -- Look at in src/filed/backup.c -> pm_strcpy(ff_pkt->fname, ff_pkt->fname_save); -> pm_strcpy(ff_pkt->link, ff_pkt->link_save); -- Add Catalog = to Pool resource so that pools will exist - in only one catalog -- currently Pools are "global". -- New directive "Delete purged Volumes" +================ + +- Why no error message if restore has no permission on the where + directory? +- Possibly allow manual "purge" to purge a Volume that has not + yet been written (even if FirstWritten time is zero) see ua_purge.c + is_volume_purged(). +- Add disk block detection bsr code (make it work). +- Remove done bsrs. +- User options for plugins. +- Pool Storage override precedence over command line. +- Autolabel only if Volume catalog information indicates tape not + written. This will avoid overwriting a tape that gets an I/O + error on reading the volume label. +- I/O error, SD thinks it is not the right Volume, should check slot + then disable volume, but Asks for mount. +- Can be posible modify package to create and use configuration files in + the Debian manner? + + For example: + + /etc/bacula/bacula-dir.conf + /etc/bacula/conf.d/pools.conf + /etc/bacula/conf.d/clients.conf + /etc/bacula/conf.d/storages.conf + + and into bacula-dir.conf file include + + @/etc/bacula/conf.d/pools.conf + @/etc/bacula/conf.d/clients.conf + @/etc/bacula/conf.d/storages.conf +- Possibly add an Inconsistent state when a Volume is in error + for non I/O reasons. +- Fix #ifdefing so that smartalloc can be disabled. Check manual + -- the default is enabled. +- Change calling sequence to delete_job_id_range() in ua_cmds.c + the preceding strtok() is done inside the subroutine only once. +- Dangling softlinks are not restored properly. For example, take a + soft link such as src/testprogs/install-sh, which points to /usr/share/autoconf... + move the directory to another machine where the file /usr/share/autoconf does + not exist, back it up, then try a full restore. It fails. +- Softlinks that point to non-existent file are not restored in restore all, + but are restored if the file is individually selected. BUG! - Prune by Job - Prune by Job Level (Full, Differential, Incremental) - Strict automatic pruning -- Implement unmount of USB volumes. - Use "./config no-idea no-mdc2 no-rc5" on building OpenSSL for Win32 to avoid patent problems. -- Implement Bacula plugins -- design API +- Implement multiple jobid specification for the cancel command, + similar to what is permitted on the update slots command. + - Better yet allow wild-cards or regexes. +- Add Group resource for grouping Jobs so they can all be + run at the same time or canceled at the same time. - modify pruning to keep a fixed number of versions of a file, if requested. -=== Duplicate jobs === - hese apply only to backup jobs. - - 1. Allow Duplicate Jobs = Yes | No | Higher (Yes) - - 2. Duplicate Job Interval = (0) - - The defaults are in parenthesis and would produce the same behavior as today. - - If Allow Duplicate Jobs is set to No, then any job starting while a job of the - same name is running will be canceled. - - If Allow Duplicate Jobs is set to Higher, then any job starting with the same - or lower level will be canceled, but any job with a Higher level will start. - The Levels are from High to Low: Full, Differential, Incremental - - Finally, if you have Duplicate Job Interval set to a non-zero value, any job - of the same name which starts after a previous job of the - same name would run, any one that starts within would be - subject to the above rules. Another way of looking at it is that the Allow - Duplicate Jobs directive will only apply after of when the - previous job finished (i.e. it is the minimum interval between jobs). - - So in summary: - - Allow Duplicate Jobs = Yes | No | HigherLevel | CancelLowerLevel (Yes) - - Where HigherLevel cancels any waiting job but not any running job. - Where CancelLowerLevel is same as HigherLevel but cancels any running job or - waiting job. - - Duplicate Job Proximity = (0) - - My suggestion was to define it as the minimum guard time between - executions of a specific job -- ie, if a job was scheduled within Job - Proximity number of seconds, it would be considered a duplicate and - consolidated. - - Skip = Do not allow two or more jobs with the same name to run - simultaneously within the proximity interval. The second and subsequent - jobs are skipped without further processing (other than to note the job - and exit immediately), and are not considered errors. - - Fail = The second and subsequent jobs that attempt to run during the - proximity interval are cancelled and treated as error-terminated jobs. - - Promote = If a job is running, and a second/subsequent job of higher - level attempts to start, the running job is promoted to the higher level - of processing using the resources already allocated, and the subsequent - job is treated as in Skip above. -=== - the cd-command should allow complete paths i.e. cd /foo/bar/foo/bar -> if a customer mails me the path to a certain file, its faster to enter the specified directory -- Fix bpipe.c so that it does not modify results pointer. - ***FIXME*** calling sequence should be changed. - Make tree walk routines like cd, ls, ... more user friendly by handling spaces better. +- When doing a restore, if the user does an "update slots" + after the job started in order to add a restore volume, the + values prior to the update slots will be put into the catalog. + Must retrieve catalog record merge it then write it back at the + end of the restore job, if we want to do this right. === rate design jcr->last_rate jcr->last_runtime @@ -166,7 +170,6 @@ Priority: 02-Nov 12:58 rufus-sd JobId 10: Wrote label to prelabeled Volume "Vol001" on device "DDS-4" (/dev/nst0) 02-Nov 12:58 rufus-sd JobId 10: Alert: TapeAlert[7]: Media Life: The tape has reached the end of its useful life. 02-Nov 12:58 rufus-dir JobId 10: Bacula rufus-dir 2.3.6 (26Oct07): 02-Nov-2007 12:58:51 -- Eliminate: /var is a different filesystem. Will not descend from / into /var - Separate Files and Directories in catalog - Create FileVersions table - Look at rsysnc for incremental updates and dedupping @@ -187,7 +190,6 @@ Priority: - Performance: despool attributes when despooling data (problem multiplexing Dir connection). - Make restore use the in-use volume reservation algorithm. -- Add TLS to bat (should be done). - When Pool specifies Storage command override does not work. - Implement wait_for_sysop() message display in wait_for_device(), which now prints warnings too often. @@ -273,21 +275,16 @@ Projects: each data chunk -- according to James Harper 9Jan07. - Features - Better scheduling - - Full at least once a month, ... - - Cancel Inc if Diff/Full running - More intelligent re-run - - New/deleted file backup - FD plugins - Incremental backup -- rsync, Stow - For next release: - Try to fix bscan not working with multiple DVD volumes bug #912. - Look at mondo/mindi - Make Bacula by default not backup tmpfs, procfs, sysfs, ... - Fix hardlinked immutable files when linking a second file, the immutable flag must be removed prior to trying to link it. -- Implement Python event for backing up/restoring a file. - Change dbcheck to tell users to use native tools for fixing broken databases, and to ensure they have the proper indexes. - add udev rules for Bacula devices. @@ -340,9 +337,6 @@ Low priority: http://linuxwiki.de/Bacula (in German) - Possibly allow SD to spool even if a tape is not mounted. -- Fix re-read of last block to check if job has actually written - a block, and check if block was written by a different job - (i.e. multiple simultaneous jobs writing). - Figure out how to configure query.sql. Suggestion to use m4: == changequote.m4 === changequote(`[',`]')dnl @@ -369,32 +363,6 @@ Low priority: The problem is that it requires m4, which is not present on all machines at ./configure time. -- Given all the problems with FIFOs, I think the solution is to do something a - little different, though I will look at the code and see if there is not some - simple solution (i.e. some bug that was introduced). What might be a better - solution would be to use a FIFO as a sort of "key" to tell Bacula to read and - write data to a program rather than the FIFO. For example, suppose you - create a FIFO named: - - /home/kern/my-fifo - - Then, I could imagine if you backup and restore this file with a direct - reference as is currently done for fifos, instead, during backup Bacula will - execute: - - /home/kern/my-fifo.backup - - and read the data that my-fifo.backup writes to stdout. For restore, Bacula - will execute: - - /home/kern/my-fifo.restore - - and send the data backed up to stdout. These programs can either be an - executable or a shell script and they need only read/write to stdin/stdout. - - I think this would give a lot of flexibility to the user without making any - significant changes to Bacula. - ==== SQL # get null file @@ -669,147 +637,6 @@ select Path.Path from Path,File where File.JobId=nnn and - Bug: if a job is manually scheduled to run later, it does not appear in any status report and cannot be cancelled. -==== Keeping track of deleted/new files ==== -- To mark files as deleted, run essentially a Verify to disk, and - when a file is found missing (MarkId != JobId), then create - a new File record with FileIndex == -1. This could be done - by the FD at the same time as the backup. - - My "trick" for keeping track of deletions is the following. - Assuming the user turns on this option, after all the files - have been backed up, but before the job has terminated, the - FD will make a pass through all the files and send their - names to the DIR (*exactly* the same as what a Verify job - currently does). This will probably be done at the same - time the files are being sent to the SD avoiding a second - pass. The DIR will then compare that to what is stored in - the catalog. Any files in the catalog but not in what the - FD sent will receive a catalog File entry that indicates - that at that point in time the file was deleted. This - either transmitted to the FD or simultaneously computed in - the FD, so that the FD can put a record on the tape that - indicates that the file has been deleted at this point. - A delete file entry could potentially be one with a FileIndex - of 0 or perhaps -1 (need to check if FileIndex is used for - some other thing as many of the Bacula fields are "overloaded" - in the SD). - - During a restore, any file initially picked up by some - backup (Full, ...) then subsequently having a File entry - marked "delete" will be removed from the tree, so will not - be restored. If a file with the same name is later OK it - will be inserted in the tree -- this already happens. All - will be consistent except for possible changes during the - running of the FD. - - Since I'm on the subject, some of you may be wondering what - the utility of the in memory tree is if you are going to - restore everything (at least it comes up from time to time - on the list). Well, it is still *very* useful because it - allows only the last item found for a particular filename - (full path) to be entered into the tree, and thus if a file - is backed up 10 times, only the last copy will be restored. - I recently (last Friday) restored a complete directory, and - the Full and all the Differential and Incremental backups - spanned 3 Volumes. The first Volume was not even mounted - because all the files had been updated and hence backed up - since the Full backup was made. In this case, the tree - saved me a *lot* of time. - - Make sure this information is stored on the tape too so - that it can be restored directly from the tape. - - All the code (with the exception of formally generating and - saving the delete file entries) already exists in the Verify - Catalog command. It explicitly recognizes added/deleted files since - the last InitCatalog. It is more or less a "simple" matter of - taking that code and adapting it slightly to work for backups. - - Comments from Martin Simmons (I think they are all covered): - Ok, that should cover the basics. There are few issues though: - - - Restore will depend on the catalog. I think it is better to include the - extra data in the backup as well, so it can be seen by bscan and bextract. - - - I'm not sure if it will preserve multiple hard links to the same inode. Or - maybe adding or removing links will cause the data to be dumped again? - - - I'm not sure if it will handle renamed directories. Possibly it will work - by dumping the whole tree under a renamed directory? - - - It remains to be seen how the backup performance of the DIR's will be - affected when comparing the catalog for a large filesystem. - - 1. Use the current Director in-memory tree code (very fast), but currently in - memory. It probably could be paged. - - 2. Use some DB such as Berkeley DB or SQLite. SQLite is already compiled and - built for Win32, and it is something we could compile into the program. - - 3. Implement our own custom DB code. - - Note, by appropriate use of Directives in the Director, we can dynamically - decide if the work is done in the Director or in the FD, and we can even - allow the user to choose. - -=== most recent accurate file backup/restore === - Here is a sketch (i.e. more details must be filled in later) that I recently - made of an algorithm for doing Accurate Backup. - - 1. Dir informs FD that it is doing an Accurate backup and lookup done by - Director. - - 2. FD passes through the file system doing a normal backup based on normal - conditions, recording the names of all files and their attributes, and - indicating which files were backed up. This is very similar to what Verify - does. - - 3. The Director receives the two lists of files at the end of the FD backup. - One, files backed up, and one files not backed up. It then looks up all the - files not backed up (using Verify style code). - - 4. The Dir sends the FD a list of: - a. Additional files to backup (based on user specified criteria, name, size - inode date, hash, ...). - b. Files to delete. - - 5. Dir deletes list of file not backed up. - - 6. FD backs up additional files generates a list of those backed up and sends - it to the Director, which adds it to the list of files backed up. The list - is now complete and current. - - 7. The FD generates delete records for all the files that were deleted and - sends to the SD. - - 8. The Dir deletes the previous CurrentBackup list, and then does a - transaction insert of the new list that it has. - - 9. The rest works as before ... - - That is it. - - Two new tables needed. - 1. CurrentBackupId table that contains Client, JobName, FileSet, and a unique - BackupId. This is created during a Full save, and the BackupId can be set to - the JobId of the Full save. It will remain the same until another Full - backup is done. That is when new records are added during a Differential or - Incremental, they must use the same BackupId. - - 2. CurrentBackup table that contains essentially a File record (less a number - of fields, but with a few extra fields) -- e.g. a flag that the File was - backed up by a Full save (this permits doing a Differential). The unique - BackupId allows us to look up the CurrentBackup for a particular Client, - Jobname, FileSet using that unique BackupId as the key, so this table must be - indexed by the BackupId. - - Note any time a file is saved by the FD other than during a Full save, the - Full save flag is cleared. When doing a Differential backup, if a file has - the Full save flag set, it is skipped, otherwise it is backed up. For an - Incremental backup, we check to see if the file has changed since the last - time we backed it up. - - Deleted files should have FileIndex == 0 ==== From David: @@ -971,9 +798,6 @@ Why: ========== - Make output from status use html table tags for nicely presenting in a browser. -- Can one write tapes faster with 8192 byte block sizes? -- Document security problems with the same password for everyone in - rpm and Win32 releases. - Browse generations of files. - I've seen an error when my catalog's File table fills up. I then have to recreate the File table with a larger maximum row @@ -1053,8 +877,6 @@ Documentation to do: (any release a little bit at a time) - Use gather write() for network I/O. - Autorestart on crash. - Add bandwidth limiting. -- Add acks every once and a while from the SD to keep - the line from timing out. - When an error in input occurs and conio beeps, you can back up through the prompt. - Detect fixed tape block mode during positioning by looking at @@ -1071,7 +893,6 @@ Documentation to do: (any release a little bit at a time) - Allow the user to select JobType for manual pruning/purging. - bscan does not put first of two volumes back with all info in bscan-test. -- Implement the FreeBSD nodump flag in chflags. - Figure out how to make named console messages go only to that console and to the non-restricted console (new console class?). - Make restricted console prompt for password if *ask* is set or @@ -1141,13 +962,10 @@ Documentation to do: (any release a little bit at a time) - Setup lrrd graphs: (http://www.linpro.no/projects/lrrd/) Mike Acar. - Revisit the question of multiple Volumes (disk) on a single device. - Add a block copy option to bcopy. -- Finish work on Gnome restore GUI. - Fix "llist jobid=xx" where no fileset or client exists. - For each job type (Admin, Restore, ...) require only the really necessary fields.- Pass Director resource name as an option to the Console. - Add a "batch" mode to the Console (no unsolicited queries, ...). -- Add a .list all files in the restore tree (probably also a list all files) - Do both a long and short form. - Allow browsing the catalog to see all versions of a file (with stat data on each file). - Restore attributes of directory if replace=never set but directory @@ -1169,32 +987,22 @@ Documentation to do: (any release a little bit at a time) - Check new HAVE_WIN32 open bits. - Check if the tape has moved before writing. - Handling removable disks -- see below: -- Keep track of tape use time, and report when cleaning is necessary. - Add FromClient and ToClient keywords on restore command (or BackupClient RestoreClient). - Implement a JobSet, which groups any number of jobs. If the JobSet is started, all the jobs are started together. Allow Pool, Level, and Schedule overrides. -- Enhance cancel to timeout BSOCK packets after a specific delay. -- Do scheduling by UTC using gmtime_r() in run_conf, scheduler, and - ua_status.!!! Thanks to Alan Brown for this tip. - Look at updating Volume Jobs so that Max Volume Jobs = 1 will work correctly for multiple simultaneous jobs. -- Correct code so that FileSet MD5 is calculated for < and | filename - generation. - Implement the Media record flag that indicates that the Volume does disk addressing. - Implement VolAddr, which is used when Volume is addressed like a disk, and form it from VolFile and VolBlock. -- Make multiple restore jobs for multiple media types specifying - the proper storage type. - Fix fast block rejection (stored/read_record.c:118). It passes a null pointer (rec) to try_repositioning(). -- Look at extracting Win data from BackupRead. - Implement RestoreJobRetention? Maybe better "JobRetention" in a Job, which would take precidence over the Catalog "JobRetention". - Implement Label Format in Add and Label console commands. -- Possibly up network buffers to 65K. Put on variable. - Put email tape request delays on one or more variables. User wants to cancel the job after a certain time interval. Maximum Mount Wait? - Job, Client, Device, Pool, or Volume? @@ -1258,8 +1066,6 @@ Documentation to do: (any release a little bit at a time) support for Oracle database ?? === - Look at adding SQL server and Exchange support for Windows. -- Make dev->file and dev->block_num signed integers so that -1 can - be an invalid value which happens with BSR. - Create VolAddr for disk files in place of VolFile and VolBlock. This is needed to properly specify ranges. - Add progress of files/bytes to SD and FD. @@ -1289,7 +1095,6 @@ Documentation to do: (any release a little bit at a time) - Implement some way for the File daemon to contact the Director to start a job or pass its DHCP obtained IP number. - Implement a query tape prompt/replace feature for a console -- Copy console @ code to gnome2-console - Make sure that Bacula rechecks the tape after the 20 min wait. - Set IO_NOWAIT on Bacula TCP/IP packets. - Try doing a raw partition backup and restore by mounting a @@ -1306,12 +1111,9 @@ Documentation to do: (any release a little bit at a time) - What to do about "list files job=xxx". - Look at how fuser works and /proc/PID/fd that is how Nic found the file descriptor leak in Bacula. -- Implement WrapCounters in Counters. -- Add heartbeat from FD to SD if hb interval expires. - Can we dynamically change FileSets? - If pool specified to label command and Label Format is specified, automatically generate the Volume name. -- Why can't SQL do the filename sort for restore? - Add ExhautiveRestoreSearch - Look at the possibility of loading only the necessary data into the restore tree (i.e. do it one directory at a @@ -1326,10 +1128,8 @@ Documentation to do: (any release a little bit at a time) run the job but don't save the files. - Make things like list where a file is saved case independent for Windows. -- Use autochanger to handle multiple devices. - Implement a Recycle command - Start working on Base jobs. -- Implement UnsavedFiles DB record. - From Phil Stracchino: It would probably be a per-client option, and would be called something like, say, "Automatically purge obsoleted jobs". What it @@ -1382,7 +1182,6 @@ Documentation to do: (any release a little bit at a time) - bscan without -v is too quiet -- perhaps show jobs. - Add code to reject whole blocks if not wanted on restore. - Check if we can increase Bacula FD priorty in Win2000 -- Make sure the MaxVolFiles is fully implemented in SD - Check if both CatalogFiles and UseCatalog are set to SD. - Possibly add email to Watchdog if drive is unmounted too long and a job is waiting on the drive. @@ -1410,8 +1209,6 @@ Documentation to do: (any release a little bit at a time) - Implement script driven addition of File daemon to config files. - Think about how to make Bacula work better with File (non-tape) archives. - Write Unix emulator for Windows. -- Put memory utilization in Status output of each daemon - if full status requested or if some level of debug on. - Make database type selectable by .conf files i.e. at runtime - Set flag for uname -a. Add to Volume label. - Restore files modified after date @@ -1462,19 +1259,13 @@ Documentation to do: (any release a little bit at a time) - MaxWarnings - MaxErrors (job?) ===== -- FD sends unsaved file list to Director at end of job (see - RFC below). -- File daemon should build list of files skipped, and then - at end of save retry and report any errors. - Write a Storage daemon that uses pipes and standard Unix programs to write to the tape. See afbackup. - Need something that monitors the JCR queue and times out jobs by asking the deamons where they are. - Enhance Jmsg code to permit buffering and saving to disk. -- device driver = "xxxx" for drives. - Verify from Volume -- Ensure that /dev/null works - Need report class for messages. Perhaps report resource where report=group of messages - enhance scan_attrib and rename scan_jobtype, and @@ -1569,31 +1360,12 @@ mounting. Nobody is dying for them, but when you see what it does, you will die without it. -3. Restoring deleted files: Since I think my comments in (2) above -have low probability of implementation, I'll also suggest that you -could approach the issue of deleted files by a mechanism of having the -fd report to the dir, a list of all files on the client for every -backup job. The dir could note in the database entry for each file -the date that the file was seen. Then if a restore as of date X takes -place, only files that exist from before X until after X would be -restored. Probably the major cost here is the extra date container in -each row of the files table. - -Thanks for "listening". I hope some of this helps. If you want to -contact me, please send me an email - I read some but not all of the -mailing list traffic and might miss a reply there. - -Please accept my compliments for bacula. It is doing a great job for -me!! I sympathize with you in the need to wrestle with excelence in -execution vs. excelence in feature inclusion. - Regards, Jerry Schieffer ============================== Longer term to do: -- Design at hierarchial storage for Bacula. Migration and Clone. - Implement FSM (File System Modules). - Audit M_ error codes to ensure they are correct and consistent. - Add variable break characters to lex analyzer. @@ -1604,17 +1376,8 @@ Longer term to do: continue a save if the Director goes down (this is NOT currently the case). Must detect socket error, buffer messages for later. -- Enhance time/duration input to allow multiple qualifiers e.g. 3d2h - Add ability to backup to two Storage devices (two SD sessions) at the same time -- e.g. onsite, offsite. -- Compress or consolidate Volumes of old possibly deleted files. Perhaps - someway to do so with every volume that has less than x% valid - files. - - -Migration: Move a backup from one Volume to another -Clone: Copy a backup -- two Volumes - ====================================================== Base Jobs design @@ -1668,111 +1431,109 @@ Need: VolSessionId and VolSessionTime. ========================================================= +========================================================= + Preliminary design of Deletion of disk volumes + +tem 5: Deletion of disk Volumes when pruned + Date: Nov 25, 2005 + Origin: Ross Boylan (edited + by Kern) + Status: + + What: Provide a way for Bacula to automatically remove Volumes + from the filesystem, or optionally to truncate them. + Obviously, the Volume must be pruned prior removal. + + Why: This would allow users more control over their Volumes and + prevent disk based volumes from consuming too much space. + + Notes: The following two directives might do the trick: + + Volume Data Retention =