X-Git-Url: https://git.sur5r.net/?a=blobdiff_plain;f=bacula%2Fkernstodo;h=ef54bfd5db8881c4963b0ba0570bfe73b890edfe;hb=68b7cad2e3ba120186129fc4c6445d6b95ecae80;hp=3a1c4d1e6dc6868dcd42e6b12aca5835076ed954;hpb=263a1f4da33d7cd7f81f5f7416398fe6853c25c4;p=bacula%2Fbacula diff --git a/bacula/kernstodo b/bacula/kernstodo index 3a1c4d1e6d..ef54bfd5db 100644 --- a/bacula/kernstodo +++ b/bacula/kernstodo @@ -1,5 +1,5 @@ Kern's ToDo List - 16 July 2006 + 12 November 2006 Major development: Project Developer @@ -21,32 +21,101 @@ Document: \bacula\working). - Document techniques for restoring large numbers of files. - Document setting my.cnf to big file usage. -- Add example of proper index output to doc. - show index from File; +- Add example of proper index output to doc. show index from File; +- Correct the Include syntax in the m4.xxx files in examples/conf +- Document JobStatus and Termination codes. +- Fix the error with the "DVI file can't be opened" while + building the French PDF. +- Document more DVD stuff +- Doc + { "JobErrors", "i"}, + { "JobFiles", "i"}, + { "SDJobFiles", "i"}, + { "SDErrors", "i"}, + { "FDJobStatus","s"}, + { "SDJobStatus","s"}, +- Document all the little details of setting up certificates for + the Bacula data encryption code. +- Document more precisely how to use master keys -- especially + for disaster recovery. + Priority: - -For 1.39: +- Why doesn't @"xxx abc" work in a conf file? +- Figure out some way to "automatically" backup conf changes. +- Look at using posix_fadvise(2) for backups -- see bug #751. + Possibly add the code at findlib/bfile.c:795 +- Add the OS version back to the Win32 client info. +- Restarted jobs have a NULL in the from field. +- Modify SD status command to indicate when the SD is writing + to a DVD (the device is not open -- see bug #732). +- Look at the possibility of adding "SET NAMES UTF8" for MySQL, + and possibly changing the blobs into varchar. +- Check if gnome-console works with TLS. +- Ensure that the SD re-reads the Media record if the JobFiles + does not match -- it may have been updated by another job. +- Look at moving the Storage directive from the Job to the + Pool in the default conf files. +- Test FIFO backup/restore -- make regression +- Doc items +- Test Volume compatibility between machine architectures +- Encryption documentation +- Wrong jobbytes with query 12 (todo) +- bacula-1.38.2-ssl.patch +- Bare-metal recovery Windows (todo) + + +Projects: +- GUI + - Admin + - Management reports + - Add doc for bweb -- especially Installation + - Look at Webmin + http://www.orangecrate.com/modules.php?name=News&file=article&sid=501 +- Performance + - FD-SD quick disconnect + - Despool attributes in separate thread + - Database speedups + - Embedded MySQL + - Check why restore repeatedly sends Rechdrs between + each data chunk -- according to James Harper 9Jan07. + - Building the in memory restore tree is slow. +- Features + - Better scheduling + - Full at least once a month, ... + - Cancel Inc if Diff/Full running + - More intelligent re-run + - New/deleted file backup + - FD plugins + - Incremental backup -- rsync, Stow + + + + +For next release: +- Look at mondo/mindi +- Don't restore Solaris Door files: + #define S_IFDOOR in st_mode. + see: http://docs.sun.com/app/docs/doc/816-5173/6mbb8ae23?a=view#indexterm-360 +- Make Bacula by default not backup tmpfs, procfs, sysfs, ... +- Fix hardlinked immutable files when linking a second file, the + immutable flag must be removed prior to trying to link it. +- Implement Python event for backing up/restoring a file. +- Change dbcheck to tell users to use native tools for fixing + broken databases, and to ensure they have the proper indexes. +- add udev rules for Bacula devices. +- If a job terminates, the DIR connection can close before the + Volume info is updated, leaving the File count wrong. +- Look at why SIGPIPE during connection can cause seg fault in + writing the daemon message, when Dir dropped to bacula:bacula +- Look at zlib 32 => 64 problems. +- Possibly turn on St. Bernard code. +- Fix bextract to restore ACLs, or better yet, use common routines. +- Do we migrate appendable Volumes? - Remove queue.c code. -- Correct the Include syntax in the m4.xxx files in examples/conf -- Get Perl replacement for bregex.c -- Fix auth compatibility with 1.38 -- Fix re-read of last block to check if job has actually written - a block, and check if block was written by a different job - (i.e. multiple simultaneous jobs writing). -- JobStatus and Termination codes. -- Some users claim that they must do two prune commands to get a - Volume marked as purged. - Print warning message if LANG environment variable does not specify UTF-8. - New dot commands from Arno. - .update volume [enabled|disabled|*see below] - > However, I could easily imagine an option to "update slots" that says - > "enable=yes|no" that would automatically enable or disable all the Volumes - > found in the autochanger. This will permit the user to optionally mark all - > the Volumes in the magazine disabled prior to taking them offsite, and mark - > them all enabled when bringing them back on site. Coupled with the options - > to the slots keyword, you can apply the enable/disable to any or all volumes. .show device=xxx lists information from one storage device, including devices (I'm not even sure that information exists in the DIR...) .move eject device=xxx mostly the same as 'unmount xxx' but perhaps with @@ -55,7 +124,56 @@ For 1.39: target slot. The catalog should be updated accordingly. .move transfer device=xxx fromslot=yyy toslot=zzz - +Low priority: +- Article: http://www.heise.de/open/news/meldung/83231 +- Article: http://www.golem.de/0701/49756.html +- Article: http://lwn.net/Articles/209809/ +- Article: http://www.onlamp.com/pub/a/onlamp/2004/01/09/bacula.html +- Article: http://www.linuxdevcenter.com/pub/a/linux/2005/04/07/bacula.html +- Article: http://www.osreviews.net/reviews/admin/bacula +- Article: http://www.debianhelp.co.uk/baculaweb.htm +- Article: +- It appears to me that you have run into some sort of race + condition where two threads want to use the same Volume and they + were both given access. Normally that is no problem. However, + one thread wanted the particular Volume in drive 0, but it was + loaded into drive 1 so it decided to unload it from drive 1 and + then loaded it into drive 0, while the second thread went on + thinking that the Volume could be used in drive 1 not realizing + that in between time, it was loaded in drive 0. + I'll look at the code to see if there is some way we can avoid + this kind of problem. Probably the best solution is to make the + first thread simply start using the Volume in drive 1 rather than + transferring it to drive 0. +- Fix re-read of last block to check if job has actually written + a block, and check if block was written by a different job + (i.e. multiple simultaneous jobs writing). +- Figure out how to configure query.sql. Suggestion to use m4: + == changequote.m4 === + changequote(`[',`]')dnl + ==== query.sql.in === + :List next 20 volumes to expire + SELECT + Pool.Name AS PoolName, + Media.VolumeName, + Media.VolStatus, + Media.MediaType, + ifdef([MySQL], + [ FROM_UNIXTIME(UNIX_TIMESTAMP(Media.LastWritten) Media.VolRetention) AS Expire, ])dnl + ifdef([PostgreSQL], + [ media.lastwritten + interval '1 second' * media.volretention as expire, ])dnl + Media.LastWritten + FROM Pool + LEFT JOIN Media + ON Media.PoolId=Pool.PoolId + WHERE Media.LastWritten>0 + ORDER BY Expire + LIMIT 20; + ==== + Command: m4 -DmySQL changequote.m4 query.sql.in >query.sql + + The problem is that it requires m4, which is not present on all machines + at ./configure time. - Given all the problems with FIFOs, I think the solution is to do something a little different, though I will look at the code and see if there is not some simple solution (i.e. some bug that was introduced). What might be a better @@ -271,7 +389,7 @@ minutes). 3905 Device "LTO-Drive1" (/dev/nst0) open but no Bacula volume is mounted. If this is not a blank tape, try unmounting and remounting the Volume. -- Add VolumeState (enable, disable, archive) +- http://www.dwheeler.com/essays/commercial-floss.html - Add VolumeLock to prevent all but lock holder (SD) from updating the Volume data (with the exception of VolumeState). - The btape fill command does not seem to use the Autochanger @@ -293,16 +411,9 @@ minutes). - What happens when you rename a Disk Volume? - Job retention period in a Pool (and hence Volume). The job would then be migrated. -- Detect resource deadlock in Migrate when same job wants to read - and write the same device. -- Queue warning/error messages during restore so that they - are reported at the end of the report rather than being - hidden in the file listing ... - Look at -D_FORTIFY_SOURCE=2 - Add Win32 FileSet definition somewhere - Look at fixing restore status stats in SD. -- Make selection of Database used in restore correspond to - client. - Look at using ioctl(FIMAP) and FIGETBSZ for sparse files. http://www.informatik.uni-frankfurt.de/~loizides/reiserfs/fibmap.html - Implement a mode that says when a hard read error is @@ -315,7 +426,6 @@ minutes). ("F","Full"), ("D","Diff"), ("I","Inc"); -- Add ACL to restore only to original location. - Show files/second in client status output. - Add a recursive mark command (rmark) to restore. - "Minimum Job Interval = nnn" sets minimum interval between Jobs @@ -393,10 +503,6 @@ minutes). - In restore don't compare byte count on a raw device -- directory entry does not contain bytes. -- To mark files as deleted, run essentially a Verify to disk, and - when a file is found missing (MarkId != JobId), then create - a new File record with FileIndex == -1. This could be done - by the FD at the same time as the backup. === rate design jcr->last_rate jcr->last_runtime @@ -445,7 +551,12 @@ minutes). - Bug: if a job is manually scheduled to run later, it does not appear in any status report and cannot be cancelled. -==== Keeping track of deleted files ==== +==== Keeping track of deleted/new files ==== +- To mark files as deleted, run essentially a Verify to disk, and + when a file is found missing (MarkId != JobId), then create + a new File record with FileIndex == -1. This could be done + by the FD at the same time as the backup. + My "trick" for keeping track of deletions is the following. Assuming the user turns on this option, after all the files have been backed up, but before the job has terminated, the @@ -456,7 +567,14 @@ minutes). pass. The DIR will then compare that to what is stored in the catalog. Any files in the catalog but not in what the FD sent will receive a catalog File entry that indicates - that at that point in time the file was deleted. + that at that point in time the file was deleted. This + either transmitted to the FD or simultaneously computed in + the FD, so that the FD can put a record on the tape that + indicates that the file has been deleted at this point. + A delete file entry could potentially be one with a FileIndex + of 0 or perhaps -1 (need to check if FileIndex is used for + some other thing as many of the Bacula fields are "overloaded" + in the SD). During a restore, any file initially picked up by some backup (Full, ...) then subsequently having a File entry @@ -483,6 +601,12 @@ minutes). Make sure this information is stored on the tape too so that it can be restored directly from the tape. + All the code (with the exception of formally generating and + saving the delete file entries) already exists in the Verify + Catalog command. It explicitly recognizes added/deleted files since + the last InitCatalog. It is more or less a "simple" matter of + taking that code and adapting it slightly to work for backups. + Comments from Martin Simmons (I think they are all covered): Ok, that should cover the basics. There are few issues though: @@ -760,8 +884,6 @@ Documentation to do: (any release a little bit at a time) block numbers in btape "test". Possibly adjust in Bacula. - Fix list volumes to output volume retention in some other units, perhaps via a directive. -- If opening a tape in read/write mode fails attempt to open - it in read-only mode, and mark the tape for read only. - Allow Simultaneous Priorities = yes => run up to Max concurrent jobs even with multiple priorities. - If you use restore replace=never, the directory attributes for @@ -769,11 +891,6 @@ Documentation to do: (any release a little bit at a time) - see lzma401.zip in others directory for new compression algorithm/library. -- Minimal autochanger handling in Bacula and in btape. -- Look into how tar does not save sockets and the possiblity of - not saving them in Bacula (Martin Simmons reported this). -- Fix restore jobs so that multiple jobs can run if they - are not using the same tape(s). - Allow the user to select JobType for manual pruning/purging. - bscan does not put first of two volumes back with all info in bscan-test. @@ -823,8 +940,6 @@ Documentation to do: (any release a little bit at a time) are not restored. See bug 213. To fix this requires creating a list of newly restored directories so that those directory permissions *can* be restored. -- Compaction of Disk space by "migrating" Volumes that have pruned - Jobs (what criteria? size, #jobs, time). - Add prune all command - Document fact that purge can destroy a part of a restore by purging one volume while others remain valid -- perhaps mark Jobs. @@ -845,9 +960,6 @@ Documentation to do: (any release a little bit at a time) - Add tree pane to left of window. - Add progress meter. - Max wait time or max run time causes seg fault -- see runtime-bug.txt -- Document writing to a CD/DVD with Bacula. -- Add a "base" package to the window installer for pthreadsVCE.dll - which is needed by all packages. - Add message to user to check for fixed block size when the forward space test fails in btape. - When unmarking a directory check if all files below are unmarked and @@ -856,7 +968,6 @@ Documentation to do: (any release a little bit at a time) - Setup lrrd graphs: (http://www.linpro.no/projects/lrrd/) Mike Acar. - Revisit the question of multiple Volumes (disk) on a single device. - Add a block copy option to bcopy. -- Investigate adding Mac Resource Forks. - Finish work on Gnome restore GUI. - Fix "llist jobid=xx" where no fileset or client exists. - For each job type (Admin, Restore, ...) require only the really necessary @@ -1006,11 +1117,6 @@ Documentation to do: (any release a little bit at a time) to start a job or pass its DHCP obtained IP number. - Implement a query tape prompt/replace feature for a console - Copy console @ code to gnome2-console -- Make AES the only encryption algorithm see - http://csrc.nist.gov/CryptoToolkit/aes/). It's - an officially adopted standard, has survived peer - review, and provides keys up to 256 bits. -- Take a careful look at SetACL http://setacl.sourceforge.net - Make tree walk routines like cd, ls, ... more user friendly by handling spaces better. - Make sure that Bacula rechecks the tape after the 20 min wait. @@ -1027,7 +1133,6 @@ Documentation to do: (any release a little bit at a time) in the "short" pool to the "long" pool if this pool runs out of volume space? - What to do about "list files job=xxx". -- Get and test MySQL 4.0 - Look at how fuser works and /proc/PID/fd that is how Nic found the file descriptor leak in Bacula. - Implement WrapCounters in Counters. @@ -1050,14 +1155,8 @@ Documentation to do: (any release a little bit at a time) run the job but don't save the files. - Make things like list where a file is saved case independent for Windows. -- Implement migrate - Use autochanger to handle multiple devices. -- On Windows with very long path names, it may be impossible to create - a file (and thus restore it) because the total length is too long. - We must cd into the directory then create the file without the - full path name. - Implement a Recycle command -- Test a second language e.g. french. - Start working on Base jobs. - Implement UnsavedFiles DB record. - From Phil Stracchino: @@ -1087,8 +1186,6 @@ Documentation to do: (any release a little bit at a time) - If SD cannot open a drive, make it periodically retry. - Add more of the config info to the tape label. -- If tape is marked read-only, then try opening it read-only rather than - failing, and remember that it cannot be written. - Refine SD waiting output: Device is being positioned > Device is being positioned for append @@ -1127,7 +1224,6 @@ Documentation to do: (any release a little bit at a time) - Compare tape to Client files (attributes, or attributes and data) - Make all database Ids 64 bit. - Allow console commands to detach or run in background. -- Fix status delay on storage daemon during rewind. - Add SD message variables to control operator wait time - Maximum Operator Wait - Minimum Message Interval @@ -1352,16 +1448,6 @@ Longer term to do: Migration: Move a backup from one Volume to another Clone: Copy a backup -- two Volumes -Bacula Migration is based on Jobs (apparently Networker is file by file). - -Migration triggered by: - Number of Jobs - Number of Volumes - Age of Jobs - Highwater mark (keep total size) - Lowwater mark - - ====================================================== Base Jobs design @@ -1607,7 +1693,6 @@ Block Position: 0 - Add ACL error messages in src/filed/acl.c. - Make authentication failures single threaded. - Make Dir and SD authentication errors single threaded. -- Install man pages - Fix catreq.c digestbuf at line 411 in src/dird/catreq.c - Make base64.c (bin_to_base64) take a buffer length argument to avoid overruns. @@ -1622,4 +1707,47 @@ Block Position: 0 LocationId NewState??? - Add Comment to Media record - +- Fix auth compatibility with 1.38 +- Update dbcheck to include Log table +- Update llist to include new fields. +- Make unmount unload autochanger. Make mount load slot. +- Fix bscan to report the JobType when restoring a job. +- Fix wx-console scanning problem with commas in names. +- Add manpages to the list of directories for make install. Notify + Scott +- Add bconsole option to use stdin/out instead of conio. +- Fix ClientRunBefore/AfterJob compatibility. +- Ensure that connection to daemon failure always indicates what + daemon it was trying to connect to. +- Freespace on DVD requested over and over even with no intervening + writes. +- .update volume [enabled|disabled|*see below] + > However, I could easily imagine an option to "update slots" that says + > "enable=yes|no" that would automatically enable or disable all the Volumes + > found in the autochanger. This will permit the user to optionally mark all + > the Volumes in the magazine disabled prior to taking them offsite, and mark + > them all enabled when bringing them back on site. Coupled with the options + > to the slots keyword, you can apply the enable/disable to any or all volumes. +- Restricted consoles start in the Default catalog even if it + is not permitted. +- When reading through parts on the DVD, the DVD is mounted and + unmounted for each part. +- Make sure that the restore options don't permit "seeing" other + Client's job data. +- Restore of a raw drive should not try to check the volume size. +- Lock tape drive door when open() +- Make release unload any autochanger. +- Arno's reservation deadlock. +- Eric's SD patch +- Make sure the new level=Full syntax is used in all + example conf files (especially in the manual). +- Fix prog copyright (SD) all other files. +- Document need for UTF-8 format +- Try turning on disk seek code. +- Some users claim that they must do two prune commands to get a + Volume marked as purged. +- Document fact that CatalogACL now needed for Tray monitor (fixed). +- If you have two Catalogs, it will take the first one. +- Migration Volume span bug +- Rescue release +- Bug reports