Technical notes on version 1.31a 02Aug03 02 Aug 2003 Kern Sibbald General: Changes submitted this submission: 02Aug03 - Yifang Dai reported a case where he stress tested Bacula and backed up to four volumes, but only two were selected for the restore. This is because I forgot that the selection could span a volume entirely. 31Jul03 - Added a missing CLIENT_FOUND_ROWS to the second attempt to open the MySQL database -- this prevents UPDATE errors if nothing actually changed. - Applied corrections to the manual supplied by Bob Collins. Many thanks! 30Jul03 - Integrated Robert Mathews improved description of Priorities into the manual. - Chased down the "The data is not valid" bug on WinMe/98/95. - Found an orphaned buffer in the set_attributs part of WinMe/98/95. 28Jul03 - Add sleep(1) to console when it gets a SIGTSTP signal to prevent it from using 100% of the CPU. - Improve description of Priorities. - Add a bit more documentation to jobq.c - Complete hash table routine htable.c htable.h - Change M_INFO to M_ERROR in attribs.c for Windows errors. 23Jul03 - Apply a patch from Nic Bellamy that clarifies the error messages during recycling volumes. 22Jul03 - Documentation. 21Jul03 - Clear VolCatInfo in askdir.c so that readbytes is zeroed. - Add SD statistics to backup report. - Removed old workq code. - Fixed rescheduling after error. - Fixed delayed starts which were not working. - Added priority to values that can change when starting a job. 20Jul03 - Complete implementation of new job scheduler. jobq.h jobq.c This code is turned off unless specifically enabled in src/version.h - Integrate code from Nic Bellamy to check for recycled volume in mount.c in SD. 19Jul03 - Fix a couple of bugs in dlist.c - Begin implementation of new job scheduler. 17Jul03 - Take serial.h provided by David Craigon, which corrects differences in prototypes between serial.h and serial.c. - Make db_get_media_ids() return Media Ids only for the current pool. - Add new jobq.h and jobq.c drived from workq. - Add JobPriority to jcr, and Priority to Job resource as well as to the run line in a Schedule. - Remove unused pool record from autoprune.c. - Implement Nic Bellamy's RecycleCurrentVolume. - Implement RecycleOldestVolume. - Begin adding new JOB_QUEUE code to the Director. - Create a single routine recycle_volume(). - Retry accept(), bind() and socket() if EINTR occurs. - Implement insert_before(), insert_after(), and empty() for dlist class. Also require offset to be given by giving item and link address. - Make error some messages in smtp.c a bit more explicit. 14Jul03 - Marc Brueckner reported a crash during restore (a missing tree->) - Moved host.h.in file from filed to src. - Update btraceback to include host os, distname, distver in output. - Split list (in lib) into alist and dlist both with .h and .c. - Update home page to include Project status page. 11Jul03 - Manual updates. - Clean up some unused variables detected by the IRIX compiler. - Test two directories on Win32 -- caused a crash. I forgot to NULL the uid cache pointer after releasing it. - Use bstrncpy() instead of strcpy() in find_files. - Clear a few linked lists in the temp directory packed in find_one.c - Eliminate an unnecessary variable in attr.c - Clear the cache pointer after release in idcache.c - Implement a new C++ doubly linked list class. - Change "Back space" to Backspace. 08Jul03 - Update document for Win32 stuff. - Ensure VolStatus value for update is permitted. - Fix cached_path so that it is local to the jcr, otherwise, there are problems from job to job. - Fixed idcache.c which was not thread safe and didn't release memory, and didn't always edit the userid correctly. 07Jul03 - Correct missing pool memory allocation in update voluseduration. - Release mutex in pool_mem.c before triggering ASSERT. 06Jul03 - Lock database while recycling. - Fix a bug in editing since where I forgot to update to the new size. - Implement all the command line update arguments. - Modify label to use volume=xxx for the new volume and oldvolume=yyy if doing a relable. - Added yes to run command line arguments. - Clear errno in editing a string to utime. - In restore print only volumes that will actually be used. - Fix bextract -- add appropriate breaks in new case code. - Add a new test -- bsr-opt-test for testing bsr optimization. As usual, it pointed out a bug where the directory tree handling code destroyed the restore arg list. - Many updates to the manual. - Pass prefix links flag to FD. - Sort list of commands for Console - Set default FD and SD concurrent jobs to 10. 05Jul03 (from vacation) - Rework the find next volume code in catreq.c to correct some minor but subtle logic errors and to eliminate a goto. - Did spell check on manual. - Removed bindtextdomain() as it conflicted with RH8.0 headers - Fixed parse_args to pass address of POOLMEM struct. - Constrain FileIndexes written to BSR to be within range of Volume. - Suppress writing volumes to BSR if they are not actually referenced. - Make FOPTS use alist for match and base entries. - Pass prefix_links to SD. - Add command line interface to most items in "update volume=xxx" - Add command line interface to restore "jobid", "current", "before", "all". - Add command line "yes" to run command to supress prompt. - In new alist code, free only if allocated. - Overload [] with get() code for alist. - Fixed the code that wrote FirstIndex and LastIndex to the database. It was not correct at the end of a volume (basically included indexes in the second volume). - Fixed bscan to work with the new code and to properly build JobMedia records. - Added code to the read end of block.c to properly track Volume bytes, blocks, and files. I thought this was not necessary, but it is critical for bscan to work correctly. - Modified read_record to properly track First/LastIndex -- needed by bscan. - Eliminated some old Volume write code. 28Jun03 - Changed RecycleOldestVolume to PurgeOldestVolume 27Jun03 - Added what I hope are the "final" touches for Win32 stuff. There are still a lot of annoying little problems. - Added the "portable=yes/no" option to Include. If set, it disables use of BackupRead/Write for Win32, so in principle, the data should be portable. 26Jun03 - Pulled in more recent config.sub and config.guess from /usr/share/libtool - Replaced the system fgets() by a Bacula version that ignores interrupts (i.e. signals). This truncated output from child processes. - Make file_index int32_t everywhere. - Moved LinkFI into ATTR structure. Also integrated data_stream there too. - Moved code that sets the stream for writing into create_file. - Removed a signal(SIG_IGN, SIGCHLD) from dird.c that prevented getting the status of child processes. This allowed removing the FreeBSD kludge to bpipe.c -- the status is now obtained correctly. - Hand scan the stream header that arrives in append.c to avoid machine dependencies of sscanf(). 25Jun03 - Implemented code to put Data stream in Attributes record. - Check if data stream is supported, if not, ignore. - Fix crash when multiple Includes are given (missing parens). - Clear WroteVol in askdir.c when JobMedia record is created. 24Jun03 - Implement simple array list class for use in Bacula. New files are lib/list.c lib/list.h. Probably will not use until version 1.32. 23Jun03 - Change Purging Oldest Volume message to Recycling Oldest Volume. - Limit results from find_oldest_volume to one. - Fix possible buffer overrun in the restore tree handling routines. - Fixed a crash in VerifyToVolume because I moved the close_db() down into the free_ua_context() and should not have done so. 21Jun03 - At a "var" command in the Console that does variable expansion and prints it. - Implement first cut of estimate command. 20Jun03 - Change find_next_volume() for oldest to use LastWritten instead of FirstWritten -- also add Append to volumes slected. - Do normal recycling before checking for RecycleOldestVolume. - Implemented block rejection on read. This should make restores run much faster. Next release will have block positioning -- even faster. 19Jun03 - Very preliminary support for Gnome-2.0. Text does not yet work. - Correct buffer corruption in find_one.c with long directory names. - Make setting owner on directories M_ERROR rather than M_WARNING. - Fix printing of JobId in run listing. - Reduce heartbeat poll interval to every 10 seconds on Cygwin because there is no kill. 18Jun03 - I finally implemented a test for multiple simultaneous jobs, and sure enough it broke when the jobs are split over multiple volumes. Now fixed and working! - Eliminated a few "duplicate" error messages by testing for canceled. 17Jun03 - Add ASSERT for device use count going negative. - Fix BlockNumber checking in stored/read.c (got first one wrong). - If socket is timed out, do a shutdown(fd,2) instead of close(). 16Jun03 - Fixed return status from SD to FD by setting JobStatus in append_end() - Add arrays to Environment variables. Elements separated by |. - Implement Reschedule On Error, Reschedule Interval, Reschedule Times. - Add a new pool PM_NAME -- gets a name length buffer. - Implement fast cancel of FD blocked on writing to SD by using pthread_cancel(). Turned off on Cygwin due to bug. - Add code to handle EAGAIN in writing (probably not necessary). Use select(). - Eliminate size_t from pool control buffers. 15Jun03 - Complete Counter resource. - Complete LabelFormat (except for WrapCounter) plus counter inrementation. - This needs a database change to eliminate PoolId from counters. 14Jun03 - Modify the manual's index to be a bit more compact. Less space between lines. - Add Phil's checkhost to examples directory (thanks Phil). - Implement generalized LabelFormat (documentation to come). - Implement Counter resource. 13Jun03 - Cleanup examples/kernsconfig - Fix new bug introduced in newvol.c - Implement restore to a specific date. 12Jun03 - Fixed a but in automatic labeling (and use durations expiring) analysed and reported by Rob Proffitt (thanks!). - Cleaned up a few Cygwin compile problems. - Made a 10Jun03 release (it is in production here) 11Jun03 - Finally took the big plunge and fixed restoration of links and other files that have been changed between the backup and restore. Basically if the file exists, it is deleted, then re-created. - Purge only Volumes marked Append, Full, Used, or Error. - Allow pruning of volumes marked Append, in addition to Full and Used. 10Jun03 - Eliminated all plain email addresses and replaced them with " at " in place of @ to reduce havesting by spammers. Doc + Web Site. - Started working on making POOLMEM a struct rather than a char. Lots of work to do. - Fixed bscan to handle -V option. - Fixed bscan to handle two File volumes. - Corrected a misplaced comma it get_fileset() in cats pointed out by bscan. - Added two Volume bscan test to regression scripts -- write two volumes, purge and delete everything, bscan the tapes, and do a restore. It works! 09Jun03 - Reorganized the backup/restore code to move the attribute information into an ATTR packet, which is passed in place of tons of arguments. Moved some code into lib/attr.c and lib/attr.h. Then eliminated all the duplicate attribute code. - Moved FT_ types into baconfig.h. - Defined FT_ types to use only 16 bits. The upper half of the word is reserved for adding optional fields in the attributes packet. - Moved jcr->where into common part of jcr and have it deleted in lib/jcr.c - Put all attribute reading code on switch() with cases instead of a big if (restore.c, bls.c, bextract.c, bscan.c, ...) - set_attributes() now takes ATTR packet, and thus has much fewer args. - moved print_ls_output() into lib/attr.c - implemented is_stream_supported(). - create_file() now takes ATTR packet so has many fewer args. - add mtime_only code. - Rewrote bnet.c read and write routines to quit if bsock->terminated is set. This will allow setting non-blocking writes and then receiving a termination message and terminating the Job immediately rather than waiting 2 hours for the line to timeout. - Put catalog db name in some error messages. - Code for restore is now much cleaner, with much of it in lib/attr.c, and it is now common for all readers. 07Jun03 - Add first cut of proper support for Win32 Backup code. - Fix bug in restore Win95/98/Me. - Pass mtime_only flag to FD. Needs config record. 04Jun03 - Added documentation on how to add a Client. - Add VolIndex update when creating JobMedia based on count. - Sort JobMedia records by VolIndex,JobMediaId works for new and old format. - Print an error message when cannot open database. - Update query.sql to include List all backups for a Client after a specified time List all backups for a Client - Change a bunch of command names to xxx_cmd. - Add Release command. - Removed update of EndBlock/EndFile on EOM in attempt to get it right. - Add rawfill command to btape. - Tweak fill command (something isn't right with last block). 03Jun03 - Fix block.c to check errno only in case of return status -1 as suggested by Justin Gibbs (FreeBSD). - Implemented qfill command in btape for quick testing write/read of a tape. - Discovered that FreeBSD pthreads re-use the same thread id, which causes the SD to fail when a user leaves a device unmounted (old pid is reused and lock_device() thinks the same thread is calling again leading to inconsistent state). Set id to zero after blocking the device during unmount. 02Jun03 - A lot of clean up, moving subroutines around for TermCode. - Free ua->prompt when Job terminates. - Add AutoPrune and Recycle to values copied from Pool resource into Pool record on create/updated. - Implemented bsr for Verify VolumeToCatalog. - Improved the Verify Job report using SD and FD term codes. - Split tree handling routines from ua_restore.c to ua_tree.c - Split bsr routines from ua_restore.c to bsr.c and bsr.h 01Jun03 - Fixed clash between FD and SD returned job values. Report now contains values from FD. Maybe I should change? or give both. - Attempt to fix negative use_count for dev packet in SD by adding a couple of open_dev(). This may be cause of Dan's crash. - Clear no_wait_id when device is unblocked. This may be cause of Dan's crash. - Eliminate old "new lock code". 31May03 - Add configure of mtx-changer for mtx path. - Always rewind tape before releasing it (for FreeBSD). - StartBlock was one too large for second volume. - Fixed restore to display status from both SD and FD. - Unified return status message for backup and restore. 30May03 - Corrected segmentation fault reported by Dan when doing "label barcodes" on a File. - Corrected a segmentation fault when attempting to send a JobMedia record to the Console -- reported by Dan. - Added MySQL documentation for using the threaded libraries. - Added new columns and tables to Catalog database. - Wrote alter scripts and tested them (thanks to Dan for the help) on MySQL and SQLite. - Started using enums where ever possible when passing flags to subroutines. This helps make the source much more readable. - Corrected a bug where a vertical database listing was being used in the query command. - Added new argument to parse_args() to prevent command arg overflow. - Renamed ua_db_query.c ua_query.c. - Split scan.c out of lib/util.c - Perhaps I have *finally* fixed the command line history in gnome-console. - Added support for smartalloc for any global new or delete command by overloading the global operators. - Made the default time with no qualifier day rather than seconds. - Fixed a bug in the store_size() routine that improperly converted from double to uint32_t. - Started using "bool" where possible. - Zap SD session key once it is used. - Added *lots* more checking for strcpy -- bstrncpy(), ... 27May03 - Added CreateTime field to FileSet record and print it to distinguish FileSets. - Print an information message when a new FileSet is created. - Include the FileSet date/time in the Job report. - Indicate if a Job is upgraded in the Job report and from what previous level. - Incremented the database version. - Ensure that any DB error message is printed if the start_time of a previous save is not found. - Free orphaned buffer in ua_restore.c in case of database error. - Implement enum for response DISPLAY_ERROR and NO_DISPLAY - Implement enum for create_pool (POOL_OP_CREATE, POOL_OP_UPDATE). - Make sure FileSets printed in restore are in order. - Add a number of bstrncat, and other protected string operations. 26May03 - Clean up old structs in dird_conf.h - Remove all Slot invalidation code. - Add Automatic choice message to all do_prompt() calls. - Eliminate JobId from restore if not used. - Clean up a few error messages. - Make fill/unfill commands work correctly in btape. 25May03 - Enhance btape fill and unfill commands. - Implement real Pmsg() code so that negative levels work in Dmsg() - Implement block number check -- had to turn it off because it doesn't work. Need to verify that it is the correct block and that block numbers are properly written. - Moved readline from depkgs1 to depkgs. - Reworked the configure code to handle readline correctly. This was broken mostly due to the fact that the readline routines are nested down one directory. Also, I missed one header file that was needed (possibly added in a later version). - Put correct include on the dependencies make for Console readline. - Remove JobMediaId from VOL_PARAMS (no longer needed). - Sort VOL_PARAMS by JobMediaId using SQL in cats. - Add jcr as argument to block.c read_block... routines so that error messages are immediately displayed. - Make bsr_dev() edit an error message if it is turned off and return 0. - Add checking for the BlockNumber in the read routines -- lots of false matches are found -- much check writing end. 24May03 - Now sort bsr volumes by JobMediaId -- produces better results. - It turns out that under certain circumstances, when doing a restore, the Volumes will not be written to the BSR in the correct order. I don't know exactly why, but many thanks to Dan Langille for reporting this. The solution is to sort the Vol_Params within each bsr (done), and to sort the bsr chain (not yet implemented). Note, the bsr chain should always be in order unless the user explicitly specifies the JobIds in a different order. - Began implementing C++ structs rather than typedef structs as in C. - Added volatile to a lot of variables that are used in two threads at the same time. This should prevent improper optimization. - Fixed a missing space in the "run job=xxx where=" the where was glued to the end of the previous stuff (bootstrap filename). - I *finally* found the cause of the mysterious failure of shell expansion. It was due to the read() getting interrupted! That's what opening up SIGCHLD will do! - Remove unused default tape drive names. - Create a new status.c file in stored and split the status code out of dircmd.c 22May03 - I discovered that C++ permits "prototyping" structures e.g. struct A; is a valid statement. This permitted me to eliminate all the void *jcr, in favor of JCR *jcr, which pointed out a number of bugs in block.c. - Change lib/bmisc.c to bsys.c (system routines). - Add set_working_directory() to lib/util.c - Remove some unneeded setjcr_job_status() since Jmsg(jcr, M_FATAL,...) already sets it. - Do not increment jcr->Errors for Fatal errors -- they represent non-fatal errors. - Fix a few more places in FD where Errors was not incremented. - Print unexpected (or incorrect) termination message returned from FD. - Use switch() instead of giant if statement in verify_vol.c - Protect overrun from do_shell_expansion() by passing max length. 20May03 - Add mandrake to platforms - Suppress error messages if no bytes written to tape. - Suck up bootstrap file even on error so that Dir sees our error message. - Pretty much finish off the Win32 backup code. - Add DESTDIR code to autostart for creating non-root rpms - Echo input read from a script in Console. - Clarify error message for VerifyToCatalog - Add error counts in restore for M_NOTSAVED. - Adapt bfile.c to handle both Win95 files as well as WinXP files. - Add MTIOCERRSTAT for FreeBSD (clear error status). - Correct double jobmedia record when cancel at EOM reported by Phil. - Correct possible write at beginning of tape during cancel at EOM as reported by Phil. - Document in detail how Incremental and Differential jobs work. - Add non-fatal error count on backup and restore Job reports. - Remove a couple uses of lld -- now prefer to edit and use %s. - Fix directory could not be accessed on Win32. - Improve message indicating that last Full backup not found. - Fix free() too early in directory traversal code. 19May03 - Prune Jobs with no JobFiles or that have JobStatus!='T' - Add a few more command line scans for prune/purge. - Restrict valid characters in a Volume name, and document it. 18May03 - Make new Win32 save/restore work. Still a bit more to do. 17May03 - Use reentrant version of mysqlclient library. - Use more machine independent way of finding gcc version. - Fix race condition in sql_list where messages edited before locking. - Lots of testing saving/restoring 6GB files. - Add where to restore where=/tmp - Complete implementation of Win32 streams in FD. Must test. Also, must implement new streams in SD. - Make termination of daemons more "error" tolerant. - Make default "duration" days rather than seconds if there is no modifier. - Install bcopy. 15May03 - Add detection of available Win API's so that a single binary will work on all Windows systems. Reference those APIs through a pointer. - Remove use_win_backup_api and enable it in bfile.c if system supports it. - Modify dev.c so that it works if MTEOM is not defined (BSDI). - Change MT_xxx to BMT_xxx to prevent conflicts with BSDI. - Detect strtoll() in configure. - Implement replacement for strtoll() for BSDI. - Add platform files for BSDI. - Use Jmsg() instead of Jmsg1() in acquire because File:line prefixed in dev.c - Use Jmsg() in write_block_to_dev() so that no messages are lost. - Rework autochanger code in restore to handle case of cassette not in magazine. 14May03 - Implement Windows BackupRead/Write(). I now have permissions right!!!! - Additions to the manual (Purging, Autopruning). - Add doc to code in autoprune. - Begin adding Level = Base. - Make Jmsg recognize console and direct messages directly back to it. 13May03 - Hopefully fix mess in mount.c when a tape expires. - Fix restore bug recently introduced due to Unix backwards status convention. - New bacula.spec from Scott - Add globals for database name and version and print them in traceback. - Eliminate SubSysDirectory in each daemon conf file. - Implement get_yesno() and get_pint() in UA. - Make Jmsg aware of console. Messages now sent directly to Console. 11May03 - Created a single bacula.spec.in for by the MySQL and SQLite builds. - Added proper configuration to console.in and gconsole.in - Start adding textdomain() code for translating. 10May03 - A number of minor code cleanups. - Rework shell expansion just a bit. - Add rewind() when releasing a tape before acquiring the next one. 09May03 - Implement addition of Description in Service entry for Win32. - Update manual to eliminate unclear autochanger points as mentioned by Dan Langille. 08May03 - Implement DESTDIR everywhere. - Rework spec files for 1.31 and combine the main spec and the client only spec making a client package. At the same time, rename the packages so it is a bit clearer to the user. Also fix the build to work as non-root (scriptdir was not prefixed with $RPM_BUILD_ROOT). - Correct Auto Changers and all other forms to Autochangers in the manual. 07May03 - John reported needing to do two "mount" requests, and indeed that was the case. It turns out that pthread_cond_timedwait() does not always return zero when awaken by a pthread_cond_signal(). - Include RunBeforeJob and RunAfterJob output in job output report. - Implement a "real" Admin job that prints a mini-job report. - Clean up a few error messages in findlib and filed. 06May03 - Recent changes to gnome-console caused initial output to be lost -- now fixed. 05May03 - The Win32 version crashed after each job. After hours, it turns out that when running with LocalSystem privilege (and not as a user), when Cygwin does pthread_kill(id, SIGUSR2), it gets a memory fault. - Moved stored/fdmsg.c to lib/bget_msg.c, and moved SD messages to stored.c. So now bget_msg() can be used by both the SD and FD. - Changed Director's bget_msg() to be called bget_dirmsg() to avoid any possible confusion. - Implemented bget_msg() in general everywhere in the FD except for job.c where the Dir and FD are communicating. - Implemented a Director only heartbeat in the FD for the cases where there is either no connection to the SD or the FD is already reading from the SD. start_dir_heartbeat() ... 04May03 - Add heartbeat to restore and verify volume. - Add "Heartbeat Interval" to Storage resource, which sets interval the SD sends heartbeats to the FD and DIR, 0 disables heartbeats. - Add "Heartbeat Interval" to FileDaemon resource, which sets the interval the FD sends heartbeats to the DIR, 0 disables heartbeats. - Added heartbeat from FD to Dir every HB_TIME rather than forwarding SD heartbeats. 03May03 - First cut label dialog. - Turn on new semaphore code for simultaneous Jobs. - Fix cancel trying to release semaphore's not acquired. - Implement get_pint() and get_yesno() for UA. - Implement find_arg_with_value() for UA. - All command line "slot" to be specified for label command. - Rework heartbeat code in FD to correctly terminate. - Fix btraceback to use smtp and to eliminate double // 02May03 - Fix "storage" command to include ssl for verify and restores. - Add Heartbeat code when SD is waiting on a tape -- heartbeat every 20 mins to keep stateful firewalls from timing out the connections. - Fix src/stored/Makefile.in typo causing problems in statically linking btape. Thanks to Lutz for reporting this. - Create an is_client_alive script for checking if a client is alive. Using this script prevents generating error messages. - Added corrections and updates to manual provided by Phil -- thanks. 01May03 - Added RequireSSL to each program/daemon configuration. - Added EnableSSL to each correspondent for each program. - Added the Console resource to the Director (need to implement individual Console authorization).