Technical notes on version 2.2.x
General:
+Release Version 2.2.4
+14Sep07
+kes Increase size of name string when FD making connection to SD.
+ May fix bug #953.
+13Sep07
+kes Add code to try to fix bug #908.
+kes Add waits to multiple exit detection code to try to force pid
+ file to always be deleted.
+kes Restore good dev.tar.gz to rescue set appropriate binary property.
+ This fixes bug #950.
+kes Fix seg fault in error exit of acquire_for_read after unsuccessfully
+ trying to switch drives by checking for blocking before unblocking.
+ Fixes bug #906.
+kes Cancel storage daemon in all cases where FD reports error. This
+ should fix virtually all cases of bug #920.
+12Sep07
+kes Fix error message that was clobbered when Dir tells SD it does not
+ have write permission on Volume. This should fix a minor point
+ in bug #942, but not the main problem.
+kes Add code to cancel job in SD if FD connection fails. This should
+ fix bug #920.
+kes Add code in FD exit to prevent loops and a crash on FreeBSD.
+kes Fix migration code to get correct Volume name with multiple volumes
+ by skipping |. Fixes bug #936.
+kes Implement patch supplied by Landon to fix bug #944 where using
+ TLS with bconsole uses 99+% of the CPU.
+kes Note, you need GTK >= 2.10 to be able to link the Tray Monitor
+ program.
+kes Move patches into patches directory.
+11Sep07
+ebl Fix bug #946 about "bacula-dir -t" which doesn't works
+ as expected.
+09Sep07
+ebl Using "m" in bconsole will show messages like before,
+ and not memory usage.
+
+Release Version 2.2.3
+kes Note, you need GTK >= 2.10 to be able to link the Tray Monitor
+ program.
+09Sep07
+kes Fix bug #935, and probably also bug #903 where files were not
+ restored. MediaId was not properly set in JobMedia record after
+ a Volume change.
+07Sep07
+kes Add ./configure search in qwt-qt4 for qwt package
+kes Apply Martin Simmons patch that should turn off the new API usage
+ when batch insert is turned off allowing building on older
+ PostgreSQLs.
+
Release Version 2.2.2
04Sep07
ebl Detect if new PosgreSQL batch insert API is present.
- Release Notes for Bacula 2.2.3
+ Release Notes for Bacula 2.2.4
Bacula code: Total files = 520 Total lines = 195,550 (*.h *.c *.in)
82 new files, 41,221 new lines of code, 208,380 lines of change from 2.0.3
not have to upgrade all your File daemons when you upgrade. There is
no database upgrade needed from version 2.0.x to 2.2.0.
+Version 2.2.4 is a minor bug fix release to version 2.2.3
+- Possible fix for authorization problems bug #953.
+- Possibel fix for bug #908.
+- Add waits to multiple exit detection code to try to force pid
+ file to always be deleted.
+- Restore good dev.tar.gz to rescue set appropriate binary property.
+ This fixes bug #950.
+- Fix seg fault in error exit of acquire_for_read after unsuccessfully
+ trying to switch drives by checking for blocking before unblocking.
+ Fixes bug #906.
+- Cancel storage daemon in all cases where FD reports error. This
+ should fix virtually all cases of bug #920 and will ensure that Devices
+ are released as soon as possible.
+- Fix error message that was clobbered when Dir tells SD it does not
+ have write permission on Volume. This should fix a minor point
+ in bug #942, but not the main problem.
+- Fix migration code to get correct Volume name with multiple volumes
+ by skipping |. Fixes bug #936.
+- Implement patch supplied by Landon to fix bug #944 where using
+ TLS with bconsole uses 99+% of the CPU.
+- Fix bug #946 about "bacula-dir -t" which doesn't works
+ as expected.
+- Using "m" in bconsole will show messages as in prior versions
+ and not memory usage.
+
+- Note, you need GTK >= 2.10 to be able to link the Tray Monitor
+ program.
+
Version 2.2.3 is a critical bug fix release to version 2.2.2
- Fix bug #935, and possibly also bug #903 where files were not
restored. MediaId was not properly set in JobMedia record after
set_jcr_job_status(jcr, JS_ErrorTerminated);
Dmsg1(400, "wait for sd. use=%d\n", jcr->use_count());
/* Cancel SD */
- if (jcr->store_bsock) {
- jcr->store_bsock->fsend("cancel Job=%s\n", jcr->Job);
- }
+ cancel_storage_daemon_job(jcr);
wait_for_storage_daemon_termination(jcr);
Dmsg1(400, "after wait for sd. use=%d\n", jcr->use_count());
return false;
}
bnet_sig(fd, BNET_TERMINATE); /* tell Client we are terminating */
+ /* Force cancel in SD if failing */
+ if (job_canceled(jcr) || !fd_ok) {
+ cancel_storage_daemon_job(jcr);
+ }
+
/* Note, the SD stores in jcr->JobFiles/ReadBytes/JobBytes/Errors */
wait_for_storage_daemon_termination(jcr);
Jmsg((JCR *)NULL, M_ERROR_TERM, 0, _("Please correct configuration file: %s\n"), configfile);
}
- if (background) {
- daemon_start();
- init_stack_dump(); /* grab new pid */
+ if (!test_config) { /* we don't need to do this block in test mode */
+ if (background) {
+ daemon_start();
+ init_stack_dump(); /* grab new pid */
+ }
+
+ /* Create pid must come after we are a daemon -- so we have our final pid */
+ create_pid_file(director->pid_directory, "bacula-dir", get_first_port_host_order(director->DIRaddrs));
+ read_state_file(director->working_directory, "bacula-dir", get_first_port_host_order(director->DIRaddrs));
}
- /* Create pid must come after we are a daemon -- so we have our final pid */
- create_pid_file(director->pid_directory, "bacula-dir", get_first_port_host_order(director->DIRaddrs));
- read_state_file(director->working_directory, "bacula-dir", get_first_port_host_order(director->DIRaddrs));
-
drop(uid, gid); /* reduce privileges if requested */
if (!check_catalog()) {
static bool already_here = false;
if (already_here) { /* avoid recursive temination problems */
+ bmicrosleep(2, 0); /* yield */
exit(1);
}
already_here = true;
generate_daemon_event(NULL, "Exit");
write_state_file(director->working_directory, "bacula-dir", get_first_port_host_order(director->DIRaddrs));
delete_pid_file(director->pid_directory, "bacula-dir", get_first_port_host_order(director->DIRaddrs));
-// signal(SIGCHLD, SIG_IGN); /* don't worry about children now */
term_scheduler();
term_job_server();
if (runjob) {
return true;
}
+void cancel_storage_daemon_job(JCR *jcr)
+{
+ UAContext *ua = new_ua_context(jcr);
+ JCR *control_jcr = new_control_jcr("*JobCancel*", JT_SYSTEM);
+ BSOCK *sd;
+
+ ua->jcr = control_jcr;
+ if (jcr->store_bsock) {
+ if (!ua->jcr->wstorage) {
+ if (jcr->rstorage) {
+ copy_wstorage(ua->jcr, jcr->rstorage, _("Job resource"));
+ } else {
+ copy_wstorage(ua->jcr, jcr->wstorage, _("Job resource"));
+ }
+ } else {
+ USTORE store;
+ if (jcr->rstorage) {
+ store.store = jcr->rstore;
+ } else {
+ store.store = jcr->wstore;
+ }
+ set_wstorage(ua->jcr, &store);
+ }
+
+ if (!connect_to_storage_daemon(ua->jcr, 10, SDConnectTimeout, 1)) {
+ goto bail_out;
+ }
+ Dmsg0(200, "Connected to storage daemon\n");
+ sd = ua->jcr->store_bsock;
+ sd->fsend("cancel Job=%s\n", jcr->Job);
+ while (sd->recv() >= 0) {
+ }
+ sd->signal(BNET_TERMINATE);
+ sd->close();
+ ua->jcr->store_bsock = NULL;
+ }
+bail_out:
+ free_jcr(control_jcr);
+ free_ua_context(ua);
+}
static void job_monitor_destructor(watchdog_t *self)
{
extern bool create_restore_bootstrap_file(JCR *jcr);
extern void dird_free_jcr(JCR *jcr);
extern void dird_free_jcr_pointers(JCR *jcr);
+extern void cancel_storage_daemon_job(JCR *jcr);
/* migration.c */
extern bool do_migration(JCR *jcr);
jcr->unlink_bsr = false;
}
+ if (job_canceled(jcr)) {
+ cancel_storage_daemon_job(jcr);
+ }
+
switch (TermCode) {
case JS_Terminated:
if (jcr->ExpectedFiles > jcr->jr.JobFiles) {
{ NT_("list"), list_cmd, _("list [pools | jobs | jobtotals | media <pool=pool-name> | files <jobid=nn>]; from catalog")},
{ NT_("label"), label_cmd, _("label a tape")},
{ NT_("llist"), llist_cmd, _("full or long list like list command")},
- { NT_("memory"), memory_cmd, _("print current memory usage")},
{ NT_("messages"), messagescmd, _("messages")},
+ { NT_("memory"), memory_cmd, _("print current memory usage")},
{ NT_("mount"), mount_cmd, _("mount <storage-name>")},
{ NT_("prune"), prunecmd, _("prune expired records from catalog")},
{ NT_("purge"), purgecmd, _("purge records from catalog")},
*/
set_jcr_job_status(jcr, JS_Blocked);
if (!connect_to_file_daemon(jcr, 10, FDConnectTimeout, 1)) {
- return false;
+ goto bail_out;
}
set_jcr_job_status(jcr, JS_Running);
Dmsg0(30, ">filed: Send include list\n");
if (!send_include_list(jcr)) {
- return false;
+ goto bail_out;
}
Dmsg0(30, ">filed: Send exclude list\n");
if (!send_exclude_list(jcr)) {
- return false;
+ goto bail_out;
}
/*
}
bnet_fsend(fd, storaddr, jcr->rstore->address, jcr->rstore->SDDport);
if (!response(jcr, fd, OKstore, "Storage", DISPLAY_ERROR)) {
- return false;
+ goto bail_out;
}
/*
*/
if (!send_bootstrap_file(jcr, fd) ||
!response(jcr, fd, OKbootstrap, "Bootstrap", DISPLAY_ERROR)) {
- return false;
+ goto bail_out;
}
if (!jcr->RestoreBootstrap) {
Jmsg0(jcr, M_FATAL, 0, _("Deprecated feature ... use bootstrap.\n"));
- return false;
+ goto bail_out;
}
level = "volume";
default:
Jmsg2(jcr, M_FATAL, 0, _("Unimplemented Verify level %d(%c)\n"), jcr->JobLevel,
jcr->JobLevel);
- return false;
+ goto bail_out;
}
if (!send_runscripts_commands(jcr)) {
- return false;
+ goto bail_out;
}
/*
* Send verify command/level to File daemon
*/
- bnet_fsend(fd, verifycmd, level);
+ fd->fsend(verifycmd, level);
if (!response(jcr, fd, OKverify, "Verify", DISPLAY_ERROR)) {
- return false;
+ goto bail_out;
}
/*
default:
Jmsg1(jcr, M_FATAL, 0, _("Unimplemented verify level %d\n"), jcr->JobLevel);
- return false;
+ goto bail_out;
}
stat = wait_for_job_termination(jcr);
verify_cleanup(jcr, stat);
return true;
}
+
+bail_out:
+ verify_cleanup(jcr, JS_ErrorTerminated);
return false;
}
update_job_end(jcr, TermCode);
+ if (job_canceled(jcr)) {
+ cancel_storage_daemon_job(jcr);
+ }
+
if (jcr->unlink_bsr && jcr->RestoreBootstrap) {
unlink(jcr->RestoreBootstrap);
jcr->unlink_bsr = false;
void terminate_filed(int sig)
{
+ static bool already_here = false;
+
+ if (already_here) {
+ bmicrosleep(2, 0); /* yield */
+ exit(1); /* prevent loops */
+ }
+ already_here = true;
+ stop_watchdog();
+
bnet_stop_thread_server(server_tid);
generate_daemon_event(NULL, "Exit");
write_state_file(me->working_directory, "bacula-fd", get_first_port_host_order(me->FDaddrs));
if (configfile != NULL) {
free(configfile);
}
+
if (debug_level > 0) {
print_memory_pool_stats();
}
- free_config_resources();
term_msg();
- stop_watchdog();
+ free_config_resources();
cleanup_crypto();
close_memory_pool(); /* release free memory in pool */
sm_dump(false); /* dump orphaned buffers */
tv.tv_sec = 10;
tv.tv_usec = 0;
/* Block until we can read */
- select(fdmax, &fdset, NULL, &fdset, &tv);
+ select(fdmax, &fdset, NULL, NULL, &tv);
break;
case SSL_ERROR_WANT_WRITE:
/* If we timeout of a select, this will be unset */
tv.tv_sec = 10;
tv.tv_usec = 0;
/* Block until we can write */
- select(fdmax, NULL, &fdset, &fdset, &tv);
+ select(fdmax, NULL, &fdset, NULL, &tv);
break;
default:
/* Socket Error Occured */
Dmsg2(50, "Dec reserve=%d dev=%s\n", dev->reserved_device, dev->print_name());
dcr->reserved_device = false;
}
- dev->dunblock(DEV_LOCKED);
+ /*
+ * Normally we are blocked, but in at least one error case above
+ * we are not blocked because we unsuccessfully tried changing
+ * devices.
+ */
+ if (dev->is_blocked()) {
+ dev->dunblock(DEV_LOCKED);
+ }
Dmsg1(950, "jcr->dcr=%p\n", jcr->dcr);
return ok;
}
int i;
bool found, quit;
int bnet_stat = 0;
- char name[MAX_NAME_LENGTH];
+ char name[500];
if (bs->recv() <= 0) {
Emsg0(M_ERROR, 0, _("Connection request failed.\n"));
/*
* Do a sanity check on the message received
*/
- if (bs->msglen < 25 || bs->msglen > (int)sizeof(name)-25) {
+ if (bs->msglen < 25 || bs->msglen > (int)sizeof(name)) {
Emsg1(M_ERROR, 0, _("Invalid connection. Len=%d\n"), bs->msglen);
bnet_close(bs);
return NULL;
if (!(jcr=get_jcr_by_full_name(Job))) {
bnet_fsend(dir, _("3904 Job %s not found.\n"), Job);
} else {
- jcr->lock();
oldStatus = jcr->JobStatus;
set_jcr_job_status(jcr, JS_Canceled);
if (!jcr->authenticated && oldStatus == JS_WaitFD) {
pthread_cond_signal(&jcr->job_start_wait); /* wake waiting thread */
}
- jcr->unlock();
if (jcr->file_bsock) {
bnet_sig(jcr->file_bsock, BNET_TERMINATE);
} else {
bstrncpy(VolumeName, dcr->VolumeName, sizeof(VolumeName));
bstrncpy(dcr->VolumeName, dev->VolHdr.VolumeName, sizeof(dcr->VolumeName));
if (!dir_get_volume_info(dcr, GET_VOL_INFO_FOR_WRITE)) {
+ POOL_MEM vol_info_msg;
+ pm_strcpy(vol_info_msg, jcr->dir_bsock->msg); /* save error message */
/* Restore desired volume name, note device info out of sync */
/* This gets the info regardless of the Pool */
bstrncpy(dcr->VolumeName, dev->VolHdr.VolumeName, sizeof(dcr->VolumeName));
" Current Volume \"%s\" not acceptable because:\n"
" %s"),
dcrVolCatInfo.VolCatName, dev->VolHdr.VolumeName,
- jcr->dir_bsock->msg);
+ vol_info_msg.c_str());
ask = true;
/* Restore saved DCR before continuing */
bstrncpy(dcr->VolumeName, VolumeName, sizeof(dcr->VolumeName));
JCR *jcr;
if (in_here) { /* prevent loops */
+ bmicrosleep(2, 0); /* yield */
exit(1);
}
in_here = true;
if (debug_level > 10) {
print_memory_pool_stats();
}
- term_reservations_lock();
term_msg();
cleanup_crypto();
free_volume_list();
+ term_reservations_lock();
close_memory_pool();
sm_dump(false); /* dump orphaned buffers */
Technical notes on version 2.2
General:
+Release Version 2.2.4
+14Sep07
+kes Increase size of name string when FD making connection to SD.
+ May fix bug #953.
+13Sep07
+kes Add code to try to fix bug #908.
+kes Add waits to multiple exit detection code to try to force pid
+ file to always be deleted.
+kes Restore good dev.tar.gz to rescue set appropriate binary property.
+ This fixes bug #950.
+kes Fix seg fault in error exit of acquire_for_read after unsuccessfully
+ trying to switch drives by checking for blocking before unblocking.
+ Fixes bug #906.
+kes Cancel storage daemon in all cases where FD reports error. This
+ should fix virtually all cases of bug #920.
12Sep07
+kes Fix error message that was clobbered when Dir tells SD it does not
+ have write permission on Volume. This should fix a minor point
+ in bug #942, but not the main problem.
+kes Add code to cancel job in SD if FD connection fails. This should
+ fix bug #920.
+kes Add code in FD exit to prevent loops and a crash on FreeBSD.
kes Fix migration code to get correct Volume name with multiple volumes
by skipping |. Fixes bug #936.
kes Implement patch supplied by Landon to fix bug #944 where using
TLS with bconsole uses 99+% of the CPU.
+kes Note, you need GTK >= 2.10 to be able to link the Tray Monitor
+ program.
kes Move patches into patches directory.
+11Sep07
+ebl Fix bug #946 about "bacula-dir -t" which doesn't works
+ as expected.
+09Sep07
+ebl Using "m" in bconsole will show messages like before,
+ and not memory usage.
Release Version 2.2.3
kes Note, you need GTK >= 2.10 to be able to link the Tray Monitor