4 Documentation to do: (a little bit at a time)
5 - Document running a test version.
6 - Make sure restore options are documented
7 - Document query file format.
9 Testing to do: (painful)
10 - that restore options work in FD.
11 - that mod of restore options works.
12 - that console command line options work
13 - blocksize recognition code.
18 - Make BSR accept count (total files to be restored).
19 - Make BSR return next_block when it knows record is not
20 in block, done when count is reached, and possibly other
21 optimizations. I.e. add a state word.
22 - Continue improving the restore process (handling
23 of tapes, efficiency improvements e.g. use FSF to
24 position the tape, ...)
25 - Add code to fast seek to proper place on tape/file
26 when doing Restore. If it doesn't work, try linear
28 - Add code to reject whole blocks if not wanted on restore.
30 - Figure out how to allow multiple simultaneous file Volumes on
32 - Start working on Base jobs.
33 - Implement FileOptions (see end of this document)
34 - Replace popen() and pclose() -- fail safe and timeout, no SIG dep.
35 - Ensure that restore of differential jobs works (check SQL).
36 - Make sure the MaxVolFiles is fully implemented in SD
37 - Flush all the daemon messages at the end of every job.
38 - Check if both CatalogFiles and UseCatalog are set to SD.
39 - Check if we can bump Bacula FD priorty in Win2000
40 - Make bcopy read through bad tape records.
41 - Need return status on read_cb() from read_records(). Need multiple
42 records -- one per Job, maybe a JCR or some other structure with
44 - Think about how to make Bacula work better with File archives.
46 - Work more on how to to a Bacula restore beginning with
47 just a Bacula tape and a boot floppy (bare metal recovery).
48 - Try bare metal Windows restore
49 - Fix read_record to handle multiple sessions.
50 - Program files (i.e. execute a program to read/write files).
51 Pass read date of last backup, size of file last time.
52 - Put system type returned by FD into catalog.
53 - Possibly add email to Watchdog if drive is unmounted too
54 long and a job is waiting on the drive.
55 - Strip trailing slashes from Include directory names in the FD.
56 - Use read_record.c in SD code.
57 - Why don't we get an error message from Win32 FD when bootstrap
58 file cannot be created for restore command?
59 - When Marking a file in Restore that is a hard link, also
60 mark the link so that the data will be reloaded.
61 - Restore program that errors in SD due to no tape reports
62 OK incorrectly in output.
63 - After unmount, if restore job started, ask to mount.
64 - Fix db_get_fileset in cats/sql_get.c for multiple records.
65 - Fix catalog filename truncation in sql_get and sql_create. Use
66 only a single filename split routine.
67 - Make Restore report an error if FD or SD term codes are not OK.
68 - Convert all %x substitution variables, which are hard to remember
69 and read to %(variable-name). Idea from TMDA.
70 - Add JobLevel in FD status (but make sure it is defined).
71 - Make Pool resource handle Counter resources.
72 - Remove NextId for SQLite. Optimize.
73 - Fix gethostbyname() to use gethostbyname_r()
74 - Implement ./configure --with-client-only
75 - Strip trailing / from Include
76 - Move all SQL statements into a single location.
77 - Cleanup db_update_media and db_update_pool
78 - Add UA rc and history files.
79 - put termcap (used by console) in ./configure and
80 allow -with-termcap-dir.
81 - Enhance time and size scanning routines.
82 - Fix Autoprune for Volumes to respect need for full save.
83 - DateWritten field on tape may be wrong.
84 - Fix Win32 config file definition name on /install
85 - No READLINE_SRC if found in alternate directory.
86 - Add Client FS/OS id (Linux, Win95/98, ...).
87 - Test a second language e.g. french.
88 - Compare tape to Client files (attributes, or attributes and data)
89 - Restore options (overwrite, overwrite if older,
90 overwrite if newer, never overwrite, ...)
91 - Restore to a particular time -- e.g. before date, after date.
92 - Make all database Ids 64 bit.
93 - Write an applet for Linux.
94 - Add estimate to Console commands
95 - Find solution to blank filename (i.e. path only) problem.
96 - Implement new daemon communications protocol.
97 - Remove PoolId from Job table, it exists in Media.
98 - Allow console commands to detach or run in background.
99 - Fix status delay on storage daemon during rewind.
100 - Add SD message variables to control operator wait time
101 - Maximum Operator Wait
102 - Minimum Message Interval
103 - Maximum Message Interval
104 - Send Operator message when cannot read tape label.
105 - Verify level=Volume (scan only), level=Data (compare of data to file).
106 Verify level=Catalog, level=InitCatalog
108 - Add keyword search to show command in Console.
109 - Fix Win2000 error with no messages during startup.
110 - Events : tape has more than xxx bytes.
111 - Restrict characters permitted in a Resource name.
112 - Complete code in Bacula Resources -- this will permit
113 reading a new config file at any time.
114 - Handle ctl-c in Console
115 - Implement LabelTemplate (at least first cut).
116 - Implement script driven addition of File daemon to config files.
118 - see setgroup and user for Bacula p4-5 of stunnel.c
119 - Implement new serialize subroutines
120 send(socket, "string", &Vol, "uint32", &i, NULL)
121 - Audit all UA commands to ensure that we always prompt where possible.
122 - If ./btape is called without /dev, assume argument is a Storage resource name.
123 - Put memory utilization in Status output of each daemon
124 if full status requested or if some level of debug on.
125 - Make database type selectable by .conf files i.e. at runtime
126 - gethostbyname failure in bnet_connect() continues
127 generating errors -- should stop.
128 - Add HOST to Volume label.
129 - Set flag for uname -a. Add to Volume label.
130 - Implement throttled work queue.
131 - Check for EOT at ENOSPC or EIO or ENXIO (unix Pc)
132 - Allow multiple Storage specifications (or multiple names on
133 a single Storage specification) in the Job record. Thus a job
134 can be backed up to a number of storage devices.
135 - Implement dump label to UA
136 - Copy volume using single drive.
137 - Concept of VolumeSet during restore which is a list
138 of Volume names needed.
139 - Restore files modified after date
140 - Restore file modified before date
141 - Emergency restore info:
143 - Backup working directory
145 - Restore -- do nothing but show what would happen
146 - SET LD_RUN_PATH=$HOME/mysql/lib/mysql
147 - Implement Restore FileSet=
148 - Create a protocol.h and protocol.c where all protocol messages
150 - If SD cannot open a drive, make it periodically retry.
151 - Remove duplicate fields from jcr (e.g. jcr.level and jcr.jr.Level, ...).
152 - Timout a job or terminate if link goes down, or reopen link and query.
153 - Fill all fields in Vol/Job Header -- ensure that everything
154 needed is written to tape. Think about restore to Catalog
155 from tape. Client record needs improving.
156 - Find general solution for sscanf size problems (as well
157 as sprintf. Do at run time?
158 - Concept of precious tapes (cannot be reused).
159 - Make bcopy copy with a single tape drive.
160 - Permit changing ownership during restore.
162 - Restore should get Device and Pool information from
163 job record rather than from config.
164 - Autolabel should be specified by DR instead of SD.
165 - Find out how to get the system tape block limits, e.g.:
166 Apr 22 21:22:10 polymatou kernel: st1: Block limits 1 - 245760 bytes.
167 Apr 22 21:22:10 polymatou kernel: st0: Block limits 2 - 16777214 bytes.
170 - AutoScan (check checksum of tape)
171 - Format command = "format /dev/nst0"
175 - Seek resolution (usually corresponds to buffer size)
176 - EODErrorCode=ENOSPC or code
177 - Partial Read error code
178 - Partial write error code
179 - Nonformatted read error
180 - Nonformatted write error
181 - WriteProtected error
185 - IgnoreCloseErrors=yes
195 - FD sends unsaved file list to Director at end of job.
196 - Write a Storage daemon that uses pipes and
197 standard Unix programs to write to the tape.
199 - Need something that monitors the JCR queue and
200 times out jobs by asking the deamons where they are.
202 - Enhance Jmsg code to permit buffering and saving to disk.
203 - device driver = "xxxx" for drives.
204 - restart: paranoid: read label fsf to
205 eom read append block, and go
206 super-paranoid: read label, read all files
207 in between, read append block, and go
208 verify: backspace, read append block, and go
209 permissive: same as above but frees drive
210 if tape is not valid.
212 - Ensure that /dev/null works
213 - File daemon should build list of files skipped, and then
214 at end of save retry and report any errors.
215 - Need report class for messages. Perhaps
216 report resource where report=group of messages
217 - enhance scan_attrib and rename scan_jobtype, and
218 fill in code for "since" option
219 - Need to save contents of FileSet to tape?
220 - Director needs a time after which the report status is sent
221 anyway -- or better yet, a retry time for the job.
222 Don't reschedule a job if previous incarnation is still running.
223 - Figure out how to save the catalog (possibly a special FileSet).
224 - Figure out how to restore the catalog.
225 - Some way to automatically backup everything is needed????
226 - Need a structure for pending actions:
228 - termination status (part of buffered msgs?)
229 - Concept of grouping Storage devices and job can use
230 any of a number of devices
232 Read, Write, Clean, Delete
233 - Login to Bacula; Bacula users with different permissions:
234 owner, group, user, quotas
235 - Store info on each file system type (probably in the job header on tape.
236 This could be the output of df; or perhaps some sort of /etc/mtab record.
239 - Implement FSM (File System Modules).
240 - Identify unchanged or "system" files and save them to a
241 special tape thus removing them from the standard
242 backup FileSet -- BASE backup.
243 - Turn virutally all sprintfs into snprintfs.
244 - Heartbeat between daemons.
245 - Audit M_ error codes to ensure they are correct and consistent.
246 - Add variable break characters to lex analyzer.
247 Either a bit mask or a string of chars so that
248 the caller can change the break characters.
249 - Make a single T_BREAK to replace T_COMMA, etc.
250 - Ensure that File daemon and Storage daemon can
251 continue a save if the Director goes down (this
252 is NOT currently the case). Must detect socket error,
253 buffer messages for later.
254 - Enhance time/duration input to allow multiple qualifiers e.g. 3d2h
258 Bacula Projects Roadmap
260 last update 27 November 2002
262 Item 1: Multiple simultaneous Jobs. (done)
265 What: Permit multiple simultaneous jobs in Bacula.
267 Why: An enterprise level solution needs to go fast without the
268 need for the system administrator to carefully tweak
269 timing. Based on the benchmarks, during a full
270 backup, NetWorker typically hit 10 times the bandwidth to
271 the tape compared to Bacula--largely. This is probably due to
272 running parallel jobs and multi-threaded filling of buffers
273 and writing them to tape. This should also make things work
274 better when you have a mix of fast and slow machines backing
277 Notes: Bacula was designed to run multiple simultaneous jobs. Thus
278 implementing this is a matter of some small cleanups and
282 Item 2: Make the Storage daemon use intermediate file storage to buffer data.
283 Deferred -- not necessary yet.
285 What: If data is coming into the SD too fast, buffer it to
286 disk if the user has configured this option.
288 Why: This would be nice, especially if it more or less falls out
289 when implementing (1) above. If not, it probably should not
290 be given a high priority because fundamentally the backup time
291 is limited by the tape bandwidth. Even though you may finish a
292 client job quicker by spilling to disk, you still have to
293 eventually get it onto tape. If intermediate disk buffering
294 allows us to improve write bandwidth to tape, it may make
297 Notes: Whether or not this is implemented will depend upon performance
298 testing after item 1 is implemented.
301 Item 3: Write the bscan program -- also write a bcopy program.
304 What: Write a program that reads a Bacula tape and puts all the
305 appropriate data into the catalog. This allows recovery
306 from a tape that is no longer in the database, or it allows
307 re-creation of a database if lost.
309 Why: This is a fundamental robustness and disaster recovery tool
310 which will increase the comfort level of a sysadmin
311 considering adopting Bacula.
313 Notes: A skeleton of this program already exists, but much work
314 needs to be done. Implementing this will also make apparent
315 any deficiencies in the current Bacula tape format.
318 Item 4: Implement Base jobs.
320 What: A base job is sort of like a Full save except that you
321 will want the FileSet to contain only files that are unlikely
322 to change in the future (i.e. a snapshot of most of your
323 system after installing it). After the base job has been run,
324 when you are doing a Full save, you can specify to exclude
325 all files saved by the base job that have not been modified.
327 Why: This is something none of the competition does, as far as we know
328 (except BackupPC, which is a Perl program that saves to disk
329 only). It is big win for the user, it makes Bacula stand out
330 as offering a unique optimization that immediately saves time
333 Notes: Big savings in tape usage. Will require more resources because
334 the e. DIR must send FD a list of files/attribs, and the FD must
335 search the list and compare it for each file to be saved.
338 Item 5: Implement Label templates
340 What: This is a mechanism whereby Bacula can automatically create
341 a tape label for new tapes according to a detailed specification
342 provided by the user.
344 Why: It is a major convenience item for folks who use automated label
347 Notes: Bacula already has a working form of automatic tape label
348 creation, but it is very crude. The design for the complete
349 tape labeling project is already documented in the manual.
352 Item 6: Write a regression script.
355 What: This is an automatic script that runs and tests as many features
356 of Bacula as possible. The output is compared to previous
357 versions of Bacula and any differences are reported.
359 Why: This is an enormous help in preventing introduction of new
360 errors in parts of the program that already work correctly.
362 Notes: This probably should be ranked higher, it's something the typical
363 user doesn't see. Depending on how it's implemented, it may
364 make sense to defer it until the archival tape format and
365 user interface mature.
368 Item 7: GUI for interactive restore
369 Item 8: GUI for interactive backup
371 What: The current interactive restore is implemented with a tty
372 interface. It would be much nicer to be able to "see" the
373 list of files backed up in typical GUI tree format.
374 The same mechanism could also be used for creating
375 ad-hoc backup FileSets (item 8).
377 Why: Ease of use -- especially for the end user.
379 Notes: Rather than implementing in Gtk, we probably should go directly
380 for a Browser implementation, even if doing so meant the
381 capability wouldn't be available until much later. Not only
382 is there the question of Windows sites, most
383 Solaris/HP/IRIX, etc, shops can't currently run Gtk programs
384 without installing lots of stuff admins are very wary about.
385 Real sysadmins will always use the command line anyway, and
386 the user who's doing an interactive restore or backup of his
387 own files will in most cases be on a Windows machine running
391 Item 9: Add SSL to daemon communications.
393 What: This provides for secure communications between the daemons.
395 Why: This would allow doing backup across the Internet without
396 privacy concerns (or with much less concern).
398 Notes: The vast majority of near term potential users will be backing up
399 a single site over a LAN and, correctly or not, they probably
400 won't be concerned with security, at least not enough to go to
401 the trouble to set up keys, etc. to screw things down. We suspect
402 that many users genuinely interested in multi-site backup
403 already run some form of VPN software in their internetwork
404 connections, and are willing to delegate security to that layer.
407 Item 10: Define definitive tape format.
410 What: Define that definitive tape format that will not change
411 for the next millennium.
413 Why: Stability, security.
415 Notes: See notes for item 11 below.
418 Item 11: New daemon communication protocol.
420 What: The current daemon to daemon protocol is basically an ASCII
421 printf() and sending the buffer. On the receiving end, the
422 buffer is sscanf()ed to unpack it. The new scheme would
423 be a binary format that allows quick packing and unpacking
424 of any data type with named fields.
426 Why: Using binary packing would be faster. Named fields will permit
427 error checking to ensure that what is sent is what the
428 receiver really wants.
430 Notes: These are internal improvements in the interest of the
431 long-term stability and evolution of the program. On the one
432 hand, the sooner they're done, the less code we have to rip
433 up when the time comes to install them. On the other hand, they
434 don't bring an immediately perceptible benefit to potential
435 users. Item 10 and possibly item 11 should be deferred until Bacula
436 is well established with a growing user community more or
437 less happy with the feature set. At that time, it will make a
438 good "next generation" upgrade in the interest of data
444 ====================================
449 Subject: File Backup Options
452 A few days ago, a Bacula user who is backing up to file volumes and
453 using compression asked if it was possible to suppress compressing
454 all .gz files since it was a waste of CPU time. Although Bacula
455 currently permits using different options (compression, ...) on
456 a directory by directory basis, it cannot do it on a file by
457 file basis, which is clearly what was desired.
459 Proposed Implementation:
460 To solve this problem, I propose the following:
462 - Add a new Director resource type called FileOptions.
464 - The FileOptions resource will have records for all
465 options that can currently be specified on the Include record
466 (in a FileSet). Examples below.
468 - The FileOptions resource will permit an exclude option as well
469 as a number of additional options.
471 - The heart of the FileOptions resource is the ability to
472 supply any number of ApplyTo records which specify POSIX
473 regular expressions. These ApplyTo regular expressions are
474 applied to the fully qualified filename (path and all). If
475 one matches, then the FileOptions will be used.
477 - When an ApplyTo specification matches an included file, the
478 options specified in the FileOptions resource will override
479 the default options specified on the Include record.
481 - Include records will be modified to permit referencing one or
482 more FileOptions resources. The FileOptions will be used
483 in the order listed on the Include record and the first
484 one that matches will be applied.
486 - Options (or specifications) currently supplied on the Include
487 record will be deprecated (i.e. removed in a later version a
488 year or so from now).
490 - The Exclude record will be deprecated as the same functionality
491 can be obtained by using an Exclude = yes in the FileOptions.
494 The following records can appear in the FileOptions resource. An
495 asterisk preceding the name indicates a feature not currently
499 - Compression= (GZIP, ...)
500 - Signature= (MD5, SHA1, ...)
502 - OneFs= (yes/no) - remain on one filesystem
503 - Recurse= (yes/no) - recurse into subdirectories
504 - Sparse= (yes/no) - do sparse file backup
505 - *Exclude= (yes/no) - exclude file from being saved
506 - *Reader= (filename) - external read (backup) program
509 - verify= (ipnougsamc5) - verify options
512 - replace= (always/ifnewer/ifolder/never) - replace options currently
514 - *Writer= (filename) - external write (restore) program
518 Currently options specifying compression, MD5 signatures, recursion,
519 ... of a FileSet are supplied on the Include record. These will now
520 all be collected into a FileOptions resource, which will be
521 specified on the Include in place of the options. Multiple FileOptions
522 may be specified. Since the FileOptions contain regular expressions
523 that are applied to the full filename, this will give the ability
524 to specify backup options on a file by file basis to whatever level
533 Include = compression=GZIP signature=MD5 {
542 Include = FileOptions=Opts {
553 That's a lot more to do the same thing, but it gives the ability to
554 apply options on a file by file basis. For example, suppose you
555 want to compress all files but not any file with extensions .gz or .Z.
556 You could do so as follows:
560 Include = FileOptions=NoCompress FileOptions=Opts {
568 ApplyTo = /*.?*/ # matches all files
573 # Note multiple ApplyTos are ORed
574 ApplyTo = /*.gz/ # matches .gz files */
575 ApplyTo = /*.Z/ # matches .Z files */
578 Now, since the NoCompress FileOptions is specified first on the
579 Include line, any *.gz or *.Z file will have an MD5 signature computed,
580 but will not be compressed. For all other files, the NoCompress will not
581 match, so the Opts options will be used which will include GZIP
585 - Is it necessary to provide some means of ANDing regular expressions
586 and negation? (not currently planned)
588 e.g. ApplyTo = /*.gz/ && !/big.gz/
590 - I see that Networker has a "null" module which, if specified, does not
591 backup the file, but does make an record of the file in the catalog
592 so that the catalog will reflect an exact picture of the filesystem.
593 The result is that the file can be "seen" when "browsing" the save
594 sets, but it cannot be restored.
596 Is this really useful? Should it be implemented in Bacula?
599 After implementing the above, the user will be able to specify
600 on a file by file basis (using regular expressions) what options are
601 applied for the backup.
602 ====================================
604 Done: (see kernsdone for more)
605 - Add EOM records? No, not at this time. The current system works and
607 - Add VolumeUseDuration and MaximumVolumeJobs to Pool db record and
609 - Add VOLUME_CAT_INFO to the EOS tape record (as
610 well as to the EOD record). -- No, not at this time.
611 - Put MaximumVolumeSize in Director (MaximumVolumeJobs, MaximumVolumeFiles,
613 - Enhance schedule to have 1stSat, ...
614 - Make sure catalog doesn't keep growing.
615 - On I/O error, write EOF, then try to write again ? No, keep it simple.
616 - Figure out how compress everything except .gz,... files.
617 Implement FileOptions.
618 - Put Bacula version somewhere in Job stream, probably Start Session Labels.
619 - Fix start/end blocks for File devices
620 - Make Job err if WriteBootstrap fails.