3 Bacula Projects Roadmap
6 Item 1: Implement Base jobs.
8 What: A base job is sort of like a Full save except that you
9 will want the FileSet to contain only files that are unlikely
10 to change in the future (i.e. a snapshot of most of your
11 system after installing it). After the base job has been run,
12 when you are doing a Full save, you specify one or more
13 Base jobs to be used. All files that have been backed up in
14 the Base job/jobs but not modified will then be excluded from
15 the backup. During a restore, the Base jobs will be
16 automatically pulled in where necessary.
18 Why: This is something none of the competition does, as far as we
20 (except BackupPC, which is a Perl program that saves to disk
21 only). It is big win for the user, it makes Bacula stand out
22 as offering a unique optimization that immediately saves time
23 and money. Basically, imagine that you have 100 nearly
25 Windows or Linux machine containing the OS and user files.
26 Now for the OS part, a Base job will be backed up once, and
27 rather than making 100 copies of the OS, there will be only
28 one. If one or more of the systems have some files updated,
29 no problem, they will be automatically restored.
31 Notes: Huge savings in tape usage even for a single machine. Will
32 require more resources because the DIR must send FD a list of
33 files/attribs, and the FD must search the list and compare it
34 for each file to be saved.
37 Item 2: Job Data Spooling.
39 What: Make the Storage daemon use intermediate file storage to
41 the data to disk before writing it to the tape.
43 Why: This would be a nice project and is the most requested
45 Even though you may finish a client job quicker by spooling to
46 disk, you still have to eventually get it onto tape. If
47 intermediate disk buffering allows us to improve write
48 bandwidth to tape, it may make sense. In addition, you can
49 run multiple simultaneous jobs all spool to disk, then the
50 data can be written one job at a time to the tape at full
51 tape speed. This keeps the tape running smoothly and prevents
52 blocks from different simultaneous jobs from being intermixed
53 on the tape, which is very inefficient for restores.
55 Notes: Need multiple spool directories. Should possibly be able
56 to spool by Job type, ... Possibly need high and low spool
60 Item 3: GUI for interactive restore
61 Item 4: GUI for interactive backup
63 What: The current interactive restore is implemented with a tty
64 interface. It would be much nicer to be able to "see" the
65 list of files backed up in typical GUI tree format.
66 The same mechanism could also be used for creating
67 ad-hoc backup FileSets (item 8).
69 Why: Ease of use -- especially for the end user.
71 Notes: Rather than implementing in Gtk, we probably should go
73 for a Browser implementation, even if doing so meant the
74 capability wouldn't be available until much later. Not only
75 is there the question of Windows sites, most
76 Solaris/HP/IRIX, etc, shops can't currently run Gtk programs
77 without installing lots of stuff admins are very wary about.
78 Real sysadmins will always use the command line anyway, and
79 the user who's doing an interactive restore or backup of his
80 own files will in most cases be on a Windows machine running
84 Item 5: Implement a Migration job type that will move the job
85 data from one device to another.
87 What: The ability to copy, move, or archive data that is on a
88 device to another device is very important.
90 Why: An ISP might want to backup to disk, but after 30 days
91 migrate the data to tape backup and delete it from disk.
92 Bacula should be able to handle this automatically. It needs
94 know what was put where, and when, and what to migrate -- it
95 is a bit like retention periods. Doing so would allow space to
96 be freed up for current backups while maintaining older data
100 Notes: Migration could be triggered by:
104 Highwater size (keep total size)
108 Item 6: Embedded Perl Scripting (precursor to 7).
110 What: On a configuration parameter, embed the Perl language in
113 Why: The embedded Perl scripting can be called to implement
114 Events such as "Volume Name needed", "End of Tape",
115 "Tape at x% of rated capacity", "Job started",
116 "Job Ended", "Job error", ...
118 Notes: This needs Events.
121 Item 7: Implement Events (requires 6).
123 What: When a particular user defined Event occurs, call the
124 embedded Perl interpreter.
126 Why: This will provide the ultimate in user customization for
127 Bacula. Almost anything imaginable can be done if Events
128 are called at the appropriate place.
130 Notes: There is a certain amount of work to be done on how
131 the user defines or "registers" events.
134 Item 8: Multiple Storage Devices for a Single Job
136 What: Allow any Job to use more than one Storage device.
138 Why: With two devices, for example, the second device could
139 have the next backup tape pre-mounted reducing operator
140 intervention in the middle of the night.
143 Item 9: Backup a Single Job Simultaneously to Multiple Storage
146 What: Make two copies of the backup data at the same time.
148 Why: Large shops typically do this and then take one set of
149 backups off-site. Some design work it needed in how to
150 specify the type of backup (backup, archive, ...) for each
154 Item 10: Break the one-to-one Relationship between a Job and a
155 Specific Storage Device (or Devices if #10 is implemented).
157 What: Allow a Job to simply specify one or more MediaType, and the
158 Storage daemon will select a device for it. In fact, the user
159 should be able to specify one or more MediaType, Storage
160 daemon, and/or device to be used.
162 Why: To allow more flexibility in large shops that have multiple
163 drives and/or multiple drives of different types.
166 Item 11: Add Regular Expression Matching and Plug-ins to the
167 FileSet Include statements.
169 What: Allow users to specify wild-card and/or regular expressions
170 to be matched in both the Include and Exclude directives
171 in a FileSet. At the same time, allow users to define plug-ins
172 to be called (based on regular expression/wild-card matching).
174 Why: This would give the users the ultimate ability to control how
175 files are backed up/restored. A user could write a plug-in
176 knows how to backup his Oracle database without
177 stopping/starting it, for example.
180 Item 12: Implement data encryption (as opposed to communications
183 What: Currently the data that is stored on the Volume is not
184 encrypted. For confidentiality, encryption of data at
185 the File daemon level is essential. Note, communications
186 encryption encrypts the data when leaving the File daemon,
187 then decrypts the data on entry to the Storage daemon.
188 Data encryption encrypts the data in the File daemon and
189 decrypts the data in the File daemon during a restore.
191 Why: Large sites require this.
193 Notes: The only algorithm that is needed is AES.
194 http://csrc.nist.gov/CryptoToolkit/aes/
197 Item 13: New daemon communication protocol.
199 What: The current daemon to daemon protocol is basically an ASCII
200 printf() and sending the buffer. On the receiving end, the
201 buffer is sscanf()ed to unpack it. The new scheme would
202 retain the current ASCII sending, but would add an
203 argc, argv like table driven scanner to replace sscanf.
205 Why: Named fields will permit error checking to ensure that what is
206 sent is what the receiver really wants. The fields can be in
207 any order and additional fields can be ignored allowing better
208 upward compatibility. Much better checking of the types and
209 values passed can be done.
211 Notes: These are internal improvements in the interest of the
212 long-term stability and evolution of the program. On the one
213 hand, the sooner they're done, the less code we have to rip
214 up when the time comes to install them. On the other hand,
215 they don't bring an immediately perceptible benefit to
219 Completed items from last year's list:
220 Item 1: Multiple simultaneous Jobs. (done)
221 Item 3: Write the bscan program -- also write a bcopy program (done).
222 Item 5: Implement Label templates (done).
223 Item 6: Write a regression script (done)
224 Item 9: Add SSL to daemon communications (For now, implement with
226 Item 10: Define definitive tape format (done)